My colleague asked me to take a look at a logging issue on a server last week. He noticed that the error logs had way too little information about exceptions. In this particular instance, we had switched to Nginx + gunicorn instead of our usual Nginx + Apache + mod_wsgi (yeah, we’re weird). I took a quick look this morning and everything looked exactly like they should. I’ve read up more gunicorn docs today than I’ve ever done, I think.
Eventually, I asked my colleague Tryggvi for help. I needed a third person to tell me if I was making an obvious mistake. He asked me if I tried running gunicorn without supervisor, which I hadn’t. I tried that locally first, and it worked! I was all set to blame supervisor for my woes and tried it on production. Nope. No luck. As any good sysadmin would do, I checked if the versions matched and they did. CKAN itself has it’s dependencies frozen, this lead to more confusion in my brain. It didn’t make sense.
I started looking at the Exception in more detail, there was a note about email not working and the actual traceback. Well, since I didn’t actually have a mail server on my local machine, I commented those configs out, and now I just had the right Traceback. A few minutes later, it dawned on me. It’s a Pylons “feature”. The full traceback is printed to stdout if and only if there’s no email handling. Our default configs have an email configured and our servers have postfix installed on them and all the errors go to an email alias that’s way too noisy to be useful (Sentry. Soon). I went and commented out the relevant bits of configuration and voilà, it works!
Image source: Unknown, but provided by Tryggvi :)