Dear Nic, Should we log directly?

I need some Unix genius advise and I thought of you...

We have a project here working on a tool to pipe the output of logs into LogStash, that logs thing.

Here's how I've known this to work in the past:

App > Logwriter Library > Log File on File System > \
  Light process tailing this > \
  Forwards on a socket to Logstash > \

Here's what they want to do:

App > Logs to STDERR/OUT | pipes directly to logstash.

So when upstart starts the app it starts it piping to another process which forwards to logstash direct.

Do you know what unix does to the downstream process if the upstream gets blocked?

Is this even a sensible question?

Thanks, Jim

Dear Jim,

So really you mean:

App | stasher

There is a problem with doing this which is that the unix pipe buffer isn't that big so if the stasher process blocks for any reason sending to stash then the pipe buffer can fill and that can be a problem. If the pipe buffer fills up then the process writing to the pipe (App) will be blocked. You don't really want that.

It's also a surprisingly difficult thing to do and easy to get wrong. If the programmer in question is inexperienced it may be a good idea to dissuade them.

The thing you mention first is this:

App > logfile & ( tail -f logfile | stash)

which is much safer except that can still be a problem; the logfile is quite hard to co-ordinate. Let me elaborate: at some point your're going to want to truncate logfile at which point you'll be looking to change it's name. The 2nd process then needs to notice and repoint the tail command. This is not trivial.

The BEST way to do this is:

App |{}| syslog => wherever

where |{}| is something I just invented to show piping over TCP. The syslog daemon collects the logs data, assembles it according to some protocol and then can send the logs over the network, over a pipe, dump to files, all of them at the same time - whatever it likes. If you're on unix always use a local syslog daemon as the first port of call.

There is a problem with this approach as well though, what happens if the syslog daemon is down? the app can't connect to the TCP server to send it's logs.

There are 3 mitigations to this:

1. look after the syslog server - check that we're getting logging from app, if we don't get any for an hour (a minute, 10seconds, whatever) is it broken?

2. have multiple syslog servers, one local, one localish (same LAN?) one less localish? that the app knows how to back off to.

3. have the people running the app think about what should happen when the app fails to find a log server, is it critical? Should the app stop? is there something else that could be done to alert people?

I should blog post this.

And now I have.

Follow up

Jim came back the next day with some more questions

Re: the tailing option, I heard that using tail -F rather than tail -f gets round the process dance issue? Add in a little log rotate and problem solved?

Not really. Consider this:

Process A

( while [ 1 ] ; do echo $(date) ; sleep 2 ; done ) > /tmp/somefile

Process B

tail -F /tmp/somefile

This works great. But now emulate what logrotate does:

mv /tmp/somefile /tmp/somefile.log

tail notices but it doesn't matter. No more log will ever arrive because the open file descriptor in process A will never get repointed to output to the new file. Data effectively runs down the plughole.

What is the best way of connecting an upstart / unicorn based apps std out to syslog?

I'd say, don't. Alter your application to use syslog directly. You can get Ruby logger tools to do that for you.

Having said that, obviously you want to collect data from unicorn itself because there might be important information there. If you can reduce the logging to a smallish level (less than 10s of MB for a days log file) then you can do this:

unicorn 2>&1 | while read line ; do echo $ line >> $LOGFILE ; done 

Of course, if unicorn provides a command line way to direct it's output to a file that would be ideal but see all the caveats about logrotateing logfiles above. Often the process will provide a SIGHUP to turn the logfiles over. Which is fine. Especially as you'll have moved most of application specific logging to syslog.