[nycphp-talk] Logging best practices, 2 of 2
Gary Mort
garyamort at gmail.com
Thu May 8 18:30:37 EDT 2014
On 05/06/2014 09:39 AM, Federico Ulfo wrote:
> We're considering to use syslog as common logging for all languages,
> mostly PHP, Java and Python (Scala, Perl, Ruby ...), so we can finally
> log with a standard format, e.g.:
> TIME - HOSTNAME PID [LOG LEVEL] MESSAGE
>
>
I prefer syslog to all other logging transports. Using syslog means
you can save logs offsite[which is important if you ever need to do
forensic security analysis. If all the log data is on the server and
the server is compromised, the log data is compromised. Even if you
archive it off, it can still be modified before that archive occurs].
Truthfully, I prefer rsyslog, not syslog - but from the PHP side they
are the same thing. You can easily setup a Log Analyzer server to send
all the data to, which provides you with the ability to tail/view your
logs as your developing. And since Log Analyzer is written in PHP, you
can change it to suite your purposes. http://loganalyzer.adiscon.com/
and http://www.rsyslog.com/
A common mistake I see in people implementing syslog logging is that
they use direct connections to their syslog server. IE using monolog's
StslogUdpHandler:
https://github.com/Seldaek/monolog/blob/master/src/Monolog/Handler/SyslogUdpHandler.php
To me, the whole point of using syslog/rsyslog is that it is FAST. Your
app connects to a syslog server over a local socket and fires strings at it.
If you connect to a /remote/ syslog server then every single message you
log will slow down your app. I've seen some weird UDP implementations
that wait for some form of ack from the network device that the message
has been queued. Or if it is important to you, it's possible to use tcp
to provide guaranteed delivery.
It does take a bit more configuration, but with a local rsyslog server
you can setup extremely simple 'forward every message to the remote
server AND log it locally' to complex conditional logs.
I like to set it up to log to local files in a tmpfs directory, and then
purge the logs every couple of hours. That gives me local access to
the logs for development/debugging - while I have the remote server for
archives. Putting it on a tmpfs drive means that they get stored in
memory instead of on the hard drive so there is no issue of slowing down
the server by making it write log messages to a hard drive.
Note: just because your using syslog does not mean you are locked into
the limitation of message being a single string. You can json_encode
anything into a string - rsyslog has a number of plugins to explode that
data into fields and then save them into a database or such. Or if you
don't want to roll your own, Papertrail provides access via the syslog
protocol and already handles json. https://papertrailapp.com/ Sometimes
it's better to just pay someone else 20-40$ a month and let them deal
with the headaches.
And with syslog, your not locked into a vendor. You could send your log
files to papertrail, save them onto a local tmpfs drive, AND send them
to a centralized garbage heap syslog server filled with cheap drives and
no real reporting/access policy. IE keep a copy of the data archived
forever 'just in case' but don't bother setting up an interface for it
until the expense of paying a provider like papertrail grows large
enough to justify bringing it in house.
More information about the talk
mailing list