Service logs are gold, if you can mine them. We scan them for occasional debugging. Perhaps we grep them looking for errors or warnings, or setup an occasional nagios log regex monitor. If that. This is a waste of the best channel for data about a service.
+Service logs are gold, if you can mine them. We scan them for occasional debugging. Perhaps we grep them looking for errors or warnings, or setup an occasional nagios log regex monitor. If that. This is a waste of the best channel for data about a service.
-
+["Log. (Huh) What is it good for. Absolutely ..."](https://www.youtube.com/watch?v=01-2pNCZiNk)
-
-
debugging
-
monitors tools that alert operators
-
non real-time analysis (business or operational analysis)
-
historical analysis
-
+- debugging
+- monitors tools that alert operators
+- non real-time analysis (business or operational analysis)
+- historical analysis
-
These are what logs are good for. The current state of logging is barely adequate for the first of these. Doing reliable analysis, and even monitoring, of varied "printf-style" logs is a grueling or hacky task that most either don't bother with, fallback to paying someone else to do (viz. Splunk's great successes), or, for web sites, punt and use the plethora of JavaScript-based web analytics tools.
+These are what logs are good for. The current state of logging is barely adequate for the first of these. Doing reliable analysis, and even monitoring, of varied ["printf-style" logs](http://journal.paul.querna.org/articles/2011/12/26/log-for-machines-in-json/) is a grueling or hacky task that most either don't bother with, fallback to paying someone else to do (viz. Splunk's great successes), or, for web sites, punt and use the plethora of JavaScript-based web analytics tools.
-
Let's log in JSON. Let's format log records with a filter outside the app. Let's put more info in log records by not shoehorning into a printf-message. Debuggability can be improved. Monitoring and analysis can definitely be improved. Let's not write another regex-based parser, and use the time we've saved writing tools to collate logs from multiple nodes and services, to query structured logs (from all services, not just web servers), etc.
+Let's log in JSON. Let's format log records with a filter _outside_ the app. Let's put more info in log records by not shoehorning into a printf-message. Debuggability can be improved. Monitoring and analysis can _definitely_ be improved. Let's _not_ write another regex-based parser, and use the time we've saved writing tools to collate logs from multiple nodes and services, to query structured logs (from all services, not just web servers), etc.
-
At Joyent we use node.js for running many core services -- loosely coupled through HTTP REST APIs and/or AMQP. In this post I'll draw on experiences from my work on Joyent's SmartDataCenter product and observations of Joyent Cloud operations to suggest some improvements to service logging. I'll show the (open source) Bunyan logging library and tool that we're developing to improve the logging toolchain.
+At [Joyent](http://joyent.com) we use node.js for running many core services -- loosely coupled through HTTP REST APIs and/or AMQP. In this post I'll draw on experiences from my work on Joyent's [SmartDataCenter product](http://www.joyent.com/products/smartdatacenter/) and observations of [Joyent Cloud](http://www.joyentcloud.com/) operations to suggest some improvements to service logging. I'll show the (open source) **Bunyan logging library and tool** that we're developing to improve the logging toolchain.
-
Current State of Log Formatting
+## Current State of Log Formatting
-
# apache access log
+```
+# apache access log
10.0.1.22 - - [15/Oct/2010:11:46:46 -0700] "GET /favicon.ico HTTP/1.1" 404 209
fe80::6233:4bff:fe29:3173 - - [15/Oct/2010:11:46:58 -0700] "GET / HTTP/1.1" 200 44
@@ -52,43 +51,46 @@ Oct 14 12:32:20 banana com.apple.SecurityServer[25]: UID 501 authenticated as us
# typical expressjs log output
[Mon, 21 Nov 2011 20:52:11 GMT] 200 GET /foo (1ms)
Blah, some other unstructured output to from a console.log call.
-
+```
-
What're we doing here? Five logs at random. Five different date formats. As Paul Querna points out we haven't improved log parsability in 20 years. Parsability is enemy number one. You can't use your logs until you can parse the records, and faced with the above the inevitable solution is a one-off regular expression.
+What're we doing here? Five logs at random. Five different date formats. As [Paul Querna points out](http://journal.paul.querna.org/articles/2011/12/26/log-for-machines-in-json/) we haven't improved log parsability in 20 years. Parsability is enemy number one. You can't use your logs until you can parse the records, and faced with the above the inevitable solution is a one-off regular expression.
-
The current state of the art is various parsing libs, analysistools and homebrew scripts ranging from grep to Perl, whose scope is limited to a few niches log formats.
+The current state of the art is various [parsing libs](http://search.cpan.org/~akira/Apache-ParseLog-1.02/ParseLog.pm), [analysis](http://www.webalizer.org/) [tools](http://awstats.sourceforge.net/) and homebrew scripts ranging from grep to Perl, whose scope is limited to a few niches log formats.
-
JSON for Logs
+## JSON for Logs
-
JSON.parse() solves all that. Let's log in JSON. But it means a change in thinking: The first-level audience for log files shouldn't be a person, but a machine.
+`JSON.parse()` solves all that. Let's log in JSON. But it means a change in thinking: **The first-level audience for log files shouldn't be a person, but a machine.**
-
That is not said lightly. The "Unix Way" of small focused tools lightly coupled with text output is important. JSON is less "text-y" than, e.g., Apache common log format. JSON makes grep and awk awkward. Using less directly on a log is handy.
+That is not said lightly. The "Unix Way" of small focused tools lightly coupled with text output is important. JSON is less "text-y" than, e.g., Apache common log format. JSON makes `grep` and `awk` awkward. Using `less` directly on a log is handy.
-
But not handy enough. That 80's pastel jumpsuit awkwardness you're feeling isn't the JSON, it's your tools. Time to find a json tool -- json is one, bunyan described below is another one. Time to learn your JSON library instead of your regex library: JavaScript, Python, Ruby, Java, Perl.
+But not handy enough. That [80's pastel jumpsuit awkwardness](http://bit.ly/wTPlN3) you're feeling isn't the JSON, it's your tools. Time to find a `json` tool -- [json](https://github.com/trentm/json) is one, `bunyan` described below is another one. Time to learn your JSON library instead of your regex library: [JavaScript](https://developer.mozilla.org/en/JSON), [Python](http://docs.python.org/library/json.html), [Ruby](http://flori.github.com/json/), [Java](http://json.org/java/), [Perl](http://search.cpan.org/~makamaka/JSON-2.53/lib/JSON.pm).
-
Time to burn your log4j Layout classes and move formatting to the tools side. Creating a log message with semantic information and throwing that away to make a string is silly. The win at being able to trivially parse log records is huge. The possibilities at being able to add ad hoc structured information to individual log records is interesting: think program state metrics, think feeding to Splunk, or loggly, think easy audit logs.
+Time to burn your log4j Layout classes and move formatting to the tools side. Creating a log message with semantic information and throwing that away to make a string is silly. The win at being able to trivially parse log records is huge. The possibilities at being able to add ad hoc structured information to individual log records is interesting: think program state metrics, think feeding to Splunk, or loggly, think easy audit logs.
-
Introducing Bunyan
+## Introducing Bunyan
-
Bunyan is a node.js module for logging in JSON and a bunyan CLI tool to view those logs.
+[Bunyan](https://github.com/trentm/node-bunyan) is **a node.js module for logging in JSON** and **a `bunyan` CLI tool** to view those logs.
-
Logging with Bunyan basically looks like this:
+Logging with Bunyan basically looks like this:
-
$ cat hi.js
+```
+$ cat hi.js
var Logger = require('bunyan');
var log = new Logger({name: 'hello' /*, ... */});
log.info("hi %s", "paul");
-
Bunyan is log4j-like: create a Logger with a name, call log.info(...), etc. However it has no intention of reproducing much of the functionality of log4j. IMO, much of that is overkill for the types of services you'll tend to be writing with node.js.
+Bunyan is log4j-like: create a Logger with a name, call `log.info(...)`, etc. However it has no intention of reproducing much of the functionality of log4j. IMO, much of that is overkill for the types of services you'll tend to be writing with node.js.
-
Longer Bunyan Example
+## Longer Bunyan Example
-
Let's walk through a bigger example to show some interesting things in Bunyan. We'll create a very small "Hello API" server using the excellent restify library -- which we used heavily here at Joyent. (Bunyan doesn't require restify at all, you can easily use Bunyan with Express or whatever.)
+Let's walk through a bigger example to show some interesting things in Bunyan. We'll create a very small "Hello API" server using the excellent [restify](https://github.com/mcavage/node-restify) library -- which we used heavily here at [Joyent](http://joyent.com). (Bunyan doesn't require restify at all, you can easily use Bunyan with [Express](http://expressjs.com/) or whatever.)
-
You can follow along in https://github.com/trentm/hello-json-logging if you like. Note that I'm using the current HEAD of the bunyan and restify trees here, so details might change a bit. Prerequisite: a node 0.6.x installation.
+_You can follow along in [https://github.com/trentm/hello-json-logging](https://github.com/trentm/hello-json-logging) if you like. Note that I'm using the current HEAD of the bunyan and restify trees here, so details might change a bit. Prerequisite: a node 0.6.x installation._
-
git clone https://github.com/trentm/hello-json-logging.git
+```
+git clone https://github.com/trentm/hello-json-logging.git
cd hello-json-logging
make
-
+Our [server](https://github.com/trentm/hello-json-logging/blob/master/server.js) first creates a Bunyan logger:
-
var Logger = require('bunyan');
+```javascript
+var Logger = require('bunyan');
var log = new Logger({
name: 'helloapi',
streams: [
@@ -138,37 +142,40 @@ var log = new Logger({
res: restify.bunyan.serializers.response,
},
});
-
+```
-
Every Bunyan logger must have a name. Unlike log4j, this is not a hierarchical dotted namespace. It is just a name field for the log records.
+Every Bunyan logger must have a **name**. Unlike log4j, this is not a hierarchical dotted namespace. It is just a name field for the log records.
-
Every Bunyan logger has one or more streams, to which log records are written. Here we've defined two: logging at DEBUG level and above is written to stdout, and logging at TRACE and above is appended to 'hello.log'.
+Every Bunyan logger has one or more **streams**, to which log records are written. Here we've defined two: logging at DEBUG level and above is written to stdout, and logging at TRACE and above is appended to 'hello.log'.
-
Bunyan has the concept of serializers: a registry of functions that know how to convert a JavaScript object for a certain log record field to a nice JSON representation for logging. For example, here we register the Logger.stdSerializers.req function to convert HTTP Request objects (using the field name "req") to JSON. More on serializers later.
+Bunyan has the concept of **serializers**: a registry of functions that know how to convert a JavaScript object for a certain log record field to a nice JSON representation for logging. For example, here we register the `Logger.stdSerializers.req` function to convert HTTP Request objects (using the field name "req") to JSON. More on serializers later.
-
Restify Server
+### Restify Server
-
Restify 1.x and above has bunyan support baked in. You pass in your Bunyan logger like this:
+Restify 1.x and above has bunyan support baked in. You pass in your Bunyan logger like this:
-
var server = restify.createServer({
+```javascript
+var server = restify.createServer({
name: 'Hello API',
log: log // Pass our logger to restify.
});
-
+```
-
Our simple API will have a single GET /hello?name=NAME endpoint:
+Our simple API will have a single `GET /hello?name=NAME` endpoint:
-
Let's add two things to our server. First, we'll use the server.pre to hook into restify's request handling before routing where we'll log the request.
+Let's add two things to our server. First, we'll use the `server.pre` to hook into restify's request handling before routing where we'll **log the request**.
-
This is the first time we've seen this log.info style with an object as the first argument. Bunyan logging methods (log.trace, log.debug, ...) all support an optional first object argument with extra log record fields:
+This is the first time we've seen this `log.info` style with an object as the first argument. Bunyan logging methods (`log.trace`, `log.debug`, ...) all support an optional **first object argument with extra log record fields**:
-