A Note on JSON Formatting

If you want to use rsyslog to reformat syslog data in to JSON format before sending off to an output, you will need to use a template.  The template that comes up the most for this purpose is this one:

template( name = "json_output" type = "list" ) {
    constant(value = "{")
    constant(value = "\"timestamp\":\"")
    property(name = "timereported" dateFormat = "rfc3339")
    constant(value = "\",\"message\":\"")
    property(name = "msg")
    constant(value = "\",\"host\",\"")
    property(name = "hostname")
    constant(value = "\",\"severity\":\"")
    property(name = "syslogseverity-text")
    constant(value = "\",\"facility\":\"")
    property(name = "syslogfacility-text")
    constant(value = "\",\"syslog-tag\":\"")
    property(name = "syslogtag")
    constant(value = "\"}\n")
}

(If you’re not up on rsyslog templates, go through the doc here.)

It basically turns a standard syslog message like this:

2017-02-01T19:30:12+00:00 localhost filemonitor: Directory C:\Users\Bill is at 90%  

In to:

{
    "timestamp":"2017-02-01T19:30:12+00:00",
    "message":"Directory C:\Users\Bill is at 90%",
    "host":"localhost",
    "severity":"notice",
    "facility":"kern",
    "syslog-tag":"filemonitor"
}

However when used with omkafka this template has one minor flaw in that it doesn’t do anything with the control characters in the message field.  Omkafka will accept the above JSON object, it will forward to a Kafka topic and Kafka will accept it, store it and serve it up to consumers.  The problem arises when a consumer deserialzes the Kafka message (which is in byte stream) back in to a JSON object.  At that point most JSON parsers – including Jackson, the Java standard JSON library – will croak with an exception like:

org.apache.kafka.common.errors.SerializationException: com.fasterxml.jackson.core.JsonParseException: Unrecognized character escape 'U' (code 120)

The problem portion being C:\Users – the JSON parser will interperet the \ as an escape character and assume the next character is a control character.  Indeed, the JSON standard supports a limited set of control characters.  Any Java developer will tell you the problem is the source data, not the parser.  Fix your source data and problem goes away.

Your first reaction may be to write a complex regex to parse and replace escape characters as rsyslog receives them, but that’s error prone and unpredictable given the wide variety of potential messages rsyslog could receive.  Not to mention CPU intensive.  You’re already re-formatting messages on their way out of rsyslog with a template, better to add the logic there.

Fortunately rsyslog provides several options for working with JSON objects in templates – namely, the type parameter to a property.  Add the type parameter to the msg property portion of the template, and set it to json:

property(name = "msg" format = "json")

Then add a parameter called controlcharacters to the same – this setting will tell the template how to deal with control characters.  Your options are escape – which will prefix a \ to all escape characters, space  – which replaces them with a blank space, or drop – which will drop them from the message altogether.

property(name = "msg" format = "json" controlcharacters = "escape")

Now the templated message will output:

{
    "timestamp":"2017-02-01T19:30:12+00:00",
    "message":"Directory C:\\Users\\Bill is at 90%",
    "host":"localhost",
    "severity":"notice",
    "facility":"kern",
    "syslog-tag":"filemonitor"
}

The other option to the type parameter is jsonf – which will format the property as a complete JSON field, key and value whereas json only formats the value.  This type will also handle control characters properly via the controlcharacters parameter, however it also escapes all the double-quotes.  E.g.:

{
    \"timestamp\":\"2017-02-01T19:30:12+00:00\",
    \"message\":\"Directory C:\\Users\\Bill is at 90%\",
    \"host\":\"localhost\",
    \"severity\":\"notice\",
    \"facility\":\"kern\",
    \"syslog-tag\":\"filemonitor\"
}

I’m not sure if that’s a remnant of older JSON standards, but the current JSON standard doesn’t support escaping double-quotes.  Some JSON parsing libraries may be able to parse this, but the Jackson libraries do not.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s