Thursday, August 6, 2009

JSON

ESR recently put up a paper on his redesign of the request-response protocol for gpsd. The gpsd project is interesting in its own right, but what interests me is his use of JSON to serialize command and response data to the back end of the daemon.

Like many hackers, I'm indifferent to Java. I've never had the urge to learn it and I've been fortunate that my employment has never required its use. I have noticed, however, that Java-related things always seem to start with a capital J, so when I saw references to JSON I always assumed it was yet another Java something or other and passed it by without further investigation. Like most forms of prejudice, of course, this was merely ignorance. While it's true that the J does stand for Java, JSON actually stands for Java Script Object Notation and it's a really neat way of serializing complex data for transfer between applications.

JSON has two types of data structures:

  • arrays, in which the values are comma separated and enclosed in square brackets, and
  • objects, which are serialized representations of what Lispers call hash tables or alists and Pythonistas call dictionaries. An object is a comma separated list of name/value pairs enclosed in curly braces. The name/value pairs have the form "name" : value.

These structures are fully recursive and each can contain instances of itself or the other. Names are always strings, but object and array values can be strings, numbers, objects, arrays, true, false, or null. The exact syntax, complete with railroad diagrams, is given on the JSON.org site linked above; a formal description is given in RFC 4627. As an example, here is a hypothetical object representing a family:

{"father":{"name":"John Smith", "age":45, "employer":"YoYodyne, inc."},
"mother":{"name":"Wilma Smith", "age":42},
"children":[{"name":"William Smith", "age":15},
         {"name":"Sally Smith","age":17}]}
Notice that white space can be inserted between any pair of tokens to

make the object more human-readable, but that it is not required. Also notice that unused fields in objects (but not arrays) can be omitted as the employer field is in all but the father's object.

What I like about this is that it's fundamentally Lispy, which shouldn't be a surprise given JavaScript's Scheme lineage. For example, the above JSON family object rendered in a typically Lisp way is shown below. Notice how similar it is to the JSON rendering.

'((father . ((name . "John Smith") (age . 45) (employer . "YoYodynce, inc.")))
(mother . ((name . "Wilma Smith") (age . 42)))
(children . #(((name . "William Smith") (age . 15))
            ((name . "Sally Smith") (age . 17)))))

The nice thing about this representation is that the Lisp/Scheme reader will parse it directly. True, the name/value pairs are represented as alists, but for many applications that's the appropriate choice. Common Lisp hash tables don't have an external representation that is directly readable, but we could certainly use alists as such a representation at the cost of spinning down the list and hashing the names.

None of this is to say that we should eschew JSON in favor of s-expressions, especially when doing data-interchange between applications written in non-Lisp languages or even when interchanging data between a Lisp application and, say, a C application. Rather, my point is that Lispers should feel familiar and comfortable with it.

So what's the take away on this? First, that JSON is a simple and ubiquitous (the JSON.org page has links to parsers for dozens of languages, including CL and Scheme) data-interchange format that can make interprocess communication much easier when complex data is involved. Second, I think it nicely illustrates the power of Lisp ideas and data structures, even if JSON itself is neither.

No comments:

Post a Comment