Saturday, January 17, 2009

There are Three -- and Only Three -- Ways to Process Json!

With profileration of Json processing packages on Java platform, there seems to be n+1 ways to slice and dice Json. And each library seems to consider its way (be it how obscure) to be the One True Way to do things, without even acknowledging that there might be other ways, or bothering to offer alternative methods itself. This is particularly odd for people with xml background, who are used to standardization to a limited set of APIs, even though as a data format it is much more versatile (and complicated) than Json. So what's with this nonsense about myriad ways to slice and nice a dead simple data format?

I'll let you in on a secret: there is not just one sensible way to process Json. There also aren't dozens of sensible alternatives. There are exactly 3 methods to this madness.

  1. Iteration: Iterating over Event (or, token) stream
  2. Data Binding: Binding Json data into Objects (of your favorite language)
  3. Tree Traversal: Building a tree structure (from Json) and traversing it using suitable methods

To give a better idea of what these mean, let us consider Java Standard APIs for these canonical processing methods:

  1. SAX and Stax. These are APIs that essentially allow iterating over events: with SAX it's the parser that spams you with the events, and with Stax, you will traverse them at your leisure pace. Push versus Pull, but an event stream all the same. Ditto regarding how events are expressed; as callbacks (SAX), event objects (Stax Event API) or just logical cursor state (Stax Cursor API); just variants of the same approach.
    1. (*) (It is also possible to build more elaborate and convenient facades around this approach: witness StaxMate ("your Stax parser's perfect mate") with its smooth mellow (and yet surprisingly sophisticated!) approach to efficient xml processing -- but I digress!)
  2. JAXB is the standard for data binding; and while there are n+1 alternatives (Jibx, XMLBeans, Castor etc. etc. etc.), they all do the same: convert (Java) Objects to xml and vice versa, some conveniently and efficiently, others less so.
  3. DOM is the "most standard" API that defines a tree structure and machinery around it; but as with data binding, there are multiple (better) alternatives as well (XOM, JDOM, DOM4j). And you traverse it either node-by-node, or using XPath.

But these are for xml. How does that relate to Json? Well, turns out, this is one area where the format doesn't matter all that much: all three approaches are valid, useful and relevant with Json as well. And probably with other structure data formats well. But I can not think of a fourth one that would immediately make sense (feel free to prove my ignorance by pointing out something!).

And here is the good thing: Your Favorite Json Process (quick! say Jackson!) implements all said 3 methods. Yay!

  1. Core package (jackson-core) contains JsonParser and JsonGenerator, which allow iterating over tokens
  2. ObjectMapper implements data binding functionality: Objects in, Json out, Json in, Objects out.
  3. TreeMapper can grow trees (expressed as JsonNodes) from Json, and print Json out of a JsonNode (and its children).

Hence, I have proven that the number is Three; that Three is a Good Number; and that Jackson Does Three. So Jackson is All Good. QED.

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.