Protowidget Preview with Object Graph Serialization
July 7th, 2006I tend to be pretty long-winded. If you want to skip my explanation and just see the goods, download and run the attached demo rails app or browse to http://pwdemo.rcode.net/ and click on the “Server Shared State” section on the left (please be kind to this app, it is hosted on a shared account). Please note that everything here has been minimally tested on Firefox 1.0/1.5, IE 6, and Opera 9. Safari is known to not work at this time and there are some visual oddities sometimes in Firefox 1.5 when pressing the ‘View Source’ button on the examples.
I wasn’t really planning to make another preview release of Protowidget until the Demo application had a fair number of examples in it, but over the past couple of days, I’ve put together a set of features that I think is pretty compelling and some of which are fairly stand-alone. As such, I’m going to throw it out there and see what people think. Be kind. This is still very alpha.
In my previous blog post I mused about the prospect of adding the ability to transfer true object graphs between JavaScript and a server side framework (Rails in this case). Read that post for some background, but in short, I kept running into limitations with JSON, specifically as it related to transfering graphs of objects with cycles. In addition, even if I could keep my graphs cycle-free (it only takes one slipup to throw your browser into an infinite loop), it was fairly awkward to manipulate and pass the JSON objects back and forth between my application in the browser and my controllers in Rails. There are a whole host of class/object hierarchies that I would like to be able to access from either the Rails controllers or the JavaScript application in the browser. As the concept started to take form in my mind, I realized that I wanted kind-of a shared object-space that straddled the browser and the server. Anything put into or manipulated in this space from either side would be accessible to the other side. By drawing a logical box around this shared space, we can control the scope of what is shared and also have some well defined points for implementing silly things like security.
This is conceptually pretty similar to the way that Rails makes instance variables defined in the controller available to its views, forming a bounded, shared space between controller and view. The difference is that the sharing goes both ways and the “view” in my case is very long lived (and remote) whereas the controller instances come and go.
So here’s what I did.
The Rails Side
I created a plugin (called protowidget — but its not very Protowidget specific) that primarily defines the following:
- BrowserController — This is a custom sub-class of ActionController::Base. If your application controller extends BrowserController, then it will be imbued with some special powers.
- JavaScriptObject — This class simulates a JavaScript object in that the set of attributes it has is open and not defined at class definition time.
- JavaScriptSerializer — Takes Ruby objects and produces a sequence of JavaScript statements necessary to create/update the graph on the browser
- YAMLDeserializer — Deserializes YAML as passed from the browser to Rails into appropriate Ruby objects. There is some extra work that needs to be done which means that we can’t just use the stock YAML deserialization support as-is.
There is really a lot of plumbing, but the end result is that if your controller extends BrowserController, then it loses the ability to render templates as a result of its actions (except for “index” — but that’s not important in this discussion). Instead, every action is assumed to be part of an interaction with the browser through the shared object-space. When the action is executed, a @browser variable is defined on the instance. Initially it contains any objects that the browser application sent across. When the action is done, any changes made to @browser will be pushed back to the browser application (currently the transmission of just the deltas is severely limited). It’s really that simple, but there are a couple more details:
- There is the concept of types but this isn’t fully implemented yet
- Since my JavaScript programs tend to use CamelCase whereas Rails uses underscore_separation, the serialization/deserialization converts between these. Therefore, I work in CamelCase in JavaScript and with underscores in Rails. Otherwise the objects are reflected back and forth unchanged.
To see an example, look at the example app and/or the controller app/controllers/pw_demo/serializer_controller.rb
The JavaScript Side
I created a JavaScript library (object_serializer.js) wich defines a YAMLSerializer and a JSDeserializer. These complement the classes on the Ruby side. The YAMLSerializer generates very JSON looking YAML (using the YAML inline syntax) but it contains type tags as well as anchors and references for any cycles in the object graph.
I then extended the Protowidget data model classes (protowidget_data.js) to leverage the object_serializer classes in order to invoke remote actions, transfering the shared objects in both directions as needed. Don’t look too closely at this part. There is still a ton of cruft in these classes that relates to the old way that I was jumping through hoops to make sure that I transfered cycle-free JSON to/from the server.
The End Result
The end result can be seen by downloading and running the demo rails app or browsing to http://pwdemo.rcode.net/ and clicking on the “Server Shared State” section on the left (please be kind to this app, it is hosted on a shared account).
Next Steps
There is still a lot more I want to do with this approach. Even lacking some of these things such as being able to toss ActiveRecord instances back and forth, I think that conceptually the approach of sharing object graphs between the browser and server provides a lot of power. It’s always a step forward when we lower the bar for having a rich conversation between two disparate pieces of technology. And I’m not sure that the bar could be lowered much more than just setting variables in the respective environment and having the objects magically appear on the other side of the void.
See the attached archive for the Rails project that has everything discussed here (I’m making it all available under an MIT-style license). Alternatively, it’s in SVN at https://rcode.devguard.com/svn/protowidget/trunk/protowidget . Note that in order to run most of the tests in the project you will need to have a JVM installed, since the Rails tests invoke the Rhino JavaScript interpreter (which requires Java). For anyone who is interested, there is also the beginnings of a manual under public/doc/protowidget_ref.pdf or here.
Serialization with JavaScript
July 5th, 2006In any new environment I try to avoid it for as long as possible, but eventually the kludges of trying to get an object graph from point A in tool X to point B in tool Y add up to an amount that justifies creating some generic marshaling tools.
The problem that always seems to come up is that object graphs are, well, graphs. They often have circular references. A lot of times this shows up as simple backreferences to parents, but it could be anything. Real object marshaling mechanisms deal with this by remembering what has already been sent and if an object is encountered subsequent times, it will be replaced with a reference to the first instance.
Uh-oh. I used the word “real”. That must mean that I was using something that I no longer categorize as “Real Object Marshaling”. It turns out that what I was using was JSON. JSON is fine and good, but getting it to deal with true graphs is beyond its charter. Also, typical tools don’t behave very well when loops are accidentally encountered… at least browsers usually let you break out of an infinite loop if it runs for more than 2 seconds.
I also had some other gripes with JSON. Chief among them was the hacks that I had to go through in order to deserialize a JSON string into instances of typed classes instead of generic objects. All of these things made me bite the bullet and start something else.
On the server side, I’m working with Rails, but the approach should work for any environment. Basically, what I did was write a JavascriptSerializer that can traverse graphs of objects (typically JavaScriptObject — another class I created which mimics the open behavior of JS objects — an adapter for ActiveRecord instances is coming). What it produces is really tight JavaScript. The generated JavaScript implicitly takes care of the loops by storing important references for later and then pulling them back when duplicates are encountered. It can successfully preserve the identity of any duplicate Object or Array, including those that directly reference themselves. The Rails side wasn’t too hard, and since it produces JavaScript, there is very little extra JavaScript coding that needs to be done in order to unmarshal a graph.
The JavaScript side is a bit trickier and I’m still working on it. The first problem is that JavaScript does not have a native construct for storing a handle to an Object in a hash or something. In Ruby I use the object_id, and in Java I would use the IdentityHashMap. Without some approach like this, it becomes very computationally expensive to detect cycles in the graph (basically having to resort to a list of all known objects where each check has to iterate over the entire list looking for a match). Since I’m working with my own domain classes, and since I can make any rule I want to, I “solved” this problem by dynamically computing and attaching a unique object id to each instance as it is encountered by the serialization logic. A look-aside object is used to map these ids to the instances.
The second problem is how to represent the object graph so that it can be efficiently reconstituted on the server. We can’t take the approach that we did before and generate Ruby code, for example. I was ok generating JavaScript and sending it to the browser because we were going from a trusted environment to an untrusted one. Going the other way and having the browser send back code that will execute on the server is a big no-no. Anyway, I wanted to be reasonably cross platform, necessitating a nuetral representational format that has the concept of graph cycles built into it. If the format had a native idea regarding object typing, that would be a boon as well.
I settled on YAML which has both of these things and is very easy to digest from Ruby. I hooked into the YAML parser directly instead of using its default mode where the document is converted directly to Ruby types. This gives me a lot of control over how things are instantiated (the reasons for needing this I’ll cover later).
The end result (which is not quite done yet) is a Rails Controller subclass that exposes a @browser attribute. When an action is invoked on the controller, the @browser collection will be populated with the shared object references from the client and any changes to it will be sent to the client when the action completes. This type of approach bypasses the usual Rails views in many cases when the client encapsulates a long-running application. By providing this shared whiteboard of sorts where object graphs are automatically kept up to date between client and server, a very robust model for communication between the browser-resident views and the server-resident controllers can be achieved.
To be continued…