Scaling and Web 2.0
January 9th, 2007Just found these somewhat aged articles from Tim O’Reilly:
Web 2.0 and Databases: Part 1
I couldn’t find later parts.
Embedded SVG
August 30th, 2006http://pwdemo.rcode.net/svg_tester
![]()
I know I’m probably late to the game on this one, but I had not realized until recently how well SVG was supported in a lot of browsers. I actually didn’t realize it until I needed it for a visually rich application prototype I’m working on. I really didn’t want to use Flash, so I checked out SVG again. To my suprise, it is fully integrated into Firefox, Opera 9 and some (pre-release?) versions of Safari.
So I started with the SVG example from Mozilla.org. It draws three circles with primary colors that are semi-transparent:
1: <svg xmlns="http://www.w3.org/2000/svg"
2: version="1.1" width="500" height="500"
3: baseProfile="full">
4: <g fill-opacity="0.7" stroke="black" stroke-width="0.1cm">
5: <circle cx="6cm" cy="2cm" r="100" fill="red"
6: transform="translate(0,50)" />
7: <circle cx="6cm" cy="2cm" r="100" fill="blue"
8: transform="translate(70,150)" />
9:
10: <circle cx="6cm" cy="2cm" r="100" fill="green"
11: transform="translate(-70,150)" />
12:
13: </g>
14:
15: </svg>
16:
I thought a modest goal would be to adapt Protowidget so that it could attach widgets to some SVG elements. I could then use the property bindings to make the circles move around. It’s dumb and pretty simple but forms the basis for creating more advanced SVG widgets.
This turned out to be harder than I had expected. The main culprit was that Protowidget wasn’t fully compatible with real XHTML. There’s the normal stuff: no document.write (it was used to bootstrap the system by writing script elements to the header), createElement must use namespaces, etc. It took a little longer than necessary to fix because I still wanted to preserve compatibility with namespaced xml documents and IE when the advanced features were not in use.
So I fixed all of that and added real namespace support so that the Protowidget attributes are now part of their own namespace if using an XHTML document. If working in normal HTML, the pw.* syntax can still be used. There’s even a hack, so that if you use the “pw:” prefix for your namespace, the parser will be able to work around IE’s deficiencies. I then added an SvgWidget to mirror the HTML DOMWidget (which in hindsight should have been named HTMLWidget). It’s basically a light version of the DOMWidget base class which leaves off the CSS class and style processing that does not apply (at least in the same way) to SVG elements.
The end result was to change the example SVG to include some Protowidget attributes:
1: <svg xmlns="http://www.w3.org/2000/svg"
2: version="1.1" width="500" height="500"
3: baseProfile="full">
4: <g fill-opacity="0.7" stroke="black" stroke-width="0.1cm">
5: <circle cx="6cm" cy="2cm" r="100" fill="red"
6: transform="translate(0,50)"
7: pw:type=’Svg.SvgWidget’
8: pw:element.r=’#{`Jitter1`}’/>
9: <circle cx="6cm" cy="2cm" r="100" fill="blue"
10: transform="translate(70,150)"
11: pw:type=’Svg.SvgWidget’
12: pw:element.r=’#{`Jitter2`}’/>
13: <circle cx="6cm" cy="2cm" r="100" fill="green"
14: transform="translate(-70,150)"
15: pw:type=’Svg.SvgWidget’
16: pw:element.r=’#{`Jitter3`}’/>
17: </g>
18: </svg>
19:
What this does is declare the three circle elements to be of Protowidget type “Svg.SvgWidget”. As I mentioned before, this is intended to be a base class for more advanced Svg widgets, but it provides some features that make it useful on its own. For one you can establish bindings between Protowidget models and attributes of the elements. This is what is done with the pw:element.r attributes. They are binding the circle’s radius to the values of “Jitter1″, “Jitter2″ and “Jitter3″ respectively.
What makes this go is a little chunk of code in the header that sets up a timer to set Jitter1, Jitter2 and Jitter3 to random values:
1: PwLoader.inlineExecute(function() {
2: Protowidget.beforeStartup(updateJitter);
3: });
4:
5: function updateJitter() {
6: Protowidget.RootWidget.setAttribute(’Jitter1′, 100+Math.random() * 50);
7: Protowidget.RootWidget.setAttribute(’Jitter2′, 100+Math.random() * 50);
8: Protowidget.RootWidget.setAttribute(’Jitter3′, 100+Math.random() * 50);
9: setTimeout(updateJitter, 100);
10: }
11:
(The couple of lines at the top are necessary for inline scripts that act outside of the module system. It ensures that the script is executed at the proper time in the startup sequence.)
Pretty neat, huh!
Here’s the URL again: http://pwdemo.rcode.net/svg_tester
I’ve tested it with Firefox 1.5 and Opera 9. It should work on Mozilla builds with SVG enabled.
For anyone whose interested, you can get all of this from anonymous svn: https://dev.rcode.net/svn/protowidget/trunk/protowidget
You can also visit the Protowidget Wiki.
IE7 Beta 3 JavaScript DOM Speed
August 15th, 2006For the past several months I don’t think I ever had a positive thing to say in the same sentence as “Internet Explorer”. Well, that has changed now. I’ve been delaying pulling down the IE 7 betas for fear of how difficult my life would become in trying to adapt my JavaScript solutions to the new browser.
I have to say that I was pleasantly surprised. I mean not surprised enough to switch, but surprised enough to chear for the launch of IE7 as a high priority update so that I can actually dream of the day when IE 6 is no more. I don’t have any solid numbers for exactly WHAT is faster, but the whole thing is a LOT snappier. Total operations that I had timed on IE6 to take 7-10 seconds are now taking 300-400ms, which is on par with Firefox and Opera. These operations consist of very heavy DOM manipulation and a good deal of JavaScript parsing.
Maybe its time to take some of those “IE SUCKS” comments out of the body of my if statements that have to do something different for IE6 or subject the user to interminable delays.
So, I am pleasantly surprised, and I haven’t been pleasant or surprised about anything out of Redmond for quite some time!
Somewhat ironically, I did notice that none of the CSS glitches that plagued the site I was testing on IE seem to have been corrected by the upgrade. And this after months of hearing that CSS fixes were the priority and that about all we could expect for JavaScript enhancements was the elimination of the dreaded closure memory leaks. Oh well… CSS I can fix. JavaScript that runs 10-20 times slower on IE than anything else is another matter.
The Effects of JavaScript Compression
August 6th, 2006I’m getting ready to take my first Protowidget application to production, and its time to address one of those things that sometimes makes me wake up at night in a cold sweat: I’m writing all of this JavaScript code and its getting huge. What impact is that going to have on perceived site performance when run over slower links?
Protowidget depends on Prototype; there’s 54KB of JavaScript source right off the top. Protowidget plus the logger adds another 174KB or so. That leaves a total of 228KB of JavaScript to shove across the wire.
Protowidget is divided into a number of modules that are dynamically loaded as needed. In a dev environment, this works fine, but the overhead of having the browser run out to the server a dozen times for JavaScript source files can quickly outstrip the cost of having the browser go out once to fetch all of the core modules combined into one big file.
So with this information in hand, I set out to optimize the problem. My plan of attack had three prongs:
- Pack prototype + logger + all of the core modules into one big JavaScript bootstrap file (the loader stub is still separate for now - it was modified to try loading this bootstrap module before dynamically loading anything else)
- Run JSMin on the bootstrap file and the loader
- Generate gzip pre-compressed versions of the files that mod_gzip or mod_deflate can serve to HTTP 1.1 browsers
The results were pretty astounding. Here is a partial directory listing after the prodedure:
-rw-r--r-- 1 pactimo pactimo 8201 Aug 6 09:06 protowidget.js -rw-r--r-- 1 pactimo pactimo 1461 Aug 6 09:27 protowidget.js.gz -rw-r--r-- 1 pactimo pactimo 136890 Aug 6 09:27 protowidget_bootstrap.js -rw-r--r-- 1 pactimo pactimo 33599 Aug 6 09:27 protowidget_bootstrap.js.gz -rw-r--r-- 1 pactimo pactimo 224963 Aug 6 09:27 protowidget_bootstrap_full.js -rw-r--r-- 1 pactimo pactimo 4492 Aug 6 09:27 protowidget_minify.js
Here is the legend of what’s what:
- protowidget.js - Un-minified Protowidget loader (sets up the module system and dynamically pulls in prototype + logger + core modules)
- protowidget_minify.js - Minified version of the loader
- protowidget.js.gz - Minified and GZIP compressed version of the loader
- protowidget_bootstrap_full.js - Concatenated source file of prototype + logger + core modules
- protowidget_bootstrap.js - Minified version of prototype + logger + core modules
- protowidget_bootstrap.js.gz - Minified and GZIP compressed version of prototype + logger + core modules
You can correlate the numbers from the directory listing, but here they are in brief:
- JSMin reduced the aggregate bootstrap file by 39% (from 220KB to 134KB)
- GZIP reduced the minified aggregate bootstrap file by a further 75% (from 134KB to 33KB)
- For browsers that can accept GZIP compressed content, this is a total savings of 85% (the original version is 6.7 TIMES larger)
The loader file, which remained separate, compressed down with similar results. Overall, this means that the total transfer required to load Protowidget dropped from 228KB down to 34KB for browsers capable of receiving GZIP encodings. Further, the individual number of files to load dropped from 12 to 2. This optimization has reduced the time to load from scratch for some parts of the app from a 10+ second ordeal down to 1 or 2 seconds. Even under dialup a 34KB download is reasonable, especially considering that it is cached and serves a very long-lived part of the application.
One thing to note is that while all modern browsers CAN support GZIP encodings, not all are configured to do so. In particular, I have observed that some corporate installs of IE disable HTTP 1.1 through proxies, thus eliminating GZIP as well. From there, it’s anyone’s guess as to whether the proxy itself requests resources via HTTP 1.1 and can accept the GZIP encoding.
In conclusion, it’s nice to know that only about every 6-7th character I type in a JavaScript source file actually contributes to the total transfer cost. There is still a cost on the browser for pawing through all of that source, though, so the need to write concise JavaScript is still as present as ever.
Protowidget Preview with Object Graph Serialization
July 7th, 2006I tend to be pretty long-winded. If you want to skip my explanation and just see the goods, download and run the attached demo rails app or browse to http://pwdemo.rcode.net/ and click on the “Server Shared State” section on the left (please be kind to this app, it is hosted on a shared account). Please note that everything here has been minimally tested on Firefox 1.0/1.5, IE 6, and Opera 9. Safari is known to not work at this time and there are some visual oddities sometimes in Firefox 1.5 when pressing the ‘View Source’ button on the examples.
I wasn’t really planning to make another preview release of Protowidget until the Demo application had a fair number of examples in it, but over the past couple of days, I’ve put together a set of features that I think is pretty compelling and some of which are fairly stand-alone. As such, I’m going to throw it out there and see what people think. Be kind. This is still very alpha.
In my previous blog post I mused about the prospect of adding the ability to transfer true object graphs between JavaScript and a server side framework (Rails in this case). Read that post for some background, but in short, I kept running into limitations with JSON, specifically as it related to transfering graphs of objects with cycles. In addition, even if I could keep my graphs cycle-free (it only takes one slipup to throw your browser into an infinite loop), it was fairly awkward to manipulate and pass the JSON objects back and forth between my application in the browser and my controllers in Rails. There are a whole host of class/object hierarchies that I would like to be able to access from either the Rails controllers or the JavaScript application in the browser. As the concept started to take form in my mind, I realized that I wanted kind-of a shared object-space that straddled the browser and the server. Anything put into or manipulated in this space from either side would be accessible to the other side. By drawing a logical box around this shared space, we can control the scope of what is shared and also have some well defined points for implementing silly things like security.
This is conceptually pretty similar to the way that Rails makes instance variables defined in the controller available to its views, forming a bounded, shared space between controller and view. The difference is that the sharing goes both ways and the “view” in my case is very long lived (and remote) whereas the controller instances come and go.
So here’s what I did.
The Rails Side
I created a plugin (called protowidget — but its not very Protowidget specific) that primarily defines the following:
- BrowserController — This is a custom sub-class of ActionController::Base. If your application controller extends BrowserController, then it will be imbued with some special powers.
- JavaScriptObject — This class simulates a JavaScript object in that the set of attributes it has is open and not defined at class definition time.
- JavaScriptSerializer — Takes Ruby objects and produces a sequence of JavaScript statements necessary to create/update the graph on the browser
- YAMLDeserializer — Deserializes YAML as passed from the browser to Rails into appropriate Ruby objects. There is some extra work that needs to be done which means that we can’t just use the stock YAML deserialization support as-is.
There is really a lot of plumbing, but the end result is that if your controller extends BrowserController, then it loses the ability to render templates as a result of its actions (except for “index” — but that’s not important in this discussion). Instead, every action is assumed to be part of an interaction with the browser through the shared object-space. When the action is executed, a @browser variable is defined on the instance. Initially it contains any objects that the browser application sent across. When the action is done, any changes made to @browser will be pushed back to the browser application (currently the transmission of just the deltas is severely limited). It’s really that simple, but there are a couple more details:
- There is the concept of types but this isn’t fully implemented yet
- Since my JavaScript programs tend to use CamelCase whereas Rails uses underscore_separation, the serialization/deserialization converts between these. Therefore, I work in CamelCase in JavaScript and with underscores in Rails. Otherwise the objects are reflected back and forth unchanged.
To see an example, look at the example app and/or the controller app/controllers/pw_demo/serializer_controller.rb
The JavaScript Side
I created a JavaScript library (object_serializer.js) wich defines a YAMLSerializer and a JSDeserializer. These complement the classes on the Ruby side. The YAMLSerializer generates very JSON looking YAML (using the YAML inline syntax) but it contains type tags as well as anchors and references for any cycles in the object graph.
I then extended the Protowidget data model classes (protowidget_data.js) to leverage the object_serializer classes in order to invoke remote actions, transfering the shared objects in both directions as needed. Don’t look too closely at this part. There is still a ton of cruft in these classes that relates to the old way that I was jumping through hoops to make sure that I transfered cycle-free JSON to/from the server.
The End Result
The end result can be seen by downloading and running the demo rails app or browsing to http://pwdemo.rcode.net/ and clicking on the “Server Shared State” section on the left (please be kind to this app, it is hosted on a shared account).
Next Steps
There is still a lot more I want to do with this approach. Even lacking some of these things such as being able to toss ActiveRecord instances back and forth, I think that conceptually the approach of sharing object graphs between the browser and server provides a lot of power. It’s always a step forward when we lower the bar for having a rich conversation between two disparate pieces of technology. And I’m not sure that the bar could be lowered much more than just setting variables in the respective environment and having the objects magically appear on the other side of the void.
See the attached archive for the Rails project that has everything discussed here (I’m making it all available under an MIT-style license). Alternatively, it’s in SVN at https://rcode.devguard.com/svn/protowidget/trunk/protowidget . Note that in order to run most of the tests in the project you will need to have a JVM installed, since the Rails tests invoke the Rhino JavaScript interpreter (which requires Java). For anyone who is interested, there is also the beginnings of a manual under public/doc/protowidget_ref.pdf or here.
Serialization with JavaScript
July 5th, 2006In any new environment I try to avoid it for as long as possible, but eventually the kludges of trying to get an object graph from point A in tool X to point B in tool Y add up to an amount that justifies creating some generic marshaling tools.
The problem that always seems to come up is that object graphs are, well, graphs. They often have circular references. A lot of times this shows up as simple backreferences to parents, but it could be anything. Real object marshaling mechanisms deal with this by remembering what has already been sent and if an object is encountered subsequent times, it will be replaced with a reference to the first instance.
Uh-oh. I used the word “real”. That must mean that I was using something that I no longer categorize as “Real Object Marshaling”. It turns out that what I was using was JSON. JSON is fine and good, but getting it to deal with true graphs is beyond its charter. Also, typical tools don’t behave very well when loops are accidentally encountered… at least browsers usually let you break out of an infinite loop if it runs for more than 2 seconds.
I also had some other gripes with JSON. Chief among them was the hacks that I had to go through in order to deserialize a JSON string into instances of typed classes instead of generic objects. All of these things made me bite the bullet and start something else.
On the server side, I’m working with Rails, but the approach should work for any environment. Basically, what I did was write a JavascriptSerializer that can traverse graphs of objects (typically JavaScriptObject — another class I created which mimics the open behavior of JS objects — an adapter for ActiveRecord instances is coming). What it produces is really tight JavaScript. The generated JavaScript implicitly takes care of the loops by storing important references for later and then pulling them back when duplicates are encountered. It can successfully preserve the identity of any duplicate Object or Array, including those that directly reference themselves. The Rails side wasn’t too hard, and since it produces JavaScript, there is very little extra JavaScript coding that needs to be done in order to unmarshal a graph.
The JavaScript side is a bit trickier and I’m still working on it. The first problem is that JavaScript does not have a native construct for storing a handle to an Object in a hash or something. In Ruby I use the object_id, and in Java I would use the IdentityHashMap. Without some approach like this, it becomes very computationally expensive to detect cycles in the graph (basically having to resort to a list of all known objects where each check has to iterate over the entire list looking for a match). Since I’m working with my own domain classes, and since I can make any rule I want to, I “solved” this problem by dynamically computing and attaching a unique object id to each instance as it is encountered by the serialization logic. A look-aside object is used to map these ids to the instances.
The second problem is how to represent the object graph so that it can be efficiently reconstituted on the server. We can’t take the approach that we did before and generate Ruby code, for example. I was ok generating JavaScript and sending it to the browser because we were going from a trusted environment to an untrusted one. Going the other way and having the browser send back code that will execute on the server is a big no-no. Anyway, I wanted to be reasonably cross platform, necessitating a nuetral representational format that has the concept of graph cycles built into it. If the format had a native idea regarding object typing, that would be a boon as well.
I settled on YAML which has both of these things and is very easy to digest from Ruby. I hooked into the YAML parser directly instead of using its default mode where the document is converted directly to Ruby types. This gives me a lot of control over how things are instantiated (the reasons for needing this I’ll cover later).
The end result (which is not quite done yet) is a Rails Controller subclass that exposes a @browser attribute. When an action is invoked on the controller, the @browser collection will be populated with the shared object references from the client and any changes to it will be sent to the client when the action completes. This type of approach bypasses the usual Rails views in many cases when the client encapsulates a long-running application. By providing this shared whiteboard of sorts where object graphs are automatically kept up to date between client and server, a very robust model for communication between the browser-resident views and the server-resident controllers can be achieved.
To be continued…
Ajax file uploading
June 26th, 2006I was messing around today with trying to duplicate the way that gmail supports file uploading in the background. I didn’t reverse engineer anything but worked solely from what I figured they must be doing based on what I saw of their interface.
I was able to duplicate the gmail behavior in Firefox and Opera 9 (<9 has some weird IFRAME support that I have never been able to figure out). Basically, you drop an input type=’file’ anywhere on your page and then in response to an onchange, do a cloneNode(true) and insert the clone into a form on a hidden iframe. You can then submit the iframe. The cloneNode support allows us to work around the problem where the individual filename/path cannot be set on this control due to security restrictions. It works pretty well.
Internet Explorer is another issue. When trying to insert the cloned node into the iframe document, it would always signal an “Illegal Argument” (RANT: I just love Internet Explorer’s error messages). I finally took this to mean that it does not supporting inserting an element from one DOM document into another. I tried the importNode method just to find out it isn’t there in IE. I was pretty frustrated at this point and decided to browse to gmail with IE to see if Google had been able to solve the problem. No dice. Their file upload mechanism for IE is completely different than it is for non IE. I figure that they must be using some barely documented IE method to do their magic, but I don’t know what it is.
Anyway, what I finally came up with was putting the file input control directly in an iframe on the page and forgo the copying. I apply some dynamic formatting to make sure the iframe is sized to just slightly larger than its contents and you can’t tell there is an iframe there. When the onchange event of the file input is raised, I hide the iframe and unhide the status/cancel buttons. I then hook the iframe’s onload event and submit the encapsulated form. While the iframe form is submitting, a poller kicks off to run out to the server periodically and update the displayed progress information on the main page. It’s a little kludgy but is nicely encapsulated in a widget and it works pretty much without change on the major browsers (sans Safari for the moment — I’m having some other problems on the Mac front).
So my final analysis: If you’re the size of Google, go ahead and support a different mechanism for IE, but I would rather make it the same all around. Therefore, I think this approach is a good middle of the road.
Introducing Protowidget
May 29th, 2006I’d like to introduce a new JavaScript/Ajax framework that my company is creating. We’re calling it Protowidget because it uses Prototype and does things with widgets. And yes, we know it’s 2006 and this is something like the 800th Ajax framework released this year. We think this one is new and different, though (of course, every parent thinks their baby is beautiful). It is being created out of real needs while building applications for customers and is helping us create better stuff.
Head over to the dedicated page on this blog to find out more.
Firefox View Source Trick
April 19th, 2006I might be the last one on the planet to discover this one, but I doubt it, so I thought I’d share.
I’m used to right-clicking on a page and selecting “View Page Source” from the context menu. This has some advantages over the same command in the View menu because it allows you to narrow down which frame you want to view.
Last night, however, I accidentally selected some text and right clicked. I noticed that the familiar “View Page Source” menu item had changed to a “View Selection Source”. It seems that if an straight-forward subset of the document is spanned by the selection, the resulting view source window will only display that portion. If, however, the selection spans disjoint elements in the tree, the whole document will be displayed with the relevant pieces hilighted.
This in and of itself is really helpful for comprehending a complex HTML structure, but there is one more thing I noticed that is even better:
The output of the View Selection Source command seems to be the HTML representation of the current structure of the document as opposed to the raw text obtained at page load time. Put another way, the changes made via JavaScript to the HTML DOM are displayed by View Selection Source.
I know there is an extension that lets you view the live document structure but I’ve never installed it and this is an easy trick that seemingly produces the same result.