Anatomy of a JavaScript object

November 17th, 2008

This post is about how the JavaScript engine represents JS objects in memory. I’m afraid a lot of it will seem obscure and opaque unless you already know a bit about SpiderMonkey internals, or you have the perseverance to click some of the links below and read the documentation.

First of all, what is a JavaScript object? Paraphrasing the documentation for JSObject, objects are made up of the following parts:

  • Most objects have a prototype. An object inherits properties, including methods, from its prototype (which is another object).
  • Most objects have a parent. An object’s parent is another object, usually either the global object or an object that represents an activation record. The JavaScript engine uses this relationship to implement lexical scoping.
  • Almost every object can have any number of its own properties. The term own property refers to a property of an object that is not inherited from its prototype. Each property has a name, a getter, a setter, and property attributes. Most properties also have a stored value.
  • Every object is associated with a JSClass and a JSObjectOps. These are C/C++ hooks that implement details of the object’s behavior. An object may also have other private fields, depending on its JSClass.

So you might imagine a JavaScript object would look something like this:

    struct JSObject {
        JSObject *proto;
        JSObject *parent;
	map<jsid, JSProperty> ownProperties;
        JSClass *cls;
        ...
    };

This being SpiderMonkey, none of the details are quite as straightforward as that, but the real struct JSObject does in fact contain a word that points to the object’s prototype, one that points to its parent, and one that points to its class. The part that represents an object’s own properties isn’t like a C++ map at all, and that’s the part I’d like to focus on here. Certainly a JavaScript engine could store each object’s own properties in a map or hash table. What we actually do is quite different and rather clever.

How properties are stored

First of all, and mostly just to get this out of the way, there is an abstraction layer around properties: JSObjectOps. By implementing this interface, an object can store its own properties in its own custom, non-default way. Apart from arrays and maybe XPCOM wrapper objects, I think this is very rare.

A nice, gentle way to learn about JSObject is by reading the source code of js_DumpObject, a debugging function that walks the data structures I’m about to describe and prints out some of the details.

In SpiderMonkey, properties are divided into two parts which are stored in separate data structures: the stored value and everything else. Each object has a growable array of jsvals. That’s where the stored values live. Each object also has a pointer to an object map, which contains the property names, getters, setters, and attributes. For native objects, this object map is a linked list of property descriptors (of type struct JSScopeProperty). Each property descriptor also tells whether the property has a stored value, and if so, its offset in the stored value array. When a new property is created on an object, the property descriptor pointer is changed. The new head of the list is a property descriptor containing information about the new property, but the tail of the list contains all the same information about existing properties as before.

Got it? OK. Now for the fun part.

Why

This turns out to be a nice way to store properties for a few reasons.

  • It turns out we can share the hefty property descriptors among objects. If many objects have the same properties, in the same order, they have the same linked list of property descriptors. So the per-object cost of a property in this case is just one word—the stored value itself. (By contrast, in a hash table implementation, a property would have to cost at least two words—plus a few extra bytes of hash table overhead, depending on the load factor of the table.) The mechanism by which these lists are shared is called the property tree, and it’s a pretty sweet idea. It’s what I originally set out to describe in this post, actually. The details, though, are mind-bogglingly intricate. An epic comment in jsscope.h dishes the dirt.
  • When we share property descriptor lists among similar objects, what we’re really doing is classifying objects by the properties they have. This is such a useful concept that the SpiderMonkey developers have a succinct term for “everything about an object’s properties except their values”: shape. Objects of the same shape all have the same own properties, and their values are stored at the same offsets. It turns out we can cache the results of property lookups by shape, so in many cases the interpreter doesn’t have to walk the linked list of properties at all. SpiderMonkey’s property cache is another nice topic for a post sometime.
  • Another consequence of objects of the same shape having the same layout is that you can treat them like structs. This is useful if you’re a JIT. A property access could be as little as a single machine instruction. (In practice, JITted code has to check the type of every property value it reads, so we don’t normally achieve this ideal. Still, eliminating the overhead of groping about for a property in a hash table or linked list is a huge win.)

Any questions?

7 Responses to “Anatomy of a JavaScript object”

  1. mixedpuppy Says:

    I think I kind of get what’s going on. How do I actually take advantage of this in JavaScript itself? When would it be advantageous to do so?

  2. jorendorff Says:

    You don’t need to know anything about this stuff to write good, fast JavaScript code. This is important only if you want to work on the JavaScript engine itself.

  3. Blake Kaplan Says:

    mixedpuppy, the brilliance of this stuff is that it’s based on the JavaScript code that people already write. For example, the property tree (and by extension, the property cache) take advantage of the fact that even though JavaScript has no named types or structures, existing code already creates many similar objects.

  4. Malte Says:

    Most JS-engines (notable exception being Rhino) iterate over properties in insertion oder (when iterated through for(var foo in object)). Is this a coincidence of early JS implementations and does this place constraints on the storage of properties?

  5. Ilya Says:

    About the last bullet, Chrome’s V8 also already “classifies” JS objects _and_ benefits from it on the JIT level. In fact, Google recommends developers to create their objects always in the same layout (e.g. by initializing with a constructor) to reap the benefits of it.

    Are we going to see the same thing once Mozilla goes JIT?

  6. jorendorff Says:

    “Are we going to see the same thing once Mozilla goes JIT?”

    Yes, we’re already doing this in existing beta releases of Firefox 3.1. (The JIT is off by default in those releases. It will be on by default in Firefox 3.1 beta 2, which is coming soon.)

  7. jorendorff Says:

    Malte: I think it was originally a coincidence, but retaining the insertion order is now considered a feature. Occasionally there is talk of standardizing it. In any case I doubt we will ever drop it.

    This is a constraint on storage. In the case of SpiderMonkey, the linked list of property descriptors is in exactly the reverse of property insertion order.

Leave a Reply