Motivation
DOM is the API that allows JavaScript to interact with HTML documents. The object that is a part of the DOM API is usually implemented as heap allocated C++/Rust object for most browsers. For JavaScript to interact with these objects, it needs to go through a wrapper called a reflector that gives JavaScript code access to the DOM object. For example, the DOM objects and reflectors on Servo is shown in Figure 1. The reflector is a JSObject managed by SpiderMonkey. There’s a pointer stored in one of the inline slots which point back to the heap allocated DOM object.
There is a couple downside for this implementation, but the biggest problem is performance:
- Type conversion For most browsers that employ this design, the API is implemented in native code. Since native code and JavaScript use different types, expensive conversions have to perform to convert JavaScript values into equivalent native types. For function return, native types then need to be converted to JS values. One extreme example is converting Rust Vec to/from JavaScript Array. Each conversion requires a deep copy of the data structure which is very costly.
- JIT Optimization Native function is opaque to JavaScript JIT, which makes it difficult to do optimizations specific to the calling code.
These problems can be mitigated to some degree. Conversions can be made cheaper. Hints can be given to the JIT to perform some limited optimization, for example the pure annotation on WebIDL for the DOM accessors.
MagicDOM
Based on the observation, instead of storing the fields of DOM in a separate heap allocated object, MagicDOM stores all of the fields in the reflector (Figure 2).
From Figure 2, all of the fields are stored in the reserved slots in the JSObject. There are 16 inline slots which store at the struct of JSObject1. If more slots are needed, data will be stored in dynamic slots. In total, at most 255 slots can be used.
Data stores in the slots would be in JSValues rather than native types provide opportunities for optimizations. But native code needs to convert JSValues back to native types when working with DOM objects. This is offset by the following:
- Conversions JSValues to/from native types conversions are highly optimized. The JSValues that are stored in slots are well defined which doesn’t need to account for the edge cases. It’s very fast compared to the JSValues got from arbitrary scripts.
- DOM API DOM interfaces can be implemented partially or entirely in JavaScript which avoids the cost of conversions and allows JIT optimizations which are limited with native DOM interfaces.
Implementation
MagicDOM utilizes many Rust features to reduce the engineering efforts and makes the implementation easy to adapt to use cases other than DOM. The code is open source at https://github.com/fitzgen/mozjs/tree/magic-dom
Mix native and JavaScript data types
MagicDOM allows fields to have either Rust type or JavaScript side type as
shown in the code snippet below. Node has both Rust type fields like u16 and
Vec
- ToFromJSConvertible This trait defines methods that convert the Rust type to/from JavaScript type. Most of the Rust primitive types have implemented it. Rust Vec also implements this trait to convert to the corresponding JavaScript type, JSArray and it’s doing a deep copy for conversion.
- ToFromJSSlots This trait defines methods that stores data into and retrieve data back from the slot. All of the primitive type in Rust implements this trait. Unlike the above trait, the fat pointer of Vec (raw pointer, length, and capacity) is stored in three separate slots which avoid the deep copy.
struct Node_struct {
node_type: u16,
node_name: *mut JSString,
base_uri: *mut JSString,
is_connected: bool,
node_value: *mut JSString,
text_content: *mut JSString,
child_nodes: Vec<Node>,
}
Inheritance
To implement inheritance, MagicDOM embeds all the slots from parent class into the current class. Below is an example of a class that has inheritance. The parent class inheritance has to be embedded in the first field of the current class. All the slots that belong to the current class come after the parent class slots.
struct Element_spec {
_inherit: node::Node,
local_name: *mut JSString,
tag_name: *mut JSString,
namespace: *mut JSString,
prefix: *mut JSString,
id: *mut JSString,
attrs: Vec<attr::Attr>,
};
Code Generation using procedure macro and macro rules
For the DOM accessors, it needs to know where the fields are stored in the slots. Not all fields occupy the same number of slots. With object inheritance, the developer needs to keep track of the number of slots to shift. It’s very tedious for a human to write all the accessors and most of these methods have similar structures.
MagicDOM uses Rust’s procedure macro and macro rules to solve this problem. Given a spec like the following code snippet,
#[derive(MagicDOM)]
struct Element_spec {
_inherit: node::Node,
local_name: *mut JSString,
tag_name: *mut JSString,
namespace: *mut JSString,
prefix: *mut JSString,
id: *mut JSString,
attrs: Vec<attr::Attr>,
};
the #[derive(MagicDOM)] transform the above code into the following code snippet.
pub struct Element {
object: *mut JSObject,
}
and generates the following implementations:
- Getter (Rust and self-hosted JavaScript)
- Setter (Rust and self-hosted JavaScript)
- Constructor (Rust)
- Trait implementations that stores and retrieves MagicDOM from slots
The slot number is calculated when generating the getter and setter. To help to calculate the slot number in face of inheritance, all types including MagicDOM implements a Trait, NumSlots, that returns the number of slots occupied by this type. The MagicDOM implementation of this trait is also auto generated. There are also other helper methods and trait implementation generated to aid code generation or better operate with SpiderMonkey GC. More details are available in the source code.
Evaluation
To evaluate the performance of MagicDOM, a few Dromeao DOM Core tests are ported as MagicDOM currently doesn’t implement all of the DOM API and not integrated with any browser yet. MagicDOM is evaluated with three implementations.
- Selfhosted: Self-hosted JavaScript storing JavaScript values
- JS native: Rust native code storing JavaScript values
- Rust: Rust native code storing Rust values
Not all tests have all three implementations and some test has more test cases that will be described in the corresponding section.
GetId
This method is tested by calling 102400 times in a loop and measure the total time. The normal js test case is the following:
var a = {id: "jaz"};
for ( var i = 0; i < num; i++) {
ret = a.id;
}
The normal js is not equivalent to the self-hosted JavaScript but it’s a lower bound to see how well the JIT can optimize the code. From Figure 3, we can see that the self-hosted JavaScript is done better than the js native. Here the id is stored as JSString.
SetId
This method is tested by calling 102400 times in a loop and measure the total time. The normal js test case is the following:
var a = {id: "jaz"};
for ( var i = 0; i < num; i++) {
a.id = "caz";
}
From Figure 4, SetId is also doing better. After showing benefits for simple methods, let’s look at some complex methods.
GetAttributes
This method is tested by calling 102400 times in a loop and measure the total time. Selfhosted method performs pretty poor. In this method, the attr array is implemented using Vec for Rust type and JSArrayObject for JavaScript type. For this method, the cost of VM call from the JavaScript side to native side is amortized by the fast native implementation.
SetAttributes
This method is tested by calling 102400 times in a loop and measure the total time. Self-hosted JavaScript implementation is not available for this test. SetAttributes needs to create a new Attr when there’s no attr matched with the name. There’s no way in SpiderMonkey now to support this. But there are two patches in review trying to address this problem.
AppendChilds
This method is tested by appending 2000 elements on a node in one trial, running 1024 trials in a loop and measuring the total time. From Fiture 7, the js native implementation is very slow. After benchmarking the individual SpiderMonkey function it calls, the JS_SetElement is the root cause. The time it takes to return from this function is increasing with the length of the array which is quiet surprise. From its name, it should be O(1) not O(n). Following the call train, it traces to the NativeSetProperty function. Ideally, we should use a JSAPI function that eventually leads to setDenseElement which is O(1). Right now, there’s no JSAPI for it.
Discussion
Based on the above experiments, we can see that MagicDOM shows benefits for simple get/set methods. We still need more investigation on JIT to speed up the self-hosted JavaScript. For the complex method, self-hosted JavaScript doesn’t show too much edge there. But this may due to my unfamiliar with the JIT and JavaScript features and I think it could be improved.
This framework could be also used to define a general data structure to use in the non-browser use cases.
Using self-hosted JavaScript code is not easy. Those code needs to compile with the SpiderMonkey and the generated JavaScript code also needs to manually copy and paste into the Utilities.js. The flow of generating JavaScript code could also be improved. Currently, all the generated JavaScript code are in one line and before putting those in SpiderMonkey, a sed script needs to run to separate the lines.