The race for speed part 3: JavaScript compiler strategies

The race for speed - part 3

The JavaScript language is hugely popular for a number of reasons. Broad reach is certainly a factor, and from a developer’s perspective it’s very quick and flexible. Everything in the language is an Object so creating structures on the fly is simple and there’s no need to datatype anything because those types are all inferred. It is however exactly this versatility that makes it a challenge to compile.

Hidden classes
Although Objects and hierarchies are very easy to build in JavaScript, navigating these complex structures can be very slow for a compiler. A common way in C to store and retrieve properties and their values is with a hashtable or Dictionary; an Array-like structure where the value of a property can be looked up by searching for a unique String identifying it’s name. The problem with looking up a property on a very large hashtable is it can be very slow.

To speed this up both V8 and SpiderMonkey implement what’s called hidden classes – a behind the scenes representation of your JavaScript objects. Google call them maps and Mozilla calls them shapes but they’re pretty much the same thing. These structures are much quicker to find and work with than a standard Dictionary lookup.

Type inference
Dynamic typing in JavaScript is what allows the same property to be a Number in one place and then a String in the next. Unfortunately this versatility requires the compiler to create a lot more type-checking conditions, and conditional code is much larger and slower than type-specific code.

The solution to this problem is called type inference and all modern JavaScript compilers do it. The compiler inspects your code and makes an assumption about the datatype of a property. If this inference proves correct it executes a typed JIT that emits a fast (datatype-specific) machine code stub. If the type is not what was expected then that code “misses” and is passed off to an un-typed JIT to be fulfilled with conditional slower code.

Inline caches
The most common optimisation in modern JavaScript compilers is inline-caching. This is not a new technique (it was first implemented 30 years ago for the Smalltalk compiler) but it is very useful.

To work, Inline caching requires both the techniques we’ve just looked at: type inference and hidden classes. When the compiler encounters a new object it caches its hidden class along with any inferred datatypes. If that structure is encountered again anywhere else in the code then it can be quickly compared to the cached version. If a match is found then the previously generated stub of optimised machine code can be reused. If the structure or type has changed then it may fall back to slower generic code, or in some modern compilers can even perform polymorphic inline caching – i.e generate additional stubs on the same structure, but for each datatype.

If you are interested in learning more about inline caching in JavaScript then I would recommend reading this article from Google V8 engineer Vyacheslav Egorov. In it he builds a Lua parser in JavaScript and explains inline caches in a lot more detail.

Once the compiler has understood the structure of your code and the datatypes within it, it becomes possible to perform a whole range of other optimisations. Here are just a few examples:

inline expansion. aka “inlining”
Function calls are computationally expensive because they require some kind of lookup, and lookups can be slow. The idea behind inline expansion is to take the body of the function being called and drop that code into where it was called. This avoids branching and produces faster code, but at the expense of additional memory.

loop-invariant code motion. aka “hoisting”
Loops are a prime candidate for optimisation. Moving unnecessary computation out of a loop can greatly improve performance. The most common example of this would be a for-loop iterating over the contents of an Array. It is computationally redundant to calculate the length of the Array every time you loop through it. The length can be pre-computed into a variable and “hoisted” out of the loop.

constant folding
This is the practice of pre-computing the values of constants or variables that do not change throughout the lifetime of the program.

common-subexpression elimination
Similar to constant folding, this optimisation works by scanning your code for expressions that evaluate to the same value. Those expressions can then be replaced with a variable holding the pre-computed value.

dead-code elimination
Dead code is defined as unused or unreachable code. If there is a function in your program that is never called then there’s absolutely no point in optimising it. In fact it can be safely eliminated, which also reduces the overall size of the program.

These kinds of optimisations are really just the start of what has become an ambitious objective. To run JavaScript code as fast as native C code. Is that really an achievable goal? In the last part of this series we’ll look at some projects at the bleeding edge of fast JavaScript.