Nikkei Electronics Asia -- January 2009

Why Is the New Google V8 Engine So Fast? [Part 2]

E-Mail Article
Tweet This
Digg This
Share this with friends on Facebook
Buzz Up!
Jan 14, 2009 18:02 Nikkei Electronics Asia

Continued from Why Is the New Google V8 Engine So Fast? [Part 1].

Problems with Language

The specification for the JavaScript language does now especially stress performance. This is quite evident when, for example, it determines the type of a variable.

Mainstream languages such as C++ and Java use static typing. In this approach, variable types are declared when the code is compiled. Because there is no need to check the type during execution, static typing enjoys performance advantages.

In general processing systems like C++ and Java, the content of fields*, methods*, etc, is stored in an array, with a 1:1 correspondence with the names of the fields, methods, etc, at an offset (Fig 2). The locations at which individual variables, methods, etc, are stored are defined for each class. In C++, Java, etc, the type of the accessed variable (class) is known in advance, so the language interpreting system merely accesses the field, method, etc, using the array and offset. The offset makes it possible to access fields, call methods or execute other tasks with only a few machine language instructions. 

* Field: variable belonging to an object; called a member variable in C++.

* Method: processing type belonging to an object; called a member function in C++.

JavaScript, on the other hand, uses dynamic typing. JavaScript variables do not have types, and the type of object assigned is determined for the first time at execution: in other words, dynamically. Every time a property* is accessed or a method is called, etc, in JavaScript, the object type must be checked and processing executed accordingly. 

* Property: JavaScript properties are object-owned variables. In JavaScript, properties can hold not only standard values, but also methods.

Many JavaScript engines use a hash table* to access properties, call methods, etc. In other words, every time a property is accessed or a method called, for example, the character string is used as a key to search the object hash table (Fig 3). 

* Hash table: Data structure that returns a value corresponding to a specified key. It has an internal array using the hash value generated from the key as an offset to a list value at a particular location in the array. If the hash value for a different key happens to result in the same location, that list location will store multiple values, which means a check must always be made that the hash values match before returning a value.

Searching a hash table is a sequence that involves determining the position within the array from the hash value, then checking to see if the key at that location matches or not. Compared to an array where the offset can be used to directly read the data, access using this approach takes more time. 

Other programming languages using dynamic typing are, for example, Smalltalk and Ruby. These languages also basically search hash tables, but they utilize classes to reduce the time required for a search. JavaScript, however, has no classes. With the exception of "Numbers" indicating numeric values, "Strings" for character strings and a few others, all other objects are the "Object" type. The programmer cannot declare the type (class), so clearly classes cannot be used to speed up processing. 

JavaScript does have flexibility in adding properties, methods, etc, to objects, or deleting them, at any time (see p32, "Consideration for Programmers Familiar with Classes"). The JavaScript language specification is extremely dynamic, and the general opinion in the industry is that a dynamic language is much harder to accelerate than a static one like C++ or Java. V8 uses a number of techniques to achieve faster speeds in spite of this difficulty, though, as outlined below.

1) Just-In-Time Compile:
Machine Language without Bytecode

From the viewpoint of performance, V8 has four key features. First, it generates machine language at execution in what is called just-in-time (JIT) compiling. This is a commonly used method of improving the speed of interpretation, and is also found in languages such as Java and .NET. V8 implemented this technology in advance of competing engines like the SpiderMonkey JavaScript engine in the Firefox browser developed by the Mozilla Foundation of the US or JavaScriptCore in Safari.

The V8 JIT compiler does not create intermediate code when generating machine language (Fig 4). For example, in Java the compiler first converts the source code into a class file expressed in a virtual intermediate language called bytecode. Java compilers and bytecode compilers generate bytecode, not machine language. Java VMs interpret class file bytecode sequentially in execution. This execution model is called the bytecode interpreter. Firefox's SpiderMonkey has an internal bytecode compiler and bytecode interpreter, converting JavaScript source code into its own flavor of bytecode for execution. 

In fact, Java VMs currently use a HotSpot-based JIT compiler. This acts as a bytecode interpreter to interpret the code, converting frequently executed code segments (only) into machine language and then executing: a hybrid model. 

The bytecode interpreter, hybrid model, etc, offers advantages such as trivial implementation and excellent portability. If the source code for the engine itself can be compiled, then bytecode can be run on any central processing unit (CPU) architecture, which is exactly why the scheme is called a "virtual machine". Even in the hybrid model, which generates machine code, development can begin by writing the bytecode interpreter, and then implementing the machine language generator. By using simple bytecode, it is much easier to optimize the output when the machine code is generated. 

V8 does not convert the source into this intermediate language, instead generating machine language directly from the abstract syntax generated by the JavaScript server, and executing it. There is no virtual machine, and because no intermediate expression is needed, program processing begins much faster. On the flipside, however, it loses out on the benefits of the virtual machine, such as high portability and simple optimization, gained through the bytecode interpreter, hybrid model, etc.

To be continued

Cover Storyˇ§Runaway Evolution of Google Engine - table of contents