« Why free software shouldn't depend on Cocoa or GNUStep
First impressions are important »


Understanding the performance difference between dynamic and weak typing

It seems like every few months someone throws out an argument such as “Dynamic typed languages can never be as fast as statically typed languages”. This particular statement came from an otherwise well written rant from one of the core Mono developers.

even if you implement the same VM technology in something like Parrot as you have in Mono, dynamic typing will always be slower;


While in practice, I do not disagree with this assertion, I think that it is a dangerous if left unqualified and I thought I’d clear up some “Static v Dynamic” misconceptions. The main issue preventing the creation of a fast runtime is not dynamic typing, but is largely an issue of weak versus strong typing.

Dynamic languages can have strong typing. Common Lisp and Python are fairly good examples of this. However dynamic languages are more frequently considered to have weak typing (ala, JavaScript or PHP).

In a strong typed language, the type may still be dynamic, but the runtime enforces type rules. This means that the following python code will throw a error:

  >>> 1 + "1"
  TypeError: unsupported operand type(s) for +: 'int' and 'str'

In a weak typed language, instead of enforcing type rules, the runtime will instead attempt to find the common denominator between the two types. For example, the same code ran under JavaScript:

>>> 1 + "1"
 "11"

It is in this weak typing that the major performance issues come from. To explain why, I’ll need to explain how language virtual machines(VM’s) work.

Every virtual machine worth its salt performance-wise uses a technique called Just In Time compilation (JIT).

In very simple terms, when a VM executes a block of code, it first passes this code to a JIT compiler. The job of the JIT compiler is to convert the code into native platform specific code. A pointer to the new compiled code is then stored so that the VM knows to invoke the native code directly the next time the function is called.

A JITing VM is drastically faster than an interpreted VM for any operation that is repeated multiple times. However, weak typing throws a huge monkey wrench into this trick.

Take this C# example:

int addInts(int a, int b) {
    return a + b;
}

The JIT will see that both “a” and “b” are integer types and will convert that function into a single x86 opcode for adding two integers. However, the equivalent JavaScript is not so lucky:

function add(var a, var b) {
     return a + b;
}

In this situation, the types of “a” and “b” are not known. A naive runtime has no idea what add operator to invoke, so what ends up generated is something along the lines of this pseudo-code:

if(a is a string and b is a string)
       return addStrings(a,b);
else if (a is a integer and b is a integer)
       return addInts(a,b);
else if (a is a string and b is a integer OR b is a string and a is integer) {
       a = ConvertToString(a)
       b = ConvertToString(b)
       return addStrings(a,b)
}
// Etc handing floating points etc.

The outputted code for this function ends up being 98% type checks, and 1% actual code. Hundreds of CPU operations compared to the single CPU operation that the C# code needed.

Because of this overhead, even with a JIT compiler, the outputted code is so bloated with type checks you will never see the performance of a strong typed language, and this is where the basis of the initial quote I mentioned above came from.

However, it is still possibly to get reasonable performance out of a weak typed language, thanks to exceedingly clever VM designs.

In a weak typed language it is impossible to know the types of variables ahead of time...However it is possible to observe the types of variables AFTER the first execution of a function.

Any computer program, even fancy object oriented or functional ones, break down into a sequence of instructions that the VM/CPU execute. Regardless of the number of branches in the code, the execution will only go down one path at a time.

A tracing VM observes these execution paths and builds what is known as “traces”, essentially records of the discovered variable types and execution flow during each run.

It is easiest to understand using code:

var a = 5;
for (var b = 0; b < 100; b++) {
    a += b;
}

In this specific JavaScript snippet, the produced JIT code within the loop will initially be the verbose type checking example I stated above, however at the same time the tracing engine will note that it has discovered that both “a” and “b” are integer types.

On the second iteration of the loop the VM now has two options:

  1. Continue to run the loop contents, complete with the very verbose type checking code.
  2. Re-compile the inner contents of the loop into a single ADD instruction, much like a static compiler would.

The VM will look at how “hot” the code is, to decide what action to take. If the code is going to be repeated over a certain threshold, the cost of recompilation is estimated to be less than the total execution cost, and the VM will recompile the code using the discovered knowledge of the types.

However, if the code is not very hot, i.e. the loop is only a few iterations; then the VM may choose to not waste time recompiling and continue executing the inefficient code. This tuning is a fine balancing act.

Tracing cannot solve all problems, real code is rarely as simple as that loop, and much like function inlining, there are some things that have too many external inputs for the VM to reliably trace.

For example, what if a third variable was introduced in the loop and this variable received its state from a function call? If the VM cannot trace the function to ensure that it will always consistently return the same type, then it cannot perform any optimization on the loop.

However, even with these limitations, tracing VM’s are incredibly fast compared to standard naïve runtimes. Googles V8 JavaScript engine and Mozillas TraceMonkey are finally fast enough for JavaScript to be considered a worthy application language even though they are still extremely slow compared to a strong typed language.

To summarize, a weak typed language, not a dynamic language, will always be slower than a strong typed language, and this unfortunately won’t change until we invent quantum computers :-)

Posted by Jonathan Holland on 9/28/2009.

Tags: Javascript   Virtual-Machines   Compilers   Tracing   JIT

Comments:

I don't think your distinction between weak and strong typing matters very much for the performance of dynamic languages. Consider the expression "a+b". In both Python and JavaScript, the types of a and b solely determine what "a+b" means, and the types are not known until run time. The fact that JavaScript defines non-excepting behaviors for more combinations of built in types doesn't matter that much. A tracing JIT is equally capable on both, and a non-tracing JIT is equally paralyzed. A non-tracing JIT in both cases has the option to either switch among the possible types for a and b to find the right code to run, or have the a object contain a pointer to the function used to do addition.

The one example you provided where a non-tracing JIT would be able to directly implement a function in assembly was from a statically typed language.

Gravatar Posted by Mike on 9/29/2009.

The picture for dynamic languages is improved if you look at Double Dispatch. Your large set of if -statements (n^2, where n is the size of the set of types that need to be handled) collapses to 2*n.

http://en.wikipedia.org/wiki/Double_dispatch

Gravatar Posted by Arne Dehli Halvorsen on 9/29/2009.

Comments are closed on this post.