Last week, Scott Petersen from Adobe gave a talk at Mozilla on a toolchain hes been creatingsoon to be open-sourcedthat allows C code to be targeted to the Tamarin virtual machine. Aside from being a really interesting piece of technology, I thought its implications for the web were pretty impressive.

Before reading this post, readers who arent familiar with Tamarin may want to read Frank Heckers excellent Adobe, Mozilla, and Tamarin post from 2006 for some background on its goals and why its relevant to Mozilla and the open-source community in general.

If I followed his presentation right, Petersens toolchain works something like this:

A special version of the GNU C Compilerpossibly llvm-gcccompiles C code into instructions for the Low Level Virtual Machine.
The LLVM instructions are converted into opcodes for a custom Virtual Machine that runs in ActionScript, a variant of ECMAScript and sibling of JavaScript.
The ActionScript is automatically compiled into Tamarin bytecode by Adobe Flash, which may be further compiled into native machine language by Tamarins Just-in-Time (JIT) compiler.
The toolchain includes lots of other details, such as a custom POSIX system call API and a C multimedia library that provides access to Flash. And theres some things that Petersen had to add to Tamarin, such as a native byte array that maps directly to RAM, thereby allowing the VMs emulation of memory to have only a minor overhead over the real thing.

The end result is the ability to run a wide variety of existing C code in Flash at acceptable speeds. Petersen demonstrated a version of Quake running in a Flash app, as well as a C-based Nintendo emulator running Zelda; both were eminently playable, and included sound effects and music.

So, once Petersens modifications to Tamarin make their way into the next version of Adobe Flash, we can expect to see older commercial games running in the browser. Even more impressive, though, is the sheer volume of existing code that can be made to run inside the browser: Petersen showed us the C-compiled versions of Lua, Ruby, Perl, and Python all running on the web in secure Flash sandboxes.

What this means for Python

The potential implications this has for Python are particularly interesting to me. The ability to run Python on the web is exciting, to say the least; also interesting is the fact that by sandboxing CPython in a virtual machine, we solve a lot of the security issues that currently face the language when it comes to running untrusted code.

Petersens work also resonates with a few goals of another project called PyPy. Im going to try to explain the idea behind PyPy in a later post; for the time being, the slides from my April 2007 ChiPy presentation on PyPy may serve as a passable introduction.

In a nutshell, the difference in mindset between PyPy and Petersens work is that the former is radically innovative in scope and mission, while the latter is pragmatic. PyPys goal is essentially to move the canonical implementation of Python from C to Python itself, and then use a pluggable toolchain to translate the Python interpreter to any platform with a configurable set of language and implementation features. In one fell swoop, this modularizes the composition of the Python interpreter in such a way that innovating and maintaining different ports and variants of Python like IronPython, Jython, and Stackless no longer requires either writing an entire copy of the same interpreter in a different language or branching the CPython source code and making pervasive changes to it.

Rather than focusing on innovation, Petersens work focuses on code reuse. Instead of moving a canonical interpreter implementation from C to a dynamic language, his strategy is to simply compile the existing C code to run in a virtual machine thats implemented in a dynamic language. Both approaches aim to obviate the necessity of ports of interpreters to different platforms, and as such their purposes intersect at a common subset of functionality. But Petersens work cant be used to facilitate the innovation of the Python language and its implementation, while PyPy offers few or no tools to reuse existing non-Python code. Perhaps its possible to combine the best of both worlds by taking PyPys generated C interpreter and using Petersens toolchain to allow it to be usable on the web and other places that Tamarin runs.

What this means for the Open Web

To be honest, Im not quite sure where the dividing line is between what of Petersens work is Flash-specific and what can be reused to benefit the Open Web. Since ActionScript is a sibling language to JavaScript, its possible that the custom VM he created can be run in a browser with relatively few modificationsalbeit much more slowly in Firefox at the time being, since SpiderMonkey-Tamarin integration is not yet complete. Once thats further along, though, I imagine it should be possible to create C libraries that can be used in the toolchain to allow sandboxed C code to interact with web pages rather than Flash apps. Should this be feasible, I think it will possibly be the ultimate in a relatively recent string of next-generation Javascript virtual machines that allow existing code to run safely in browsers.

Also, in the context of the web, download size is a significant concern because applications are essentially streamed to clients. While Petersens toolchain means that its possible to instantly inherit most of CPythons benefits on the web, it also means that we get all of its flaws along with itsuch as the fact that the standard CPython distribution is a few megabytes large. But theres ways to get around this.

In any case, Im really excited to see how both Petersens work and PyPy proceed. I just hope I havent mis-represented either one of them here due to a lack of understanding; Ill try to correct this blog post as I become aware of my mistakes.

Home | Privacy Policy | Site Map | Links | Company | Contact Us | Arena           

We provide cutting edge software development, consultancy and web design services            

There are currently 548 user(s) with 33606 hits.