InvokeDynamic

From APIDesign

Revision as of 16:32, 12 September 2014 by JaroslavTulach (Talk | contribs)
Jump to: navigation, search

When I was younger I used to believe that having invokeDynamic instruction in HotSpot VM can be beneficial. I even argued that the instruction should not be used just for languages like Ruby but rather by the core Java to implement lambdas. Now, after spending time to implement lambdas in my Bck2Brwsr VM and seeing things from the other side I have to admit I was wrong. invokeDynamic is wrong idea (especially for implementation of lambdas).

Contents

Benefits

Implementing different languages on top of HotSpot virtual machine is of different complexity. When John Rose pushed forward his invokeDynamic vision, he claimed that the most problematic thing is to properly and effectively dispatch methods calls. Not every language uses the Java rules. Some support type conversions, implicit arguments. Some can dynamically alter the existing dispatch target or strategies. More about that in an excellent summary Bytecodes meet Combinators. I really liked that paper and I continue to like it. It matches my functional heart: with MethodHandle a method invocation is finally first class citizen in the VM. One can do currying & co. - all the goodies functional languages had for ages.

What is a MethodHandle? A pointer to method of some signature (for example plus would take two ints and return their sum as an int) and an object - a receiver to call the method on. However this is nothing else than a closure.

The proposed improvements to the HotSpot virtual machine may help the JDK to support different languages, but first and foremost they open the door to effective implementation of Closures.

Getting Dynamic

The primary goal of John R. was to support dynamic languages - e.g. languages where one knows (almost) no type information until the program actually runs. That means one can effectively type (in this JVM context: effectively generate bytecode) only when one knows the actual types. To address all these "deffered" needs the new invokeDynamic bytecode operand has been introduced. It does not hardcode the actual invocation, but once invoked, it calls back to let the "supervising" software (like your JRuby implementation) analyse the call and generate sequence of MethodHandle transformation to effectively match the actual types of arguments.

Drawbacks

The major problem with invokeDynamic is, well, that it is dynamic! Java is statically typed language and all variable, field, method and parameter types are known to JavaC before its emits the bytecode. Yet (as JavaC from JDK8 is emulating lambdas with invokeDynamic) it forgets all the derived type information and generates invokeDynamic - which is supposed to do late binding - e.g. find out the right types at the invocation time.

One of the key ideas that I had in mind when advocating use of MethodHandles for implementation of lambdas was reduction in the size of constant pool - you know, the list of referenced symbols like Ljava/lang/String which generally needs to be repeated in every Java class. If lambdas were simulated by inner classes, the constant pool might get enormous. With invokeDynamic I was hoping for the pool to be reduced to one shared pool for a single source code (with as many lambdas as needed).

However the JDK8 lambdas are generating innerclasses behind the scene and on the fly! So the main benefit is in my opinion gone.

The Problem

The unnecessary loose of types is problematic for VMs that are supposed to run in restricted environment - e.g. Bck2Brwsr or (as far as I heard) Java ME 8. We are running in restricted environment, we can't consume these resources by trying to generate new classes. Just in time compilation may be too expensive, easier to compile ahead-of-time.

Another issue is related to reflection. Method Handles are (due to their dynamic nature) a specific form or reflection. While doing method lookup one identifies the desired method (or field, or setter) by name. One can reference public or private methods. It is not known in advance which methods will be requested - one needs to invoke the bootstrap method to find that out. As such it is really hard to do compile time optimizations (like shortening method names). Again problem for for small, limited environments.

Summary

As a result we have implementation of lambdas that is needlessly forgetting the type information gained during compilation, re-creates it during each startup, is generating bytecode on the fly. It is even surprising it performs acceptably.

Personal tools
buy