'. '

Closures

From APIDesign

Jump to: navigation, search

Closures (nicknamed lambdas in JDK8) are classical OOP approach to represent a block of code that can be invoked passing in few parameters. Closures are typical building block of Smalltalk systems. Origin of closures (called lamdas back then) can however be traced into pre-historic age of computer science. The λ-calculus in fact knows nothing else than invocable blocks of code.

Since their invention closures become almost mandatory syntactic element for any new language. Not having them become a faux pas. As the profound history of programming languages puts it: Java makes them popular by not having them.

Looks like things are about to change. Sun recently announced its will to implement closures for JDK7 (now corrected by Oracle to be JDK8). To join the overwhelming ecstasy in the Java community I decided to write this page and provide few insights from less conventional angles.

Contents

[edit] The Da Vinci Closures

Everyone who read or saw The Da Vinci Code knows that public statements of those who rule the world need to be taken with a grain of salt. Public speech is there to influence the public behaviour. Not to describe the real estate of things. The real motivations need to remain hiden behind multiple meanings. The goal is to make trustable untrustable and untrustable real. Using irony and self denying references is good. Outsiders who discover part of the truth can never be sure whether it is the grail or just a fake layer around it with completely opposite meaning.

Only paranoid survive. Especially when dealing with Da Vinci and things related to him. One needs to be ready to reveal hidden and surprising meanings. One needs to seek for them.

That is why I was quite surprised that the attempt to extend Java bytecode with invokeDynamic has been named Da Vinci Machine project. What does that mean? Is the proclaimed goal to support multiple languages just a layered public statement and the real goal is somewhere else? What Sun wants public to think? Why?

Now when Sun adopted the idea of closures for JDK7, everything is clear. There was a hidden agenda behind the publicly stated goal. Let me take you through the plot plan!

[edit] Method Handle

When I was younger I believed that having invokeDynamic instruction in HotSpot VM can be beneficial. I even argued that the instruction should not be used just for languages like Ruby but rather by the core Java to implement lambdas. Now, after spending time to implement lambdas in my Bck2Brwsr VM and seeing things from the other side I have to admit I was wrong. invokeDynamic is wrong idea (especially for implementation of lambdas).

[edit] Benefits

Implementing different languages on top of HotSpot virtual machine is of different complexity. The most problematic thing is to properly and effectively dispatch methods calls. Not every language uses the Java rules. Some support type conversions, implicit arguments. Some can dynamically alter the existing dispatch target or strategies. More about that in an excellent summary Bytecodes meet Combinators.

To address all these different needs the new invokeDynamic bytecode operand does not hardcode the actual invocation, but delegates it to software controllable MethodHandles. What is a method handle? A pointer to method of some signature (for example plus would take two ints and return their sum as an int) and an object - a receiver to call the method on. However this is nothing else than a closure.

The proposed improvements to the HotSpot virtual machine may help the JDK to support different languages, but first and foremost they open the door to effective implementation of Closures.


[edit] Drawbacks

The major problem with invokeDynamic is, well, that it is dynamic! Java is statically typed language and all variable, field, method and parameter types are known to JavaC before its emits the bytecode. Yet (as JavaC from JDK8 is emulating lambdas with invokeDynamic) it forgets all the derived type information and generates invokeDynamic - which is supposed to do late binding - e.g. find out the right types at the invocation time.

One of the key ideas that I had in mind when advocating use of MethodHandles for implementation of lambdas was reduction in the size of constant pool - you know, the list of referenced symbols like Ljava/lang/String which generally needs to be repeated in every Java class. If lambdas were simulated by inner classes, the constant pool might get enormous. With invokeDynamic I was hoping for the pool to be reduced to one shared pool for a single source code (with as many lambdas as needed).

However the JDK8 lambdas are generating innerclasses behind the scene and on the fly! So the main benefit is in my opinion gone.

[edit] The Problem

The unnecessary loose of types is problematic for VMs that are supposed to run in restricted environment - e.g. Bck2Brwsr or (as far as I heard) Java ME 8. We are running in restricted environment, we can't consume these resources by trying to generate new classes. Just in time compilation may be too expensive, easier to compile ahead-of-time.

Another issue is related to reflection. Method Handles are (due to their dynamic nature) a specific form or reflection. While doing method lookup one identifies the desired method (or field, or setter) by name. One can reference public or private methods. It is not known in advance which methods will be requested - one needs to invoke the bootstrap method to find that out. As such it is really hard to do compile time optimizations (like shortening method names). Again problem for for small, limited environments.

[edit] Summary

As a result we have implementation of lambdas that is needlessly forgetting the type information gained during compilation, re-creates it during each startup, is generating bytecode on the fly. It is even surprising it performs acceptably.

[edit] Closures as innerclasses

The typical expectation for implementing closures (for example the 0.6a version) seems to envision a closure as an innerclass (if it says at all, how closures shall be implemented), with simplified syntax. This is indeed possible, yet ineffective. Overhead of defining new (inner) class in Java is high. Each class occupies a single .class file and these files are selfcontained. They contain not only their code, but also all their static linking information (e.g. the constant pool). This information gets copied with each inner class. Splitting one class into three does not keep the final size proportional to the original one. Imagine you want to rewrite following code:

class SayHello {
  public void sayHello(String to) {
    String hello = "hello";
 
    System.out.println(hello);
    System.out.println(to);
  }
}

so that each of the printlns runs under some lock (let's expect there there is some static method withLock(Runnable) and that we can use some form of closures):

class SayHelloSafely {
  public void sayHello(String to) {
    String hello = "hello";
 
    withLock({ System.out.println(hello); });
    withLock({ System.out.println(to); });
  }
}

Due to power of closures this code is as simple as the original one (just the call to withLock is added, but that was intended change to satisfy our goal). However if we stick with the originally planned implementation of closures as inner classes, then the above code in fact means:

class SayHelloSafely {
  public void sayHello(final String to) {
    final String hello = "hello";
    withLock(new Runnable() { 
      public void run() {
        System.out.println(hello);
      }
    });
    withLock(new Runnable() { 
      public void run() {
        System.out.println(to);
      }
    });
  }
}

Even this simple example shows how ineffective trivial implementation of closures can be. Instead of one class, we have three. Each of them having significant overlaps in their constant pools. Given the expected proliferation of closure based APIs (as they are easy to use, much easier than innerclasses), this can lead to enormous and unnecessary waste of memory. As one who watches over performance of NetBeans I cannot silently let this happen.

[edit] ClosuresAndMethodHandles

Thankfully there is a cure. It is possible to write well performing implementation of closures using invokeDynamic and its method handles. Imagine that the above code is rewritten to use method handles (and that the withLock method now takes MethodHandle):

class SayHelloEffectively {
  private static MethodHandle first;
  private static MethodHandle second;
 
  public void sayHello(final String to) {
    MethodHandle addThis = MethodHandles.insertArgument(first, 0, this);
    withLock(addThis);
    MethodHandle applyToAndThis = 
      MethodHandle.insertArgument(MethodHandles.insertArgument(second, 0, to), 0, this);
    withLock(applyToAndThis);
  }
 
  private void firstRunnable() {
    final String hello = "hello";
    System.out.println(hello);
  }
  private void secondRunnable(String to) {
    System.out.println(to);
  }
 
  static {
    first = MethodHandles.lookup().findSpecial(
      SayHelloEffectively.class, "firstRunnable",
      MethodHandles.methodType(void.class)
    );
    second = MethodHandles.lookup().findSpecial(
      SayHelloEffectively.class, "secondRunnable",
      MethodHandles.methodType(void.class, String.class)
    );
  }
}

Please accept my appology for the above use of method handling API. It is just a sketch of the implementation. I have not found the javadoc to verify or even compile my code against it. Anyway it is clear that this conversion of closures is very constant pool friendly. Regardless of the amount of closures in a class, just one, shared constant pool is used. This avoids useless duplication of its entries in the inner classes.

The method handle solution shall also be well performant. Method handle combinators are supposed to be effective. The only slow operation is the reflective binding, but that happens just once, when the class is loaded (or the method is first used).

Also notice that this approach really supports closures. If a piece of code references some variable from outer block, it is easy (because of method handle combinators) to pass such variable into the closure method via an argument. Sort of like partially applied functions in high level languages.

Also, in all the closures for Java specifications, the this is treated differently than in inner classes. It is supposed to mean the outer class, which would require certain compiler transformations if closures were implemented as hidden inner classes. If closures are implemented as method handles, the meaning of this naturally stays the same, as the methods are really methods of the proper class.

[edit] Just Implementation

The above example shows only the internals of the closures implementation. It does not prescribe at all how the actual closures syntax is going to look like. It does not matter. If the closures can be expressed as anonymous innerclasses, it will be possible to express them via method handles too.

Especially following is unaffected: The classical closures proposals usually operate with a closure conversion - e.g. that a closure can be converted into an existing interface. So one can call:

void sayHello(java.util.concurrent.Executor ex) {
    ex.execute(#(){ System.out.println("hello"); });
}

This still remains possible. It is just necessary to know how to convert some MethodHandle into Runnable. But one factory method can indeed do it (including arity and parameter types check). So the expressiveness of closures is in no way affected by using method handles.

[edit] Related External References


[edit] Declination

At the beginning I accused Sun for having a hidden agenda. I do not mean it. Given all the problems the Sun's JDK team had to keep even to simple plans after open sourcing JDK, I cannot imagine how it could execute something as complex. Pretending that invokeDynamic is for other languages on top of HotSpot virtual machine and doing all of this just because of closures is too complex to be real. Also, when I explained the usefulness of invokeDynamic for closures to Peter von Ahe few years ago (when he was working for Sun and responsible for the JavaC compiler), he was pleasantly surprised and slightly interested in the topic. I do not think the compiler team knew about the emerging effects of these two parallel efforts. Anyway this is not that interesting.

What is important is that there is an effective way to implement closures for Java. It is also good that most of the work (on runtime performance) has already been done by the Da Vinci Machine project. And last, but not least - it is very good that the same infrastructure used by non-Java languages on top of JVM will now be shared by core Java. This will make it a primary focus for wider groups of engineers making the invokeDynamic and method handle combinators more and more effective in the future.

Name (required):

Comment:

Views
Personal tools
buy