'. '

Modular Java SE

From APIDesign

Revision as of 21:24, 21 June 2009 by JaroslavTulach (Talk | contribs)
Jump to: navigation, search

I like puzzles that tease my mind (a bit). Last week I've been introduced to one. Modularize JDK (a described at Mark Reinhold's blog). This page will capture my thoughts on this topic.

There can be many reasons for having modular JDK, but to simplify things, let's stick with one: we want to reach a point in future, when it will be enough to download just a limited set of JDK to execute an applet/application/server, etc.

Contents

False Expectations

Sometimes people expect to get better performance just by modularizing their application. This is probably a false expectation, at least in the initial step.

By trivially splitting a JAR into ten smaller ones, one can only increase the amount of work done by the system. Instead of opening just one JAR and reading list of its entries, one needs to do this operation ten times, and this is obviously slower, especially when operating system caches are empty (e.g. after boot). Also the mutual communication between the newly separated pieces of the application will require some some overhead. Not much, but certainly something.

I have faced this when I split the monolithic openide.jar - a JAR with majority of NetBeans APIs back in 2005 (see the project page for more details). When I divided the big JAR into fifteen smaller, the start time of NetBeans IDE increased by 5%. I was seeking for reasons of such slowdown for the whole release and managed to eliminate it somehow, but the proper fix was still waiting to be discovered.

These days the NetBeans IDE starts faster then it used to before its modularization - we improved the infrastructure and made it more module friendly. We created various caches (for content of META-INF/MANIFEST.MF files of all modules, for META-INF/services, for classes loaded during start, for layout of files on disk, NetBeansLayers, etc.) and these days (NetBeans IDE 6.5, or 6.7) we don't open the modularized JARs at all. Thus we have the deployment benefits as claimed in manifesto of modular programming, while during runtime the system behaves like a monolithic application.

Modularization really pays off (we can easily deprecate obsoleted modules and make the system smaller), but it may take a little while. If you are seeking immediate improvements in terms of ms spend while loading a Hello World! application, you'd better refactor your code and classes. Modularization is not going to help you. Modularization is for those who seek long term improvements in deployment, ability to deprecate and more rapidly evolve the framework.

Infrastructure

Motto: I don't like to do useless work. - As a result I always seek for some test that will ensure my work really leads somewhere (like the guards described in section Path of a lost warrior in Chapter 11).

What is the foremost check to ensure your code is split into pieces? Well, each piece needs to compile separately. Thus, before modularization, I tweak the build infrastructure to make sure it really compiles the pieces and not everything at once.


To do this, you very likely don't want to mangle with location of your sources in your version control system. This would be premature, as the final division of the units is not yet know and your version history would be full of useless moves and renames. Luckily Ant offers powerful enough way to define sets of files and feed them into the compiler.

The following part of build.xml defines three groups of sources: applet, beans and the rest - called base.

<!-- this is the core of the separation - definition
      of what classes belong into what compilation group.
    -->
    <selector id="applet">
        <or>
            <filename name="java/beans/AppletInitializer*"/>
            <filename name="java/applet/**"/>
            <filename name="sun/applet/**"/>
            <filename name="META-INF/services/sun.beans.AppletProxy"/>
        </or>
    </selector>
    <selector id="beans">
        <and>
            <or>
                <filename name="java/beans/**"/>
                <filename name="sun/beans/**"/>
                <filename name="com/sun/beans/**"/>
            </or>
            <none>
                <selector refid="applet"/>
            </none>
        </and>
    </selector>
 
    <selector id="base">
        <none>
            <selector refid="applet"/>
            <selector refid="beans"/>
        </none>
    </selector>

Please note that the selectors are referring to each other. The beans group explicitly says it wants nothing from the applet group and the base group is solitelly defined as everything not included in the previous groups.

Then you need to start Java compiler on each of this group. An important step is to disable the search functionality of javac. By default the compiler looks for additional classes in the sourcepath and loads them as necessary. This needs to be prevented, as that might accidentally load classes from some other group of sources. To do this use the sourcepath="" parameter:

<javac
  bootclasspath="${build.dir}/base.jar"
  sourcepath=""
  destdir="${build.dir}/classes/${module}"
  classpath="${module.cp}"
  includejavaruntime="false"
  includeantruntime="false"
>
  <src refid="src.path"/>
  <selector refid="${module}"/>
</javac>

With infrastructure like this one, you can start splitting your project apart.

Hudson Builder

There is a hudson job to build the system which does basically following:

# to build the system do:
# 1. get the OpenJDK tree and build it all
      hg fclone http://hg.openjdk.java.net/jdk7/jdk7
# 2. change the subtree repository to our
#      default = http://source.apidesign.org/hg/jdk/ 
      cd jdk
      vi .hg/hgrc
# 3. update to new version
      hg pull -u
# 4. build it
      ANT_OPTS=-mx900M ant clean all

Feel free to repeat the build on your computers as well.

java.applet and java.beans

The biggest obstacle preventing creation of limited parts of JDK that really work is to define such limited pieces, make them independent and yet keep binary compatibility for the whole Java SE. Let's look at one such problem and seek a solution.

Obviously you may be interested in using JavaBeans specification and you may not want to know anything about applets. Also you may want to use applets and don't care about introspection and BeanInfos provided by JavaBeans. Is this possible?

Well, there is a problem. The java.beans.AppletInitializer interface. It resides in beans, yet its signature contains java.applet.Applet. This means that the java.beans package has compile time dependency on java.applet. Does that mean whoever uses JavaBeans module in future, needs to install applet module as well?

No. I have a solution: Let's use CodeInjection! Let's change Beans code to not talk directly to Applet, but rather create a code slot that can be injected by the applet module. Here is the diff against out openjdk repository:

The diff is here.

The idea is that when the applet module is not installed, there is no AppletProxy provider meaning that the application would not reference any types in the applet module. When the applet module is installed, it will install the provider and update META-INF/services/sun.beans.AppletProxy and thereafter the service loader will find it.


So things are looking good. With just one problem: There is a static method in Beans class that takes AppletInitializer parameter. Right now it is commented out, but for the sake of BackwardCompatibility I need to return it back? Another puzzle! What shall I do now?

<comments/>

Personal tools
buy