'. '

Modular Java SE

From APIDesign

Jump to: navigation, search

I like puzzles that tease my mind (a bit). Last week I've been introduced to one. Modularize JDK (as described at Mark Reinhold's blog). This page will capture my thoughts and experiments on this topic. I don't know how much my work influenced actual Jigsaw work, but I know that at least Mandy has seen my results.

There can be many reasons for having modular JDK, but to simplify things, let's stick with one: we want to reach a point in future, when it will be enough to download just a limited set of JDK to execute an applet/application/server, etc.


False Expectations

Sometimes people expect to get better performance just by modularizing their application. This is probably a false expectation, at least in the initial step.

By trivially splitting a JAR into ten smaller ones, one can only increase the amount of work done by the system. Instead of opening just one JAR and reading list of its entries, one needs to do this operation ten times, and this is obviously slower, especially when operating system caches are empty (e.g. after boot). Also the mutual communication between the newly separated pieces of the application will require some overhead. Not much, but certainly something.

I have faced this when I split the monolithic openide.jar - a JAR with majority of NetBeans APIs back in 2005 (see the Modularization of NetBeans Platform page for more details). When I divided the big JAR into fifteen smaller, the start time of NetBeans IDE increased by 5%. I was seeking for reasons of such slowdown for the whole release and managed to eliminate it somehow, but the proper fix was still waiting to be discovered.

These days the NetBeans IDE starts faster then it used to before its modularization - we improved the infrastructure and made it more module friendly. We created various caches (for content of META-INF/MANIFEST.MF files of all modules, for META-INF/services, for classes loaded during start, for layout of files on disk, NetBeansLayers, etc.) and these days (NetBeans IDE 6.5, or 6.7) we don't open the modularized JARs at all. Thus we have the deployment benefits as claimed in manifesto of modular programming, while during runtime the system behaves like a monolithic application.

Modularization really pays off (we can easily deprecate obsoleted modules and make the system smaller), but it may take a little while. If you are seeking immediate improvements in terms of ms spend while loading a Hello World! application, you'd better refactor your code and classes. Modularization is not going to help you. Modularization is for those who seek long term improvements in deployment, ability to deprecate and more rapidly evolve the framework.


Motto: I don't like to do useless work. - As a result I always seek for some test that will ensure my work really leads somewhere (like the guards described in section Path of a lost warrior in Chapter 11).

What is the foremost check to ensure your code is split into pieces? Well, each piece needs to compile separately. Thus, before modularization, I tweak the build infrastructure to make sure it really compiles the pieces and not everything at once.

To do this, you very likely don't want to mangle with location of your sources in your version control system. This would be premature, as the final division of the units is not yet know and your version history would be full of useless moves and renames. Luckily Ant offers powerful enough way to define sets of files and feed them into the compiler.

The following part of build.xml defines three groups of sources: applet, beans and the rest - called base.

<!-- this is the core of the separation - definition
      of what classes belong into what compilation group.
    <selector id="applet">
            <filename name="java/beans/AppletInitializer*"