Modular Java SE

From APIDesign

(Difference between revisions)
Jump to: navigation, search
(Hudson Builder)
(java.applet and java.beans)
Line 117: Line 117:
loader will find it.
loader will find it.
 +
=== Sneaking Simplicity In ===
-
So things are looking good. With just one problem: There is a static method in Beans class that takes '''AppletInitializer''' parameter. Right now it is commented out, but for the sake of [[BackwardCompatibility]] I need to return it back? Another puzzle! What shall I do now?
+
So things are looking good. With just one problem: There is a static method in '''Beans''' class that takes '''AppletInitializer''' parameter. Right now it is commented out, but for the sake of [[BackwardCompatibility]] I need to return it back? Another puzzle! What shall I do now?
 +
 
 +
Well, the basic trick is to ''sneak in simplicity''. Of course ''simplicity'' can have various meanings, but in this context it means number of outgoing [[dependencies]]. The '''Beans''' class is not simple, because it has dependency on beans, as well as applet classes. If we can replace it with some other class, that does not depend on applet, then we will simplify the [[API]]. Sometimes this is called ''conceptual surface'' - the amount of concepts one needs to understand when dealing with an [[API]]. By removing the need for users of a class to know anything about applet, we ''simplify'' its surface. Not only that, we also allow it to be compilable without applet being around (which is actually the most important goal when modularizing an [[API]]).
 +
 
 +
The only question is how to ''simplify'' the ''Beans'' class? Of course, the simplest way is to remove the one static method that references '''Applet''' - however this is horribly [[BackwardCompatibility|backward incompatible]] and compatibility for existing clients in our highest priority. Thus the only compile time option is to deprecate the whole Beans class and replace it with some other, ''simplified'' one.
 +
 
 +
I did that by creating new '''BeanFactory''' class that does not reference applet at all and otherwise contains the same methods like '''Beans''' class.
 +
 
 +
=== There Will Be Victims ===
 +
 
 +
When modularizing an [[API]], get ready to have some victims - some classes that will need to be deprecated. Regardless of how well designed your [[API]] is, it will contain classes with not enough ''simplicity'', like the ''Beans'' class above. Prepare for that and create a trash for them - a ''deprecated'' module.
 +
 
 +
The purpose of such module is to keep [[BackwardCompatibility]] only. It will have dependencies on all other modules in your system, and as such it can contain classes that do not fit anywhere else. Users of previous version of your [[API]] should see this module by default, so their previous dependencies are satisfied.
 +
 
 +
On the other hand users of your new version, shall not care and shall use other classes in properly modularized [[API]]s that have smaller ''conceptual surface'' and smaller compile type and runtime [[dependencies]].
 +
 
 +
For those interested, here is the final diff of the java.beans and java.applet separation:
 +
[http://source.apidesign.org/hg/jdk/rev/57914fd9382f read it all]!
<comments/>
<comments/>

Revision as of 16:29, 22 June 2009

I like puzzles that tease my mind (a bit). Last week I've been introduced to one. Modularize JDK (a described at Mark Reinhold's blog). This page will capture my thoughts on this topic.

There can be many reasons for having modular JDK, but to simplify things, let's stick with one: we want to reach a point in future, when it will be enough to download just a limited set of JDK to execute an applet/application/server, etc.

Contents

False Expectations

Sometimes people expect to get better performance just by modularizing their application. This is probably a false expectation, at least in the initial step.

By trivially splitting a JAR into ten smaller ones, one can only increase the amount of work done by the system. Instead of opening just one JAR and reading list of its entries, one needs to do this operation ten times, and this is obviously slower, especially when operating system caches are empty (e.g. after boot). Also the mutual communication between the newly separated pieces of the application will require some some overhead. Not much, but certainly something.

I have faced this when I split the monolithic openide.jar - a JAR with majority of NetBeans APIs back in 2005 (see the project page for more details). When I divided the big JAR into fifteen smaller, the start time of NetBeans IDE increased by 5%. I was seeking for reasons of such slowdown for the whole release and managed to eliminate it somehow, but the proper fix was still waiting to be discovered.

These days the NetBeans IDE starts faster then it used to before its modularization - we improved the infrastructure and made it more module friendly. We created various caches (for content of META-INF/MANIFEST.MF files of all modules, for META-INF/services, for classes loaded during start, for layout of files on disk, NetBeansLayers, etc.) and these days (NetBeans IDE 6.5, or 6.7) we don't open the modularized JARs at all. Thus we have the deployment benefits as claimed in manifesto of modular programming, while during runtime the system behaves like a monolithic application.

Modularization really pays off (we can easily deprecate obsoleted modules and make the system smaller), but it may take a little while. If you are seeking immediate improvements in terms of ms spend while loading a Hello World! application, you'd better refactor your code and classes. Modularization is not going to help you. Modularization is for those who seek long term improvements in deployment, ability to deprecate and more rapidly evolve the framework.

Infrastructure

Motto: I don't like to do useless work. - As a result I always seek for some test that will ensure my work really leads somewhere (like the guards described in section Path of a lost warrior in Chapter 11).

What is the foremost check to ensure your code is split into pieces? Well, each piece needs to compile separately. Thus, before modularization, I tweak the build infrastructure to make sure it really compiles the pieces and not everything at once.


To do this, you very likely don't want to mangle with location of your sources in your version control system. This would be premature, as the final division of the units is not yet know and your version history would be full of useless moves and renames. Luckily Ant offers powerful enough way to define sets of files and feed them into the compiler.

The following part of build.xml defines three groups of sources: applet, beans and the rest - called base.

<!-- this is the core of the separation - definition
      of what classes belong into what compilation group.
    -->
    <selector id="applet">
        <or>
            <filename name="java/beans/AppletInitializer*"/>
            <filename name="java/applet/**"/>
            <filename name="sun/applet/**"/>
            <filename name="META-INF/services/sun.beans.AppletProxy"/>
        </or>
    </selector>
    <selector id="beans">
        <and>
            <or>
                <filename name="java/beans/**"/>
                <filename name="sun/beans/**"/>
                <filename name="com/sun/beans/**"/>
            </or>
            <none>
                <selector refid="applet"/>
            </none>
        </and>
    </selector>
 
    <selector id="base">
        <none>
            <selector refid="applet"/>
            <selector refid="beans"/>
        </none>
    </selector>

Please note that the selectors are referring to each other. The beans group explicitly says it wants nothing from the applet group and the base group is solitelly defined as everything not included in the previous groups.

Then you need to start Java compiler on each of this group. An important step is to disable the search functionality of javac. By default the compiler looks for additional classes in the sourcepath and loads them as necessary. This needs to be prevented, as that might accidentally load classes from some other group of sources. To do this use the sourcepath="" parameter:

<javac
  bootclasspath="${build.dir}/base.jar"
  sourcepath=""
  destdir="${build.dir}/classes/${module}"
  classpath="${module.cp}"
  includejavaruntime="false"
  includeantruntime="false"
>
  <src refid="src.path"/>
  <selector refid="${module}"/>
</javac>

With infrastructure like this one, you can start splitting your project apart.

Hudson Builder

There is a hudson job to build the system which does basically following:

# to build the system do:
# 1. get the OpenJDK tree and build it all
      hg fclone http://hg.openjdk.java.net/jdk7/jdk7
# 2. change the subtree repository to our
#      default = http://source.apidesign.org/hg/jdk/ 
      cd jdk
      vi .hg/hgrc
# 3. update to new version
      hg pull -u
# 4. build it
      ANT_OPTS=-mx900M ant clean all

Feel free to repeat the build on your computers as well.

java.applet and java.beans

The biggest obstacle preventing creation of limited parts of JDK that really work is to define such limited pieces, make them independent and yet keep binary compatibility for the whole Java SE. Let's look at one such problem and seek a solution.

Obviously you may be interested in using JavaBeans specification and you may not want to know anything about applets. Also you may want to use applets and don't care about introspection and BeanInfos provided by JavaBeans. Is this possible?

Well, there is a problem. The java.beans.AppletInitializer interface. It resides in beans, yet its signature contains java.applet.Applet. This means that the java.beans package has compile time dependency on java.applet. Does that mean whoever uses JavaBeans module in future, needs to install applet module as well?

No. I have a solution: Let's use CodeInjection! Let's change Beans code to not talk directly to Applet, but rather create a code slot that can be injected by the applet module. Here is the diff against out openjdk repository:

The diff is here.

The idea is that when the applet module is not installed, there is no AppletProxy provider meaning that the application would not reference any types in the applet module. When the applet module is installed, it will install the provider and update META-INF/services/sun.beans.AppletProxy and thereafter the service loader will find it.

Sneaking Simplicity In

So things are looking good. With just one problem: There is a static method in Beans class that takes AppletInitializer parameter. Right now it is commented out, but for the sake of BackwardCompatibility I need to return it back? Another puzzle! What shall I do now?

Well, the basic trick is to sneak in simplicity. Of course simplicity can have various meanings, but in this context it means number of outgoing dependencies. The Beans class is not simple, because it has dependency on beans, as well as applet classes. If we can replace it with some other class, that does not depend on applet, then we will simplify the API. Sometimes this is called conceptual surface - the amount of concepts one needs to understand when dealing with an API. By removing the need for users of a class to know anything about applet, we simplify its surface. Not only that, we also allow it to be compilable without applet being around (which is actually the most important goal when modularizing an API).

The only question is how to simplify the Beans class? Of course, the simplest way is to remove the one static method that references Applet - however this is horribly backward incompatible and compatibility for existing clients in our highest priority. Thus the only compile time option is to deprecate the whole Beans class and replace it with some other, simplified one.

I did that by creating new BeanFactory class that does not reference applet at all and otherwise contains the same methods like Beans class.

There Will Be Victims

When modularizing an API, get ready to have some victims - some classes that will need to be deprecated. Regardless of how well designed your API is, it will contain classes with not enough simplicity, like the Beans class above. Prepare for that and create a trash for them - a deprecated module.

The purpose of such module is to keep BackwardCompatibility only. It will have dependencies on all other modules in your system, and as such it can contain classes that do not fit anywhere else. Users of previous version of your API should see this module by default, so their previous dependencies are satisfied.

On the other hand users of your new version, shall not care and shall use other classes in properly modularized APIs that have smaller conceptual surface and smaller compile type and runtime dependencies.

For those interested, here is the final diff of the java.beans and java.applet separation: read it all!

<comments/>

Personal tools
buy