APITypes
From APIDesign
Line 16: | Line 16: | ||
--[[User:JaroslavTulach|JaroslavTulach]] 10:47, 16 June 2008 (UTC) | --[[User:JaroslavTulach|JaroslavTulach]] 10:47, 16 June 2008 (UTC) | ||
+ | |||
+ | [Category:APITypes] |
Revision as of 05:55, 19 November 2008
As TheAPIBook's Chapter 3, Determining What Makes a Good API argues, there is much more types of an API than just signatures of classes and methods. Some of them were already discussed in TheAPIBook itself, but as our understanding of this topic is growing everyday, this page contains additional and more recent observations about nature of API types.
Compiler Optimizations
Do you remember the time when we were still coding in C++ and we used real compilers, producing not just ByteCode, but real machine code, executed by the target CPU? I do remember, at least a bit. I was developing an implementation of SQL database for Novell for a certain time while studying my University.
This used to be the time when compilers needed to perform optimizations. These days are gone now, the JavaC just emits ByteCode and only later, when it is really executed the HotSpot virtual machine performs optimizations. The DynamicCompilation makes this possible. The JavaC does not need to optimize anything, everything can be done later, with much greater knowledge about the execution environment.
Yet, at the old days when compiler output was directly consumed by hardware CPU and there was no chance to optimize something later, everything had to be done during the compilation. At that time various C++ compilers competed among themselves to produce the fastest code, the most optimized one. The competition had to be quite hard, as often they tried to optimize too much and sometimes even overoptimized. I remember that from time to time I was getting some mysterious error in my program that vanished away as soon as (usually after many hours of debugging) I realized what can be the cause and I disabled some optimization switches.
For a while I believed that problems of this kind cannot happen to JavaC, however I was probably wrong. Recently I needed to prevent an object to be garbage collected from memory and wrote following code:
Code from CompilerSurprisesTest.java:
See the whole file.public class CompilerSurprisesTest { Reference<String> cache; public String factory() { String value = new String("Can I disappear?"); cache = new WeakReference<String>(value); return value; } @Test public void checkThatTheValueCanDisapper() { String retValue = factory(); retValue = null; assertGC("Nobody holds the string value anymore." + "It can be GCed.", cache); } }
The assertGC is a code from our JUnit extension library called NbJUnit and tries as hard as it can to remove the object pointed by the reference from memory. In the previous code snippet it works fine, in the following code snippet the GC cannot succeed, as the local strong reference is not cleared:
Code from CompilerSurprisesTest.java:
See the whole file.@Test public void obviouslyWithoutClearingTheReferenceItCannotBeGCed() { String retValue = factory(); // commented out: retValue = null; assertNotGC("The reference is still on stack." + "It cannot be GCed.", cache); }
So far, so good. This code behaves exactly as expected. It leads to conclusion that if you have a variable defined in a method body, and it has a reference to your object, the object cannot be garbage collected, until the method execution ends. OK, now guess: will the following test succeed or fail?
Code from CompilerSurprisesTest.java:
See the whole file.boolean yes = true; @Test public void canItBeGCedSurprisingly() { String retValue; if (yes) { retValue = factory(); } assertGC("Can be GCed, as retValue is not on stack!!!!", cache); }
To my biggest surprise the reference can really be garbage collected, even there is a local variable pointing to it! This is an example of surprising (over)optimization of JavaC or HotSpot. It turns out that, in spite of being declared for the whole method, the variable is not used outside of the if block and as such the JavaC allocates its space on stack only for the execution of the if branch. This is quite surprising behaviour. An easy to fix one, yet surprising one:
Code from CompilerSurprisesTest.java:
See the whole file.boolean ok = true; @Test public void canItBeGCedIfInitialized() { String retValue = null; if (ok) { retValue = factory(); } assertNotGC("Cannot be GCed as retValue is not stack", cache); }
The fix is easy, however the consequences of my finding are really horrible: This means that compiler optimization are not as invisible as they should be. People can rely on or be hurt by them. They can influence predictability of our code, they can change our code to do something else than programmer would expect. This may be flaw of the compiler or of the language designer, yet NetBeans probably rely on the expected behaviour (e.g. having an uninitialized local variable is enough to hold a reference while a method is being executed) quite a lot. We know that from time to time our tests are failing unexpectedly and unexplainably and it may be due to this randomness. Usually everything is OK, but from time to time, on machines with too powerful virtual machines, too many cores, too low memory, etc. the GC can kick in while the method is running and release the reference, causing our tests to fail because of an unexpected situation.
Liam noted at ljnelson's blog note that it is enough to make the variable final and the problem goes away. True, final helps, I've just tried that:
Code from CompilerSurprisesTest.java:
See the whole file.@Test public void properUseOfFinalFixesTheProblem() { final String retValue; if (yes) { retValue = factory(); } else { retValue = null; } assertNotGC("Cannot be GCed, now the retValue is on stack", cache); }
However the same code without final works as well. It is enough to initialize the variable in both branches of the if statement to prevent [Garbage Collection]] of the reference held by the retValue variable:
Code from CompilerSurprisesTest.java:
See the whole file.@Test public void properInitializationFixesTheProblem() { String retValue; if (yes) { retValue = factory(); } else { retValue = null; } assertNotGC("Cannot be GCed, now the retValue is on stack", cache); }
This very likely means that the compiler puts the variable onto the stack of the topmost block where it is guaranteed to be fully initialized. That is why we need a hint to warn developers about declaration of non-fully initialized non-primitive variables, as those can be source of the memory leaks.
I believe that original motivation for compiler optimizations is to speed program execution without affecting its behaviour. However this often is just a distant dream as from time to time the optimizations change execution semantics and as soon as that happen they start to be part of the API of our languages and their libraries!
Visual Aspects
The usual consensus is that visual aspects that are presented just to the end user are not part of API of some application. This is usually well justified and correct, especially in multi-platform framework like Java. Programmers that would rely on some library to render a button 8px next to right border, with a certain text painted in dedicated RGB color could be successful with their application on one screen resolution, while horribly fail on small monitors with limited gray scale. Common sense suggests that writing this kind of checks is against good habits of using APIs.
However recently I had an opportunity to face this kind of rendering bug. Stylesheets of this website were reported to be broken on firefox 3.0, while working fine on other browsers and older versions of firefox itself. The text in the navigation and toolbox areas were supposed to be black with yellow background, but for some reason firefox 3.0 was able to render it without the desired background. Some users reported that reading black text on black background is not really pleasant.
I was not sure where is the bug and I asked for help the mozilla guys. To my surprise they reacted pretty quickly, verified that this is behaviour of Opera and other browsers as well and even suggested how to fix my CSS files. Thanks guys, my website is looking much better now. However this leads me to two API observations:
- If I used an API in some version and it used to work, I consider it a bug that it does not work in new releases. I guess many programmers feel the same. And this all applies in some situations even to visual outcomes.
- Even rendering can sometimes become part of API, especially if you accidentally start to render black text on black background, there will be many people who complain about behaviour of your rendering engine.
Still, I'd like to apologize and thank mozilla guys for quick resolution and help. Solving incompatibilities between versions of some product is definitely much easier with such great support that I got as part of 449911 issue. Thanks.
--JaroslavTulach 13:55, 11 August 2008 (UTC)
<comments/>
Dependencies
Not many APIs can live alone, without support from other parts of the system. As each library, also APIs have own environment, which defines what needs to be available around to allow the API to function properly. Each user of the shared library needs to recreate proper environment, that means to satisfy library dependencies, before the API can be used. This implies, that the dependencies of a library form an important API of the shared library itself.
These dependencies may not even be visible in external signatures! They may only be needed during the runtime, internally, still changing them constitutes an API change. Imagine, that users of your API are using your library in some version and it works fine with just plain JDK. Suddenly, in newer release, you decide to change the library internals and depend on some other library, for example Jakarta Commons. That immediately means every user, who migrates to new version of your library, needs to include a version of Jakarta Commons in own application as well. This may or may not be a problem, however this is quite an externally visible change.
As the Chapter 3 defines APIs as everything that is externally visible, it makes sense to include shared library dependencies into the family of various types of APIs. In spite the fact, that it is very hidden kind of API, it is in fact one of the highest level kind and the API that we deal with the most during our day to day work.
--JaroslavTulach 10:47, 16 June 2008 (UTC)
[Category:APITypes]