CompilerOptimizations

From APIDesign

(Difference between revisions)
Jump to: navigation, search

JaroslavTulach (Talk | contribs)
(New page: Do you remember the time when we were still coding in C++ and we used real compilers, producing not just ByteCode, but real machine code, executed by the target CPU? I do, remember. At...)
Next diff →

Revision as of 15:58, 16 October 2008

Do you remember the time when we were still coding in C++ and we used real compilers, producing not just ByteCode, but real machine code, executed by the target CPU? I do, remember. At least a bit, for a while I make my living developing SQL database while visiting my University.

This used to be the time when compilers needed to perform optimizations. These days are gone now, the JavaC just emits ByteCode and only later, when it is really executed the HotSpot virtual machine perform optimizations. The DynamicCompilation make this possible. The JavaC does not need to optimize anything, everything can be done later, with much greater knowledge about the execution environment.

Yet, at the old days when compiler output was directly consumed by hardware CPU, there was no chance to optimize something later, everything had to be done during the compilation. At that time various C++ compilers competed among themselves to produce the fasted code, the most optimized one. The competition had to be quite hard, as every tried to optimize too much and sometimes even overoptimized. I remember that from time to time I was getting some mysterious error in my program that vanished away as soon as (usually after many hours of debugging) I realized what can be the cause and I disabled some optimization switches.

For a while I believed that problems of this kind cannot happen to JavaC, however I was probably wrong. Recently I needed to prevent an object to be garbage collected from memory and wrote following code:

Code from CompilerSurprisesTest.java:
See the whole file.

public class CompilerSurprisesTest {
    Reference<String> cache;
 
    public String factory() {
        String value = new String("Can I disappear?");
        cache = new WeakReference<String>(value);
        return value;
    }
 
    @Test
    public void checkThatTheValueCanDisapper() {
        String retValue = factory();
        retValue = null;
        assertGC("Nobody holds the string value anymore." +
                "It can be GCed.", cache);
    }
}
 

The assertGC is a code from our JUnit extension library called NbJUnit and tries as hard as it can to remove the object pointed by the reference from memory. In the previous code it works fine, in the following code snippet the assert fails, as the local strong reference is not cleared:

Code from CompilerSurprisesTest.java:
See the whole file.

    @Test
    public void obviouslyWithoutClearingTheReferenceItCannotBeGCed() {
        String retValue = factory();
// commented out:        retValue = null;
        assertNotGC("The reference is still on stack." +
                "It cannot be GCed.", cache);
    }
 

So far, so good. This code behaves exactly as expected. It leads to conclusion that if you have a variable defined in a method body, and it has a reference to your object, the object cannot be garbage collected, until the method execution ends. OK, now guess: will the following test succeed of fail?

Code from CompilerSurprisesTest.java:
See the whole file.

boolean yes = true;
@Test
public void canItBeGCedSurprisingly() {
    String retValue;
    if (yes) {
        retValue = factory();
    }
    assertGC("Can be GCed, as retValue is not on stack!!!!", cache);
}
 

To my biggest surprise the refecent can really be garbage collected, even there is a local variable point to it! This is an example of surprising (over)optimization of JavaC or HotSpot. It finds out that, in spite of being declared for the whole method, the variable is not used outside of the if block and allocates its space on stack only for the execution of the if branch. This is quite surprising behaviour. An easy to fix behaviour, yet surprising:

Code from CompilerSurprisesTest.java:
See the whole file.

boolean ok = true;
@Test
public void canItBeGCedIfInitialized() {
    String retValue = null;
    if (ok) {
        retValue = factory();
    }
    assertNotGC("Cannot be GCed as retValue is not stack", cache);
}
 

The fix is easy, however the consequences of my finding are really horrible. NetBeans may rely on the expected behaviour (e.g. having an uninitialized local variable is enough) quite a lot. From time to time our tests are failing and it may be due to this randomness. Usually everything is OK, but from time to time, on machines with too powerful virtual machines, the GC can kick in while the method is running and release the reference, causing our tests to fail because of unexpected situation.

Maybe we will need to to perform complete audit of NetBeans sources to eliminate use of uninitialized local variables. And all of this just because compiler optimizations seem to become thing that external API users can depend on. It seems to be part of the API of our libraries.

Personal tools
buy