Java Monitor

From APIDesign

(Difference between revisions)
Jump to: navigation, search
Current revision (08:45, 7 September 2009) (edit) (undo)
 
(3 intermediate revisions not shown.)
Line 1: Line 1:
-
The [[Java]] synchronization model with all '''synchronized''', '''notify''' and '''wait''' its is commonly thought of as an example of [[monitor]]s. However with closer look one finds out that it is far different to the model used in [[wikipedia::Concurrent_Pascal|Concurrent Pascal]]. It is not that robust, it is not well enough integrated to the language and sort of feels like an assembly language with synchronization primitives than high-level abstraction provided by real [[monitor]]. However, it has yet another hidden flaw, kind of example of [[Copy Based Design]] problem: the [[Java Monitor]]s implant the [[monitor]] into [[wikipedia::Object_oriented_language|object oriented language]], in an object oriented style and that does not work.
+
The [[Java]] synchronization model with all its '''synchronized''', '''notify''' and '''wait''' is commonly thought of as an example of [[monitor]]s. However with closer look one finds out that it is far different to the model used in [[wikipedia::Concurrent_Pascal|Concurrent Pascal]]. It is not that robust, it is not well enough integrated to the language and sort of feels like an assembly language with synchronization primitives than high-level abstraction provided by real [[monitor]]. However, it has yet another hidden flaw, kind of example of [[Copy Based Design]] problem: the [[Java Monitor]]s implant the [[monitor]] into [[wikipedia::Object_oriented_language|object oriented language]], in an object oriented style and that does not work.
Imagine there is an '''Cache''' support class in some [[API]]:
Imagine there is an '''Cache''' support class in some [[API]]:
Line 5: Line 5:
<source lang="java" snippet="monitor.pitfalls.Cache"/>
<source lang="java" snippet="monitor.pitfalls.Cache"/>
-
Its synchronization is sufficiently well designed. It never calls foreign code while holding own lock, which is one of necessary [[Deadlock Condition]]s. As such one would expect that there is no way to [[deadlock]] on the '''Cache'''s lock. However that would be false assumption. Simple call like:
+
Its synchronization is sufficiently well designed. It never calls foreign code while holding own lock, which is one of necessary [[Deadlock Condition]]s. As such one would expect that there is no way to [[deadlock]] on the '''Cache''''s own lock. However that would be false assumption. Simple call like:
<source lang="java" snippet="monitor.pitfalls.brokencall"/>
<source lang="java" snippet="monitor.pitfalls.brokencall"/>
Line 18: Line 18:
</source>
</source>
-
Moreover this all is achievable using the ''best'' [[Java]] synchronization practices. Just imagine there is a subclass of the '''Cache''' class which defines few properties and for better or worse guards them as [[Java Monitor]]:
+
Moreover this all is achievable using the ''best'' [[Java]] synchronization practices. How's that possible?
 +
 
 +
Just imagine there is a subclass of the '''Cache''' class which defines few properties and for better or worse guards them as [[Java Monitor]]:
<source lang="java" snippet="monitor.pitfalls.subclass"/>
<source lang="java" snippet="monitor.pitfalls.subclass"/>
-
This code is not bulletproof. It notifies its listeners while holding own lock and as such this can lead to deadlocks. However it is common that people do these sort of mistakes when using an [[API]]. The [[API]] shall be ready to handle [[cluelessness|clueless]] clients. It is acceptable to let the clients of an [[API]] deadlock on their own locks, yet there is no reason to block the [[API]] itself. Yet this is the case:
+
This code is not bulletproof. It notifies its listeners while holding own lock and as such this can lead to deadlocks. However it is common that people do these sort of mistakes when using an [[API]]. The [[API]] shall be ready to handle such [[cluelessness|clueless]] clients. It is acceptable to let them deadlock on their own locks, yet there is no reason to block the [[API]] itself. It shall continue to work for other clients if possible. Yet this is the case in this example:
<source lang="java" snippet="monitor.pitfalls.block.propertychange"/>
<source lang="java" snippet="monitor.pitfalls.block.propertychange"/>
-
Simple call to '''get("123")''' deadlocks. Why? Because the [[Java Monitor]]s implant the [[monitor]] concept into object oriented world and in the process of doing so break the most fundamental assumption: the [[monitor]] shall be encapsulated. One shall always define all internal data structures and operations working on them in one monitor. This is not the case of [[Java]]'s synchronized methods at all. By default, the '''Cache''' and its subclass '''MultiplyCache''' share the same [[monitor]] lock!
+
Simple call to '''get("123")''' deadlocks. Why? Because the [[Java Monitor]]s implant the [[monitor]] concept into object oriented world and in the process of doing so break the most fundamental assumption: the [[monitor]] shall be encapsulated.
 +
 
 +
One shall always define all internal data structures and all their operations working on them in one monitor. This is not the case of [[Java]]'s synchronized methods at all. By default, the methods in '''Cache''' and its subclass '''MultiplyCache''' share the same [[monitor]] lock!
=== Implications for [[API]] Design ===
=== Implications for [[API]] Design ===
-
TBD: Use internal locks
+
Now, after blaming [[Java]] of having wrong synchronization model, it is our turn to explain what implications does this have for [[API]] design. Are the above described problems reason why one cannot write good [[API]] in [[Java]]? No, not at all. One just cannot do it [[Cluelessness|cluelessly]], one cannot use the most obvious tool - '''synchronized''' methods - when writing a shared library.
 +
 
 +
Exposing locks in an [[API]] has its own obligations. Other people can obtain them and the [[API]] has to know what to do then. Having some global, publicly accessible locks exposed to unknown clients is a way to ask for problems with [[deadlock]]s.
 +
 
 +
Using ''synchronized'' methods in classes makes the locks accessible to [[API]] clients for public use. Moreover this often happens unconsciously, which is even more dangerous. The fix is simple. Just hide the locks as private objects:
 +
 
 +
<source lang="java" snippet="monitor.pitfalls.CacheOK"/>
 +
 
 +
With a change like this the call
 +
 
 +
<source lang="java" snippet="monitor.pitfalls.brokencall"/>
 +
 
 +
can never [[deadlock]] again, regardless how bad or clueless [[API]] clients one has.
 +
 
 +
=== The way out ===
 +
 
 +
{{:Synchronized}}
 +
 
 +
[[Category:APIDesignPatterns]]
 +
[[Category:APIDesignPatterns:Anti]]

Current revision

The Java synchronization model with all its synchronized, notify and wait is commonly thought of as an example of monitors. However with closer look one finds out that it is far different to the model used in Concurrent Pascal. It is not that robust, it is not well enough integrated to the language and sort of feels like an assembly language with synchronization primitives than high-level abstraction provided by real monitor. However, it has yet another hidden flaw, kind of example of Copy Based Design problem: the Java Monitors implant the monitor into object oriented language, in an object oriented style and that does not work.

Imagine there is an Cache support class in some API:

Code from Cache.java:
See the whole file.

public abstract class Cache<From,To> {
    private Map<From,To> cache;
 
    protected abstract To createItem(From f);
 
    public final To get(From f) {
        To t = inspectValue(f);
        if (t != null) {
            return t;
        }
        To newT = createItem(f);
        return registerValue(f, newT);
    }
 
 
    private synchronized To inspectValue(From f) {
        if (cache == null) {
            cache = new HashMap<From, To>();
        }
        return cache.get(f);
    }
 
    private synchronized To registerValue(From f, To newT) {
        To t = cache.get(f);
        if (t == null) {
            cache.put(f, newT);
            return newT;
        } else {
            return t;
        }
    }
}
 

Its synchronization is sufficiently well designed. It never calls foreign code while holding own lock, which is one of necessary Deadlock Conditions. As such one would expect that there is no way to deadlock on the Cache's own lock. However that would be false assumption. Simple call like:

Code from CacheTest.java:
See the whole file.

int value =  cache.get("123");
assertEquals("3*10=30", 30, value);
 

can lead to deadlock while waiting to enter internal(!) Cache's monitor:

Thread Test Watch Dog: testDeadlockViaAPI
 org.apidesign.javamonitorflaws.Cache.inspectValue:32
 org.apidesign.javamonitorflaws.Cache.get:22
 org.apidesign.javamonitorflaws.CacheTest.testDeadlockViaAPI:134

Moreover this all is achievable using the best Java synchronization practices. How's that possible?

Just imagine there is a subclass of the Cache class which defines few properties and for better or worse guards them as Java Monitor:

Code from MultiplyCache.java:
See the whole file.

public class MultiplyCache extends Cache<String,Integer>
implements CacheTest.CacheToTest {
    private PropertyChangeSupport pcs;
    private int multiply;
    public static final String PROP_MULTIPLY = "multiply";
 
    public synchronized int getMultiply() {
        return multiply;
    }
    public synchronized void setMultiply(int multiply) {
        int oldMultiply = this.multiply;
        this.multiply = multiply;
        pcs.firePropertyChange(PROP_MULTIPLY, oldMultiply, multiply);
    }
 
    public synchronized void addPropertyChangeListener(
        PropertyChangeListener listener
    ) {
        if (pcs == null) {
            pcs = new PropertyChangeSupport(this);
        }
        pcs.addPropertyChangeListener(listener);
    }
    public synchronized void removePropertyChangeListener(
        PropertyChangeListener listener
    ) {
        if (pcs != null) {
            pcs.removePropertyChangeListener(listener);
        }
    }
 
    @Override
    protected Integer createItem(String f) {
        return f.length() * multiply;
    }
}
 

This code is not bulletproof. It notifies its listeners while holding own lock and as such this can lead to deadlocks. However it is common that people do these sort of mistakes when using an API. The API shall be ready to handle such clueless clients. It is acceptable to let them deadlock on their own locks, yet there is no reason to block the API itself. It shall continue to work for other clients if possible. Yet this is the case in this example:

Code from CacheTest.java:
See the whole file.

private static void testDeadlockViaAPI(final CacheToTest cache)
throws Exception {
    class ToDeadlock implements Runnable, PropertyChangeListener {
        int lastMultiply;
 
        public void run() {
            cache.setMultiply(10);
        }
        public void propertyChange(PropertyChangeEvent evt) {
            try {
                storeMultiply();
            } catch (InterruptedException ex) {
                // ok
            }
        }
 
        private synchronized void storeMultiply()
        throws InterruptedException {
            lastMultiply = cache.getMultiply();
            // simulates "starvation"
            wait();
        }
 
        public void assertMultiplyByTen() {
        }
    }
    ToDeadlock toDeadlock = new ToDeadlock();
    cache.addPropertyChangeListener(toDeadlock);
    Thread t = new Thread(toDeadlock, "Deadlock using API");
    t.start();
 
    Thread.sleep(100);
 
    int value =  cache.get("123");
    assertEquals("3*10=30", 30, value);
}
 

Simple call to get("123") deadlocks. Why? Because the Java Monitors implant the monitor concept into object oriented world and in the process of doing so break the most fundamental assumption: the monitor shall be encapsulated.

One shall always define all internal data structures and all their operations working on them in one monitor. This is not the case of Java's synchronized methods at all. By default, the methods in Cache and its subclass MultiplyCache share the same monitor lock!

Implications for API Design

Now, after blaming Java of having wrong synchronization model, it is our turn to explain what implications does this have for API design. Are the above described problems reason why one cannot write good API in Java? No, not at all. One just cannot do it cluelessly, one cannot use the most obvious tool - synchronized methods - when writing a shared library.

Exposing locks in an API has its own obligations. Other people can obtain them and the API has to know what to do then. Having some global, publicly accessible locks exposed to unknown clients is a way to ask for problems with deadlocks.

Using synchronized methods in classes makes the locks accessible to API clients for public use. Moreover this often happens unconsciously, which is even more dangerous. The fix is simple. Just hide the locks as private objects:

Code from CacheOK.java:
See the whole file.

public abstract class CacheOK<From,To> {
    private final Object LOCK = new Object();
 
    private Map<From,To> cache;
 
    protected abstract To createItem(From f);
 
    public final To get(From f) {
        To t = inspectValue(f);
        if (t != null) {
            return t;
        }
        To newT = createItem(f);
        return registerValue(f, newT);
    }
 
 
    private To inspectValue(From f) {
        synchronized (LOCK) {
            if (cache == null) {
                cache = new HashMap<From, To>();
            }
            return cache.get(f);
        }
    }
 
    private To registerValue(From f, To newT) {
        synchronized (LOCK) {
            To t = cache.get(f);
            if (t == null) {
                cache.put(f, newT);
                return newT;
            } else {
                return t;
            }
        }
    }
}
 

With a change like this the call

Code from CacheTest.java:
See the whole file.

int value =  cache.get("123");
assertEquals("3*10=30", 30, value);
 

can never deadlock again, regardless how bad or clueless API clients one has.

The way out

Synchronization is getting more and more important in applications and libraries written these days. However synchronization is hard. The primitives available in Java (or other languages), are ... well, are primitive. Higher level abstractions are available, but still they don't guarantee completely deadlock prone system. This has all been discussed in Chapter 11, Runtime Aspects of APIs.

Java Monitors just aren't what they supposed to be (read why). Thus I am glad to see that the project Lombok's @Synchronized seems to successfully replace the synchronized keyword with annotation (vivat annotations!).


In the name of cluelessness of your Java API users, don't forget to prefer private locks to synchronized methods. Or switch to the beautiful @Synchronized annotation.

Personal tools
buy