'. '

ImpossibleThreading

From APIDesign

(Difference between revisions)
Jump to: navigation, search
(Seeking the Flaws)
(Seeking the Flaws)
Line 19: Line 19:
Since the beginning I knew the problem has no simple solution. Thus I acted as [[impossible|experts do]], I tried to find out why the offered solution is completely stupid. Maybe I was too proud, or more likely I just didn't want to rewrite most of the [[NetBeans]] [[API]]s (and change them incompatibly) to a new threading scheme without guaranteed result - those who follow this web may know I honour [[backward compatibility]] a lot and I don't want to sacrifice it for a fictitious dream.
Since the beginning I knew the problem has no simple solution. Thus I acted as [[impossible|experts do]], I tried to find out why the offered solution is completely stupid. Maybe I was too proud, or more likely I just didn't want to rewrite most of the [[NetBeans]] [[API]]s (and change them incompatibly) to a new threading scheme without guaranteed result - those who follow this web may know I honour [[backward compatibility]] a lot and I don't want to sacrifice it for a fictitious dream.
-
As such I seeked for ways to eliminate our [[deadlock]]s - but not thanks to a new and unproven master plan, but with as little changes as possible (e.g. trying to not shake the [[Amoeba]] of [[NetBeans]] needlessly). At the end I learned how to simulate any [[deadlock]] in a unit test (see [[FlowControllingTest]] page for details) and then everything goes easy. Just apply [[wikipedia:TDD|test driven development]]: Have a bug? Write a test that fails. Fix the code to make the test pass. Repeat!
+
As such I seeked for ways to eliminate our [[deadlock]]s - but not thanks to a new and unproven master plan, but with as little changes as possible (e.g. trying to not shake the [[Amoeba]] of [[NetBeans]] needlessly). At the end I learned how to simulate any [[deadlock]] in a unit test (see [[FlowControllingTest]] page for details) and then everything goes easy. Just apply [[wikipedia:TDD|test driven development]]: Have a bug describing a deadlock? Write a test that fails. Fix the code to make the test pass. Repeat!
As a result the number of [[deadlock]]s in critical areas started to decrease. It took few years, lay-off of my former manager and [[Chapter 11|one chapter]] in [[TheAPIBook]], before it got clear that the threading cannot be fixed by a vision, but rather a hard work. The [[impossible|expert truth]] may eventually reveal, but it takes time.
As a result the number of [[deadlock]]s in critical areas started to decrease. It took few years, lay-off of my former manager and [[Chapter 11|one chapter]] in [[TheAPIBook]], before it got clear that the threading cannot be fixed by a vision, but rather a hard work. The [[impossible|expert truth]] may eventually reveal, but it takes time.

Revision as of 19:46, 3 January 2015

Another story about problems with explaining that something is impossible is here. This time it touches my own experience with threading (and you don't have to understand finite state automata to read it)..

Contents

NetBeans Threading

Once upon a time, probably slightly after year 2000, NetBeans had enormous problems with deadlocks. Not surprisingly. Swing is single-threaded, but we were running a lot of tasks on background and they were competing for resources (like the Swing dispatch thread, or their own locks, etc.). My boss asked me to fix this.

Yes, I was the expert - I knew about deadlock conditions and was aware that it is enough to make sure just one of them is not true and we would have a deadlock-free system. Yet I also remembered my lectures from MatFyz where we were informed that there is no coherent theory to drive development of deadlock-free system. Especially if you have a system composed from independent modules. Each of them may be deadlock-free itself, but when you assemble them together a deadlock can still appear.

Is It Impossible?

I did what experts do. I said: "It's impossible!" and explained my reasoning. Looking back and reminding myself of the finite-state automaton story, it was no surprise my boss didn't listen. I lost my credibility as an expert and he selected somebody else to make NetBeans deadlock free!

As a result we got a detailed write up describing the state of locking at that time (it was really bad) and suggestions to modify state under write-lock and deliver events under read-lock. For a while it seemed to work OK (it takes a while before people report deadlocks in new code), but at the end it turned out this style is actually a source of major and hard to solve deadlocks and long pauses when rendering the UI (because of few global locks preventing rendering while long modification was running).

Now it is easy to see the whole idea suffered from a typical syndrome of designing a "solution" to something that is impossible to be solved, but at beginning? My manager and many others treated it as a real cure.

Seeking the Flaws

Since the beginning I knew the problem has no simple solution. Thus I acted as experts do, I tried to find out why the offered solution is completely stupid. Maybe I was too proud, or more likely I just didn't want to rewrite most of the NetBeans APIs (and change them incompatibly) to a new threading scheme without guaranteed result - those who follow this web may know I honour backward compatibility a lot and I don't want to sacrifice it for a fictitious dream.

As such I seeked for ways to eliminate our deadlocks - but not thanks to a new and unproven master plan, but with as little changes as possible (e.g. trying to not shake the Amoeba of NetBeans needlessly). At the end I learned how to simulate any deadlock in a unit test (see FlowControllingTest page for details) and then everything goes easy. Just apply test driven development: Have a bug describing a deadlock? Write a test that fails. Fix the code to make the test pass. Repeat!

As a result the number of deadlocks in critical areas started to decrease. It took few years, lay-off of my former manager and one chapter in TheAPIBook, before it got clear that the threading cannot be fixed by a vision, but rather a hard work. The expert truth may eventually reveal, but it takes time.

Don't Say It is Impossible

The feeling of expert in me was right: It is impossible to fix threading only with a great vision. Even my favourite way to unchain one of the deadlock conditions - e.g.: never call foreign code while holding a lock is not enough as Chapter 11 analyses. Hard work is necessary, but how to sell it?

Given the few years of struggling I had to go through, I would reply differently hearing the original question again: rather than saying fighting deadlocks is impossible, I'd say we need to create a process to help our developers fight with deadlocks properly (e.g. they have to write a test before fixing a deadlock). The result would be the same and I would go through less suffering. Moreover such answer might have suited my manager more, as he was famous for mixing technical and human factors by saying: we have a technical issue, we need somebody to ...."

Should we think twice before claiming something is impossible? Yeah, sometimes it is hard to explain why something is impossible and there is always a lot of dummies to offer half-baken solutions. But, explaining impossibility is necessary from time to time! Btw. I still have one more topic about imposibility to cover - to be continued...

Personal tools
buy