FlowControllingTest

From APIDesign

(Difference between revisions)
Jump to: navigation, search
(Execution Flow Control using Logging)
(Advanced usage of Logging)
Line 75: Line 75:
</source>
</source>
-
It says that logging of all message levels shall be collected and as such if any of the test methods fails, the reported exception is going to contain output of all messages logged during execution of the program in format like: [name.of.the.logger] THREAD: threadName MSG: The message Of course the text in the exception message is truncated, to be of reasonable length, but the full log report is also available. In the NbTestCase.getWorkDir() which would usually be /tmp/tests/testLogCollectingTest/testMethodOne is a file called testMethodOne.log which contains last 1Mb of logged output. That should be enough for cases when one needs deeper analysis of some complex logging code, as most of the recently logged information shall be there. It shall also be mentioned that if one is using [http://xtest.netbeans.org xtest] to execute the tests, the generated HTML report includes the copy of the workdir including the output log file. Sometimes the logged messages can get pretty long and especially in case of random failures it may not be completely easy to analyse what is going on in the test. The best strategy to fight with such situation we have discoveder so far, is to add fail("Ok"); line into the place in the test where the execution randomly fails. This is going to generate a log file as well and the two log files - the one from real failure - and the one from the correct run can then be compared and diffed against each other. Of course it is wise to replace all the @543hac5f messages in the output with something more neutral, as each run is going to have these memory locations different. After eliminating this difference, it is generally possible to get a reasonable diff and understand what is the difference between the those two runs and what can be the root cause of the failure. There maybe two possible reasons why such an understanding to the differences between log files might not be possible. The first is that there may not be enough logs. The information recorded in the log file is just to coarse to provide a meaningful information of what has been happening during the test run. Indeed, the fix is to insert more logging messages into the test code. Using the NbTestCase.logLevel method one can add more logging on various levels like Level.FINE, Level.FINER and Level.FINEST and in different test cases enable different test messages. In case of very suspicious random failure which can potentially influence a critical piece of code, it may even make sense to log infomarmation about each line of the code that gets executed, possibly with arguments send to various methods and state of local variables. That shall generate enough information to allow detailed understanding of the reasons of random failures. On the other hand, there is a hidden catch: The more logging is put into the application, the more is its behaviour going to be different compared to the execution of the application without the logging enabled. Every call to Logger.log is going to add additional delays, formating of recorded messages is going to take a while, etc., etc. That is why a failure that might occur in one third of cases, may nearly disappear, due to all the logging code around. Indeed such a failure gets harder and harder to repeat and thus evaluate. One can nearly give up on trying to emulate it in debugger, or in the worse case, one cannot even face such failure on own computer. However from time to time the failure appears on someones machine. In NetBeans we have farm of testing machines running tests every day in various configurations, and this really helps to track such rare errors down. Althrough we may not even face the bug on the developer's workstations, from time to time we get a report from the testing infrastructure with full logging information and we can start to hunt the bug down. The other side of the problem is that there is already too much logging messages and it is hard to get it right from where they come from. For example if the test executes a series of steps which result in repeated calls to the same place of the application code that of course prints the same log messages, one can easy get lost. This is especially hard if the applications does some computation in a loop repeating the same or at least similar messages on and on and on. In such case the really helpful advice is to put logging messages into the testing code itself. Those messages are the anchor that can lead the base orientation in the log output. When comparing the differences between the successful and unsuccessful run, one can first of all locate the test messages and compare the difference of application messages inserted among them. Basically the best advice to fight with random failures is to do a lot of logging, capture it in tests and print enough of logging messages in the application and also in the test code itself. Then the likehood of understanding the root cause of the failure increases dramatically. A bit different direction of logging is testing that something really has been logged. For example when a new method in one's API that shall for some reason be overriden by subclasses, it might be desirable to let already existing classes compiled against previous version of the class to know about it during runtime: protected boolean overideMePlease() { Logger.getLogger("my.logger").warning("subclasses are supposed to override overideMePlease() method!"); // some default return true; } Of course everyone who is dead set on testing wants to be sure that the warning really get's printed in the right situations. To do that one can register is own logger or in NetBeans prior to 5.0 its own ErrorManager (like in the [http://www.netbeans.org/source/browse/openide/test/unit/src/org/openide/util/actions/AsynchronousTest.java?rev=1.1&content-type=text/x-cvsweb-markup AsynchronousTest.java]) and makes sure that its warning method is called. However as this was way too common pattern in NetBeans the there is a special utility method in '''NbJUnit''' that will handle everything itself. So one can just do: CharSequence log = Log.enable("my.logger", Level.ALL); // do the necessary actions if (log.toString().indexOf("warned") == -1) { fail("Warning shall be printed: " + log); } This is just a little utility method, but it helps to easily access and analyze what gets logged from inside the test case. Together with the rest of the support in org.netbeans.junit.NbTestCase and org.netbeans.junit.Log classes this should provide good enough support for fighting with the unwanted shaking of the amoeba's edge using logging.
+
It says that logging of all message levels shall be collected and as such if any of the test methods fails, the reported exception is going to contain output of all messages logged during execution of the program in format like: [name.of.the.logger] THREAD: threadName MSG: The message Of course the text in the exception message is truncated, to be of reasonable length, but the full log report is also available. In the NbTestCase.getWorkDir() which would usually be /tmp/tests/testLogCollectingTest/testMethodOne is a file called testMethodOne.log which contains last 1Mb of logged output. That should be enough for cases when one needs deeper analysis of some complex logging code, as most of the recently logged information shall be there. It shall also be mentioned that if one is using [http://xtest.netbeans.org xtest] to execute the tests, the generated HTML report includes the copy of the workdir including the output log file. Sometimes the logged messages can get pretty long and especially in case of random failures it may not be completely easy to analyse what is going on in the test. The best strategy to fight with such situation we have discoveder so far, is to add fail("Ok"); line into the place in the test where the execution randomly fails. This is going to generate a log file as well and the two log files - the one from real failure - and the one from the correct run can then be compared and diffed against each other. Of course it is wise to replace all the @543hac5f messages in the output with something more neutral, as each run is going to have these memory locations different. After eliminating this difference, it is generally possible to get a reasonable diff and understand what is the difference between the those two runs and what can be the root cause of the failure. There maybe two possible reasons why such an understanding to the differences between log files might not be possible. The first is that there may not be enough logs. The information recorded in the log file is just to coarse to provide a meaningful information of what has been happening during the test run. Indeed, the fix is to insert more logging messages into the test code. Using the NbTestCase.logLevel method one can add more logging on various levels like Level.FINE, Level.FINER and Level.FINEST and in different test cases enable different test messages. In case of very suspicious random failure which can potentially influence a critical piece of code, it may even make sense to log infomarmation about each line of the code that gets executed, possibly with arguments send to various methods and state of local variables. That shall generate enough information to allow detailed understanding of the reasons of random failures. On the other hand, there is a hidden catch: The more logging is put into the application, the more is its behaviour going to be different compared to the execution of the application without the logging enabled. Every call to Logger.log is going to add additional delays, formating of recorded messages is going to take a while, etc., etc. That is why a failure that might occur in one third of cases, may nearly disappear, due to all the logging code around. Indeed such a failure gets harder and harder to repeat and thus evaluate. One can nearly give up on trying to emulate it in debugger, or in the worse case, one cannot even face such failure on own computer. However from time to time the failure appears on someones machine. In NetBeans we have farm of testing machines running tests every day in various configurations, and this really helps to track such rare errors down. Althrough we may not even face the bug on the developer's workstations, from time to time we get a report from the testing infrastructure with full logging information and we can start to hunt the bug down. The other side of the problem is that there is already too much logging messages and it is hard to get it right from where they come from. For example if the test executes a series of steps which result in repeated calls to the same place of the application code that of course prints the same log messages, one can easy get lost. This is especially hard if the applications does some computation in a loop repeating the same or at least similar messages on and on and on. In such case the really helpful advice is to put logging messages into the testing code itself. Those messages are the anchor that can lead the base orientation in the log output. When comparing the differences between the successful and unsuccessful run, one can first of all locate the test messages and compare the difference of application messages inserted among them. Basically the best advice to fight with random failures is to do a lot of logging, capture it in tests and print enough of logging messages in the application and also in the test code itself. Then the likehood of understanding the root cause of the failure increases dramatically. A bit different direction of logging is testing that something really has been logged. For example when a new method in one's API that shall for some reason be overriden by subclasses, it might be desirable to let already existing classes compiled against previous version of the class to know about it during runtime:
 +
 
 +
<source lang="java">
 +
protected boolean overideMePlease() {
 +
Logger.getLogger("my.logger").warning("subclasses are supposed to override overideMePlease() method!");
 +
// some default
 +
return true;
 +
}
 +
</source>
 +
 
 +
Of course everyone who is dead set on testing wants to be sure that the warning really get's printed in the right situations. To do that one can register is own logger or in NetBeans prior to 5.0 its own '''ErrorManager''' (like in the [http://www.netbeans.org/source/browse/openide/test/unit/src/org/openide/util/actions/AsynchronousTest.java?rev=1.1&content-type=text/x-cvsweb-markup AsynchronousTest.java]) and makes sure that its warning method is called. However as this was way too common pattern in NetBeans the there is a special utility method in '''NbJUnit''' that will handle everything itself. So one can just do:
 +
 
 +
<source lang="java">
 +
CharSequence log = Log.enable("my.logger", Level.ALL); // do the necessary actions
 +
if (log.toString().indexOf("warned") == -1) {
 +
fail("Warning shall be printed: " + log);
 +
}
 +
</source>
 +
 
 +
This is just a little utility method, but it helps to easily access and analyze what gets logged from inside the test case. Together with the rest of the support in org.netbeans.junit.NbTestCase and org.netbeans.junit.Log classes this should provide good enough support for fighting with the unwanted shaking of the amoeba's edge using logging.
==Execution Flow Control using Logging==
==Execution Flow Control using Logging==

Revision as of 08:39, 18 June 2020

TBD needs proper formatting and splitting to multiple pages.

Contents

Deadlock Test

Fighting with deadlocks is a sad destiny of any multithreaded application. The problem field has been under extensive research because it causes huge problems for every writer of an operating system. Most of the applications are not as complex as operating systems, but as soon as you allow foreign code to run in your application, you basically have to fight with the same set of problems. In spite of the huge research efforts, there was no simple answer / solution found. We know that there are four necessary and sufficient deadlock conditions for a deadlock to be created: # Mutual exclusion condition - there has to be a resource (lock, execution queue, etc.) that can be owned by just one thread # Non-preemptive scheduling condition - it is not possible to take away or release a resource already assigned by anyone else than its owner # Hold and wait condition - a thread can wait for a resource indefinitely and can hold it indefinitely # Resources can be acquired incrementally - one can ask for a new resource (lock, execution queue), while already holding another one But we do not know how the code that prevents at least one condition to appear shall look like and definitely we do not know how to do a static analysis over source code to check whether a deadlock can or cannot appear. The basic and in fact very promising advice for a programmer in a language with threads and locks like Java has is: do not hold any lock while calling foreign code. By following this rule one eliminates the fourth condition and as all four must be satisfied to create a deadlock, we may believe we found the ultimate solution to deadlocks. But in fact, it is sometimes very hard to satisfy such restriction. Can the following code deadlock?

private HashSet allCreated = new HashSet (); 
public synchronized JLabel createLabel () { 
  JLabel l = new JLabel (); 
  allCreated.add (l); 
  return l; 
}

It feels safe as the only real call is to HashSet.add and HashSet does not use synchronized at all. But in fact there is a lot of room for failures. The first problem is that JLabel extends JComponent and somewhere in its constructor one acquires the awt tree lock (JComponent.getTreeLock()). And if someone writes a component that overrides: public Dimension getPreferredSize () { JLabel sampleLabel = createLabel (); return sampleLabel.getPreferredSize (); } we are in danger of deadlock as the getPreferredSize is often called when a component is painted and while the awt tree lock is held. So even though we tried really hard to not call foreign code, we did it.

The second and even less visible problem is the implementation of HashSet. It uses Object.hashCode() and Object.equals which again can call virtually anywhere (any object can override them) and if the implementation acquires another lock, we can get into similar, but even less expected, problems. Talking about possible solutions for deadlocks would provide enough materials for its own article, so let us return back to topic of this one - writing tests. In Java, the solution to deadlocks are often easy. Whenever the application freezes, the user can produce a thread dump and from that we can get the description of the problem. From there, there is just a step to fix, just lock on another lock, or use SwingUtilities.invokeLater to reschedule the code in question from the dangerous section sometime later. We used this style for a few years and the result is that our code started to be unpredictable and we have not really fixed much of the deadlocks when we modified the code to fix one, we often created a new one. My favourite example are changes made in our classes on Jun 26, 2000 and Feb 2, 2004. Both tried to fix a deadlock and the second one effectively returned the state back, prior to the first integration. That means we have successfully shifted the [#amoeba amoeba] shape of our application code to fix a deadlock in the year 2000, and four years later we just shifted it once more to improve in one part, but regress with respect to the 2000's fix. This would have never happened if together with the first fix, we also integrated a test! A test for a deadlock!? Yes, a test for a deadlock. However surprising that may sound, it is possible and often not that hard (we often write a test for deadlock in about two hours, we never needed more than a day). Beyond the automated nature of such test, it also gives the developer confidence that he really fixed something, which is not usually fully obvious due to the esoteric nature of deadlock fixes, as they cannot be usually reproduced and thus verified by anyone. Also, when there is a test, one can choose a simpler solution that fixes the problem than to invent intellectually elegant, but in fact complicated, one. The result is that the art of deadlock fixing turns into regular engineering work. And we all want our applications to be developed by engineers, do we not? Writing a test for a deadlock is not that hard. In our imaginary situation with createLabel we could do that by writing a component, overriding getPreferredSize, stopping the thread and waiting while another one locks the resources in the opposite way:

public class CreateLabelTest extends TestCase { 
  public void testSimulateTheDeadlock () { 
    MyComponent c = new MyComponent (); 
    c.validate (); 
  } 
 
  private static class MyComponent extends JComponent implements Runnable { 
    public synchronized Dimension getPreferredSize () { 
      JLabel sampleLabel = createLabel (); 
      new Thread (this).start (); 
      wait (1000); 
      assertNotNull ("Also can create label", createLabel ()); 
      return sampleLabel.getPreferredSize ();
    } 
 
    public void run () { 
      assertNotNull ("We can create the label", createLabel ()); 
      synchronized (this) { 
        notifyAll (); 
      } 
    } 
}

The test works with two threads, one creates a component and validates it, which results in a callback to getPreferredSize under the awt tree lock, at this moment we start another thread and wait a while for it to acquire the createLabel lock. Under the current implementation this blocks in the JLabel constructor and as soon as our thread continues (after 1000ms) we create the deadlock. There can be a lot of fixes, but the simplest one is very likely to synchronize on the same lock as the JLabel constructor does: public JLabel createLabel () { synchronized (JLabel.getTreeLock ()) { JLabel l = new JLabel (); allCreated.add (l); return l; } } The fix is simple - much simpler than the test - but without the test, we would not fix the shape of our [#amoeba amoeba]. So the time spent writing the test is likely to get paid back. Often the test can be written by using an already existing api, like in our case the getPreferredSize method (for example our test. Only in special situations does one need to introduce a special method that helps the test to simulate the problem (we used that in our howToReproduceDeadlock40766(boolean) called from PositionRef.java . Anyway deadlock tests are pure regression tests - one writes them when a bug is reported, nobody is going to write them in advance. At the beginning it is much wiser to invest in good design, but as we explained earlier, as there is no really universal theory on how to prevent deadlocks, one should know what he wants to do when a deadlock appears, for that we suggest that testing is the best way with respect to our [#amoeba amoeba] shape.

Testing Race Conditions

There is no such thing as 100% CodeCoverage, but that doesn't mean we shouldn't strive to increase BugFixCoverage. The FlowControllingTest test pattern make it possible to even test race conditions.

Test to Simulate Race conditions

While certain problems with multiple threads and their synchronization are hard to anticipate, as deadlocks mentioned earlier, sometimes it is possible and useful to write a test to verify that various problems with parallel execution are correctly handled.

We have faced such problem when asked to write a startup lock for NetBeans. The goal was to solve a situation when a user starts the NetBeans IDE for the second time and warn him that another instance of the program is already running and then exit. This is similar to the behaviour of Mozilla or Open Office. We decided to allocate a socket server and create a file in a well known location with the port number written to it. Then each newly started NetBeans IDE could verify whether a previously running instance is active or not (by reading the port number and trying to communicate with it).

The major problem we had to optimize for was a situation when the user starts more NetBeans IDE processes at once. This can happen by extra clicks on the icon on the desktop or by dragging and dropping more files on the desktop icon of the IDE. Then more processes are started and they [[Race conditions|compete] to lock the user directory. The sequence of one process looks like this:

if (lockFile.exists ()) {
  // read the port number and connect to it 
  if (alive) {
    // exit
    return; 
  } 
} 
// otherwise try to create the file yourself
lockFile.createNewFile(); 
DataOutputStream os = new DataOutputStream(new FileOutputStream(lockFile)); 
SocketServer server = new SocketServer(); 
int p = server.getLocalPort(); 
os.writeInt(p);
os.close()

The above code can be interrupted at any time by the system. Instead of executing all the code as an atomic operation, OS can suspend the process and the control can be passed to a competing process which races to perform the same actions.

  • What happens when one process creates the file, and another tries to read it meanwhile, before a port number is written to it?
  • What if there is a file left from a previous (killed) execution?
  • What happens when a test for file existence fails, but when trying to create it the file already exists?

All these questions have to be asked when one wants to have really good confidence in the application code.

Simple FlowControllingTest written manually

In order to get the confidence we wanted, we inserted a lot of check points into our implementation of locking so the code became a modified version of the previous snippet:

enterState(10, block); 
if (lockFile.exists ()) { 
  enterState(11, block); 
  // read the port number and connect to it if (alive) {
   // exit 
   return; 
  } 
} 
// otherwise try to create the file yourself 
enterState(20, block); 
lockFile.createNewFile(); 
DataOutputStream os = new DataOutputStream(new FileOutputStream(lockFile)); 
SocketServer server = new SocketServer(); 
enterState(21, block); 
int p = server.getLocalPort(); 
enterState(22, block); 
os.writeInt(p); 
enterState(23, block);
os.close();

The enterState method does nothing in real production environment, but in test it can be instructed to block at a specific check point. So we can write a test which starts two threads and instruct one of them to stop at 22 and then let the second one run and observe how it handles the case when a file already exists, but the port is not yet written in.

This approach worked pretty well and despite the skeptical opinions we heard when we tried to solve this problem, we got about 90% of the behaviour right before we integrated the first version. Yes, there was still more work to do and bugs to be fixed, but because we had really good automated tests for the behaviour we really implemented, our amoeba edge was well stiffened and we had enough confidence that we can fix all outstanding problems.

Analyzing Random Failures

Those 10% of random failures mentioned in the previous part emerged, as usual, into more work than just the next 10% of additional tests and a few fixes. They inspired this whole new part, as dealing with failures that happen just from time to time and usually on a computer that you do not own, requires more sophisticated techniques be used for their tracking. The problem with parallel execution is that there is really not much help anyone can get if he wants to use it correctly. The methodology is either weak or missing or just too concentrated on specific cases, the debuggers are not really ready to push the debugged applications to their parallel limits, so in order to really move somewhere, people resort to the oldest solution - to println and logging. The old approach is pretty simple - add log messages into your code, run it a few times, wait until it starts to misbehave and then try to figure out from the log file what went wrong and fix it. In the case of automated tests a similar approach can be used. Enhance the application code and also the tests with logging, and if the test fails, output all the collected log messages as part of the failure report. We have achieved this by writing our own implementation of ErrorManager (which is a NetBeans class used for logging and error reporting), but one can do this in any test by using java.util.logging and implementing its Handler. The implementation has to be registered at the beginning of the test and has to capture all logged messages and in the case of failure make them part of the failure message: public class MyTest extends NbTestCase { static { System.setProperty ("org.openide.util.Lookup", "MyTest$Lkp"); } public MyTest (String name) { super (name); } protected void runTest () { ErrManager.messages.clear (); try { super.runTest (); } catch (AssertionFailedError err) { throw new AssertionFailedError (err.getMessage() + " Log:\n" + ErrManager.messages); } } public void testYourTest() throws Exception { // invoke some code ErrorManager.getDefault().log ("Do some logging"); // another code ErrorManager.getDefault().log ("Yet another logging"); } public static final class Lkp extends org.openide.util.lookup.AbstractLookup { private Lkp (org.openide.util.lookup.InstanceContent ic) { super (ic); ic.add (new MyErr ()); } } private static final class ErrManager extends org.openide.ErrorManager { public static final StringBuffer messages = new StringBuffer (); public void log (int severity, String s) { messages.append (s); messages.append ('\n'); } public void notify (int severity, Throwable t) { messages.append (t.getMessage ()); messages.append ('\n'); } } } The logging can be done by the test to mark important sections of its progress, but the main advantage is that your code shall be full of ErrorManager.log or (if you use the standard Java logging) java.util.logging.Logger.log. The test then collects messages from all places and in the case of failure, provides a complete and detailed (well, depending on how often and useful the log messages are delivered) description of the failure which then can be analyzed and either fixed or the logging made more detailed to help to track the problem down (as we did for example in ChildrenKeysIssue30907Test.java). Sometimes people are reluctant to analyze random failures in tests as something that does not affect the production code. In fact, it may turn out that the problem is in the test and the code is ok, but relying on this is usually false hope. Without a deeper understanding of the problem, it can be in the application code and even if it is not reproducible all the time, if it occurs it can have huge consequences. If we want to have enough trust in the behaviour of our application and make its [#amoeba amoeba] shape less amoebic, logging in application code and logging friendly tests turn out to be a very useful tool.

Advanced usage of Logging

The NetBeans project started to introduce tests that collect logged output slowly and only in the most exposed places, where the failures were often due to overly complicated usage of multiple threads. However one should remember that every application written in Java is non-deterministic, even if it consciously uses just one thread. There is always the garbage collector that starts in random moments depending on conditions external to the tested program and also when dealing with visual components, there is at least the AWT event dispatch thread, which can also receive events in not that strictly defined order. As a result it turned out that the logging support is pretty useful in almost any test. Because of that NetBeans decided to support it in its NbJUnit extensions. The simplest add-on is to support collecting of messages logged during run of a test. To get this, one can just override the logLevel method in any subclass of NbTestCase. Let's look at following example:

public class LogCollectingTest extends NbTestCase {
  public LogCollectingTest(String n) { 
    super(n); 
  }
 
  protected Level logLevel() { 
    return Level.ALL; 
  } 
 
  public void testMethodOne() { /* .... */ } 
 
  public void testMethodTwo() { /* .... */ } 
 
  public void testMethodThree() { /* .... */ }
}

It says that logging of all message levels shall be collected and as such if any of the test methods fails, the reported exception is going to contain output of all messages logged during execution of the program in format like: [name.of.the.logger] THREAD: threadName MSG: The message Of course the text in the exception message is truncated, to be of reasonable length, but the full log report is also available. In the NbTestCase.getWorkDir() which would usually be /tmp/tests/testLogCollectingTest/testMethodOne is a file called testMethodOne.log which contains last 1Mb of logged output. That should be enough for cases when one needs deeper analysis of some complex logging code, as most of the recently logged information shall be there. It shall also be mentioned that if one is using xtest to execute the tests, the generated HTML report includes the copy of the workdir including the output log file. Sometimes the logged messages can get pretty long and especially in case of random failures it may not be completely easy to analyse what is going on in the test. The best strategy to fight with such situation we have discoveder so far, is to add fail("Ok"); line into the place in the test where the execution randomly fails. This is going to generate a log file as well and the two log files - the one from real failure - and the one from the correct run can then be compared and diffed against each other. Of course it is wise to replace all the @543hac5f messages in the output with something more neutral, as each run is going to have these memory locations different. After eliminating this difference, it is generally possible to get a reasonable diff and understand what is the difference between the those two runs and what can be the root cause of the failure. There maybe two possible reasons why such an understanding to the differences between log files might not be possible. The first is that there may not be enough logs. The information recorded in the log file is just to coarse to provide a meaningful information of what has been happening during the test run. Indeed, the fix is to insert more logging messages into the test code. Using the NbTestCase.logLevel method one can add more logging on various levels like Level.FINE, Level.FINER and Level.FINEST and in different test cases enable different test messages. In case of very suspicious random failure which can potentially influence a critical piece of code, it may even make sense to log infomarmation about each line of the code that gets executed, possibly with arguments send to various methods and state of local variables. That shall generate enough information to allow detailed understanding of the reasons of random failures. On the other hand, there is a hidden catch: The more logging is put into the application, the more is its behaviour going to be different compared to the execution of the application without the logging enabled. Every call to Logger.log is going to add additional delays, formating of recorded messages is going to take a while, etc., etc. That is why a failure that might occur in one third of cases, may nearly disappear, due to all the logging code around. Indeed such a failure gets harder and harder to repeat and thus evaluate. One can nearly give up on trying to emulate it in debugger, or in the worse case, one cannot even face such failure on own computer. However from time to time the failure appears on someones machine. In NetBeans we have farm of testing machines running tests every day in various configurations, and this really helps to track such rare errors down. Althrough we may not even face the bug on the developer's workstations, from time to time we get a report from the testing infrastructure with full logging information and we can start to hunt the bug down. The other side of the problem is that there is already too much logging messages and it is hard to get it right from where they come from. For example if the test executes a series of steps which result in repeated calls to the same place of the application code that of course prints the same log messages, one can easy get lost. This is especially hard if the applications does some computation in a loop repeating the same or at least similar messages on and on and on. In such case the really helpful advice is to put logging messages into the testing code itself. Those messages are the anchor that can lead the base orientation in the log output. When comparing the differences between the successful and unsuccessful run, one can first of all locate the test messages and compare the difference of application messages inserted among them. Basically the best advice to fight with random failures is to do a lot of logging, capture it in tests and print enough of logging messages in the application and also in the test code itself. Then the likehood of understanding the root cause of the failure increases dramatically. A bit different direction of logging is testing that something really has been logged. For example when a new method in one's API that shall for some reason be overriden by subclasses, it might be desirable to let already existing classes compiled against previous version of the class to know about it during runtime:

protected boolean overideMePlease() { 
    Logger.getLogger("my.logger").warning("subclasses are supposed to override overideMePlease() method!"); 
    // some default 
    return true; 
  }

Of course everyone who is dead set on testing wants to be sure that the warning really get's printed in the right situations. To do that one can register is own logger or in NetBeans prior to 5.0 its own ErrorManager (like in the AsynchronousTest.java) and makes sure that its warning method is called. However as this was way too common pattern in NetBeans the there is a special utility method in NbJUnit that will handle everything itself. So one can just do:

CharSequence log = Log.enable("my.logger", Level.ALL); // do the necessary actions 
  if (log.toString().indexOf("warned") == -1) { 
    fail("Warning shall be printed: " + log); 
  }

This is just a little utility method, but it helps to easily access and analyze what gets logged from inside the test case. Together with the rest of the support in org.netbeans.junit.NbTestCase and org.netbeans.junit.Log classes this should provide good enough support for fighting with the unwanted shaking of the amoeba's edge using logging.

Execution Flow Control using Logging

This section is going to build on top of the content of previous description of logging support and on sections that were discussing testing for race conditions and tests that try to simulate deadlocks. Both of these tests need some kind of hooks in the code that allow the testing environment to simulate some obscure situations and either corrupt some internal data structure or stop the thread in a place that is going to cause a deadlock with other executing thread. The previous parts suggested to put special methods accessible from the testing code that allow such kinds of execution control. Either to have code full of inserted statements like enterState(10, block) or to provide some overridable method like howToReproduceDeadlock40766(boolean) and make it do something insane in the test. This is possible, but actually there is a small improvement which is not easy to find but when discovered it seems so natural and easy to use, that one just has to wonder why it has not come to the minds earlier: Instead of putting special methods in the code, one can use logging! Logging? Yes, logging! The beauty of such solution is in the fact that logging is or at least should be natural part of every at least a bit complicated program anyway, so one does not have to obscure the code with enterState and howToReproduceDeadlock40766(boolean), instead one can just use: logger.finest("Enter state 20"); // or logger.finest("reproduceDeadlock40766now"); The first log is fully natural, the second is a bit suspicious because of the name of its message, but still it is not that much strange to be seen as alien piece of code not related to the program at all. Logging just belongs to programs. Now the testing code can register its own Handler and in its publish method do all the wild things:

class Reproduce40766 extends Handler { 
  public void publish(LogRecord rec) { 
    if ("reproduceDeadlock40766now".equals(rec.getMessage())) { 
      // block and let other thread run as shown 
      // in [http://www.netbeans.org/source/browse/openide/test/unit/src/org/openide/text/Deadlock40766Test.java?rev=1.2.2.2&content-type=text/x-cvsweb-markup Deadlock40766Test.howToReproduceDeadlock40766(boolean)] 
    } 
  } 
} 
 
Logger.getLogger("").addHandler(new Reproduce40766());

So instead of inseminating the application code with special hacks, one can achieve the same by adding a logging handler and analyzing messages passed to it during execution of the test. The application code stays clean and the test gets as powerful as is the number of messages that are logged in the application code. Because each logged message is a chance for the test to influence the behaviour of the application. Taken to an extreme, it shall be possible to fully control the behaviour a multithreaded program by suspeding all other threads than the actual one that shall be executed. Imagine following program:

class Parael implements Runnable { 
  public void run() { 
    Random r = new Random(); 
    for (int i = 0; i < 10; i++) { 
      try { 
        Thread.sleep(r.nextInt(100)); 
      } catch (InterruptedException ex) {
      } 
      Logger.global.log(Level.WARNING, "cnt: {0}", new Integer(i)); 
    } 
  } 
 
  public static void main(String[] args) throws InterruptedException { 
    Thread t1 = new Thread(new Parael(), "1st"); 
    Thread t2 = new Thread(new Parael(), "2nd"); 
    t1.start(); 
    t2.start(); 
    t1.join(); 
    t2.join(); 
  }
}

The program runs two threads, each of them counting by one to ten while pausing random number of miliseconds. Obviously the threads are going to be run in parael and the speed of their counting is going to be random. This can be easily verified by a simple NbTestCase with enabled logging:

public class ParaelTest extends NbTestCase { 
  public ParaelTest(String testName) { 
    super(testName); 
  } 
 
  protected Level logLevel() { 
    return Level.WARNING; 
  } 
 
  public void testMain() throws Exception { 
    Parael.main(null); 
    fail("Ok, just print logged messages"); 
  } 
}

When executed once, the output can be for example like the one shown in the left, next time it can look like the one shown on right:

[global] THREAD: 2nd MSG: cnt: 0 
[global] THREAD: 1st MSG: cnt: 0 
[global] THREAD: 2nd MSG: cnt: 1 
[global] THREAD: 2nd MSG: cnt: 2 
[global] THREAD: 2nd MSG: cnt: 3 
[global] THREAD: 2nd MSG: cnt: 4 
[global] THREAD: 1st MSG: cnt: 1 
[global] THREAD: 1st MSG: cnt: 2 
[global] THREAD: 2nd MSG: cnt: 5 
[global] THREAD: 2nd MSG: cnt: 6 
[global] THREAD: 1st MSG: cnt: 3 
[global] THREAD: 1st MSG: cnt: 4 
[global] THREAD: 2nd MSG: cnt: 7 
[global] THREAD: 1st MSG: cnt: 5 
[global] THREAD: 2nd MSG: cnt: 8 
[global] THREAD: 2nd MSG: cnt: 9 
[global] THREAD: 1st MSG: cnt: 6 
[global] THREAD: 1st MSG: cnt: 7 
[global] THREAD: 1st MSG: cnt: 8 
[global] THREAD: 1st MSG: cnt: 9 

another random execution:

[global] THREAD: 2nd MSG: cnt: 0 
[global] THREAD: 1st MSG: cnt: 0 
[global] THREAD: 2nd MSG: cnt: 1 
[global] THREAD: 1st MSG: cnt: 1 
[global] THREAD: 2nd MSG: cnt: 2 
[global] THREAD: 1st MSG: cnt: 2 
[global] THREAD: 2nd MSG: cnt: 3 
[global] THREAD: 2nd MSG: cnt: 4 
[global] THREAD: 1st MSG: cnt: 3 
[global] THREAD: 2nd MSG: cnt: 5 
[global] THREAD: 2nd MSG: cnt: 6 
[global] THREAD: 1st MSG: cnt: 4 
[global] THREAD: 1st MSG: cnt: 5 
[global] THREAD: 2nd MSG: cnt: 7 
[global] THREAD: 1st MSG: cnt: 6 
[global] THREAD: 2nd MSG: cnt: 8 
[global] THREAD: 1st MSG: cnt: 7 
[global] THREAD: 2nd MSG: cnt: 9 
[global] THREAD: 1st MSG: cnt: 8 
[global] THREAD: 1st MSG: cnt: 9

Of course, when the program is executed again and again the output is unlikely to look exactly same as the previous one. That is fine, when two threads run in parael, the result is obviusly non-deterministic. But imagine that one of the execution orders is somehow special. For example that it is known to cause some race condition or deadlock. Indeed such wrong behaviour should be fixed, but aligned with the intention of this whole paper, it shall also be simulated, otherwise the fight with amoeba is never going to end. That is why one should try to write a test that simulates the order of execution that is known to be broken. For example let us try to generate higly unlikely sequence where each thread increments and prints one number, goes to sleep and lets the other thread run. It is highly unprobable that such kind of output happened randomly, so it is reasonable to question whether such test can be writen at all. But it can! With a little help from logging, here is the code that forces the threads to behave deterministicaly and always output just one number from the sequence:

public class ParaelTest extends NbTestCase {
  public ParaelTest(String testName) { 
    super(testName); 
  } 
 
  protected Level logLevel() { return Level.WARNING; } 
 
  public void testMain() throws Exception { 
    Logger.global.addHandler(new BlockingHandler()); 
    Parael.main(null); 
    fail("Ok, just print the logged output"); 
  } 
 
  private static final class BlockingHandler extends Handler { 
    boolean runSecond; 
    public synchronized void publish(LogRecord record) { 
      if (!record.getMessage().startsWith("cnt")) return; 
 
      if (runSecond == Thread.currentThread().getName().equals("2nd")) { 
        notify(); 
        runSecond = !runSecond; 
      } 
 
      try { 
        wait(500); 
      } catch (InterruptedException ex) {
      } 
    } 
 
    public void flush() {
    } 
 
    public void close() {
    } 
  } 
}
When the test is executed it can really be seen that each thread adds one to its counter and gives execution control to the other thread. And this behaves more or less deterministicly - e.g. nearly every execution of the test yields:
[global] THREAD: 1st MSG: cnt: 0 
[global] THREAD: 2nd MSG: cnt: 0 
[global] THREAD: 1st MSG: cnt: 1 
[global] THREAD: 2nd MSG: cnt: 1 
[global] THREAD: 1st MSG: cnt: 2 
[global] THREAD: 2nd MSG: cnt: 2 
[global] THREAD: 1st MSG: cnt: 3 
[global] THREAD: 2nd MSG: cnt: 3 
[global] THREAD: 1st MSG: cnt: 4 
[global] THREAD: 2nd MSG: cnt: 4 
[global] THREAD: 1st MSG: cnt: 5 
[global] THREAD: 2nd MSG: cnt: 5 
[global] THREAD: 1st MSG: cnt: 6 
[global] THREAD: 2nd MSG: cnt: 6 
[global] THREAD: 1st MSG: cnt: 7 
[global] THREAD: 2nd MSG: cnt: 7 
[global] THREAD: 1st MSG: cnt: 8 
[global] THREAD: 2nd MSG: cnt: 8 
[global] THREAD: 1st MSG: cnt: 9 
[global] THREAD: 2nd MSG: cnt: 9

The basic trick that allows such miracle to happen is intercepting of the output messages. The special BlockingHandler registered by the test before the application code gets started analyzes the messages sent to it by the application threads and suspends and resumes them to simulate the execution order where each thread adds one number and then gets suspended by the operating system in favor of the other thread. The beauty of such solution is that the application code does not really know that its execution flow is going to be controled by the test. If one looked at the application code itself, one could not guess that there is a test that does such wild things to the code, because the actual application code looks natural. Just due to the possibility to intercept logging the test gets the chance to influence the execution flow and as a result simulate even the wildest and unprobable execution order. Indeed writing the BlockingHandler correctly may not be easy, especially if there is more than two threads that need to interact and if the messages that needs to be analyzed are not that simple. That is why the NetBeans JUnit extensions library contains a support method called Log.controlFlow which registers the handler itself and does all the thread manipulation itself. The only thing that is needed is to specify the expected order of messages. A very nice thing is that the format of expected messages is THREAD: name MSG: message which exactly matches the output reported by the NbTestCase when capturing of log messages is enabled (by overriding the logLevel method). So one can just copy the output and feed it to the controlFlow method possibly without any modifications. However as the real world messages sent to logger often contain some content specific to each run (like @af52h442 that identify location of the object in memory) it is possible to use regular expressions to describe the expected messages. So here is one possible rewrite of our test that uses the Log.controlFlow method to simulate the one by one order of execution:

public class ParaelTest extends NbTestCase { 
  public ParaelTest(String testName) { 
    super(testName);
  } 
 
  protected Level logLevel() { 
    return Level.WARNING; 
  } 
 
  public void testMain() throws Exception { 
    org.netbeans.junit.Log.controlFlow(Logger.global, null, 
      "THREAD: 1st MSG: cnt: 0" + 
      "THREAD: 2nd MSG: .*0" + "THREAD: 1st MSG: ...: 1" + 
      "THREAD: 2nd MSG: cnt: 1" + 
      "THREAD: 1st MSG: cnt: 2" + 
      "THREAD: 2nd MSG: cnt: 2" + 
      "THREAD: 1st MSG: cnt: 3" + 
      "THREAD: 2nd MSG: cnt: 3" + 
      "THREAD: 1st MSG: cnt: 4" + 
      "THREAD: 2nd MSG: cnt: 4" + 
      "THREAD: 1st MSG: cnt: 5" + 
      "THREAD: 2nd MSG: cnt: 5" + 
      "THREAD: 1st MSG: cnt: 6" + 
      "THREAD: 2nd MSG: cnt: 6" + 
      "THREAD: 1st MSG: cnt: 7" + 
      "THREAD: 2nd MSG: cnt: 7" + 
      "THREAD: 1st MSG: cnt: 8" + 
      "THREAD: 2nd MSG: cnt: 8" + 
      "THREAD: 1st MSG: cnt: 9" + 
      "THREAD: 2nd MSG: cnt: 9", 500
    ); 
    Parael.main(null); 
    fail("Ok, just print the logged output"); 
  } 
}

Indeed this may not look like a big simplification, but usually the important order of execution just affects some local place and then the size of the script to replay the execution flow can be simplified much very. For example imagine that we just want the thread 1st to print 5 first, then thread 2nd print 2 and then the thread 1st continue with 6. Here would be the script:

public class ParaelTest extends NbTestCase { 
  public ParaelTest(String testName) { 
    super(testName); 
  } 
 
  protected Level logLevel() { 
    return Level.WARNING; 
  } 
 
  public void testMain() throws Exception { 
    org.netbeans.junit.Log.controlFlow(Logger.global, null, 
      "THREAD: 1st MSG: cnt: 5" + 
      "THREAD: 2nd MSG: cnt: 2" + 
      "THREAD: 1st MSG: cnt: 6", 
      5000 
    ); 
    Parael.main(null); 
    fail("Ok, just print the logged output"); 
  } 
}

As can be seen the log messages inside the application can became very useful and valuable tool for preventing unwanted moves of amoeba's edge. Not only that such messages can be enabled during execution of the application itself, but they should even be on and be captured when the code is executed in testing environment. Not only that having the log for failed tests is going to help analyzing its root cause, but if needed, the log can also serve as a script that can control the execution flow and thus became very useful in simulating extreme situations the application code can get to. Indeed this style of logging control can be taken even further. One possible style of enhancements is in the direction of better log generation and the second in better replaying of such log files. On the side of log capturing, an interesting thing to know is when a context switch - e.g. transfer of execution from one thread to another is happening - for that possible tool like dtrace could be very useful. The interesting implication of knowing about context switches and being able to replay them is that this in fact can turn a parael, non-deterministic application into a deterministic one (of course only on single processor machines). Other possible improvements might include use of AOP or any other bytecode patching techniques to insert logging code everywhere and thus getting info about the application even if it is not fully ready to provide it. Everything is possible, but just a future is going to tell which of such improvements is useful. For now it is enough to say, that NetBeans is successfully using the logging and control flow based on logging to fight with the amoeba effect in its APIs.

Personal tools
buy