'. '

OrderOfElements

From APIDesign

(Difference between revisions)
Jump to: navigation, search
(Random Order of Tests)
Current revision (07:55, 24 February 2012) (edit) (undo)
(PS)
 
(31 intermediate revisions not shown.)
Line 1: Line 1:
-
Runtime [[BackwardCompatibility]] can be significantly influenced by changing the order of elements in a {{JDK|java/util|Collection}} or in an {{JDK|java/lang/reflect|Array}}. Sure, some people would argue that depending on the order of items as served by {{JDK|java/util|HashSet}}.'''iterator()''' is unreasonable (and there is no doubt about that), however there are less obvious cases.
+
Runtime [[BackwardCompatibility]] can be significantly influenced by changing the order of elements in a {{JDK|java/util|Collection}} or in an {{JDK|java/lang/reflect|Array}}. Sure, some people would argue that depending on the order of items as served by {{JDK|java/util|HashSet}}.'''iterator()''' is unreasonable (and there is no doubt about that), however there are less obvious cases where this dependence is relevant.
== [[JUnit]] and Switch to [[JDK]]7 ==
== [[JUnit]] and Switch to [[JDK]]7 ==
-
[[NetBeans]] is using [[JUnit]] a lot for own internal testing. For historical reasons majority of our existing tests is written against [[JUnit]]3 and even newly created tests continue to be based on the 3.x style. I want to mention this at the beggining, as I am not sure if the same problem we are facing now influences [[JUnit]]4 style tests (although I believe it does).
+
[[NetBeans]] uses [[JUnit]] a lot for its own internal testing. We have invested a lot of time in stabilizing our tests over the last four years. More than 8,000 NetBeans tests are known to pass reliably and about 1,000 others are known to be stable in more than 99% of cases (we mark these as ''randomly failing'' tests).
-
We have invested a lot of time to stabilize our tests during last four years. More than eight thousands tests is know to pass reliably and about one thousand others is known to be stable in more than 99% of cases (we mark these as ''randomly failing'' tests).
+
Anyway, with [[JDK]]7 being out, we started to execute the tests on both [[JDK]]s (6 and 7) to see how many regressions we have. Not being naive, we expected that there would be some (for example [[JDK]]7 defines a broken property editor for all enum types which takes precedence over our own property editor used on [[JDK]]6 and tests verifying the correct behavior correctly fail). But that is OK, one cannot improve a framework without making its [[amoeba]] shape shiver a bit.
-
Anyway, with [[JDK]]7 being out, we started to execute the tests on both [[JDK]]s (six and seven) to see how much regressions we have. Not being naive, we expected there'll some (for example we found out that [[JDK]]7 defines a broken property editor for all enum types which takes precedence over our own property editor used on [[JDK]]6). But that is OK, one cannot improve the framework without making its [[amoeba]] shape shiver a bit.
+
However there is a much bigger problem: basically none of our test suites are able to pass reliably on [[JDK]]7 (while they pass in 99% cases on [[JDK]]6). To make things even worse, the failures are '''random'''!
-
 
+
-
However there is much bigger problem: basically none of our test suites is able to pass reliably on [[JDK]]7 (while they pass in 99% cases on [[JDK]]6). To make things even worse, the failures are '''random'''!
+
=== Random Order of Tests ===
=== Random Order of Tests ===
-
After a week of investigation I realized and proved (by reading the log files), that the order of '''testXYZ''' is random on [[JDK]]7. As the version of [[JUnit]] remains unchanged and as it only calls {{JDK|java/lang|Class}}.'''getMethods()''', the only framework to blame for shaking like [[amoeba]] is the [[JDK]]!
+
After a week of investigation, I realized and proved (by reading the log files), that the order of executed '''testXYZ''' methods is random on [[JDK]]7. As the version of [[JUnit]] remains unchanged and as it only calls {{JDK|java/lang|Class}}.'''getMethods()''', the only framework to blame for shaking like an [[amoeba]] is the [[JDK]]!
-
Sure, reading the [[Javadoc]] of the '''getMethods''' method makes it clear that the order of returned methods can be random, but c'mon we are all [[clueless]]! Nobody reads [[javadoc]] until necessary and things work. And things used to work for last four years! [[JDK]]6 (and possibly also [[JDK]]5) used to return the methods in stable order defined by order of methods in the source.
+
Sure, reading the [[Javadoc]] of the '''getMethods''' method makes it clear that the order of returned methods can be random. But normal [[API]] users are known to be [[clueless]]! Nobody reads [[javadoc]] while things work. And things used to work for the last four years! [[JDK]]6 (and possibly also [[JDK]]5) return the methods in a stable order, defined by the order of methods in the source.
-
I am not sure what motivated the change, but if this is not a violation of [[BackwardCompatibility]] with respect to the specification, it is clearly [[BackwardCompatibility]] problem with respect to runtime and good habits!
+
I'm not sure about what motivated the change, but if this is not a violation of [[BackwardCompatibility]] with respect to the specification, it is clearly a [[BackwardCompatibility]] problem with respect to runtime and good habits! Just execute following test few times on [[JDK]]6 and [[JDK]]7 and compare the results:
-
=== What can be done about the [[JDK]]7 incompatibility? ===
+
<source lang="java">
 +
package org.bar;
-
We want our tests to pass on [[JDK]]7. [[NetBeans]] usually support multiple releases of [[JDK]]s and for close future, we are going to continue to compile with [[JDK]]6 and we can run our tests with [[JDK]]6 primarily. However one day we will drop support for [[JDK]]6 and then we need to have stable, not randomly failing tests running on [[JDK]]7. What can we do?
+
import org.junit.Assert;
 +
import org.junit.Test;
 +
 
 +
public class OrderedTest {
 +
private static int counter;
 +
 
 +
@Test public void testZero() {
 +
Assert.assertEquals(0, counter);
 +
counter++;
 +
}
 +
 +
@Test public void testOne() {
 +
Assert.assertEquals(1, counter);
 +
counter++;
 +
}
 +
}
 +
</source>
 +
 
 +
== What can be done about [[JDK]]7 incompatibility? ==
 +
 
 +
We want our tests to pass on [[JDK]]7. [[NetBeans]] usually supports multiple releases of [[JDK]]s and for close future, we are going to continue to compile with [[JDK]]6 and we can run our tests with [[JDK]]6 primarily. However one day we will drop support for [[JDK]]6 and then we need to have stable, not randomly failing tests running on [[JDK]]7. What can we do?
=== Fix [[JDK]]7 ===
=== Fix [[JDK]]7 ===
-
Obviously, the simplest way to fix the problem is to change the [[JDK]] code to behave as it used to in [[JDK]]6. However, given the specification explicitly permitting random order of returned methods, I don't believe there is a force on the planet to make [[JDK]] team to do such change. Moreover, they may even be legitimate reasons why this change had to be done ([[performance]] comes to my mind).
+
Obviously, the simplest way to fix the problem is to change the [[JDK]] code to behave as it used to in [[JDK]]6. However, given the specification explicitly permitting random order of returned methods, I don't believe there is a force on the planet to make [[JDK]] team to do such change. Moreover, there may even be legitimate reasons why this change had to be done ([[performance]] comes to my mind).
=== Learn to Write Proper Tests! ===
=== Learn to Write Proper Tests! ===
Line 31: Line 50:
I am sure that progenitors of [[JUnit]] have been ready to advice me to learn to write proper unit tests since beginning of reading of my text. Yes, they are right. Proper unit tests are not supposed to silently depend on their execution order. We should make them more robust!
I am sure that progenitors of [[JUnit]] have been ready to advice me to learn to write proper unit tests since beginning of reading of my text. Yes, they are right. Proper unit tests are not supposed to silently depend on their execution order. We should make them more robust!
-
How should we have known? On [[JDK]]6 the execution order is always the same. Thus there is almost no chance to run into a problem. On [[JDK]]7 the order is random, so we may be getting random failures for months before we eliminate them all (some of them caused by sticky [[ThreadContextClassLoader]] for example).
+
How should we have known? On [[JDK]]6 the execution order is always the same. Thus there is almost no chance to run into a problem. On [[JDK]]7 the order is random, so we may be getting random failures for months before we eliminate them all.
Moreover even if there is a failure on the server, when I re-run the test locally, everything works!
Moreover even if there is a failure on the server, when I re-run the test locally, everything works!
-
=== Random vs. Randomized Tests ===
+
=== Random vs. [[RandomizedTests]] ===
 +
 
 +
Usage of a [[RandomizedTest]] is fully acceptable. However, '''random''' does not mean '''randomized'''! The most important property of [[RandomizedTests]] is reproducibility. As soon as the execution fails, we should have a way to reproduce the failure. The [[JUnit]]+[[JDK]]7 combination is quite deadly: it provides ''randomness'', but does not help with easy reproducibility. Could something be done about that?
 +
 
 +
First of all, assuming reproducibility is of the biggest value, [[JUnit]]3 could sort all the test methods into fixed order (at least when running on [[JDK]]7 and future releases). This might cause one time failures, but since then all the test would remain stable. Fixing the one time failure would be straightforward anyway, as it would be naturally reproducible.
 +
 
 +
In case the ''randomness'' is a feature, rather than defect, [[JUnit]] could be enhanced to support reproducibility. For example it could print the order of tests that lead to a failure as part of the failure message or (as advocated at the end of [[RandomizedTests]] overview), it could even generate code to execute the tests in proper order.
 +
 
 +
Maybe ''randomness'' is essential. Then the [[JUnit]] could even mix the methods before execution to increase the likehood of the failure (I still don't know when [[JDK]]7 mixes the methods, most of the time the order is stable on my local machine). Then it could be enough to print the seed as part of the assert and support some special mode to run with a fixed seed (again as advocated at [[RandomizedTests]]).
 +
 
 +
== Summary ==
 +
 
 +
Clearly, the order of elements returned from an [[API]] may be very significant, especially when they depend in some way on each other - like (inproperly isolated) [[JUnit]] tests.
 +
 
 +
Changing an [[API]] that used to return objects in particular order to return them ''randomized'' is a huge [[BackwardCompatibility|incompatible]] change in the [[Chapter 11|runtime behavior]] of your [[API]].
 +
 
 +
<comments/>
 +
 
 +
===== Happy End =====
 +
I'll continue fixing the [[NetBeans]] [[JUnit]] tests, but to be good open source members, we also donated a [https://github.com/KentBeck/junit/pull/293 patch] that makes the execution order repeatable, but not predicatable (as in unit tests one should not depend on order of tests, right?). The [https://github.com/KentBeck/junit/pull/293 patch] has been accepted in February 2012.
-
[[TBD]]
+
{{:Talk:OrderOfElements}}
-
[[Category:APITypes]]
+
[[Category:APITypes]] [[Category:APIDesignPatterns:Evolution]] [[Category:APIDesignPatterns:Anti]]

Current revision

Runtime BackwardCompatibility can be significantly influenced by changing the order of elements in a Collection or in an Array. Sure, some people would argue that depending on the order of items as served by HashSet.iterator() is unreasonable (and there is no doubt about that), however there are less obvious cases where this dependence is relevant.

Contents

JUnit and Switch to JDK7

NetBeans uses JUnit a lot for its own internal testing. We have invested a lot of time in stabilizing our tests over the last four years. More than 8,000 NetBeans tests are known to pass reliably and about 1,000 others are known to be stable in more than 99% of cases (we mark these as randomly failing tests).

Anyway, with JDK7 being out, we started to execute the tests on both JDKs (6 and 7) to see how many regressions we have. Not being naive, we expected that there would be some (for example JDK7 defines a broken property editor for all enum types which takes precedence over our own property editor used on JDK6 and tests verifying the correct behavior correctly fail). But that is OK, one cannot improve a framework without making its amoeba shape shiver a bit.

However there is a much bigger problem: basically none of our test suites are able to pass reliably on JDK7 (while they pass in 99% cases on JDK6). To make things even worse, the failures are random!

Random Order of Tests

After a week of investigation, I realized and proved (by reading the log files), that the order of executed testXYZ methods is random on JDK7. As the version of JUnit remains unchanged and as it only calls Class.getMethods(), the only framework to blame for shaking like an amoeba is the JDK!

Sure, reading the Javadoc of the getMethods method makes it clear that the order of returned methods can be random. But normal API users are known to be clueless! Nobody reads javadoc while things work. And things used to work for the last four years! JDK6 (and possibly also JDK5) return the methods in a stable order, defined by the order of methods in the source.

I'm not sure about what motivated the change, but if this is not a violation of BackwardCompatibility with respect to the specification, it is clearly a BackwardCompatibility problem with respect to runtime and good habits! Just execute following test few times on JDK6 and JDK7 and compare the results:

package org.bar;
 
import org.junit.Assert;
import org.junit.Test;
 
public class OrderedTest {
    private static int counter;
 
    @Test public void testZero() {
        Assert.assertEquals(0, counter);
        counter++;
    }
 
    @Test public void testOne() {
        Assert.assertEquals(1, counter);
        counter++;
    }
}

What can be done about JDK7 incompatibility?

We want our tests to pass on JDK7. NetBeans usually supports multiple releases of JDKs and for close future, we are going to continue to compile with JDK6 and we can run our tests with JDK6 primarily. However one day we will drop support for JDK6 and then we need to have stable, not randomly failing tests running on JDK7. What can we do?

Fix JDK7

Obviously, the simplest way to fix the problem is to change the JDK code to behave as it used to in JDK6. However, given the specification explicitly permitting random order of returned methods, I don't believe there is a force on the planet to make JDK team to do such change. Moreover, there may even be legitimate reasons why this change had to be done (performance comes to my mind).

Learn to Write Proper Tests!

I am sure that progenitors of JUnit have been ready to advice me to learn to write proper unit tests since beginning of reading of my text. Yes, they are right. Proper unit tests are not supposed to silently depend on their execution order. We should make them more robust!

How should we have known? On JDK6 the execution order is always the same. Thus there is almost no chance to run into a problem. On JDK7 the order is random, so we may be getting random failures for months before we eliminate them all.

Moreover even if there is a failure on the server, when I re-run the test locally, everything works!

Random vs. RandomizedTests

Usage of a RandomizedTest is fully acceptable. However, random does not mean randomized! The most important property of RandomizedTests is reproducibility. As soon as the execution fails, we should have a way to reproduce the failure. The JUnit+JDK7 combination is quite deadly: it provides randomness, but does not help with easy reproducibility. Could something be done about that?

First of all, assuming reproducibility is of the biggest value, JUnit3 could sort all the test methods into fixed order (at least when running on JDK7 and future releases). This might cause one time failures, but since then all the test would remain stable. Fixing the one time failure would be straightforward anyway, as it would be naturally reproducible.

In case the randomness is a feature, rather than defect, JUnit could be enhanced to support reproducibility. For example it could print the order of tests that lead to a failure as part of the failure message or (as advocated at the end of RandomizedTests overview), it could even generate code to execute the tests in proper order.

Maybe randomness is essential. Then the JUnit could even mix the methods before execution to increase the likehood of the failure (I still don't know when JDK7 mixes the methods, most of the time the order is stable on my local machine). Then it could be enough to print the seed as part of the assert and support some special mode to run with a fixed seed (again as advocated at RandomizedTests).

Summary

Clearly, the order of elements returned from an API may be very significant, especially when they depend in some way on each other - like (inproperly isolated) JUnit tests.

Changing an API that used to return objects in particular order to return them randomized is a huge incompatible change in the runtime behavior of your API.

<comments/>


Happy End

I'll continue fixing the NetBeans JUnit tests, but to be good open source members, we also donated a patch that makes the execution order repeatable, but not predicatable (as in unit tests one should not depend on order of tests, right?). The patch has been accepted in February 2012.


mbien said ...

looks like a RFE for junit to me. I agree the execution order of test methods should not change between runs. something like @After("test5") or @Test("2") would solve it in most cases IMO. I always had the habit to sort the result of getMethods() alphabetically (method name+parameters concatenated) in my libs (mostly code generators) since i knew this might happen some time in the future.

thanks for the heads up!

--mbien 01:15, 23 August 2011 (CEST)

Hello Michael, I've been told that TestNG supports ordering of tests. Kent replied to me that methods in JUnit test class should rather be independent. As far as the RFE for JUnit goes - right, Jesse created one. Actually, what Jesse suggest is to sort test methods alphabetically.

--JaroslavTulach 16:14, 27 August 2011 (UTC)

Personal tools
buy