'. '

RangeDependenciesAnalysed

From APIDesign

(Difference between revisions)
Jump to: navigation, search
(Avoiding NP-Completeness)
(Avoiding NP-Completeness)
Line 47: Line 47:
From the things described so far it seems that [[RangeDependencies]] may not be as bad, if we could eliminate the [[NP-Complete]] problems [[RangeDependencies|associated with them]]! If we could effectively find out whether dependencies of a module can be satisfied! Following lines show a way to do it.
From the things described so far it seems that [[RangeDependencies]] may not be as bad, if we could eliminate the [[NP-Complete]] problems [[RangeDependencies|associated with them]]! If we could effectively find out whether dependencies of a module can be satisfied! Following lines show a way to do it.
-
[[TBD]]
+
=== Module Repository with Ranges ===
 +
 
 +
Let <math>A, B, C, ...</math> denote various modules and their [[API]]s.
 +
 
 +
Let <math>A_{1.0}, A_{1.1}, A_{1.7}, A_{1.11}, A_{2.1}, A_{2.9}, A_{3.0}</math> denote versions of module <math>A</math>. It is not really important what format the versions use, the only condition is that there is a [[wikipedia:Linear_orderlinear order]] among the versions.
 +
 
 +
Let <math>A_{x} \rightarrow B_{[u,v)}</math> denote the fact that version ''x'' of module A depends on version range <math>[u, v)</math> of module ''B''. E.g. any version of module ''B'' between u and less than v can be used to satisfy A's need.
 +
 
 +
Let ''Repository'' <math>R=(M,D)</math> be any set of modules with their versions and their dependencies on other modules.
 +
 
 +
=== Complete Module Repository with Ranges ===
 +
 
 +
A repository <math>R=(M,D)</math> is said to be ''complete'' if '''the transitive dependencies are explicitly enumerated''':
 +
 
 +
<math>A_{x} \rightarrow B_{[u,v)} \in D \wedge B_{u} \rightarrow C_{[p.q)} \in D \Rightarrow \exists r \exist s [r,s) \subseteq [p,q) A_{x} \rightarrow C_{[r,s)} \in D</math>

Revision as of 09:32, 8 January 2012

When analysing dependencies it can be easily proven that RangeDependencies (as used by OSGi) are too flexible and may lead to NP-Complete problems. However recently I started to ask following question - it is known that LibraryReExportIsNPComplete and that the problem can be fixed by generating so called complete repositories. Can't the same trick be applied to repositories using RangeDependencies as well?

My current feeling is that it can. This page is my attempt to formalize the feeling to something more material.

Contents

Lower Bound

Each dependency has a lower bound, a minimal required version of a library that we can work against. When producing our application, it is in our interest to verify that we can really compile and create our application against the lower bound. To make sure we are using only classes and methods available in this such lower bound version. The easiest way to ensure this is to compile against the lower bound version of each library we define a dependency on.

Or from the opposite perspective: Compiling against newer version than the lower bound is too errorprone. NetBeans allows this and almost always this leads to LinkageErrors that render the whole application unusable at random and unexpected moments.

Lower bound is the version that we compile against.

Upper bound

Upper bound on the other hand is the version that we know we can still work with.

There are just two vague terms in the previous sentence. What does that mean to know?

  • Know can mean to be sure. E.g. we have verified that we really work with that version. We have downloaded that version of a library, executed all our tests and certified it. We really know our application works with that version. In some situations this kind of knowledge is required.
  • Know can however mean to believe. Often we can trust the producer of the library, that they will conform to some versioning scheme (like Semantic versioning) and based on such trust we can estimate the upper bound.

What does it mean to work?

  • Work can mean really runs as expected and this kind of verification is hard, tough to automate. It requires work of a quality department and is more a process decision than something scientifically definable.
  • Work can be reduced to links, this is definitely much less satisfying answer (but at least it prevents the linkage errors, so it gives the application a chance to recover), but it is easily measurable. BinaryCompatibility can be verified by tools like Sigtest.

In case we want to believe our application will sort of work and for sure at least link to new version, it is better to use predefined, widely accepted RangeDependencies - e.g. when using a library with lower bound 1.5, the upper bound would automatically be 2.0. The range would be [1.5,2.0).

If one wants to be sure application really runs as expected, it is wiser to specify narrow range. If version 1.5 is the newest version, one can even use [1.5,1.5] and only when new versions are released, release new versions of the application and expand the range to [1.5,1.6].

The latter, paranoiac approach, is likely to be used more often in end user applications. The first, optimistic one, seems more suitable for reusable libraries.

Stability of Published API

How one specifies how stable just published API is? The semantic versioning suggests to treat all releases up to changed major version (e.g. up to 2.0.0) as being BinaryCompatible. The NetBeans API Stability proposal allows each API to be attributed as stable, devel or private. The netbeans:NbmPackageStability goes even further and suggests to use different ranges for different types of the API. Is this the right approach?

In ideal, rationalistic world, it is. Under the assumption that API designers are rationalistic and understand the versioning issues and the importance of proper dependency management, it is fine to expect that they will do everything to follow principles of semantic versioning (or netbeans:NbmPackageStability guidelines) and either keep BackwardCompatibility or bump the major version. In rationalistic world, it is enough to believe our application will sort of work and at least link and the rest will be achieved by the good work of the other, rationalistic people.

The trouble is that we live in clueless world and most of the people (including API designers) don't have time to fully evaluate negative influence of their changes. When producers of up-stream library make a mistake, there should be a way for the consumers of such API to recover and shield themselves. From this thought there is just a small step to the idea of giving the users of an API right to classify its stability.

Should there be a repository of modules and their dependencies including module M in versions 1.1,1.2,1.5,2.0,2.5,3.0. If majority of other modules having a dependency on M uses range [1.1,2.0) (e.g. the classical range for semantic versioning), then we can deduce that the module M is doing good job keeping BackwardCompatibility. If many dependencies on M use for example [1.1,1.5), then we can deduce that something unnatural (from the point of semantic versioning happened in version 1.5) and that the owner of the module M underestimated the impact of changes in version 1.5. On the other hand, if many dependencies on M use range like [1.1,3.0) then the incompatible change was probably not as incompatible as it might have seen.

The beauty of analysing the overall dependencies on a module is that it clearly expresses the statistical probability of BackwardCompatibility as seen by all users. It also allows the users to be sure application really runs as expected if the dependency is satisfied. Of course, the author of such API should set the expectations up by using stability category, but the power relies in hands of the users.

Avoiding NP-Completeness

From the things described so far it seems that RangeDependencies may not be as bad, if we could eliminate the NP-Complete problems associated with them! If we could effectively find out whether dependencies of a module can be satisfied! Following lines show a way to do it.

Module Repository with Ranges

Let A,B,C,... denote various modules and their APIs.

Let A1.0,A1.1,A1.7,A1.11,A2.1,A2.9,A3.0 denote versions of module A. It is not really important what format the versions use, the only condition is that there is a wikipedia:Linear_orderlinear order among the versions.

Let A_{x} \rightarrow B_{[u,v)} denote the fact that version x of module A depends on version range [u,v) of module B. E.g. any version of module B between u and less than v can be used to satisfy A's need.

Let Repository R = (M,D) be any set of modules with their versions and their dependencies on other modules.

Complete Module Repository with Ranges

A repository R = (M,D) is said to be complete if the transitive dependencies are explicitly enumerated:

A_{x} \rightarrow B_{[u,v)} \in D \wedge B_{u} \rightarrow C_{[p.q)} \in D \Rightarrow \exists r \exist s [r,s) \subseteq [p,q) A_{x} \rightarrow C_{[r,s)} \in D

Personal tools
buy