LibraryReExportIsNPComplete
From APIDesign
Reusing libraries produced by others is essential aspect of DistributedDevelopment. It simplifies Time To Market, it reduces long term cost of ownership and leads to creation of good technologies. However it does not come for free. Read details or directly jump to the implications that shall improve your every day development habits.
This page starts by describing a way to convert any 3SAT problem to a solution of finding whether there is a way to satisfy all dependencies of a library in a repository of libraries. Thus proving that the later problem is NP-Complete. Then it describes the importance of such observations on our development practices.
Contents |
3SAT
The problem of satisfying a logic formula remains NP-complete even if all expressions are written in wikipedia::conjunctive normal form with 3 variables per clause (3-CNF), yielding the 3SAT problem. This means the expression has the form:
- Failed to parse (unknown error): (x_{11} \vee x_{12} \vee x_{13}) \wedge
- Failed to parse (unknown error): (x_{21} \vee x_{22} \vee x_{23}) \wedge
- Failed to parse (unknown error): (x_{31} \vee x_{32} \vee x_{33}) \wedge
- Failed to parse (unknown error): ...
- Failed to parse (unknown error): (x_{n1} \vee x_{n2} \vee x_{n3})
where each Failed to parse (unknown error): x_{ab}
is a variable Failed to parse (unknown error): v_i or a negation of a variable Failed to parse (unknown error): \neg v_i
. Each variable Failed to parse (unknown error): v_i
can appear multiple times in the expression.
Library Versioning Terminology
Let Failed to parse (unknown error): A, B, C, ...
denote various modules and their APIs.
Let Failed to parse (unknown error): A_{1.0}, A_{1.1}, A_{1.7}, A_{1.11}
denote compatible versions of module Failed to parse (unknown error): A
.
Let Failed to parse (unknown error): A_{1.0}, A_{2.0}, A_{3.1}
denote incompatible versions of module Failed to parse (unknown error): A
.
Let Failed to parse (unknown error): A_{x.y} > B_{u.v}
denote the fact that version x.y of module A depends on version u.v of module B.
Let Failed to parse (unknown error): A_{x.y} \gg B_{u.v}
denote the fact that version x.y of module A depends on version u.v of module B and that it re-exports module B's API to users of own API.
Let Repository Failed to parse (unknown error): R=(M,D)
be any set of modules with their various versions and their dependencies on other modules with or without re-export.
Let C be a Configuration in a repository Failed to parse (unknown error): R=(M,D) , if Failed to parse (unknown error): C \subseteq M , where following is satisfied:
- Failed to parse (unknown error): \forall A_{x.y} \in C, \forall A_{x.y} \gg B_{u.v} \in D \Rightarrow \exists w >= v \wedge B_{u.w} \in C
- each re-exported dependency is satisfied with some compatible version
- Failed to parse (unknown error): \forall A_{x.y} \in C, \forall A_{x.y} > B_{u.v} \in D \Rightarrow \exists w >= v \wedge B_{u.w} \in C
- each dependency is satisfied with some compatible version
- Let there be two chains of re-exported dependencies Failed to parse (unknown error): A_{p.q} \gg ... \gg B_{x.y}
and Failed to parse (unknown error): A_{p.q} \gg ... \gg B_{u.v} then Failed to parse (unknown error): x = u \wedge y = v - this guarantees that each class has just one, exact meaning for each importer
Module Dependency Problem
Let there be a repository Failed to parse (unknown error): R=(M,D)
and a module Failed to parse (unknown error): A \in M
. Does there exist a configuration Failed to parse (unknown error): C
in the repository Failed to parse (unknown error): R
, such that the module Failed to parse (unknown error): A \in C , e.g. the module can be enabled?
Conversion of 3SAT to Module Dependencies Problem
Let there be 3SAT formula with with variables Failed to parse (unknown error): v_1, ..., v_m
as defined above.
Let's create a repository of modules Failed to parse (unknown error): R . For each variable Failed to parse (unknown error): v_i
let's create two modules Failed to parse (unknown error): M^i_{1.0} and Failed to parse (unknown error): M^i_{2.0}
, which are mutually incompatible and put them into repository Failed to parse (unknown error): R .
For each formula Failed to parse (unknown error): (x_{i1} \vee x_{i2} \vee x_{i3})
let's create a module Failed to parse (unknown error): F^i
that will have three compatible versions. Each of them will depend on one variable's module. In case the variable is used with negation, it will depend on version 2.0, otherwise on version 1.0. So for formula
- Failed to parse (unknown error): v_a \vee \neg v_b \vee \neg v_c
we will get:
- Failed to parse (unknown error): F^i_{1.1} \gg M^a_{1.0}
- Failed to parse (unknown error): F^i_{1.2} \gg M^b_{2.0}
- Failed to parse (unknown error): F^i_{1.3} \gg M^c_{2.0}
All these modules and dependencies are added into repository Failed to parse (unknown error): R
Now we will create a module Failed to parse (unknown error): T_{1.0}
that depends all formulas:
- Failed to parse (unknown error): T_{1.0} \gg F^1_{1.0}
- Failed to parse (unknown error): T_{1.0} \gg F^2_{1.0}
- ...
- Failed to parse (unknown error): T_{1.0} \gg F^n_{1.0}
and add this module as well as its dependencies into repository Failed to parse (unknown error): R .
Claim: There Failed to parse (unknown error): \exists C
(a configuration) of repository Failed to parse (unknown error): R and Failed to parse (unknown error): T_{1.0} \in C Failed to parse (unknown error): \Longleftrightarrow there is a solution to the 3SAT formula.
Proof
"Failed to parse (unknown error): \Leftarrow ": Let's have an evaluation of each variable to either true or false that evaluates the whole 3SAT formula to true. Then
- Failed to parse (unknown error): C = \{ T_{1.0} \} \bigcup
- Failed to parse (unknown error): \{ M^i_{1.0} : v_i \} \bigcup \{M^i_{2.0} : \neg v_i \} \bigcup
- Failed to parse (unknown error): \{ F^i_{1.1} : x_{i1} \} \bigcup \{ F^i_{1.2} : \neg x_{i1} \wedge x_{i2} \} \bigcup \{ F^i_{1.3} : \neg x_{i1} \wedge \neg x_{i2} \wedge x_{i3} \}
It is clear from the definition that each Failed to parse (unknown error): M^i
and Failed to parse (unknown error): F^i can be in the Failed to parse (unknown error): C just in one version. Now it is important to ensure that each module is present always at least in one version. This is easy for Failed to parse (unknown error): M^i as its Failed to parse (unknown error): v_i needs to be true or false, and that means one of Failed to parse (unknown error): M^i_{1.0} or Failed to parse (unknown error): M^i_{2.0} will be included. Can there be a Failed to parse (unknown error): F^i which is not included? Only if Failed to parse (unknown error): \neg x_{i1} \wedge \neg x_{i2} \wedge \neg x_{i3} but that would mean the whole 3-or would evaluate to false and as a result also the 3SAT formula would evaluate to false. This means that dependencies of Failed to parse (unknown error): T_{1.0} on Failed to parse (unknown error): F^i modules are satisfied. Are also dependencies of every Failed to parse (unknown error): F^i_{1.q} satisfied? From all the three versions, there is just one Failed to parse (unknown error): F^i_{1.q}
, the one its Failed to parse (unknown error): x_{iq}
evaluates to true. However Failed to parse (unknown error): x_{iq} can either be without negation, and as such Failed to parse (unknown error): F^i_{1.q} depends on Failed to parse (unknown error): M^j_{1.0} which is included as Failed to parse (unknown error): v_j is true. Or Failed to parse (unknown error): x_{iq} contains negation, and as such Failed to parse (unknown error): F^i_{1.q} depends on Failed to parse (unknown error): M^j_{2.0} which is included as Failed to parse (unknown error): v_j is false.
"Failed to parse (unknown error): \Rightarrow ": Let's have a Failed to parse (unknown error): C
configuration satisfies all dependencies of Failed to parse (unknown error): T_{1.0}
. Can we also find positive valuation of 3SAT formula?
For Failed to parse (unknown error): i -th 3-or there is Failed to parse (unknown error): T_{1.0} \gg F^i_{1.0}
dependency which is satisfied. At least by one from Failed to parse (unknown error): F^i_{1.0} or Failed to parse (unknown error): F^i_{1.1} or Failed to parse (unknown error): F^i_{1.2}
. The one Failed to parse (unknown error): F^i
that has the satisfied dependency reexports Failed to parse (unknown error): M^j_{1.0} (which means Failed to parse (unknown error): v_j = true
) or Failed to parse (unknown error): M^j_{2.0}
(which means Failed to parse (unknown error): v_j = false
). Anyway each Failed to parse (unknown error): i
3-or evaluates to Failed to parse (unknown error): true
.
The only remaining question is whether a Failed to parse (unknown error): C
configuration can force truth variable Failed to parse (unknown error): v_j to be true in one 3-or and false in another. However that would mean that there is re-export via Failed to parse (unknown error): T_{1.0} \gg F^i_{1.x} \gg M^j_{1.0} and also another one via Failed to parse (unknown error): T_{1.0} \gg F^p_{1.u} \gg M^j_{2.0}
. However those two chain of dependencies ending in different versions of Failed to parse (unknown error): M^j
cannot be in one Failed to parse (unknown error): C as that breaks the last condition of configuration definition. Thus each Failed to parse (unknown error): M^j is represented only by one version and each Failed to parse (unknown error): v_j is evaluated either to true or false, but never both.
The 3SAT formula's evaluation based on the configuration Failed to parse (unknown error): C
is consistent and satisfies the formula.
qed.
Implications
If there is a repository of modules in various (incompatible) versions, with mutual dependencies and re-export of their APIs, then deciding whether some of them can be enabled is NP-complete problem.
As NP-complete problems are hard to solve, it is usually our best desire to avoid them in real life situations. What does that mean in case one decides to practise DistributedDevelopment (and it is inevitable that software for 21st century needs this development style)? If you want to avoid headaches with finding the right configuration of various version of the libraries that allows to execute them together, then stick with following simple rules.
Be Compatible!
If you develop your own libraries in backward compatible way, you can always select the most recent version of each library. That is the configuration you are looking for. It is easy to find (obviously) and also it is the most desirable, as it delivers the most modern features and bugfixes that users of such libraries want.
Reuse with Care!
If you happen to reuse libraries (and you should because reuse lowers Time_To_Market, just like it did for me when I was publishing my first animated movie), then choose such libraries that can be trusted to evolve compatibly.
Hide Incompatibilities!
If you happen to reuse a library that cannot be trusted to keep its BackwardCompatibility, then do whatever you can to not re-export its APIs! This has been discussed in Chapter 10, Cooperating with Other APIs, but in short: If you hide such library for internal use and do not export any of its interfaces, you can use whatever version of library you want (even few years older) and nobody shall notice. Moreover in many module systems there can even be multiple versions of the same library in case they are not re-exported.
Conclusion
Avoid complexities and NP-complete problems. Learn to develop in backward compatible way. Reading TheAPIBook is a perfect entry point into such compatible software design of the 21st century.
External Links
- discussion at Lambda the Ultimate
- LtU guys pointed out that the proof has already been published: D. Burrows, Modelling and Resolving Software Dependencies