'. '


From APIDesign

Revision as of 08:51, 6 June 2017 by JaroslavTulach (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Recently Truffle project decided to change its repository structure. Rather than keeping Truffle and its API in its own repository, the project decided to merge the code into a common repository with Graal compiler (currently available here). This was a simple change from the point of view of the code - merging two git repositories into one is a matter of pull and merge. However, a working system isn't just about the code, there are other services around and they may get broken when the code gets shuffled away.

Truffle repository was primarily the API repository. With its own release cycles, evolution rules and special tools associated with it. Most importantly there was TruffleSigtest setup - netbeans:Sigtest tool is something everyone who seriously thinks about API design has to include in own workflow. Some parts of the TruffleSigtest rely just on the code - e.g. they still work - it is still possible to execute:

$ mx sigtest --generate
$ mx sigtest --check binary
$ mx sigtest --check all

however other parts of the system (especially the Notification of Daily Changes) are more cloud-like - e.g. they require properly orchestrated Jenkins builders to be useful. Obviously the builders no longer work after the original Truffle repository changed its location. Last Friday I was asked to fix that.


Just the Code

Of course, I could probably fix the repository URLs in the Jenkins jobs and everything would work again, but there is something attractive in the JustCode approach!

Continuous integration as done by Jenkins builds on the code, but also allows one to have a significant amount of configuration on the server - the selection of JDK, Ant, Maven are done on the server. One can also pass in additional parameters and even execute custom builds scripts. All this configuration is hidden from the people who check the repository out. As a result it may be hard, or even impossible to fully reproduce the continuous build.

Compare this approach with the Travis one. Travis (de-facto standard continuous integration tool on GitHub) keeps its configuration file in the root of the Git repository ```.travis.yml```. As such everything that gets executed during the continuous build is visible to every user of the code - and can be more easily reproduced. I often start by looking at content of ```.travis.yml``` file when working with foreign repositories - this file contains description of the Truth - it describes what will happen to validate changes in your pull request.

Bugs Belong to the Code

Another example is related to NetBeans Apache donation. We are almost ready to donate the code, but NetBeans isn't just about the code. NetBeans is an open source project with its own issue tracker: Bugzilla - while it is easy to convert our Hg repositories to Git and give them to Apache - converting the Bugzilla content (and keeping issue numbers) is way harder.

Maybe it would be way easier, if the bugs were also stored in Git next to the code. We have distributed version system for code - why not have it for bugs as well? By having the bug tracker next to the code, all bugfixes would immediately associate changes in code with changes in the bug database - something every real project needs and solves by various external tools they synchronize changesets with bugs and vice-versa.

Maybe everything belongs to the code!? Including continuous builder setup as well as bug repositories. Maybe also the TruffleSigtest infrastructure should be fully incorporated in the Git repository!


One problem with BinaryCompatibility is that it isn't only a function of current state, but also of the previous one. This is similar to running benchmarks - you aren't concerned just about the current state, but also about the changes. Did we speed something up since yesterday? Did we regress? Such approach needs to step out of the code and create a system that maintains the history. Usually teams create a database and store the results in it.

The same can be done with APIs and the current Truffle API check was doing exactly that. It remembered the previous signatures of the API and after each commit compared them with their new state. If there was a difference it sent an email. The hope was to give people an early alert of changes, so they could fix their (potential) mistakes before the release.

In the Code

However similar effect could be achieved by storing the exact signatures in the Git repository and failing the build if they differ. That would require everyone who makes an API change to perform an additional step: to update the .sigtest files stored in the repository. That is somewhat annoying, but the benefit is clear - the knowledge of what is and what isn't an API is encoded directly in the code - e.g. all we need is JustCode - there is no need for additional Jenkins builders and keeping the history - the snapshot of the API is part of the Git state and always in sync with the code.


Probably one still wants a binary compatible check as well as the change check: changing some API shall yield a warning, but changing something incompatibly (with respect to already released version) needs to trigger an alert. Such system would result in two .sigtest files being in the repository - one capturing the last release and one capturing the current state. How to instruct developers to freely adjust the current state one and almost never touch the last released version is an additional organization problem. But it is similar to the need to update the .sigfile content once a release is done - some things can't be JustCode, some things require certain level of developer discipline.

Personal tools