Upcoming nanojit-central, import logic and graph structure
This is another super-long email about the nanojit merge. Sorry.
We're Approaching The End of the nanojit merge, in the sense that
there are no (or very few) cross-nanojit features remaining on
either fork of the nanojit-central synthetic history, and the
only substantial diffs are in the ARM backend, that several
contributors on both teams are now looking at.
In light of this, I've been making some preparations for a
"final" transition to working out of a single nanojit-central
branch, with a "reversed flow" of changes from nanojit-central
(NC) to tamarin-redux (TR) and tracemonkey (TM). When we are
finished, I will announce a flag day very visibly; that is not
this email. This email is about preparation, not pulling the
This has involved two separate issues:
1. Preparing a repository for standalone work.
2. Preparing mercurial scripts for reversing the flow of
I'll describe what I've done here on both counts. In the first
case I don't expect much objection, in the second there's
actually a question that each team will have to resolve for
1. Preparing a repository for standalone work.
I've made a configure.in and Makefile.in that build nanojit on, I
hope, a representative assortment of platforms. Surely not all,
but enough to run some tests, I hope.
To test it, I've brought the old "lirasm" tool back to life, and
asked njn to spend some time fleshing out a --random mode that
fuzz-tests the public API of nanojit. The lirasm tool is a simple
assembler (with an optional assemble-and-go execution mode) for a
textual description of LIR.
My goal in doing so was to provide a "common yet trivial client"
both teams can encode all their nanojit-API usage patterns in,
such that changes to nanojit can be made with corresponding
changes made, validated and tested by the client before
committing them to NC. This, I hope, might reduce the chance of
landing something on NC that can't sensibly flow to TR and TM, or
that causes a functional regression. We'll have to see how
successful it is in that role. I'd encourage both teams to become
familiar with lirasm though, and customize it to (somewhat
artificially) "do what your client does" in any additional exotic
ways it doesn't already.
2. Preparing mercurial scripts for reversing the flow of changes.
Here we get into ugly territory. Recall that the plan is to use
the 'convert' extension to repeatedly copy revisions from NC to
TR and TM. Every time a change occurs in NC, developers of TR and
TM will have to independently run a convert job that copies them
from the NC history into the larger respective TR and TM
histories (and in TM's case, rewrites the paths to embed within
js/src). I'll be wiring the convert job here into the TM
makefile, so it's a simple matter of saying "make import-nanojit"
Where this gets tricky is "where in the destination history-graph
to put the copied changes". You might think this has a simple
answer -- "at the end" -- but it turns out there are two sensible
answers. So I've been exploring both.
It's easiest to explain the two with a pair of
(A) append the changes temporally, clobbering current state on
the imported files with import state:
In these two scenarios, we're developing very different history
graphs. In (A), it is linear, just like the existing TM graph. In
(B) we're developing a parallel lineage that represents *only*
the contiguous NC changes, with regular merge lines running over
to the trunk it's being merged with.
These scenarios have advantages and disadvantages:
- In (A), the graph looks simpler. Bisect is possibly more
likely to work non-confusingly. Changes, in any case, look
like part of the regular work: you will be able to test them
- In (B), the graph is more honest, and you can see the NC
changes in their parallel form, with each NC change following
the previous NC change. But it's also more *dishonest*
because it appears to be development "derived from" the
initial point of divergence, rather than only being
circumstantially associated with that point (because that was
the tip when the *first* import happened).
In particular: it's likely that the nodes along the parallel
"imported NC" lineage won't actually build or work as
composite views of TM, because the non-NC TM portions of them
(eg. jstracer.cpp) are rooted in a very old snapshot (the
point of the first import). As time passes, this will render
those "honest" nodes less and less useful; it will also
clutter the graph, which in the case of mozilla, is already
cluttered due to multiple project branches. They can get
pretty confusing-looking in 'hg view'.
Mercurial's 'convert' extension, by default, does (B). It does so
presumably in cases with a small number of conversions, or a
non-repeating conversion, it's more honest. But it's not
necessarily what we're going to want.
So I've conducted an experiment in writing mercurial code. I
wrote a script that uses the mercurial API to query the
import-target graph, find the last conversion in the source, work
through the descendants of the source to isolate the first
new-incoming-conversion, and write a 'splicemap' that connects
the incoming change to the current tip of the target graph. The
'convert' extension already has support for splicemaps, and this
works. It gets us (A) if we want it. I'll post that to the wiki
The question each team has to answer for themselves is: do you
want (A) or (B)? I'm going to propose wiring up a makefile rule
in TM that does (A). But I'm open to disagreement on this if you