Upcoming nanojit-central, import logic and graph structure

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Upcoming nanojit-central, import logic and graph structure

Graydon Hoare-3
Hi,

This is another super-long email about the nanojit merge. Sorry.

We're Approaching The End of the nanojit merge, in the sense that
there are no (or very few) cross-nanojit features remaining on
either fork of the nanojit-central synthetic history, and the
only substantial diffs are in the ARM backend, that several
contributors on both teams are now looking at.

In light of this, I've been making some preparations for a
"final" transition to working out of a single nanojit-central
branch, with a "reversed flow" of changes from nanojit-central
(NC) to tamarin-redux (TR) and tracemonkey (TM). When we are
finished, I will announce a flag day very visibly; that is not
this email. This email is about preparation, not pulling the
trigger.

This has involved two separate issues:

   1. Preparing a repository for standalone work.
   2. Preparing mercurial scripts for reversing the flow of
      changes.

I'll describe what I've done here on both counts. In the first
case I don't expect much objection, in the second there's
actually a question that each team will have to resolve for
themselves.


1. Preparing a repository for standalone work.
----------------------------------------------

I've made a configure.in and Makefile.in that build nanojit on, I
hope, a representative assortment of platforms. Surely not all,
but enough to run some tests, I hope.

To test it, I've brought the old "lirasm" tool back to life, and
asked njn to spend some time fleshing out a --random mode that
fuzz-tests the public API of nanojit. The lirasm tool is a simple
assembler (with an optional assemble-and-go execution mode) for a
textual description of LIR.

My goal in doing so was to provide a "common yet trivial client"
both teams can encode all their nanojit-API usage patterns in,
such that changes to nanojit can be made with corresponding
changes made, validated and tested by the client before
committing them to NC. This, I hope, might reduce the chance of
landing something on NC that can't sensibly flow to TR and TM, or
that causes a functional regression. We'll have to see how
successful it is in that role. I'd encourage both teams to become
familiar with lirasm though, and customize it to (somewhat
artificially) "do what your client does" in any additional exotic
ways it doesn't already.


2. Preparing mercurial scripts for reversing the flow of changes.
-----------------------------------------------------------------

Here we get into ugly territory. Recall that the plan is to use
the 'convert' extension to repeatedly copy revisions from NC to
TR and TM. Every time a change occurs in NC, developers of TR and
TM will have to independently run a convert job that copies them
from the NC history into the larger respective TR and TM
histories (and in TM's case, rewrites the paths to embed within
js/src). I'll be wiring the convert job here into the TM
makefile, so it's a simple matter of saying "make import-nanojit"
or such.

Where this gets tricky is "where in the destination history-graph
to put the copied changes". You might think this has a simple
answer -- "at the end" -- but it turns out there are two sensible
answers. So I've been exploring both.

It's easiest to explain the two with a pair of
sequences-of-diagram:


(A) append the changes temporally, clobbering current state on
     the imported files with import state:

    ...--+--TM-trunk--+----+----*

       ||
       ||  import causes
       \/

    ...--+--TM-trunk--+----+----[+--imported-NC--+----*]

       ||
       ||  further development on trunk causes
       \/

    ...--+--TM-trunk--+----+----[+--imported-NC--+----]+----+----*



(B) append the changes logically, picking up from the last import
     point, and merge-forward the changes with the trunk:

    ...--+--TM-trunk--+----+----*

       ||
       ||  first import causes
       \/

    ...--+--TM-trunk--+----+----[+--imported-NC--+----*]

       ||
       ||  further development + merge on trunk causes
       \/

    ...--+--TM-trunk--+----+----[+--imported-NC--+----+]
                            \                           \
                             \                           \
                              \+----+----+----+----+------*

       ||
       ||  subsequent import + merge causes
       \/

    ...--+--TM-trunk--+----+----[+--imported-NC--+----+----+]
                            \                           \    \
                             \                           \    \
                              \+----+----+----+----+------+----*



In these two scenarios, we're developing very different history
graphs. In (A), it is linear, just like the existing TM graph. In
(B) we're developing a parallel lineage that represents *only*
the contiguous NC changes, with regular merge lines running over
to the trunk it's being merged with.

These scenarios have advantages and disadvantages:

   - In (A), the graph looks simpler. Bisect is possibly more
     likely to work non-confusingly. Changes, in any case, look
     like part of the regular work: you will be able to test them
     in context.

   - In (B), the graph is more honest, and you can see the NC
     changes in their parallel form, with each NC change following
     the previous NC change. But it's also more *dishonest*
     because it appears to be development "derived from" the
     initial point of divergence, rather than only being
     circumstantially associated with that point (because that was
     the tip when the *first* import happened).

     In particular: it's likely that the nodes along the parallel
     "imported NC" lineage won't actually build or work as
     composite views of TM, because the non-NC TM portions of them
     (eg. jstracer.cpp) are rooted in a very old snapshot (the
     point of the first import). As time passes, this will render
     those "honest" nodes less and less useful; it will also
     clutter the graph, which in the case of mozilla, is already
     cluttered due to multiple project branches. They can get
     pretty confusing-looking in 'hg view'.


Mercurial's 'convert' extension, by default, does (B). It does so
presumably in cases with a small number of conversions, or a
non-repeating conversion, it's more honest. But it's not
necessarily what we're going to want.

So I've conducted an experiment in writing mercurial code. I
wrote a script that uses the mercurial API to query the
import-target graph, find the last conversion in the source, work
through the descendants of the source to isolate the first
new-incoming-conversion, and write a 'splicemap' that connects
the incoming change to the current tip of the target graph. The
'convert' extension already has support for splicemaps, and this
works. It gets us (A) if we want it. I'll post that to the wiki
presently.

The question each team has to answer for themselves is: do you
want (A) or (B)? I'm going to propose wiring up a makefile rule
in TM that does (A). But I'm open to disagreement on this if you
feel strongly.

-Graydon
_______________________________________________
dev-tech-js-engine mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-js-engine