Re: Raw MIME API proposal

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Joshua Cranmer-2
On 07/31/2011 03:58 AM, Andrew Sutherland wrote:
> The JS engine has available/can have exposed to it typed arrays,
> https://developer.mozilla.org/en/JavaScript_typed_arrays
Typed arrays are nice, but our I/O layer can't spit out a typed array
yet, to my knowledge, so we at least have to put a shell around it in C++.
>> Finally, a C++
>> implementation gives us better leeway in dealing with multithreaded
>> APIs.
>
> I presume the benefit in this case would be that if we parse on one
> thread and consume on another thread, then copying costs could be
> avoided through use of reference-counted global-heap-managed objects?
Well, this is my knowledge of the current state of multithreading in JS:
1. JS XPCOM components cannot be accessed off the main thread
2. If you want to use multiple threads, you have to use workers
3. In worker threads, you can only access interfaces that are explicitly
marked as threadsafe via nsIClassInfo
4. Necko interfaces claim to only work on the main thread

In other words, if it were implemented in JS, to my knowledge, you could
only access the MIME tree via the main thread. I don't particularly care
about parsing and consuming on different threads, but I do want to make
sure that the MIME is not inaccessible to other threads.
> A relevant question would also be whether the (XPCOM) cycle collector
> is multi-thread-aware?  While the MIME hierarchy is indeed a tree and
> so can be represented without cycles, that kind of invariant is easy
> to screw up, especially if one of the goals is to let third-party
> extensions augment the tree, we don't want them to be able to regress
> memory behavior.
The cycle collector is not really thread-aware: it can only handle
refcount changes on the main thread (or the cycle collector thread).
> And along those lines, one might argue that it's much easier for
> third-parties to write and ship pure JS code than build and link it on
> all supported platforms given the release cadence.  I'll cite rkent's
> unfortunate build problems he has been asking about to that end :(
I think the most appropriate answer here is jsctypes, particularly if
bug 593484 can be fixed.

Even if that's not the case, I still wonder how extensible raw MIME
parsing needs to be. It seems sufficiently rare to me that I think we
can handle the pain of binary issues until jsctypes or similar give us a
better story, especially if we expose it via a raw C ABI (which we need
to do for jsctypes anyways).
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Andrew Sutherland-3
On 07/31/2011 12:14 PM, Joshua Cranmer wrote:

> Well, this is my knowledge of the current state of multithreading in JS:
> 1. JS XPCOM components cannot be accessed off the main thread
> 2. If you want to use multiple threads, you have to use workers
> 3. In worker threads, you can only access interfaces that are explicitly
> marked as threadsafe via nsIClassInfo
> 4. Necko interfaces claim to only work on the main thread
>
> In other words, if it were implemented in JS, to my knowledge, you could
> only access the MIME tree via the main thread. I don't particularly care
> about parsing and consuming on different threads, but I do want to make
> sure that the MIME is not inaccessible to other threads.

Yes, if exposed exclusively via XPConnect to C++ code, that could be
troublesome.  But it's not that hard to spin up a JS runtime on another
thread and expose a lightweight C++ wrapper class that is based on using
the JS API to traverse the JS MIME object hierarchy.

Note that my value of "that hard" is of course a relative thing given
that you are already talking about reimplementing the parser in C++ in
an XPCOM/multi-threaded happy way.  If the comparison were using an
existing and known/proven already very working with active users (I
believe Jonathan Kamens is interested in such a strategy) via jsctypes,
that is indeed a different issue.


> Even if that's not the case, I still wonder how extensible raw MIME
> parsing needs to be. It seems sufficiently rare to me that I think we
> can handle the pain of binary issues until jsctypes or similar give us a
> better story, especially if we expose it via a raw C ABI (which we need
> to do for jsctypes anyways).

I agree that it does not seem like the type of thing that needs to be
endlessly extensible.  If we don't start using any such rewrite until we
have implemented PGP support and TNEF support in the core, then it
becomes much less of a concern.

However, my concern is that we would be unlikely to block on that, and
the pain is not experienced by the Thunderbird Core but instead by
extension developers.  It's understandable why the current situation is
so painful for extension developers trying to replace or extend core C++
functionality that was not really designed for extensions.  It would be
less understandable why a ground-up rewrite would leave them in almost
the same exact situation (and one which might still require some degree
of rewrite on their part.)


Andrew
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Andrew Sutherland-3
On 07/31/2011 02:08 PM, Andrew Sutherland wrote:
> If the comparison were using an
> existing and known/proven already very working with active users (I
                                                 ^ C library
> believe Jonathan Kamens is interested in such a strategy) via jsctypes,
> that is indeed a different issue.

(left out a word)

Andrew
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jonathan Protzenko
In reply to this post by Joshua Cranmer-2
This might sounds naïve but why not use coroutines in JS to write an
async libmime parser? The load would still be on the main thread but at
least it would be asynchronous and frankly speaking, with all the effort
that's been put into JS performance lately (as Andrew pointed out),
performance might not be that much of a concern anymore. libmime was
great 20 years ago when we couldn't afford to hold a message all at once
in memory; I tend to think that the constraints have relaxed since then.

jonathan
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jonathan Protzenko
Of course, what I have in mind for point 1. you mentioned on IRC
(message display) is Conversations, which could (on a first approach)
use a pure JS implementation of libmime without having to ensure that it
is accessible for C++. I'm talking about message display here, not
downloading attachments and stuff.

jonathan

On 07/31/2011 04:30 PM, Jonathan Protzenko wrote:

> This might sounds naïve but why not use coroutines in JS to write an
> async libmime parser? The load would still be on the main thread but
> at least it would be asynchronous and frankly speaking, with all the
> effort that's been put into JS performance lately (as Andrew pointed
> out), performance might not be that much of a concern anymore. libmime
> was great 20 years ago when we couldn't afford to hold a message all
> at once in memory; I tend to think that the constraints have relaxed
> since then.
>
> jonathan
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Robert Kaiser
In reply to this post by Joshua Cranmer-2
Andrew Sutherland schrieb:
> On 07/28/2011 05:57 PM, Joshua Cranmer wrote:
>> Finally, a C++
>> implementation gives us better leeway in dealing with multithreaded APIs.
>
> I presume the benefit in this case would be that if we parse on one
> thread and consume on another thread, then copying costs could be
> avoided through use of reference-counted global-heap-managed objects?

Could chrome workers help us there if we would be completely in JS land?

Robert Kaiser


--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the
time, I even appreciate irony and fun! :)
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Joshua Cranmer-2
In reply to this post by Joshua Cranmer-2
On 07/31/2011 07:30 PM, Jonathan Protzenko wrote:
> This might sounds naïve but why not use coroutines in JS to write an
> async libmime parser? The load would still be on the main thread but
> at least it would be asynchronous and frankly speaking, with all the
> effort that's been put into JS performance lately (as Andrew pointed
> out), performance might not be that much of a concern anymore. libmime
> was great 20 years ago when we couldn't afford to hold a message all
> at once in memory; I tend to think that the constraints have relaxed
> since then.

JS doesn't allow practical coroutines if you're also playing around with
recursion (<https://bugzilla.mozilla.org/show_bug.cgi?id=666396> would
fix a lot of the issues).

I don't doubt that JS is fast enough to implement a MIME parser, but
that wasn't necessarily my concern. My concern with a JS implementation
is the following:

1. I want to allow access from C++. This implies an XPCOM JS component,
which is inaccessible from non-main threads, or techniques that have us
create our own JS runtimes, which I'm hesitant to do noting the severe
churn SpiderMonkey has in their APIs.
2. We need this to be accessible from JS workers (the current preferred
model for multithreaded JS, it appears). This implies that we can't
implement as a JS component and instead as a script which can work in
either context (which means I lose a lot of XPCOM--anything that isn't
explicitly thread-safe, and that includes most of necko).
3. We need to deal with potentially binary data (i.e., EAI, inane bad
charsets, or 8-bit MIME). JavaScript lacks a lot of effective APIs in
this regard. Uint8Array or ctypes.char.array are about the closest, but
these have issues (in particular, I highly doubt that String-like APIs
work on them). We also need to be able to call Unicode decoders for RFC
2047 support (although that currently being an auxiliary C++ API means
it's not imperative right now).
4. We need a way to asynchronously deliver data to the parser. XHR is
out because it can't deliver on the "asynchronous" part (I see nothing
in XHR nor XHR2). Necko is out because it's inaccessible from JS chrome
workers (nsIOService fails to implement nsIClassInfo, let alone claim to
be threadsafe). That leaves either creating our own thunking
implementation or doing magic calls to JS, neither of which feel
particularly good to me.

asuth recently argued to me over IRC that it may be better to prototype
this in JS with a hacky layer for now and then migrate it to better APIs
as they are added. While I see the goal of that philosophy, I am
concerned that the platform community will not be forthcoming with the
necessary APIs and we will be forced to maintain a layer of hacky APIs.

In short, what it would take to convince that a JS implementation is the
best way forward:
1. An assurance that an asynchronous, binary I/O API for JS that is
usable from multiple threads is coming.
2. An assurance that there will be a suitable API that allows JS
implementations to both be usable from multiple threads in JS and in C++.
3. Preliminary guidelines for these APIs to minimize churn for when we
remove any necessary preliminary hacky APIs.
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jean-Marc Desperrier-4
In reply to this post by Joshua Cranmer-2
Jonathan Protzenko wrote:
> with all the effort that's been put into JS performance lately (as
> Andrew pointed out), performance might not be that much of a concern
> anymore.
> libmime was great 20 years ago when we couldn't afford to hold a message
> all at once in memory;

In the context of newgroups/mailing-list, there can be ten of thousands
of messages, the performance of the raw part of decoding still needs to
be top-notch.
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Ludovic Hirlimann-4
In reply to this post by Joshua Cranmer-2
On 29/07/11 02:57, Joshua Cranmer wrote:

> Open issues:
> * Part numbering -- Libmime numbers parts differently from IMAP. In
> particular, libmime requires a part number of `1' to get to the body of
> a message, so 1.4 in libmime is really IMAP part 4, and (if IMAP part 3
> were a message/rfc822) 1.3.1.2 would be IMAP part 3.2. Numbering
> non-MIME decapsulated parts would require a different separator, e.g.,
> 3-1.3-3 if we chose to paint the bikeshed `-'.
> * Non-MIME decapsulation -- It's not quite non-MIME, but we sometimes
> need to represent synthesized MIME trees not in the tree. The cases I
> know of:
> - multipart/encrypted (both S/MIME and PGP, although I think PGP doesn't
> actually use this Content-Type?)
> - message/external-body
> - uuencode and yenc
> - TNEF

The partly good news is that we have plenty of test case for the mime
format we don't support properly, so it would be quite easy to have a
large number of tests available and maybe use an Xtreme programming
approach to make sure that this new implementation wouldn't regress.

Ludo
--
Ludovic Hirlimann MozillaMessaging QA lead
https://wiki.mozilla.org/Thunderbird:Testing
http://www.spreadthunderbird.com/aff/79/2
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jonathan Protzenko
In reply to this post by Jean-Marc Desperrier-4
Well let's write it in assembly then! And let's leave it to the
programmers to deal with readability and maintainability, they should
know how to do that.

More seriously, we're not parsing thousands of messages *at once*. The
only thing I can think of that parses lots of messages is the indexing,
and it's asynchronous, in the background, and can be easily configured
to not chew all the CPU.

jonathan

On 08/02/2011 02:20 AM, Jean-Marc Desperrier wrote:

> Jonathan Protzenko wrote:
>> with all the effort that's been put into JS performance lately (as
>> Andrew pointed out), performance might not be that much of a concern
>> anymore.
>> libmime was great 20 years ago when we couldn't afford to hold a message
>> all at once in memory;
>
> In the context of newgroups/mailing-list, there can be ten of
> thousands of messages, the performance of the raw part of decoding
> still needs to be top-notch.
> _______________________________________________
> dev-apps-thunderbird mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-apps-thunderbird
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jonathan Protzenko
In reply to this post by Joshua Cranmer-2


On 08/01/2011 11:39 PM, Joshua Cranmer wrote:

> On 07/31/2011 07:30 PM, Jonathan Protzenko wrote:
>> This might sounds naïve but why not use coroutines in JS to write an
>> async libmime parser? The load would still be on the main thread but
>> at least it would be asynchronous and frankly speaking, with all the
>> effort that's been put into JS performance lately (as Andrew pointed
>> out), performance might not be that much of a concern anymore.
>> libmime was great 20 years ago when we couldn't afford to hold a
>> message all at once in memory; I tend to think that the constraints
>> have relaxed since then.
>
> JS doesn't allow practical coroutines if you're also playing around
> with recursion (<https://bugzilla.mozilla.org/show_bug.cgi?id=666396>
> would fix a lot of the issues).
I was thinking more of the TCO bug being fixed
https://bugzilla.mozilla.org/show_bug.cgi?id=445363, but well... (TCO is
what prevents my coroutines from not growing the stack in
https://github.com/protz/thunderbird-stdlib/blob/master/tests/test_SimpleStorage.js).
>
> I don't doubt that JS is fast enough to implement a MIME parser, but
> that wasn't necessarily my concern. My concern with a JS
> implementation is the following:
>
> 1. I want to allow access from C++. This implies an XPCOM JS
> component, which is inaccessible from non-main threads, or techniques
> that have us create our own JS runtimes, which I'm hesitant to do
> noting the severe churn SpiderMonkey has in their APIs.
Why? I'm sorry but I'm not getting that part at all. Could you be very
clear in how and why you need this to be done? How is that not solved
with a component on the main thread written in JS that delegates to some
other thread, if you really insist on having this multithreaded?

Cheers,

jonathan
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jean-Marc Desperrier-4
In reply to this post by Jean-Marc Desperrier-4
Jonathan Protzenko wrote:
> Well let's write it in assembly then! And let's leave it to the
> programmers to deal with readability and maintainability, they should
> know how to do that.
>
> More seriously, we're not parsing thousands of messages *at once*.

What I mean is that if you start by saying we don't need to care about
performance they drop a lot (and a lot more than what having a cleaner
code actually needs), and that here they are some use case where they
are critical.

I'm convinced you can do almost all the processing in js, and maybe even
*all* the processing in js, and still have good performance for those
specific cases, but for that to happen you need to check and test this
kind of critical cases from the start.

> The only thing I can think of that parses lots of messages is the indexing

mozilla.support.firefox contains around 52 000 message. It's not that
much of an anomaly for a newsgroup folder.
Switching the view to ordered by "From" takes around 6 seconds on my
Core2 Duo E7500 @3GHz, almost the same as going from threaded to
unthreaded, but unthreaded to threaded take only about 2 seconds.

I know how deeply inefficient the code behind is, in fact this
threaded/unthreaded asymmetry cries out "hey look at how inefficient I
am, I can even run the easier case 3 time slower", so I wouldn't be that
surprised if a full js but well thought and smartly written code were to
end up being faster. But I'm also convinced a code written with the
assumption that there is nothing there that needs to be
optimized/well-written algorithmically will be slower.

So, I don't believe for one second you need to go assembly language, you
just need to check the algorithmic complexity and make sure your inner
loop are very efficient (and with type-interference, etc., that could
still be full js).
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Joshua Cranmer-2
On 8/2/2011 9:45 AM, Jean-Marc Desperrier wrote:

> mozilla.support.firefox contains around 52 000 message. It's not that
> much of an anomaly for a newsgroup folder.
> Switching the view to ordered by "From" takes around 6 seconds on my
> Core2 Duo E7500 @3GHz, almost the same as going from threaded to
> unthreaded, but unthreaded to threaded take only about 2 seconds.
>
> I know how deeply inefficient the code behind is, in fact this
> threaded/unthreaded asymmetry cries out "hey look at how inefficient I
> am, I can even run the easier case 3 time slower", so I wouldn't be
> that surprised if a full js but well thought and smartly written code
> were to end up being faster. But I'm also convinced a code written
> with the assumption that there is nothing there that needs to be
> optimized/well-written algorithmically will be slower.

That's not about MIME parsing, that's about our database efficiencies or
lack thereof. And possibly nsITreeView overhead.
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Andrew Sutherland-3
In reply to this post by Joshua Cranmer-2
On 08/01/2011 11:39 PM, Joshua Cranmer wrote:
> In short, what it would take to convince that a JS implementation is the
> best way forward:

It sounds very much like you are very determined to go the C++ route.
As long as you are willing to do the work and see it through to
completion, that's fantastic and I will happy to see the backside of
libmime.

If anyone else is looking to take this on, I suggest using JS and I
would be happy to provide pointers on what would be required to expose
the JS implementation to C++ without using XPCOM/XPConnect.

Andrew
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jean-Marc Desperrier-4
In reply to this post by Joshua Cranmer-2
Andrew Sutherland wrote:
> node.js also has a pretty good example of a byte-buffer abstraction that
> can do string encoding tricks.

In the MIME case, I think the best option is indeed to keep the data as
a raw byte buffer, and convert it to what will be displayed on the fly.

And if it's included in a new message, copy the raw byte version, not
the interpreted one. You might need to have a cache of the sanitized
version when your comparison are based on it, and not just the display.

It would be a great step forward if two Thunderbird clients responding
to each other were *guaranteed* to keep the content of the subject the
same (when the user did not manually edit it), whatever their charset
decoding is set to, or that if the subject is incorrect it doesn't get
more and more broken each time.

> While MIME parsing involves more encodings than node knows about,
> it's not a hard set of tricks to teach it.

You make me slightly worried that you underestimate how hard it is to
correctly handle the edge cases of MIME encoding (decoding is a little
bit easier, but not much).

99% of the libraries around get it wrong.
I remember some specific cases where Mozilla's MIME code had it wrong
for a very long time, resulting for example with the incorrect insertion
of tab characters inside the subject.

I didn't look at every detail, but there's a related bug that's still
open and active (created ten years ago in 2001) :
"inconsistent display of TAB characters in subjects & thread pane"
https://bugzilla.mozilla.org/show_bug.cgi?id=64948
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Joshua Cranmer-2
In reply to this post by Andrew Sutherland-3
On 8/2/2011 10:06 AM, Andrew Sutherland wrote:
> On 08/01/2011 11:39 PM, Joshua Cranmer wrote:
>> In short, what it would take to convince that a JS implementation is the
>> best way forward:
>
> It sounds very much like you are very determined to go the C++ route.
> As long as you are willing to do the work and see it through to
> completion, that's fantastic and I will happy to see the backside of
> libmime.

Not so much "determined" as I am "resigned". I've also had, through a
few of my other projects, experience with keeping something in sync with
SpiderMonkey code; that experience has left a bad taste in my mouth, as
it seems practically every new release modifies the APIs needed to run a
script. The other concern I have is maintaining the globals that people
expect.

After thinking about this some more, I realized what you probably meant
earlier about driving the JS runtime ourselves, after independently
coming up with something similar. Some poking around xpconnect leaves me
assured that the globals issue can be solved
(nsIXPConnect::InitClassesWithNewWrappedGlobal), but I'd still rather
see a method which says "give me a global object and context for this
script".
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Andrew Sutherland-3
On 08/02/2011 10:53 AM, Joshua Cranmer wrote:
> Not so much "determined" as I am "resigned". I've also had, through a
> few of my other projects, experience with keeping something in sync with
> SpiderMonkey code; that experience has left a bad taste in my mouth, as
> it seems practically every new release modifies the APIs needed to run a
> script. The other concern I have is maintaining the globals that people
> expect.

Even if they do change the API to run a script every release, we are
talking about two functions with simple signatures, and honestly, these
look pretty familiar to me from many revisions ago:

JSObject * JS_CompileFile(JSContext *cx, JSObject *obj, const char
*filename);
JSBool JS_ExecuteScript(JSContext *cx, JSObject *obj, JSObject
*scriptObj, jsval *rval);

https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_CompileFile
https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_ExecuteScript
https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#Compiled_scripts


If you are referring to the bit-rot JSHydra experienced, I don't think
it's surprising that the parser internals would change.  (And now there
is an explicit AST intended for consumption! :)


> After thinking about this some more, I realized what you probably meant
> earlier about driving the JS runtime ourselves, after independently
> coming up with something similar. Some poking around xpconnect leaves me
> assured that the globals issue can be solved
> (nsIXPConnect::InitClassesWithNewWrappedGlobal), but I'd still rather
> see a method which says "give me a global object and context for this
> script".

How about:
https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_NewContext
followed by:
https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_NewCompartmentAndGlobalObject

Andrew
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

David Bienvenu
In reply to this post by Andrew Sutherland-3
On 8/2/2011 10:06 AM, Andrew Sutherland wrote:
>
> If anyone else is looking to take this on, I suggest using JS and I would be happy to provide pointers on what would be required to expose the JS implementation to C++
> without using XPCOM/XPConnect.
Roughly what does that entail?

- David
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Andrew Sutherland-3
On 08/02/2011 01:27 PM, David Bienvenu wrote:
> On 8/2/2011 10:06 AM, Andrew Sutherland wrote:
>> If anyone else is looking to take this on, I suggest using JS and I
>> would be happy to provide pointers on what would be required to expose
>> the JS implementation to C++ without using XPCOM/XPConnect.
> Roughly what does that entail?

The control-flow could vary a lot depending on the need of the C++ code.
  In general, the tree big things that would need to happen are:
1) Spin up a JS runtime on the thread and load the JS mime parsing code
into it.
2) Cause I/O to be fed to the JS mime parsing code, producing a JS
object representation as a byproduct.
3) Traverse/consume the resulting JS object hierarchy.

For #1, see this for a good idea of what the boilerplate looks like.
https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#A_minimal_example
It may also be possible to piggyback on the web/chrome workers
mechanism, depending.


For #2, there are a variety of ways this could happen, such as:
- Expose the I/O mechanisms that web/chrome workers can use into the JS
runtime, tell the JS code a mailnews URI, and let it use those I/O
mechanisms to get the data.  For example, an initial hack could just use
XHR1.

- Expose a custom C++ object to JS to provide I/O services, it would
look something like this:
https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#Defining_objects_and_properties

- Use a C++ I/O mechanism to get the data to our thread, then feed it
into the JS code either by direct invocation or by posting an event.
Direct calls look like so and posted events would also look similar,
it's just a question of how much complexity is on the stack when you
call into JS space and how much of a chance you give the trace JIT to
get into a groove by providing it with lots of data and longer loops
that avoid returning into C++ space too much:
https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#Calling_functions


For #3, you basically translate how you would traverse the object from
JS into C++:
https://developer.mozilla.org/En/SpiderMonkey/JSAPI_Phrasebook#Object_properties


It's worth noting that there's a very limited set of things that the
current C++ back-end needs from the parser *if we assume that most of
the display bits will be done in JS*.  Specifically, at the bottom-most
layer, I suspect the mbox parsing logic only needs a very simple
understanding of message parsing.  Then when we go up a layer to the
things nsIMsgDBHdrs need to know about, they have a constrained set of
data that could easily be postMessaged across from a worker thread and
then set on an nsIMsgDBHdr via traditional XPCOM.  Of course, you
obviously know much more about the needs of the existing C++ code than I!

Andrew
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Raw MIME API proposal

Jonathan Protzenko
If that's the route we decide to go, maybe
https://bugzilla.mozilla.org/show_bug.cgi?id=649537 would be a source
of inspiration (esp. given the "Lose XPConnect part").

jonathan

On Tue 02 Aug 2011 06:14:34 PM PDT, Andrew Sutherland wrote:

> On 08/02/2011 01:27 PM, David Bienvenu wrote:
>> On 8/2/2011 10:06 AM, Andrew Sutherland wrote:
>>> If anyone else is looking to take this on, I suggest using JS and I
>>> would be happy to provide pointers on what would be required to expose
>>> the JS implementation to C++ without using XPCOM/XPConnect.
>> Roughly what does that entail?
>
> The control-flow could vary a lot depending on the need of the C++ code.
> In general, the tree big things that would need to happen are:
> 1) Spin up a JS runtime on the thread and load the JS mime parsing code
> into it.
> 2) Cause I/O to be fed to the JS mime parsing code, producing a JS
> object representation as a byproduct.
> 3) Traverse/consume the resulting JS object hierarchy.
>
> For #1, see this for a good idea of what the boilerplate looks like.
> https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#A_minimal_example 
>
> It may also be possible to piggyback on the web/chrome workers
> mechanism, depending.
>
>
> For #2, there are a variety of ways this could happen, such as:
> - Expose the I/O mechanisms that web/chrome workers can use into the JS
> runtime, tell the JS code a mailnews URI, and let it use those I/O
> mechanisms to get the data. For example, an initial hack could just use
> XHR1.
>
> - Expose a custom C++ object to JS to provide I/O services, it would
> look something like this:
> https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#Defining_objects_and_properties 
>
>
> - Use a C++ I/O mechanism to get the data to our thread, then feed it
> into the JS code either by direct invocation or by posting an event.
> Direct calls look like so and posted events would also look similar,
> it's just a question of how much complexity is on the stack when you
> call into JS space and how much of a chance you give the trace JIT to
> get into a groove by providing it with lots of data and longer loops
> that avoid returning into C++ space too much:
> https://developer.mozilla.org/En/SpiderMonkey/JSAPI_User_Guide#Calling_functions 
>
>
>
> For #3, you basically translate how you would traverse the object from
> JS into C++:
> https://developer.mozilla.org/En/SpiderMonkey/JSAPI_Phrasebook#Object_properties 
>
>
>
> It's worth noting that there's a very limited set of things that the
> current C++ back-end needs from the parser *if we assume that most of
> the display bits will be done in JS*. Specifically, at the bottom-most
> layer, I suspect the mbox parsing logic only needs a very simple
> understanding of message parsing. Then when we go up a layer to the
> things nsIMsgDBHdrs need to know about, they have a constrained set of
> data that could easily be postMessaged across from a worker thread and
> then set on an nsIMsgDBHdr via traditional XPCOM. Of course, you
> obviously know much more about the needs of the existing C++ code than I!
>
> Andrew
> _______________________________________________
> dev-apps-thunderbird mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-apps-thunderbird
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
12