Choosing a programming language

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Choosing a programming language

Joshua Cranmer 🐧
Attach

This is something I've been thinking about for a while, and I want to
solicit feedback and hopefully consensus from our wonderful programming
community at large on the matter. I want to consider what programming
language(s) we should be moving towards in new code in TB. The key here
is "new code"; it is infeasible, and probably ill-advised, for us to try
a massive rush to rewrite all of our code into whatever target language
we want just for the sake of rewriting code in a language, but we should
be thoughtful if it is worth migrating code to a better language as we
improve or modernize it.

Within the Mozilla build system, we essentially have a choice of
adopting 4 languages. I'll summarize their advantages and disadvantages
here:

Javascript
*Advantages
*

  * Dynamically typed
  * We can very aggressively new language features, due to our reliance
    on tip-of-trunk SpiderMonkey
  * No need to recompile after editing source files
  * Good tooling for debugging, testing, benchmarking--if our code can
    run in the environments that those tools want

        *Disadvantages
*

  * Dynamically typed
  * Workers have a cumbersome model, which makes it hard to move to
    multithreaded code
  * System APIs (e.g., filesystem, networking) are generally unavailable
    unless there's an XPIDL or DOM interface exposing them
  * The scope for a lot of third-party packages tend to imply a Node.js
    backing, which requires shims to implement outside of Node.js
  * A lot of dumb programmer errors (e.g., fat-fingered a name) can only
    be caught at runtime
  * Support for binary parsing is kind of sucky, and support for
    quasi-binary is really poor
  * Performance can be dicey or unpredictable

XPIDL-based C++
*Advantages
*

  * Our code is already written in this format
  * XPIDL is the most flexible FFI system we have at the moment. The
    only missing path is calling XPIDL from JS workers

        *Disadvantages
*

  * The API style is outdated, and it requires lots of macros and other
    magic incantations to get stuff done
  * Mozilla's long-term commitment to XPIDL is questionable (but XPIDL
    is basically a way of enforcing a common AVI--with the exception of
    xpconnect, it's pretty trivial to maintain ourselves)
  * Mozilla considers XPIDL to at the very least be soft-deprecated
  * Promise-style async API has pretty much no support whatsoever here

Modern C++
*Advantages
*

  * Modern C++ is actually quite an ergonomic language, at least if
    you're not attempting to wrap everything into a std::move and
    &&-based logic
  * Mozilla's done a fairly decent job at providing a useful library of
    standard ADT and some system API bits in MFBT and de-COM'd xpcom code
  * Compared to XPIDL, being able to chain method calls or not have to
    assume that every function can potentially fail is a big win

        *Disadvantages
*

  * Using lambdas for callbacks can easily create use-after-free bugs
  * Exposing this to anything else is generally difficult unless there's
    already a bindings framework to use.
  * Mozilla generally prohibits the use of the STL
  * There's a lag between new features being added to the standard and
    our ability to pick them up
  * No consistent API for handling error propagation, either in
    non-Mozilla projects or in Mozilla code

Rust
*Advantages
*

  * Rust's handling of strings-versus-binary-versus-ASCII-but-maybe-not
    is the best of the set of possible languages
  * The error handling and propagation is pretty sane, safe, and
    ergonomic. Least likely to have errors get dropped on the floor with
    no one noticing that an error ever happened
  * The borrow checker allows for enforcing quite a few invariants in
    the type system
  * Rust can easily compile to WebAssembly, which allows an extra way to
    call from JS code for computation-heavy code
  * Cargo is probably the friendliest package system I've tried dealing with

        *Disadvantages
*

  * As a newer language, we're less likely to see knowledgeable contributors
  * Assuaging the borrow checker can be challenging for novices
  * Rust<->JS calls are particularly challenging
  * All of the vendoring of crates happens in mozilla-central--it could
    be a challenge if we start using libraries that m-c doesn't use


I don't think there is a great benefit for enforcing that we have to
pick one language to implement the entirety of TB in. The reality is
that we don't have the bandwidth to rewrite everything. Even if we could
magically wave that away, the reality of system integration is that
systems require us to have support for native languages--which include
C++, Objective-C, even Java for Android--to implement necessary
features. Furthermore, I don't see any realistic way of cutting lose our
dependency from the Mozilla stack, and I think that people are too
optimistic when looking at the challenges of shimming all of the system
APIs if we were to try to support multiple stacks. At the end of the
day, multiple languages I don't see as the biggest barrier, or even one
of the biggest barriers, to development. From that perspective, then, it
makes sense to break down our components into smaller pieces to figure
which languages ought to be used. Here are the components as I see them:


        UI and frontend

This will be implemented in JS. There are a few pieces right now which
are not (nsMsgDBTreeView anybody?), but I expect that even these would
likely eventually move to JS. And I doubt there's any room for
discussion on the matter.


        Protocols, including formats such as MIME, TNEF or PGP

As I've discussed in my last thread, this is the sort of stuff where JS
suffers the most. It's also where Rust tends to shine the brightest. Of
course, there are complications. For complex things, particularly IMAP,
you generally want a procedural structure for the code; this suggests
off-main-thread synchronous I/O or an async/await implementation. C++
and Rust are both /getting/ coroutine support in some form, but neither
of them have it yet: Rust could probably see async/await stabilized by
the end of this year, and C++20 officially merged coroutine support only
a month ago (so wait at least three or four years for minimum-supported
compilers to get it).

My personal opinion here is that we should stick with the status quo for
now, but we should explore the viability of Rust implementations. If
Rust doesn't pan out, then I would suggest looking for a modern C++
implementation instead. I don't think we should choose to implement this
sort of stuff in JS.


        Database (which includes both the msgdb and the mailbox store)

It would be daft of us to try to implement a database (as in something
like Mork or SQLite) ourselves (and yes, I'm aware of the irony that we
actually do exactly this right now). Building high-performance, durable
databases is a specialized skillset that is rather orthogonal to the
main challenges of building an email client, and I don't think we have
any community members with adequate expertise in that skillset. It's
much easier just to reuse an off-the-shelf implementation. Of all the
components I discuss, this is the one that most needs invasive
modifications, and the necessary modifications are going to have to
amount to a complete rewrite before long: the present API forces us to
doing lots of stuff as synchronous, on-main-thread disk access, which is
a recipe for performance issues and one whose results have already been
observed.

The database API obviously is going to be most heavily used by the UI.
The connections to the protocol implementations--particularly if we
rewrite them to be separate the protocol from the consumer logic--are
much weaker and can be encapsulated in a few smaller details (basically,
there's a method to convert a MIME message to a header object, some
potentially-batched add operations, flag toggles, message deletion by
key, and a list-keys operation, and I think that's it, although I
haven't audited IMAP code in great detail). I don't have a strong
opinion here; it depends on what database implementation best suits our
needs.


        The smorgasbord of everything else

What I called out above is the code that is likely to be performance
sensitive. Everything else that falls into this category of "other" is
likely not to be (search is the possible exception, but if the database
exposes a search API, most of the critical performance pieces are going
to be in the database bits rather than the wrapping search code). The
code that deals with allowing for multiple backends--that's for both
mail protocols and address books--is going to face the reality that some
implementations will want to be JS and some will want to be native (see
my early note that system integration is going to invariably involve
native code somewhere). But the orchestration code itself could
reasonably be in any language, so long as the cross-language bindings
are possible.

Thoughts/questions/comments/concerns?

--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

ISHIKAWA, Chiaki
Thank you for raising these points, Joshua.

Maybe the next Counci will take notice and may want to produce a
whitepaper/guideline or something.

On 2019/03/19 11:54, Joshua Cranmer 🐧 wrote:

> Attach
>
> This is something I've been thinking about for a while, and I want to
> solicit feedback and hopefully consensus from our wonderful
> programming community at large on the matter. I want to consider what
> programming language(s) we should be moving towards in new code in TB.
> The key here is "new code"; it is infeasible, and probably
> ill-advised, for us to try a massive rush to rewrite all of our code
> into whatever target language we want just for the sake of rewriting
> code in a language, but we should be thoughtful if it is worth
> migrating code to a better language as we improve or modernize it.
>
> Within the Mozilla build system, we essentially have a choice of
> adopting 4 languages. I'll summarize their advantages and
> disadvantages here:
>
> Javascript
> *Advantages
> *
>
>  * Dynamically typed
>  * We can very aggressively new language features, due to our reliance
>    on tip-of-trunk SpiderMonkey
>  * No need to recompile after editing source files
>  * Good tooling for debugging, testing, benchmarking--if our code can
>    run in the environments that those tools want
>
>     *Disadvantages
> *
>
>  * Dynamically typed
>  * Workers have a cumbersome model, which makes it hard to move to
>    multithreaded code
>  * System APIs (e.g., filesystem, networking) are generally unavailable
>    unless there's an XPIDL or DOM interface exposing them
>  * The scope for a lot of third-party packages tend to imply a Node.js
>    backing, which requires shims to implement outside of Node.js
>  * A lot of dumb programmer errors (e.g., fat-fingered a name) can only
>    be caught at runtime
>  * Support for binary parsing is kind of sucky, and support for
>    quasi-binary is really poor
>  * Performance can be dicey or unpredictable
>
> XPIDL-based C++
> *Advantages
> *
>
>  * Our code is already written in this format
>  * XPIDL is the most flexible FFI system we have at the moment. The
>    only missing path is calling XPIDL from JS workers
>
>     *Disadvantages
> *
>
>  * The API style is outdated, and it requires lots of macros and other
>    magic incantations to get stuff done
>  * Mozilla's long-term commitment to XPIDL is questionable (but XPIDL
>    is basically a way of enforcing a common AVI--with the exception of
>    xpconnect, it's pretty trivial to maintain ourselves)
>  * Mozilla considers XPIDL to at the very least be soft-deprecated
>  * Promise-style async API has pretty much no support whatsoever here
>
> Modern C++
> *Advantages
> *
>
>  * Modern C++ is actually quite an ergonomic language, at least if
>    you're not attempting to wrap everything into a std::move and
>    &&-based logic
>  * Mozilla's done a fairly decent job at providing a useful library of
>    standard ADT and some system API bits in MFBT and de-COM'd xpcom code
>  * Compared to XPIDL, being able to chain method calls or not have to
>    assume that every function can potentially fail is a big win
>
>     *Disadvantages
> *
>
>  * Using lambdas for callbacks can easily create use-after-free bugs
>  * Exposing this to anything else is generally difficult unless there's
>    already a bindings framework to use.
>  * Mozilla generally prohibits the use of the STL
>  * There's a lag between new features being added to the standard and
>    our ability to pick them up
>  * No consistent API for handling error propagation, either in
>    non-Mozilla projects or in Mozilla code
>
> Rust
> *Advantages
> *
>
>  * Rust's handling of strings-versus-binary-versus-ASCII-but-maybe-not
>    is the best of the set of possible languages
>  * The error handling and propagation is pretty sane, safe, and
>    ergonomic. Least likely to have errors get dropped on the floor with
>    no one noticing that an error ever happened
>  * The borrow checker allows for enforcing quite a few invariants in
>    the type system
>  * Rust can easily compile to WebAssembly, which allows an extra way to
>    call from JS code for computation-heavy code
>  * Cargo is probably the friendliest package system I've tried dealing
> with
>
>     *Disadvantages
> *
>
>  * As a newer language, we're less likely to see knowledgeable
> contributors
>  * Assuaging the borrow checker can be challenging for novices
>  * Rust<->JS calls are particularly challenging
>  * All of the vendoring of crates happens in mozilla-central--it could
>    be a challenge if we start using libraries that m-c doesn't use
>
>
> I don't think there is a great benefit for enforcing that we have to
> pick one language to implement the entirety of TB in. The reality is
> that we don't have the bandwidth to rewrite everything. Even if we
> could magically wave that away, the reality of system integration is
> that systems require us to have support for native languages--which
> include C++, Objective-C, even Java for Android--to implement
> necessary features. Furthermore, I don't see any realistic way of
> cutting lose our dependency from the Mozilla stack, and I think that
> people are too optimistic when looking at the challenges of shimming
> all of the system APIs if we were to try to support multiple stacks.
> At the end of the day, multiple languages I don't see as the biggest
> barrier, or even one of the biggest barriers, to development. From
> that perspective, then, it makes sense to break down our components
> into smaller pieces to figure which languages ought to be used. Here
> are the components as I see them:
>
I have a couple of questions and a wish as an occasional patch contributor.

Q-1:. What is the difference between XIPDL-based C++ and modern C++?
I am not sure what you are describing here.

I thought following XIPDL was the necessity to publish API that is used
by JavaScript, etc., i.e., outside the C++ domain in the mozilla
framework. Writing unconstrained code in C++ is a sure way to fail to
provide APIs that can be used in other languages, I think.

I am not saying writing in XPIDL-based API is the sure way to provide
APIs that can be used in other languages very successsfully.
We can certainly write APIs diffcult to use and error-prone (say, memory
allocation/release issues), etc.).
But at least, the use of XPIDL interface heightens the awareness that we
are writing APIs to be used in other languages. And, we can focus our
debugging efforts to a few published selected APIs instead of myriads of
exported symbols.


Q-2:  What do you see the definitive specification for the language Rust?
I hate to invest in a language until a clearcut formal specification of
its syntax and semantics is given in an easily available document.
(I am not sure if the language has been formally defined before: I read
somewhere that finally a formal definition, or an attempt to define the
language semantics written in a dynamically executable function language
appeared in the last year? The article referred to the following PDF at
arxiv.: https://arxiv.org/pdf/1804.10806.pdf
                    arXiv: 1804.10806v1  [cs.PL]  28 Apr 2018 KRust: A
Formal Executable Semantics of Rust

I think Haskel and a few other languages with established history is
wonderful although I am not advocating writing a mail client in Haskel.
(Come to think of it, though, it may be a good idea to build a
long-lasting program with low maintenance overhead in Haskel or in  a
few other such languages IFF we can provide external libraries that can
be proven "correct" in the framework of Haskel and friends.)

W-1: One point I would like to see in the language runtime support or
environment is as follows.

Difficulty to hook to a good debugging framework during the bulk test
suite execution ( I am talking about |make mozmill| test suite ) is a
big deficiency under linux environment IMHO.  I mean when I see a
dubious behavior (warnings/errors/assertions), it is not entirely clear
how to debug that easily. Well, after a few trials and errors, we can
begin checking what is going on. But under linux, often times, I have to
insert some probes (dumping variables) and recompile the source and off
I go with gdb, etc. iif I am lucky enough to figure out the sequences of
operations that lead to that errors/warnings/assertions.

Maybe I am not using the toolchain to their best effect, but then I
don't find good reference / web page about that.

If I can insert a breakpoint to a given named function and then run TB
inside |make mozimill| under that setting so that I get a gdb prompt at
the console from which I invoked |make mozmill|
at the time the execution reaches the selected function within executed
TB binary,  that may be a starting point.
There may be a hook built in, but I am not sure if this works today, or
it has been documented well before.

TIA

Chiaki

PS: BTW, I hate JS due to its dynamically typed system. The recent
eslint and other static analysis check is a wonderful addition to lower
long-term maintenance burden.
Yes, I like small size DSL, but we have to recognize the limitation in
terms of long-term maintenance and should resist writing a large code
that is to be maintained for a long term. Famous last words. I have done
the similar things, but they are NOT released to wider public, and it is
only me who suffers.




_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Mark Rousell-3
On 19/03/2019 08:59, ISHIKAWA,chiaki wrote:
> PS: BTW, I hate JS due to its dynamically typed system.

Have you looked at Typescript? It seems to be designed especially to
address issues like this.


--
Mark Rousell
 
 
 

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

ISHIKAWA, Chiaki
On 2019/03/20 1:21, Mark Rousell wrote:
> On 19/03/2019 08:59, ISHIKAWA,chiaki wrote:
>> PS: BTW, I hate JS due to its dynamically typed system.
> Have you looked at Typescript? It seems to be designed especially to
> address issues like this.
>
I know it exists. But given that mozilla uses built-in JS interpreter,
the chance of using Typescript is nil unless the built-in JS interpreter
is modified to support Typescript, isn't it?
(If it happens, it will be very nice for maintenance reason alone.)

Joshua probably didn't raise the prospect of better JS-like script
languages precisely because of this reason.
But I may be wrong.

Chiaki


_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Mark Rousell-3
On 19/03/2019 18:59, ISHIKAWA,chiaki wrote:

> On 2019/03/20 1:21, Mark Rousell wrote:
>> On 19/03/2019 08:59, ISHIKAWA,chiaki wrote:
>>> PS: BTW, I hate JS due to its dynamically typed system.
>> Have you looked at Typescript? It seems to be designed especially to
>> address issues like this.
>>
> I know it exists. But given that mozilla uses built-in JS interpreter,
> the chance of using Typescript is nil unless the built-in JS
> interpreter is modified to support Typescript, isn't it?
> (If it happens, it will be very nice for maintenance reason alone.)
>
> Joshua probably didn't raise the prospect of better JS-like script
> languages precisely because of this reason.
> But I may be wrong.

Typescript compiles to standard Javascript[1] so there's no need for
Typescript-specific infrastructure within Mozilla/Thunderbird.

How practical it would actually be to write Thunderbird UI in Typescript
and compile to Javascript I don't know but, in theory, it should
certainly work.


Footnote:-
1: From https://www.typescriptlang.org/index.html: "TypeScript compiles
to clean, simple JavaScript code which runs on any browser, in Node.js,
or in any JavaScript engine that supports ECMAScript 3 (or newer)."

--
Mark Rousell
 
 
 

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

ISHIKAWA, Chiaki
On 2019/03/20 4:50, Mark Rousell wrote:

> On 19/03/2019 18:59, ISHIKAWA,chiaki wrote:
>> On 2019/03/20 1:21, Mark Rousell wrote:
>>> On 19/03/2019 08:59, ISHIKAWA,chiaki wrote:
>>>> PS: BTW, I hate JS due to its dynamically typed system.
>>> Have you looked at Typescript? It seems to be designed especially to
>>> address issues like this.
>>>
>> I know it exists. But given that mozilla uses built-in JS interpreter,
>> the chance of using Typescript is nil unless the built-in JS
>> interpreter is modified to support Typescript, isn't it?
>> (If it happens, it will be very nice for maintenance reason alone.)
>>
>> Joshua probably didn't raise the prospect of better JS-like script
>> languages precisely because of this reason.
>> But I may be wrong.
>
> Typescript compiles to standard Javascript[1] so there's no need for
> Typescript-specific infrastructure within Mozilla/Thunderbird.
>
> How practical it would actually be to write Thunderbird UI in Typescript
> and compile to Javascript I don't know but, in theory, it should
> certainly work.
>

I see. This means we have to run through typescript code via TypeScript
compiler, sort of, to produce clean JavaScript code and place it inside
the source tree BEFORE creating the testable/release versions.

I think it is doable: whether the majority of developers would like the
extra step or not is another question.
I, for one,  would go for TypeScript and other variants that enforce
static type checking better than the current JS.

>
> Footnote:-
> 1: From https://www.typescriptlang.org/index.html: "TypeScript compiles
> to clean, simple JavaScript code which runs on any browser, in Node.js,
> or in any JavaScript engine that supports ECMAScript 3 (or newer)."
>

It is ironic that node.js was, the last time I checked, a typical
example of sloppy JS code in terms of typing. EsLint would have barfed
on it. Of course, there was a reason. The developers of Node.js wanted
to reduce the number of characters in the code as much as possible, and
there are enough eyeballs to keep it in good shape.
The latter is quite the opposite of TB development community scene.

Chiaki

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Mark Rousell-3
On 19/03/2019 20:04, ISHIKAWA, Chiaki wrote:
> I see. This means we have to run through typescript code via
> TypeScript compiler, sort of, to produce clean JavaScript code and
> place it inside the source tree BEFORE creating the testable/release
> versions.

Yes, that is my understanding.

> I, for one,  would go for TypeScript and other variants that enforce
> static type checking better than the current JS.

I understand that that is indeed its appeal for many people.


--
Mark Rousell
 
 
 

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Joshua Cranmer 🐧
In reply to this post by ISHIKAWA, Chiaki
On 3/19/2019 4:59 AM, ISHIKAWA,chiaki wrote:

> Q-1:. What is the difference between XIPDL-based C++ and modern C++?
> I am not sure what you are describing here.
>
> I thought following XIPDL was the necessity to publish API that is
> used by JavaScript, etc., i.e., outside the C++ domain in the mozilla
> framework. Writing unconstrained code in C++ is a sure way to fail to
> provide APIs that can be used in other languages, I think.
>
> I am not saying writing in XPIDL-based API is the sure way to provide
> APIs that can be used in other languages very successsfully.
> We can certainly write APIs diffcult to use and error-prone (say,
> memory allocation/release issues), etc.).
> But at least, the use of XPIDL interface heightens the awareness that
> we are writing APIs to be used in other languages. And, we can focus
> our debugging efforts to a few published selected APIs instead of
> myriads of exported symbols.

When I said XPIDL-based C++, I meant to refer specifically to the style
of coding where to use another object, you create an nsIFrozz interface,
fill it out with everything you want, then go to an nsFrozz instance,
have it implement nsIFrozz, and interact with nsFrozz exclusively
through nsIFrozz. Furthermore, it tends to rely on methods that look
like nsresult GetFrozz(nsIFrozz **instance) instead of
already_AddRefed<nsFrozz> GetFrozz() (or RefPtr<> in some cases). Where
this really tends to hurt is callbacks; you have to create an interface
whose sole purpose is to be the callback function, and then refactor
stuff to actually implement that interface through the entire XPCOM
infrastructure. By contrast, modern C++ can represent such a callback
with a lambda function, or even just a RefPtr + pointer-to-member
function. You can look at the APIs we have for proxying individual
function calls off-main-thread for an example.

Right now, I would say that mailnews code definitely errs way too far on
make-an-interface-for-everything. I don't see compelling needs for
nsI*Protocol to be exposed outside of script, and things like
nsINNTPArticleList and nsINNTPNewsgroupList are completely extraneous.

> Q-2: What do you see the definitive specification for the language Rust?
> I hate to invest in a language until a clearcut formal specification
> of its syntax and semantics is given in an easily available document.

The Rust Reference
<https://doc.rust-lang.org/reference/introduction.html> is a reasonably
complete document of syntax and generally sufficiently complete for
semantics. It's not a formal semantics, and it's definitely vaguer than
the level of the C++ standard, but my experience of Rust versus C++ is
that unless you're dealing with unsafe code in Rust, the compiler will
prevent you from accidentally writing code with the wrong semantics.

I will also point out that I'm not aware of any attempts to build an
executable semantics for C++; most people give up after tackling C,
which itself is quite difficult. C and C++ are also underspecified in
some annoying ways--I had a fun time discovering that volatile register
_Atomic int x; is a legal declaration and I've been unable to find any
definitive clue as to what the legal semantics of such an abomination are.

> W-1: One point I would like to see in the language runtime support or
> environment is as follows.
>
> Difficulty to hook to a good debugging framework during the bulk test
> suite execution ( I am talking about |make mozmill| test suite ) is a
> big deficiency under linux environment IMHO.  I mean when I see a
> dubious behavior (warnings/errors/assertions), it is not entirely
> clear how to debug that easily. Well, after a few trials and errors,
> we can begin checking what is going on. But under linux, often times,
> I have to insert some probes (dumping variables) and recompile the
> source and off I go with gdb, etc. iif I am lucky enough to figure out
> the sequences of operations that lead to that errors/warnings/assertions.

Mach makes dealing with most of the testsuites pretty easy, but the
mozmill testsuite is definitely a hole since it's not integrated with
mach at all. (That's one of the reasons I avoid that testsuite whenever
possible).

In general, the best way to debug code is often to put code in a
restricted, smaller environment. Mozmill, since it goes through the full
UI stack, is generally a poor way to test much of the backend pieces
that I personally tend to touch as a result.

--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

ISHIKAWA, Chiaki
Dear Joshua,

Thank you for the explanation.

My short comments inline below.

On 2019年03月20日 13:27, Joshua Cranmer 🐧 wrote:

> On 3/19/2019 4:59 AM, ISHIKAWA,chiaki wrote:
>> Q-1:. What is the difference between XIPDL-based C++ and modern C++?
>> I am not sure what you are describing here.
>>
>> I thought following XIPDL was the necessity to publish API that is used by
>> JavaScript, etc., i.e., outside the C++ domain in the mozilla framework.
>> Writing unconstrained code in C++ is a sure way to fail to provide APIs
>> that can be used in other languages, I think.
>>
>> I am not saying writing in XPIDL-based API is the sure way to provide APIs
>> that can be used in other languages very successsfully.
>> We can certainly write APIs diffcult to use and error-prone (say, memory
>> allocation/release issues), etc.).
>> But at least, the use of XPIDL interface heightens the awareness that we
>> are writing APIs to be used in other languages. And, we can focus our
>> debugging efforts to a few published selected APIs instead of myriads of
>> exported symbols.
>
> When I said XPIDL-based C++, I meant to refer specifically to the style of
> coding where to use another object, you create an nsIFrozz interface, fill
> it out with everything you want, then go to an nsFrozz instance, have it
> implement nsIFrozz, and interact with nsFrozz exclusively through nsIFrozz.
> Furthermore, it tends to rely on methods that look like nsresult
> GetFrozz(nsIFrozz **instance) instead of already_AddRefed<nsFrozz>
> GetFrozz() (or RefPtr<> in some cases).

That was more or less my understanding.

> Where this really tends to hurt is
> callbacks; you have to create an interface whose sole purpose is to be the
> callback function, and then refactor stuff to actually implement that
> interface through the entire XPCOM infrastructure.

Ditto.

> By contrast, modern C++
> can represent such a callback with a lambda function, or even just a RefPtr
> + pointer-to-member function. You can look at the APIs we have for proxying
> individual function calls off-main-thread for an example.

I did not realize this.

> Right now, I would say that mailnews code definitely errs way too far on
> make-an-interface-for-everything.

I thought this was the "mozilla" way: I am afraid that I have been
braindamaged by the legacy code.

> I don't see compelling needs for
> nsI*Protocol to be exposed outside of script, and things like
> nsINNTPArticleList and nsINNTPNewsgroupList are completely extraneous.

I think, for now, as long as we can call C++ functions and access/modify C++
objects from JS, that would be fine. I don't really see the need for non-JS
language binding. Oh wait, we now discuss the introduction of rust here, but
I suppose the rust-binding is not difficult and not much different from
JS-binding over the XPCOM infrastructure.

>> Q-2: What do you see the definitive specification for the language Rust?
>> I hate to invest in a language until a clearcut formal specification of
>> its syntax and semantics is given in an easily available document.
>
> The Rust Reference <https://doc.rust-lang.org/reference/introduction.html>
> is a reasonably complete document of syntax and generally sufficiently
> complete for semantics.

The PDF about KRust which I referenced contained a statement as follows:
" As a witness, although the Rust’s community provides some syntax in EBNF
[12], it is still far away from complete. This makes the formalization of
Rust much difficult, as mentioned in [20].  "

I had tough time understanding this, but when I noticed the following
statements in the following URL,
https://doc.rust-lang.org/reference/index.html
I think I understood the elusive nature of an actively developed language.

Quote: "For now, this reference is a best-effort document. We strive for
validity and completeness, but are not yet there. In the future, the docs
and lang teams will work together to figure out how best to do this. Until
then, this is a best-effort attempt. If you find something wrong or missing,
file an issue or send in a pull request. "

But I take that, for practical purposes,
https://doc.rust-lang.org/reference/introduction.html
gives the necessary information about the language (?).


> It's not a formal semantics, and it's definitely
> vaguer than the level of the C++ standard, but my experience of Rust versus
> C++ is that unless you're dealing with unsafe code in Rust, the compiler
> will prevent you from accidentally writing code with the wrong semantics.

OK.


> I will also point out that I'm not aware of any attempts to build an
> executable semantics for C++; most people give up after tackling C, which
> itself is quite difficult. C and C++ are also underspecified in some
> annoying ways--I had a fun time discovering that volatile register _Atomic
> int x; is a legal declaration and I've been unable to find any definitive
> clue as to what the legal semantics of such an abomination are.

The PDF about KRust which I referenced contained some pointers to executable
semantics of C, C++
which seemed to be in POPL proceedings. Those are [7] and [11] in the PDF.:
the latter deals with "undefined" behavior of C11.
Obviously, the C, C++ have more years behind them to see such executable
semantic modeling.


>> W-1: One point I would like to see in the language runtime support or
>> environment is as follows.
>>
>> Difficulty to hook to a good debugging framework during the bulk test
>> suite execution ( I am talking about |make mozmill| test suite ) is a big
>> deficiency under linux environment IMHO.  I mean when I see a dubious
>> behavior (warnings/errors/assertions), it is not entirely clear how to
>> debug that easily. Well, after a few trials and errors, we can begin
>> checking what is going on. But under linux, often times, I have to insert
>> some probes (dumping variables) and recompile the source and off I go with
>> gdb, etc. iif I am lucky enough to figure out the sequences of operations
>> that lead to that errors/warnings/assertions.
>
> Mach makes dealing with most of the testsuites pretty easy, but the mozmill
> testsuite is definitely a hole since it's not integrated with mach at all.
> (That's one of the reasons I avoid that testsuite whenever possible).

This |make mozmill| test suite is exactly where I often find issues in TB and
not being able to hook  a debugger to the running TB inside |make mozmill|
has been a constant headache. Mozmill can invoke gdb, a debugger under
linux, when a crash occurs, but I want to run debugger for a
warning/error/assertion.

This issue is orthogonal to the language chosen, but
TB needs to make the debugging of |make mozmill| easier to survive in the
long term.
The selection of the language is an important factor for survival, but we
need to
pay attention to debugging issues.
I think we probably spend  20-30 times or even more debugging time than coding.

> In general, the best way to debug code is often to put code in a restricted,
> smaller environment. Mozmill, since it goes through the full UI stack, is
> generally a poor way to test much of the backend pieces that I personally
> tend to touch as a result.

Unfortunately, many tests for UI interaction and the behavior of TB are
written in mozmill.

Mozmill does not even use a full UI stack for debugging: I mean, if I am not
mistaken, there is no notion of UI component naming. We can't to seem to
name a dialog, say, and wait for the particular named  dialog to appear,
etc. There are times that test codes expected for a dialog, but in reality,
a different dialog for an unexpected error is shown and the test code
thought it is seeing the original dialog it was expecting and all hell goes
loose. But I digress.

Chiaki


cf. The KRust PDF that I am referring above is as follows.
The article referred to the following PDF at arxiv.:
https://arxiv.org/pdf/1804.10806.pdf
                   arXiv: 1804.10806v1  [cs.PL]  28 Apr 2018 KRust: A Formal
Executable Semantics of Rust
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Tito-12
In reply to this post by ISHIKAWA, Chiaki
I am curious why was webassembly format left outside of this list, in my
opinion, we can also add any language that can compile to webassembly, I
did so far very trivial test like adding integers and printing a UTF
string with webassembly and thunderebird and so far it worked.


Tito





On 19.03.19 08:59, ISHIKAWA,chiaki wrote:

> Thank you for raising these points, Joshua.
>
> Maybe the next Counci will take notice and may want to produce a
> whitepaper/guideline or something.
>
> On 2019/03/19 11:54, Joshua Cranmer 🐧 wrote:
>> Attach
>>
>> This is something I've been thinking about for a while, and I want to
>> solicit feedback and hopefully consensus from our wonderful
>> programming community at large on the matter. I want to consider what
>> programming language(s) we should be moving towards in new code in TB.
>> The key here is "new code"; it is infeasible, and probably
>> ill-advised, for us to try a massive rush to rewrite all of our code
>> into whatever target language we want just for the sake of rewriting
>> code in a language, but we should be thoughtful if it is worth
>> migrating code to a better language as we improve or modernize it.
>>
>> Within the Mozilla build system, we essentially have a choice of
>> adopting 4 languages. I'll summarize their advantages and
>> disadvantages here:
>>
>> Javascript
>> *Advantages
>> *
>>
>>  * Dynamically typed
>>  * We can very aggressively new language features, due to our reliance
>>    on tip-of-trunk SpiderMonkey
>>  * No need to recompile after editing source files
>>  * Good tooling for debugging, testing, benchmarking--if our code can
>>    run in the environments that those tools want
>>
>>     *Disadvantages
>> *
>>
>>  * Dynamically typed
>>  * Workers have a cumbersome model, which makes it hard to move to
>>    multithreaded code
>>  * System APIs (e.g., filesystem, networking) are generally unavailable
>>    unless there's an XPIDL or DOM interface exposing them
>>  * The scope for a lot of third-party packages tend to imply a Node.js
>>    backing, which requires shims to implement outside of Node.js
>>  * A lot of dumb programmer errors (e.g., fat-fingered a name) can only
>>    be caught at runtime
>>  * Support for binary parsing is kind of sucky, and support for
>>    quasi-binary is really poor
>>  * Performance can be dicey or unpredictable
>>
>> XPIDL-based C++
>> *Advantages
>> *
>>
>>  * Our code is already written in this format
>>  * XPIDL is the most flexible FFI system we have at the moment. The
>>    only missing path is calling XPIDL from JS workers
>>
>>     *Disadvantages
>> *
>>
>>  * The API style is outdated, and it requires lots of macros and other
>>    magic incantations to get stuff done
>>  * Mozilla's long-term commitment to XPIDL is questionable (but XPIDL
>>    is basically a way of enforcing a common AVI--with the exception of
>>    xpconnect, it's pretty trivial to maintain ourselves)
>>  * Mozilla considers XPIDL to at the very least be soft-deprecated
>>  * Promise-style async API has pretty much no support whatsoever here
>>
>> Modern C++
>> *Advantages
>> *
>>
>>  * Modern C++ is actually quite an ergonomic language, at least if
>>    you're not attempting to wrap everything into a std::move and
>>    &&-based logic
>>  * Mozilla's done a fairly decent job at providing a useful library of
>>    standard ADT and some system API bits in MFBT and de-COM'd xpcom code
>>  * Compared to XPIDL, being able to chain method calls or not have to
>>    assume that every function can potentially fail is a big win
>>
>>     *Disadvantages
>> *
>>
>>  * Using lambdas for callbacks can easily create use-after-free bugs
>>  * Exposing this to anything else is generally difficult unless there's
>>    already a bindings framework to use.
>>  * Mozilla generally prohibits the use of the STL
>>  * There's a lag between new features being added to the standard and
>>    our ability to pick them up
>>  * No consistent API for handling error propagation, either in
>>    non-Mozilla projects or in Mozilla code
>>
>> Rust
>> *Advantages
>> *
>>
>>  * Rust's handling of strings-versus-binary-versus-ASCII-but-maybe-not
>>    is the best of the set of possible languages
>>  * The error handling and propagation is pretty sane, safe, and
>>    ergonomic. Least likely to have errors get dropped on the floor with
>>    no one noticing that an error ever happened
>>  * The borrow checker allows for enforcing quite a few invariants in
>>    the type system
>>  * Rust can easily compile to WebAssembly, which allows an extra way to
>>    call from JS code for computation-heavy code
>>  * Cargo is probably the friendliest package system I've tried dealing
>> with
>>
>>     *Disadvantages
>> *
>>
>>  * As a newer language, we're less likely to see knowledgeable
>> contributors
>>  * Assuaging the borrow checker can be challenging for novices
>>  * Rust<->JS calls are particularly challenging
>>  * All of the vendoring of crates happens in mozilla-central--it could
>>    be a challenge if we start using libraries that m-c doesn't use
>>
>>
>> I don't think there is a great benefit for enforcing that we have to
>> pick one language to implement the entirety of TB in. The reality is
>> that we don't have the bandwidth to rewrite everything. Even if we
>> could magically wave that away, the reality of system integration is
>> that systems require us to have support for native languages--which
>> include C++, Objective-C, even Java for Android--to implement
>> necessary features. Furthermore, I don't see any realistic way of
>> cutting lose our dependency from the Mozilla stack, and I think that
>> people are too optimistic when looking at the challenges of shimming
>> all of the system APIs if we were to try to support multiple stacks.
>> At the end of the day, multiple languages I don't see as the biggest
>> barrier, or even one of the biggest barriers, to development. From
>> that perspective, then, it makes sense to break down our components
>> into smaller pieces to figure which languages ought to be used. Here
>> are the components as I see them:
>>
> I have a couple of questions and a wish as an occasional patch contributor.
>
> Q-1:. What is the difference between XIPDL-based C++ and modern C++?
> I am not sure what you are describing here.
>
> I thought following XIPDL was the necessity to publish API that is used
> by JavaScript, etc., i.e., outside the C++ domain in the mozilla
> framework. Writing unconstrained code in C++ is a sure way to fail to
> provide APIs that can be used in other languages, I think.
>
> I am not saying writing in XPIDL-based API is the sure way to provide
> APIs that can be used in other languages very successsfully.
> We can certainly write APIs diffcult to use and error-prone (say, memory
> allocation/release issues), etc.).
> But at least, the use of XPIDL interface heightens the awareness that we
> are writing APIs to be used in other languages. And, we can focus our
> debugging efforts to a few published selected APIs instead of myriads of
> exported symbols.
>
>
> Q-2:  What do you see the definitive specification for the language Rust?
> I hate to invest in a language until a clearcut formal specification of
> its syntax and semantics is given in an easily available document.
> (I am not sure if the language has been formally defined before: I read
> somewhere that finally a formal definition, or an attempt to define the
> language semantics written in a dynamically executable function language
> appeared in the last year? The article referred to the following PDF at
> arxiv.: https://arxiv.org/pdf/1804.10806.pdf
>                     arXiv: 1804.10806v1  [cs.PL]  28 Apr 2018 KRust: A
> Formal Executable Semantics of Rust
>
> I think Haskel and a few other languages with established history is
> wonderful although I am not advocating writing a mail client in Haskel.
> (Come to think of it, though, it may be a good idea to build a
> long-lasting program with low maintenance overhead in Haskel or in  a
> few other such languages IFF we can provide external libraries that can
> be proven "correct" in the framework of Haskel and friends.)
>
> W-1: One point I would like to see in the language runtime support or
> environment is as follows.
>
> Difficulty to hook to a good debugging framework during the bulk test
> suite execution ( I am talking about |make mozmill| test suite ) is a
> big deficiency under linux environment IMHO.  I mean when I see a
> dubious behavior (warnings/errors/assertions), it is not entirely clear
> how to debug that easily. Well, after a few trials and errors, we can
> begin checking what is going on. But under linux, often times, I have to
> insert some probes (dumping variables) and recompile the source and off
> I go with gdb, etc. iif I am lucky enough to figure out the sequences of
> operations that lead to that errors/warnings/assertions.
>
> Maybe I am not using the toolchain to their best effect, but then I
> don't find good reference / web page about that.
>
> If I can insert a breakpoint to a given named function and then run TB
> inside |make mozimill| under that setting so that I get a gdb prompt at
> the console from which I invoked |make mozmill|
> at the time the execution reaches the selected function within executed
> TB binary,  that may be a starting point.
> There may be a hook built in, but I am not sure if this works today, or
> it has been documented well before.
>
> TIA
>
> Chiaki
>
> PS: BTW, I hate JS due to its dynamically typed system. The recent
> eslint and other static analysis check is a wonderful addition to lower
> long-term maintenance burden.
> Yes, I like small size DSL, but we have to recognize the limitation in
> terms of long-term maintenance and should resist writing a large code
> that is to be maintained for a long term. Famous last words. I have done
> the similar things, but they are NOT released to wider public, and it is
> only me who suffers.
>
>
>
>
> _______________________________________________
> dev-apps-thunderbird mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-apps-thunderbird
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Joshua Cranmer 🐧
On 3/21/2019 4:42 AM, Tito wrote:
> I am curious why was webassembly format left outside of this list, in
> my opinion, we can also add any language that can compile to
> webassembly, I did so far very trivial test like adding integers and
> printing a UTF string with webassembly and thunderebird and so far it
> worked.

You need to compile into WebAssembly somehow, which requires build
system integration. To my knowledge, build system integration for
WebAssembly source files doesn't exist.

Also, people don't write WebAssembly code directly, it has to be
compiled from some other language first. And the main contenders for
those languages are C++ and Rust, and I did call out Rust's support for
WebAssembly as a target as a benefit.

--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Tito-12
Hello Joshua,

you are right  i missed that sentence i.e.

"
* Rust can easily compile to WebAssembly, which allows an extra way to
    call from JS code for computation-heavy code
"

in my opinion we need to converge around WebAssembly since it is already
supported by all browsers and  thus it is platform independent.  The
next release of webassembly will receive a garbage collection capability
which will mean we will be able to use java, c# etc to compile to it.

Tito





On 21.03.19 12:16, Joshua Cranmer 🐧 wrote:

> On 3/21/2019 4:42 AM, Tito wrote:
>> I am curious why was webassembly format left outside of this list, in
>> my opinion, we can also add any language that can compile to
>> webassembly, I did so far very trivial test like adding integers and
>> printing a UTF string with webassembly and thunderebird and so far it
>> worked.
>
> You need to compile into WebAssembly somehow, which requires build
> system integration. To my knowledge, build system integration for
> WebAssembly source files doesn't exist.
>
> Also, people don't write WebAssembly code directly, it has to be
> compiled from some other language first. And the main contenders for
> those languages are C++ and Rust, and I did call out Rust's support for
> WebAssembly as a target as a benefit.
>
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Joshua Cranmer 🐧
In reply to this post by ISHIKAWA, Chiaki
On 3/20/2019 7:28 AM, ishikawa wrote:
> I think, for now, as long as we can call C++ functions and
> access/modify C++
> objects from JS, that would be fine. I don't really see the need for non-JS
> language binding. Oh wait, we now discuss the introduction of rust here, but
> I suppose the rust-binding is not difficult and not much different from
> JS-binding over the XPCOM infrastructure.

Right now, we pretty exclusively rely on XPIDL (or, more specifically,
XPConnect) for JS bindings. But XPConnect doesn't actually follow the JS
object model very well--you can look at the magic stuff that happens
when you call QueryInterface (or instanceof, which does the same thing
under the hood). Especially with Mozilla pivoting to WebExtensions, I
doubt Mozilla has a long-term commitment to XPConnect, and I don't
believe we have the capability to truly maintain such a versatile and
complex piece ourselves. It should also be noted that there are some
features that have already been desupported by Mozilla--Worker thread
access to xpconnect is forbidden, for example. Most of the focus for JS
bindings to C++ code in Gecko happens via WebIDL, which is much more
heavily tuned to support the JS object model.

--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

ISHIKAWA, Chiaki
On 2019年03月24日 09:35, Joshua Cranmer 🐧 wrote:

> On 3/20/2019 7:28 AM, ishikawa wrote:
>> I think, for now, as long as we can call C++ functions and access/modify C++
>> objects from JS, that would be fine. I don't really see the need for non-JS
>> language binding. Oh wait, we now discuss the introduction of rust here, but
>> I suppose the rust-binding is not difficult and not much different from
>> JS-binding over the XPCOM infrastructure.
>
> Right now, we pretty exclusively rely on XPIDL (or, more specifically,
> XPConnect) for JS bindings. But XPConnect doesn't actually follow the JS
> object model very well--you can look at the magic stuff that happens when
> you call QueryInterface (or instanceof, which does the same thing under the
> hood). Especially with Mozilla pivoting to WebExtensions, I doubt Mozilla
> has a long-term commitment to XPConnect, and I don't believe we have the
> capability to truly maintain such a versatile and complex piece ourselves.
> It should also be noted that there are some features that have already been
> desupported by Mozilla--Worker thread access to xpconnect is forbidden, for
> example. Most of the focus for JS bindings to C++ code in Gecko happens via
> WebIDL, which is much more heavily tuned to support the JS object model.
>

I see.
For TB, what we need to consider is how much we depend on Gecko (web
browser) for UI interaction, etc. and how much we don't really interact with
Gecko and do the underlying network/filesystem I/O, etc. in TB's own code now.

Depending on the amount of JS (or Rust or whatever) that needs to be
maintained over the long term along with the underlying C++ code,
we have to figure out what would be the choice of intra-domain and
intra-language interface mechanisms, but as you say, we may not be able to
support complex interface mechanisms ourselves any more and in that case, we
have to go with the flow as long as the choice is not something everybody
abhors.

Chiaki
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird
Reply | Threaded
Open this post in threaded view
|

Re: Choosing a programming language

Noel Grandin-2
In reply to this post by Joshua Cranmer 🐧
As a datapoint, I will note that trying to hack on Thunderbird (a long time ago), the combination of Javascript and C++ code made for a truly awful debugging experience.

C++ and Rust should play fairly nicely together in the debugger, since they don't have kind of fancy JIT/GC/etc.
_______________________________________________
dev-apps-thunderbird mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-thunderbird