A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

Jack Works

Just like DOMParser in HTML and Houdini's parser API in CSS, a built-in parser for ECMAScript itself is quite useful in many ways.

Check out https://github.com/Jack-Works/proposal-ecmascript-parser for details (and also, finding champions!)



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

David Rajchenbach-Teller-2
Out of curiosity, what is the expected benefit wrt Esprima, Babel or
Shift? In particular since there is no standard AST for ECMAScript yet [1]?

Cheers,
 David

[1] Ok, that's a subset of https://github.com/tc39/proposal-binary-ast,
which is in the pipes.

On 14/09/2019 07:46, Jack Works wrote:

> Just like DOMParser <http://mdn.io/DOMParser> in HTML and Houdini's
> parser API in CSS
> <https://github.com/WICG/CSS-Parser-API/blob/master/README.md>, a
> built-in parser for ECMAScript itself is quite useful in many ways.
>
> Check out https://github.com/Jack-Works/proposal-ecmascript-parser for
> details (and also, finding champions!)
>
>
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

Jack Works
This proposal is not a part of the binary AST proposal. Because that proposal wants a binary representation and will not generate AST directly from the ecmascript spec.
Because run those parsers in browser is pretty slow. Since the JS engine can already parse the JavaScript code, just expose those interfaces will make things easier.


Out of curiosity, what is the expected benefit wrt Esprima, Babel or
Shift? In particular since there is no standard AST for ECMAScript yet [1]?

Cheers,
 David

[1] Ok, that's a subset of https://github.com/tc39/proposal-binary-ast,
which is in the pipes.

On 14/09/2019 07:46, Jack Works wrote:
> Just like DOMParser <http://mdn.io/DOMParser> in HTML and Houdini's
> parser API in CSS
> <https://github.com/WICG/CSS-Parser-API/blob/master/README.md>, a
> built-in parser for ECMAScript itself is quite useful in many ways.
>
> Check out https://github.com/Jack-Works/proposal-ecmascript-parser for
> details (and also, finding champions!)
>
>
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

Gareth Heyes
In reply to this post by Jack Works
I had a few goes with making a JS sandbox. I also created a safe DOM environment that allowed safe manipulation of innerHTML etc

JS sandbox with regular expressions

JS sandbox and safe DOM environment

It would be great to have a parser in JS!

On 14 Sep 2019, at 06:46, Jack Works <[hidden email]> wrote:

Just like DOMParser in HTML and Houdini's parser API in CSS, a built-in parser for ECMAScript itself is quite useful in many ways.

Check out https://github.com/Jack-Works/proposal-ecmascript-parser for details (and also, finding champions!)


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

Isiah Meadows-2
I do want to note a couple things here, as someone familiar with the
implementation aspect of JS and programming languages in general:

1. The HTML and CSS parsers (for inline style sheets) have to build a
full DOM trees for each anyways just to conform to spec, so they can't
just, say, parse `.foo { display: block; color: red; }` as `.foo {
display: block; } .foo { color: red }` with a cached selector (which
*would* be easier to process later on). In this case, they're
basically just exposing the same parsers they'd have to use in
practice anyways, so it's literally trivial for them to add.
2. No JS engine parses nodes the way the spec processes them, just in
a way it's unobservable mod timings. They internally parse `1` and
`1.0` as different types, and they will do things like constant
propagation - `3 * 5` gets parsed as `15` usually, and `"a" + "b"`
will usually get read as `"ab"` by some engines. Furthermore, browser
engines lazily parse functions where they can, only validating them
for early errors and storing the source code to reparse them on first
call, because it helps them start up faster with less memory. And of
course, `typeof value === "string"` is often not simply compiled to
`%IsString(value)` but literally parsed as such if `value` is defined
in that scope. And finally, engines typically merge the steps of AST
generation and scope detection, not only to detect `let`/`const`
errors but also to speed up bytecode generation.

So although it sounds like JS engines could reuse their logic, they
really couldn't. This is further evidenced by SpiderMonkey's parser
API (the predecessor to the ESTree spec) not sharing the same
implementation as the core language parser. There's two vastly
different concerns between generating an AST for tooling and
generating an AST to execute. In the former, you want as much info as
possible readily available. In the latter, you just want to have the
bare minimum to compile to bytecode with relevant source locations for
stack traces, and anything else is literally just unnecessary
overhead.

-----

Isiah Meadows
[hidden email]
www.isiahmeadows.com

On Sat, Sep 14, 2019 at 9:41 AM Gareth Heyes
<[hidden email]> wrote:

>
> I had a few goes with making a JS sandbox. I also created a safe DOM environment that allowed safe manipulation of innerHTML etc
>
> JS sandbox with regular expressions
> http://www.businessinfo.co.uk/labs/jsreg/jsreg.html
>
> JS sandbox and safe DOM environment
> http://businessinfo.co.uk/labs/MentalJS/MentalJS.html
>
> It would be great to have a parser in JS!
>
> On 14 Sep 2019, at 06:46, Jack Works <[hidden email]> wrote:
>
> Just like DOMParser in HTML and Houdini's parser API in CSS, a built-in parser for ECMAScript itself is quite useful in many ways.
>
> Check out https://github.com/Jack-Works/proposal-ecmascript-parser for details (and also, finding champions!)
>
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

David Rajchenbach-Teller-2
In reply to this post by Jack Works
Before you can have a standard parser, you need a standard AST. There is
no such thing as the moment, so the v8 parser, the SpiderMonkey parser
and the JSCore parser, etc. all use distinct internal ASTs, each of
which changes every so often, either because the language changes or
because the VM needs to attach different information to help with
compilation.

That's the main reason for which there hasn't been a standard
user-accessible ECMAScript parser in ECMAScript.

As Binary AST relies upon having a standard AST, standandardizing the
AST is part of the Binary AST proposal. You may find the latest version
of this AST online
https://github.com/binast/binjs-ref/blob/master/spec/es6.webidl

Cheers,
 David

On 14/09/2019 10:10, Jack Works wrote:

> This proposal is not a part of the binary AST proposal. Because that
> proposal wants a binary representation and will not generate AST
> directly from the ecmascript spec.
> Because run those parsers in browser is pretty slow. Since the JS engine
> can already parse the JavaScript code, just expose those interfaces will
> make things easier.
>
>
>     Out of curiosity, what is the expected benefit wrt Esprima, Babel or
>     Shift? In particular since there is no standard AST for ECMAScript
>     yet [1]?
>
>     Cheers,
>      David
>
>     [1] Ok, that's a subset of https://github.com/tc39/proposal-binary-ast,
>     which is in the pipes.
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

Jack Works
Happy to see standard ast in binary ast proposal.

For compiler, it can have a "slow" mode when parsing with this parser API and still use fast code generation in other cases. But unfortunately it seems there are much more work than I think to provide such an API.

David Teller <[hidden email]> 于 2019年9月15日周日 下午7:02写道:
Before you can have a standard parser, you need a standard AST. There is
no such thing as the moment, so the v8 parser, the SpiderMonkey parser
and the JSCore parser, etc. all use distinct internal ASTs, each of
which changes every so often, either because the language changes or
because the VM needs to attach different information to help with
compilation.

That's the main reason for which there hasn't been a standard
user-accessible ECMAScript parser in ECMAScript.

As Binary AST relies upon having a standard AST, standandardizing the
AST is part of the Binary AST proposal. You may find the latest version
of this AST online
https://github.com/binast/binjs-ref/blob/master/spec/es6.webidl

Cheers,
 David

On 14/09/2019 10:10, Jack Works wrote:
> This proposal is not a part of the binary AST proposal. Because that
> proposal wants a binary representation and will not generate AST
> directly from the ecmascript spec.
> Because run those parsers in browser is pretty slow. Since the JS engine
> can already parse the JavaScript code, just expose those interfaces will
> make things easier.
>
>
>     Out of curiosity, what is the expected benefit wrt Esprima, Babel or
>     Shift? In particular since there is no standard AST for ECMAScript
>     yet [1]?
>
>     Cheers,
>      David
>
>     [1] Ok, that's a subset of https://github.com/tc39/proposal-binary-ast,
>     which is in the pipes.
>

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

David Rajchenbach-Teller-2
In theory, it should be possible to have both modes, if the parser is
designed for it. Unfortunately, that's not the case at the moment.

Mozilla has recently started working on a new parser which could be used
both by VMs and by JS/wasm devs. It might help towards this issue, but
it's still early days.

Cheers,
 David

On 15/09/2019 13:09, Jack Works wrote:
> Happy to see standard ast in binary ast proposal.
>
> For compiler, it can have a "slow" mode when parsing with this parser
> API and still use fast code generation in other cases. But unfortunately
> it seems there are much more work than I think to provide such an API.
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

kai zhu
adding datapoint on application in code-coverage.

a builtin parser-api would be ideal (and appreciate the insight on implementation difficulties).
lacking that, the next best alternative i've found is acorn (based on esprima),
available as a single, embedabble file runnable in browser:

```shell
curl https://registry.npmjs.org/acorn/-/acorn-6.3.0.tgz | tar -O -xz package/dist/acorn.js > acorn.rollup.js
ls -l acorn.rollup.js
-rwxr-xr-x 1 root root 191715 Sep 15 16:49 acorn.rollup.js
```

i recently added es9 syntax-support to in-browser-variant of istanbul by replacing its aging esprima-parser with acorn [1].
ideally, i hope a standardized ast will be available someday, and get rid of acorn/babel/shift altogether (or maybe acorn can become that standard?).
even better, is if [cross-compatible] instrumentation becomes a common bultin-feature in engines, and get rid of istanbul.

chrome/puppeteer's instrumentation-api is not yet ideal for my use-case because it currently lack code-coverage-info on branches (which istanbul-instrumentation provides).

[1] istanbul-lite - embeddable, es9 browser-variant of istanbul code-coverage



On Sun, Sep 15, 2019 at 9:08 AM David Teller <[hidden email]> wrote:
In theory, it should be possible to have both modes, if the parser is
designed for it. Unfortunately, that's not the case at the moment.

Mozilla has recently started working on a new parser which could be used
both by VMs and by JS/wasm devs. It might help towards this issue, but
it's still early days.

Cheers,
 David

On 15/09/2019 13:09, Jack Works wrote:
> Happy to see standard ast in binary ast proposal.
>
> For compiler, it can have a "slow" mode when parsing with this parser
> API and still use fast code generation in other cases. But unfortunately
> it seems there are much more work than I think to provide such an API.
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: A new proposal for syntax-checking and sandbox: ECMAScript Parser proposal

Isiah Meadows-2
Nit: Acorn's *output* is based on Esprima. Its code is *not* and
hasn't been for a few years now. It started a fork of Esprima, but it
wasn't long before it was rewritten the first time.

-----

Isiah Meadows
[hidden email]
www.isiahmeadows.com

On Mon, Sep 16, 2019 at 1:58 AM kai zhu <[hidden email]> wrote:

>
> adding datapoint on application in code-coverage.
>
> a builtin parser-api would be ideal (and appreciate the insight on implementation difficulties).
> lacking that, the next best alternative i've found is acorn (based on esprima),
> available as a single, embedabble file runnable in browser:
>
> ```shell
> curl https://registry.npmjs.org/acorn/-/acorn-6.3.0.tgz | tar -O -xz package/dist/acorn.js > acorn.rollup.js
> ls -l acorn.rollup.js
> -rwxr-xr-x 1 root root 191715 Sep 15 16:49 acorn.rollup.js
> ```
>
> i recently added es9 syntax-support to in-browser-variant of istanbul by replacing its aging esprima-parser with acorn [1].
> ideally, i hope a standardized ast will be available someday, and get rid of acorn/babel/shift altogether (or maybe acorn can become that standard?).
> even better, is if [cross-compatible] instrumentation becomes a common bultin-feature in engines, and get rid of istanbul.
>
> chrome/puppeteer's instrumentation-api is not yet ideal for my use-case because it currently lack code-coverage-info on branches (which istanbul-instrumentation provides).
>
> [1] istanbul-lite - embeddable, es9 browser-variant of istanbul code-coverage
> https://kaizhu256.github.io/node-istanbul-lite/build..beta..travis-ci.org/app/
>
>
>
> On Sun, Sep 15, 2019 at 9:08 AM David Teller <[hidden email]> wrote:
>>
>> In theory, it should be possible to have both modes, if the parser is
>> designed for it. Unfortunately, that's not the case at the moment.
>>
>> Mozilla has recently started working on a new parser which could be used
>> both by VMs and by JS/wasm devs. It might help towards this issue, but
>> it's still early days.
>>
>> Cheers,
>>  David
>>
>> On 15/09/2019 13:09, Jack Works wrote:
>> > Happy to see standard ast in binary ast proposal.
>> >
>> > For compiler, it can have a "slow" mode when parsing with this parser
>> > API and still use fast code generation in other cases. But unfortunately
>> > it seems there are much more work than I think to provide such an API.
>> >
>> _______________________________________________
>> es-discuss mailing list
>> [hidden email]
>> https://mail.mozilla.org/listinfo/es-discuss
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss