JSON.stringify </script>

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

JSON.stringify </script>

Michał Wadas

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel
I think defining an easy way to produce embeddable JSON is a great
idea, but it's not quite that simple.

https://github.com/OWASP/json-sanitizer#output captures some
requirements that I came up with for embedding JSON in HTML:

"""
The output is well-formed JSON as defined by RFC 4627. The output
satisfies these additional properties:

* The output will not contain the substring (case-insensitively)
"</script" so can be embedded inside an HTML script element without
further encoding.
* The output will not contain the substring "]]>" so can be embedded
inside an XML CDATA section without further encoding.
* The output is a valid Javascript expression, so can be parsed by
Javascript's eval builtin (after being wrapped in parentheses) or by
JSON.parse. Specifically, the output will not contain any string
literals with embedded JS newlines (U+2028 Paragraph separator or
U+2029 Line separator).
* The output contains only valid Unicode scalar values (no isolated
UTF-16 surrogates) that are allowed in XML unescaped.
"""

These apply equally well to RFC 7159 IIUC.  The latter few constraints
are required to allow embedding of JSON in HTML in a foreign content
context ( https://www.w3.org/TR/html5/syntax.html#cdata-sections ).

Those rules are sufficient to allow embedding in HTML without breaking
token boundaries in the embedding language.

To preserve semantics when embedding in HTML you also need to escape '&'.
To prevent exfiltration via external entities in SVG & other XML
variants, you should probably also escape '%'.



On Wed, Sep 28, 2016 at 10:06 AM, Michał Wadas <[hidden email]> wrote:

> Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".
>
> Benefits: remove XSS vulnerability when injecting JSON as content of
> <script> tag (quite common antipattern).
>
> Backward compatible: yes, unless binary equality is required and this string
> is used.
>
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Alexander Jones
In reply to this post by Michał Wadas
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex

On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Michał Wadas

Actually CDATA suffer the same issue - for string "]]>". Mike Samuel has a very strong point here.

And by saying "it's antipattern, don't do this" we will not make old vulnerable code go away. And we have a very good way to stop people from shooting their own feet - for free.


On 28 Sep 2016 8:31 p.m., "Alexander Jones" <[hidden email]> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex


On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Alexander Jones
Hi Michał

Embedding a JSON literal into HTML involves first encoding to JSON then encoding that into HTML. Two stages which must not be confused. The 'encoding into HTML' part is best done in XHTML with CDATA, and the encoding method is taken care of by whichever XML-generating library you're using. If you hint it to use CDATA for such a text node, or if for any other reason it chooses to use CDATA, rather than merely converting every `<` to `&lt;`, etc., then it will (or should) "escape" `]]>` as `]]]]><![CDATA[>` or whatever equivalent. See https://en.wikipedia.org/wiki/CDATA#Nesting for more info. Crucially, this works for encoding ANY text data into a text node in an XML document, not just JSON.

Having the specified JSON algorithm in ECMAScript deal with concerns of embedding into legacy, non XML-based HTML (oh yes, I totally went there! ;) ) is a classic layer violation, which I would guarantee offends 99 out of 100 experienced programmers' sensibilities. :)

Aside, I'll repeat again that this would be largely ineffective - a lot of JSON that might be dumbly pasted into a text stream of HTML would be generated by implementations other than that specified by ECMAScript.

Hope this clears it up

Alex

On 28 September 2016 at 19:41, Michał Wadas <[hidden email]> wrote:

Actually CDATA suffer the same issue - for string "]]>". Mike Samuel has a very strong point here.

And by saying "it's antipattern, don't do this" we will not make old vulnerable code go away. And we have a very good way to stop people from shooting their own feet - for free.


On 28 Sep 2016 8:31 p.m., "Alexander Jones" <[hidden email]> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex


On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.




_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Kris Siegel
ECMAScript, while highly used in web browsers, should really not care about HTML constructs. That's where WHATWG and W3C come in. I suggest this type of feature should come from one of those groups, not ECMA.

On Wed, Sep 28, 2016 at 11:54 AM, Alexander Jones <[hidden email]> wrote:
Hi Michał

Embedding a JSON literal into HTML involves first encoding to JSON then encoding that into HTML. Two stages which must not be confused. The 'encoding into HTML' part is best done in XHTML with CDATA, and the encoding method is taken care of by whichever XML-generating library you're using. If you hint it to use CDATA for such a text node, or if for any other reason it chooses to use CDATA, rather than merely converting every `<` to `&lt;`, etc., then it will (or should) "escape" `]]>` as `]]]]><![CDATA[>` or whatever equivalent. See https://en.wikipedia.org/wiki/CDATA#Nesting for more info. Crucially, this works for encoding ANY text data into a text node in an XML document, not just JSON.

Having the specified JSON algorithm in ECMAScript deal with concerns of embedding into legacy, non XML-based HTML (oh yes, I totally went there! ;) ) is a classic layer violation, which I would guarantee offends 99 out of 100 experienced programmers' sensibilities. :)

Aside, I'll repeat again that this would be largely ineffective - a lot of JSON that might be dumbly pasted into a text stream of HTML would be generated by implementations other than that specified by ECMAScript.

Hope this clears it up

Alex

On 28 September 2016 at 19:41, Michał Wadas <[hidden email]> wrote:

Actually CDATA suffer the same issue - for string "]]>". Mike Samuel has a very strong point here.

And by saying "it's antipattern, don't do this" we will not make old vulnerable code go away. And we have a very good way to stop people from shooting their own feet - for free.


On 28 Sep 2016 8:31 p.m., "Alexander Jones" <[hidden email]> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex


On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.




_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel
In reply to this post by Alexander Jones

I agree it's subideal which is why I work to address problems like this in template systems but ad-hoc string concatenation happens and embeddable sub-languages provide defense-in-depth without sacrificing correctness.

CDATA sections solve no problems because they cannot contain any string that has "]]>" as a substring so you still have to s/\]\]>/]]>]]<!CDATA>/g.


On Sep 28, 2016 2:32 PM, "Alexander Jones" <[hidden email]> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex

On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Alexander Jones
They do solve the problem. You encode your entire JS *before* pasting it, encoding `]]>` and nothing more, and the XML document's text node contains the unadulterated text, which the JS parser also sees. It's perfect layer isolation. Ye olde HTML can't do that because there is no escaping mechanism for `</script>` that actually allows the JS parser to see the text (code) content unmodified.

Viva la `<xhtml:revolución />` ;)

On Wednesday, 28 September 2016, Mike Samuel <[hidden email]> wrote:

I agree it's subideal which is why I work to address problems like this in template systems but ad-hoc string concatenation happens and embeddable sub-languages provide defense-in-depth without sacrificing correctness.

CDATA sections solve no problems because they cannot contain any string that has "]]>" as a substring so you still have to s/\]\]>/]]>]]<!CDATA>/g.


On Sep 28, 2016 2:32 PM, "Alexander Jones" <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;alex@weej.com&#39;);" target="_blank">alex@...> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex

On Wednesday, 28 September 2016, Michał Wadas <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;michalwadas@gmail.com&#39;);" target="_blank">michalwadas@...> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.


_______________________________________________
es-discuss mailing list
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;es-discuss@mozilla.org&#39;);" target="_blank">es-discuss@...
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel

Without CDATA you have to encode script bodies properly.  With CDATA you have to encode script bodies properly.  What problem did CDATA solve?


On Sep 28, 2016 8:03 PM, "Alexander Jones" <[hidden email]> wrote:
They do solve the problem. You encode your entire JS *before* pasting it, encoding `]]>` and nothing more, and the XML document's text node contains the unadulterated text, which the JS parser also sees. It's perfect layer isolation. Ye olde HTML can't do that because there is no escaping mechanism for `</script>` that actually allows the JS parser to see the text (code) content unmodified.

Viva la `<xhtml:revolución />` ;)

On Wednesday, 28 September 2016, Mike Samuel <[hidden email]> wrote:

I agree it's subideal which is why I work to address problems like this in template systems but ad-hoc string concatenation happens and embeddable sub-languages provide defense-in-depth without sacrificing correctness.

CDATA sections solve no problems because they cannot contain any string that has "]]>" as a substring so you still have to s/\]\]>/]]>]]<!CDATA>/g.


On Sep 28, 2016 2:32 PM, "Alexander Jones" <[hidden email]> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex

On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Alexander Jones
In XHTML, CDATA allows a 'more' verbatim spelling of text node content. But the end token has to be escaped, as discussed. Despite this escaping, the text node can contain arbitrary strings.

In XHTML, you *can* achieve the same effect without CDATA, just by escaping XML entities. Again, and cruciallt, the text node can contain arbitrary strings.

In HTML without CDATA, using HTML entities within the script tag is wrong specifically because they are *not* interpreted. The text node in the HTML document CANNOT contain arbitrary strings, and there is no further decode step before the JS parser hits your code, so you're forced to take other measures to ensure that `</script>` does not appear in your code. There are a few places this can appear, only one of which is embedded in string literals, so the method of avoiding this is actually sensitive to the context and not practical to specify.

I hope you can appreciate how ridiculous this problem is for HTML - I don't believe CDATA support in HTML 5 can solve this due to forward compatibility - which is why it's an antipattern. Just don't do it, or use XHTML. It's not cool to hate on XML anymore. ;)

Alex


On Thursday, 29 September 2016, Mike Samuel <[hidden email]> wrote:

Without CDATA you have to encode script bodies properly.  With CDATA you have to encode script bodies properly.  What problem did CDATA solve?


On Sep 28, 2016 8:03 PM, "Alexander Jones" <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;alex@weej.com&#39;);" target="_blank">alex@...> wrote:
They do solve the problem. You encode your entire JS *before* pasting it, encoding `]]>` and nothing more, and the XML document's text node contains the unadulterated text, which the JS parser also sees. It's perfect layer isolation. Ye olde HTML can't do that because there is no escaping mechanism for `</script>` that actually allows the JS parser to see the text (code) content unmodified.

Viva la `<xhtml:revolución />` ;)

On Wednesday, 28 September 2016, Mike Samuel <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;mikesamuel@gmail.com&#39;);" target="_blank">mikesamuel@...> wrote:

I agree it's subideal which is why I work to address problems like this in template systems but ad-hoc string concatenation happens and embeddable sub-languages provide defense-in-depth without sacrificing correctness.

CDATA sections solve no problems because they cannot contain any string that has "]]>" as a substring so you still have to s/\]\]>/]]>]]<!CDATA>/g.


On Sep 28, 2016 2:32 PM, "Alexander Jones" <[hidden email]> wrote:
That's awful. As you say, it's an antipattern, no further effort should be spent on this. JSON produced by JavaScript has far more general uses than slapping directly into a script tag unencoded, so no-one else should have to see this. Also, there are many other producers of JSON than JavaScript.

Instead, use XHTML and CDATA (which has a straightforward encoding mechanism that doesn't ruin the parseability of the code or affect it in any way) if you really want to pull stunts like this.

Alex

On Wednesday, 28 September 2016, Michał Wadas <[hidden email]> wrote:

Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".

Benefits: remove XSS vulnerability when injecting JSON as content of <script> tag (quite common antipattern).

Backward compatible: yes, unless binary equality is required and this string is used.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Simon Pieters-3
In reply to this post by Michał Wadas
On Wed, 28 Sep 2016 19:06:31 +0200, Michał Wadas <[hidden email]>  
wrote:

> Idea: require implementations to stringify "</script>" as  
> "<\uxxxxscript>".
>
> Benefits: remove XSS vulnerability when injecting JSON as content of
> <script> tag (quite common antipattern).
>
> Backward compatible: yes, unless binary equality is required and this
> string is used.

You would also need to escape "<!--" and "<script" for HTML. See  
https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements

--
Simon Pieters
Opera Software
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel
In reply to this post by Alexander Jones
On Thu, Sep 29, 2016 at 2:09 AM, Alexander Jones <[hidden email]> wrote:
> In XHTML, CDATA allows a 'more' verbatim spelling of text node content. But
> the end token has to be escaped, as discussed. Despite this escaping, the
> text node can contain arbitrary strings.



> In XHTML, you *can* achieve the same effect without CDATA, just by escaping
> XML entities. Again, and cruciallt, the text node can contain arbitrary
> strings.

So, <script><![CDATA[...]]></script> has a complete escaping process,
whereas, since CDATA sections were taken out of HTML foreign element
content disallowing
  <svg><script><![[CDATA[...]]></script></svg>
HTML does not, so to figure out how to embed

  alert("</script>");
  if (a < /script>/.exec(myString)) ...

you have to do scripting language specific analysis.

Is that about right?


> In HTML without CDATA, using HTML entities within the script tag is wrong
> specifically because they are *not* interpreted. The text node in the HTML
> document CANNOT contain arbitrary strings, and there is no further decode
> step before the JS parser hits your code, so you're forced to take other
> measures to ensure that `</script>` does not appear in your code. There are
> a few places this can appear, only one of which is embedded in string
> literals, so the method of avoiding this is actually sensitive to the
> context and not practical to specify.



> I hope you can appreciate how ridiculous this problem is for HTML - I don't
> believe CDATA support in HTML 5 can solve this due to forward compatibility
> - which is why it's an antipattern. Just don't do it, or use XHTML. It's not
> cool to hate on XML anymore. ;)

Yes.  I've written hardened DOM tree serializers.  I appreciate these problems.
No-one is hating on XML.

We're talking about JSON serializers.  Every JSON serializers produces
a subset of the output language. Choices about that sublanguage affect
how easy/hard it is to use that serializer with other tools.

That "if everyone wrote software with property P, we would not have
problem Q" is a great argument that we should prefer stacks with
property P, but does not mean we should not take the prevalence of
problem Q into account when designing elements of software stacks.
You seem to actually be arguing that we should not do our best to
prevent problem Q by other means, but real systems need
defense-in-depth.

So I concede your point about CDATA sections but don't see that these
arguments about antipatterns and the benefits of XHTML are all that
relevant.



> Alex
>
>
>
> On Thursday, 29 September 2016, Mike Samuel <[hidden email]> wrote:
>>
>> Without CDATA you have to encode script bodies properly.  With CDATA you
>> have to encode script bodies properly.  What problem did CDATA solve?
>>
>>
>> On Sep 28, 2016 8:03 PM, "Alexander Jones" <[hidden email]> wrote:
>>>
>>> They do solve the problem. You encode your entire JS *before* pasting it,
>>> encoding `]]>` and nothing more, and the XML document's text node contains
>>> the unadulterated text, which the JS parser also sees. It's perfect layer
>>> isolation. Ye olde HTML can't do that because there is no escaping mechanism
>>> for `</script>` that actually allows the JS parser to see the text (code)
>>> content unmodified.
>>>
>>> Viva la `<xhtml:revolución />` ;)
>>>
>>> On Wednesday, 28 September 2016, Mike Samuel <[hidden email]>
>>> wrote:
>>>>
>>>> I agree it's subideal which is why I work to address problems like this
>>>> in template systems but ad-hoc string concatenation happens and embeddable
>>>> sub-languages provide defense-in-depth without sacrificing correctness.
>>>>
>>>> CDATA sections solve no problems because they cannot contain any string
>>>> that has "]]>" as a substring so you still have to s/\]\]>/]]>]]<!CDATA>/g.
>>>>
>>>>
>>>> On Sep 28, 2016 2:32 PM, "Alexander Jones" <[hidden email]> wrote:
>>>>>
>>>>> That's awful. As you say, it's an antipattern, no further effort should
>>>>> be spent on this. JSON produced by JavaScript has far more general uses than
>>>>> slapping directly into a script tag unencoded, so no-one else should have to
>>>>> see this. Also, there are many other producers of JSON than JavaScript.
>>>>>
>>>>> Instead, use XHTML and CDATA (which has a straightforward encoding
>>>>> mechanism that doesn't ruin the parseability of the code or affect it in any
>>>>> way) if you really want to pull stunts like this.
>>>>>
>>>>> Alex
>>>>>
>>>>> On Wednesday, 28 September 2016, Michał Wadas <[hidden email]>
>>>>> wrote:
>>>>>>
>>>>>> Idea: require implementations to stringify "</script>" as
>>>>>> "<\uxxxxscript>".
>>>>>>
>>>>>> Benefits: remove XSS vulnerability when injecting JSON as content of
>>>>>> <script> tag (quite common antipattern).
>>>>>>
>>>>>> Backward compatible: yes, unless binary equality is required and this
>>>>>> string is used.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> es-discuss mailing list
>>>>> [hidden email]
>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Oriol _
In reply to this post by Kris Siegel
> ECMAScript, while highly used in web browsers, should really not care about HTML constructs. That's where WHATWG and W3C come in. I suggest this type of feature should come from one of those groups, not ECMA.

That applies to escaping things like `</script>` or `]]>`, and I agree. But as Mike Samuel mentioned, JSON strings containing U+2028 or U+2029 are not valid JS expressions. I think it would make sense for `JSON.stringify` to escape these.

-Oriol


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel
On Thu, Sep 29, 2016 at 8:45 AM, Oriol Bugzilla
<[hidden email]> wrote:
>> ECMAScript, while highly used in web browsers, should really not care
>> about HTML constructs. That's where WHATWG and W3C come in. I suggest this
>> type of feature should come from one of those groups, not ECMA.
>
> That applies to escaping things like `</script>` or `]]>`, and I agree. But
> as Mike Samuel mentioned, JSON strings containing U+2028 or U+2029 are not
> valid JS expressions. I think it would make sense for `JSON.stringify` to
> escape these.

What is it that you're saying is not in TC-39's bailiwick?

Is it that w3c/whatwg should define what constitutes "embeddable JSON"?

Or is it that if it's worth defining a function that produces
embeddable JSON from an EcmaScript object, that w3c/whatwg should
include that in some set of EcmaScript APIs that it defines?

If you agree with my earlier claim
"""
We're talking about JSON serializers.  Every serializers produces
a subset of the output language. Choices about that sublanguage affect
how easy/hard it is to use that serializer with other tools.
"""
then it seems that TC-39 might take embeddability into account when
crafting the subset of JSON that JSON.stringify produces.
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Alexander Jones
Maybe we should just make U+2028 and U+2029 valid in JS then? What other productions in JSON are invalid syntax in JS?

On Thursday, 29 September 2016, Mike Samuel <[hidden email]> wrote:
On Thu, Sep 29, 2016 at 8:45 AM, Oriol Bugzilla
<<a href="javascript:;" onclick="_e(event, &#39;cvml&#39;, &#39;oriol-bugzilla@hotmail.com&#39;)">oriol-bugzilla@...> wrote:
>> ECMAScript, while highly used in web browsers, should really not care
>> about HTML constructs. That's where WHATWG and W3C come in. I suggest this
>> type of feature should come from one of those groups, not ECMA.
>
> That applies to escaping things like `</script>` or `]]>`, and I agree. But
> as Mike Samuel mentioned, JSON strings containing U+2028 or U+2029 are not
> valid JS expressions. I think it would make sense for `JSON.stringify` to
> escape these.

What is it that you're saying is not in TC-39's bailiwick?

Is it that w3c/whatwg should define what constitutes "embeddable JSON"?

Or is it that if it's worth defining a function that produces
embeddable JSON from an EcmaScript object, that w3c/whatwg should
include that in some set of EcmaScript APIs that it defines?

If you agree with my earlier claim
"""
We're talking about JSON serializers.  Every serializers produces
a subset of the output language. Choices about that sublanguage affect
how easy/hard it is to use that serializer with other tools.
"""
then it seems that TC-39 might take embeddability into account when
crafting the subset of JSON that JSON.stringify produces.
_______________________________________________
es-discuss mailing list
<a href="javascript:;" onclick="_e(event, &#39;cvml&#39;, &#39;es-discuss@mozilla.org&#39;)">es-discuss@...
https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel
On Thu, Sep 29, 2016 at 9:25 AM, Alexander Jones <[hidden email]> wrote:
> Maybe we should just make U+2028 and U+2029 valid in JS then? What other
> productions in JSON are invalid syntax in JS?

I don't think any other productions in JSON are invalid syntax in an
Expression context.

JSON places no limit on size of numeric literals, and other languages
ban unrepresentably large ones, but IIRC ES does not.

Obviously if you start parsing JSON in a statement context, you run
into problems where a JSON object with one or more properties is an
invalid BlockStatement and the ExpressionStatement production is not
reached because of the negative lookahead.
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mike Samuel
In reply to this post by Michał Wadas
On Wed, Sep 28, 2016 at 10:06 AM, Michał Wadas <[hidden email]> wrote:
> Idea: require implementations to stringify "</script>" as "<\uxxxxscript>".
>
> Benefits: remove XSS vulnerability when injecting JSON as content of
> <script> tag (quite common antipattern).
>
> Backward compatible: yes, unless binary equality is required and this string
> is used.

TLDR; I'm against this.

I've pushed back against a number of threads, so I want to avoid
leaving the impression that I support this proposal.

I think this is a bad idea, so let me try to pull together the various
threads and address them in one place.


Should EcmaScript or any other standards body define "embeddable JSON"?
============================================================
No.  Standards bodies move slowly.  The main argument for this feature
is to make it easier to write more secure code, and to transparently
make existing code more secure.

Standards bodies move too slowly.  Library code can roll-out quickly
in response to zero-days or emerging threats, but standards cannot.

For example, client-side templates using mustaches ( goo.gl/eztprF )
are an emerging threat.

There has been a poor history of this, even with JSON.  Crock's RFC 4627 said
"""
    A JSON text can be safely passed into JavaScript's eval() function
   (which compiles and executes a string) if all the characters not
   enclosed in strings are in the set of characters that form JSON
   tokens.  This can be quickly determined in JavaScript with two
   regular expressions and calls to the test and replace methods.

      var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
             text.replace(/"(\\.|[^"\\])*"/g, ''))) &&
         eval('(' + text + ')');
"""
which is not in the latest JSON RFC because it was found to be false
in a dozen ways
before RFC 7158 (obsoleted) removed that language.

The only way to deal with emerging threats is to have a quickly
patchable system.  Patching serializers causes spurious test failures,
the broken-hearts problem:
   assertTrue("I <3 u", serializeHtml("I <3 u"))
I suspect that the best we will ever be able to do re emerging-threats
is to allow those who care about security to patch and fix tests and
ignore the maintenance cost to unmaintained projects.


Is there any value in embeddable sanitizers?
=================================
I think embeddable serializers can provide defense-in-depth against
faults in code that composes network messages which is why I wrote
https://github.com/OWASP/json-sanitizer to do just that.


Is this backwards compatible?
=======================
No.  JSON strings are used as keys in persisted tables because we have
de-facto defined a canonical subset of JSON.

This kind of thing can be discouraged by randomizing the way Java is
doing with builtin map implementaions in Java 9 and helps avoid
broken-hearts problems.  Java is a large API language so can provide
umpteen variants of x in a way that wouldn't fit well in ES, and
providing an alternate API loses a lot of the benefit of the original
proposal.


Are embeddable serializers an anti-pattern?
========================================
No.  The anti-pattern is that trustworthy and untrustworthy content
are mixed using naive string concatenation to produce a trusted
output.

Even if the real anti-pattern were not endemic within distributed
systems, composing trustworthy network messages is hard and embeddable
serializers provide useful defense-in-depth for message composing
code.


Is XHTML more easily secured than HTML?
======================
Yes.  XML is much more easily statically analyzed, and mistaken
assumptions in a serializer much more frequently manifest as parse
failures so fail safe more often.  When the embedding language
fails-safe, the whole is more secure than if you have an embedded
languages that fails-safe in an embedding language which does not as
is the case with JSON in HTML.

This is why, when I write an HTML sanitizer or hardened DOM
serializer, I try to make the output the intersection of HTML &
vanilla XML+namespaces.  (This prevents use of CDATA sections,
incidentally so serializers have included JS rewriters.).

At the risk of FUD though, XHTML-specific parsing branches might be
simpler but have been much less heavily tested and fuzzed, so it might
actually be easier to craft a buffer overflow to take over the
renderer for an origin that serves XHTML than one that serves HTML
exclusively.

The security of XHTML is not relevant though, because XHTML isn't used.

To anyone who is passionate about the benefits of making HTML more
XML-like, I would be happy to help with a proposal to the
content-security-policy team or similar body to add a switch that says
that the parsing should halt as soon as it is realized that the
content is not syntactically valid XML to get the fail-safe benefits
of XML.
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.stringify </script>

Mark S. Miller-2
In reply to this post by Alexander Jones


On Thu, Sep 29, 2016 at 9:25 AM, Alexander Jones <[hidden email]> wrote:
Maybe we should just make U+2028 and U+2029 valid in JS then? What other productions in JSON are invalid syntax in JS?

IIRC, Doug Crockford, possibly Mike Samuel, and I (and perhaps others) advocated such a change to EcmaScript back during the transition from ES3 to ES3.1/ES5. ES differed enough between platforms in other ways that, some of us felt, it would have been worth the experiment to see if we could get away with it -- without breaking the web. We were not able to convince people to engage in that experiment then. Such an experiment would be much more expensive now, with a much lower probability of success, and with a lower payoff. I don't see it happening.

 

On Thursday, 29 September 2016, Mike Samuel <[hidden email]> wrote:
On Thu, Sep 29, 2016 at 8:45 AM, Oriol Bugzilla
<[hidden email]> wrote:
>> ECMAScript, while highly used in web browsers, should really not care
>> about HTML constructs. That's where WHATWG and W3C come in. I suggest this
>> type of feature should come from one of those groups, not ECMA.
>
> That applies to escaping things like `</script>` or `]]>`, and I agree. But
> as Mike Samuel mentioned, JSON strings containing U+2028 or U+2029 are not
> valid JS expressions. I think it would make sense for `JSON.stringify` to
> escape these.

What is it that you're saying is not in TC-39's bailiwick?

Is it that w3c/whatwg should define what constitutes "embeddable JSON"?

Or is it that if it's worth defining a function that produces
embeddable JSON from an EcmaScript object, that w3c/whatwg should
include that in some set of EcmaScript APIs that it defines?

If you agree with my earlier claim
"""
We're talking about JSON serializers.  Every serializers produces
a subset of the output language. Choices about that sublanguage affect
how easy/hard it is to use that serializer with other tools.
"""
then it seems that TC-39 might take embeddability into account when
crafting the subset of JSON that JSON.stringify produces.


I agree that this issue belongs with TC39 much more than it belongs anywhere else. TC39's steering of JS is certainly influenced by how JS gets used in web browsers. When an issue touches both JS and browser specific concerns, it can often be unclear whose "jurisdiction" it belongs in. This one is not unclear. It should be treated as a language issue by TC39.


 
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss




--
    Cheers,
    --MarkM

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss