JSON.canonicalize()

classic Classic list List threaded Threaded
73 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Anders Rundgren-2
On 2018-03-16 20:09, Mike Samuel wrote:
>
>     Availability beats perfection anytime.  This is the VHS (if anybody remember that old story) of canonicalization and I don't feel too bad about that :-)
>
>
> Perhaps.  Any thoughts on my question about the merits of "Hashable" vs "Canonical"?

No, there were so much noise here so I may have need a more dense description if possible.

Anders

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Richard Gibson
In reply to this post by Mike Samuel
Though ECMAScript JSON.stringify may suffice for certain Javascript-centric use cases or otherwise restricted subsets thereof as addressed by JOSE, it is not suitable for producing canonical/hashable/etc. JSON, which requires a fully general solution such as [1]. Both its number serialization [2] and string serialization [3] specify aspects that harm compatibility (the former having arbitrary branches dependent upon the value of numbers, the latter being capable of producing invalid UTF-8 octet sequences that represent unpaired surrogate code points—unacceptable for exchange outside of a closed ecosystem [4]). JSON is a general language-agnostic interchange format, and ECMAScript JSON.stringify is not a JSON canonicalization solution.


On Fri, Mar 16, 2018 at 3:09 PM, Mike Samuel <[hidden email]> wrote:


On Fri, Mar 16, 2018 at 3:03 PM, Anders Rundgren <[hidden email]> wrote:
On 2018-03-16 19:51, Mike Samuel wrote:


On Fri, Mar 16, 2018 at 2:43 PM, Anders Rundgren <[hidden email] <mailto:[hidden email]>> wrote:

    On 2018-03-16 19:30, Mike Samuel wrote:

        2. Any numbers with minimal changes: dropping + signs, normalizing zeros,
              using a fixed threshold for scientific notation.
              PROS: supports whole JSON value-space
              CONS: less useful for hashing
              CONS: risks loss of precision when decoders decide based on presence of
                 decimal point whether to represent as double or int.


    Have you actually looked into the specification?
    https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html#rfc.section.3.2.2 <https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html#rfc.section.3.2.2>
    ES6 has all what it takes.


Yes, but other notions of canonical equivalence have been mentioned here
so reasons to prefer one to another seem in scope.

Availability beats perfection anytime.  This is the VHS (if anybody remember that old story) of canonicalization and I don't feel too bad about that :-)

Perhaps.  Any thoughts on my question about the merits of "Hashable" vs "Canonical"? 

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Mike Samuel
In reply to this post by Anders Rundgren-2


On Fri, Mar 16, 2018 at 3:23 PM, Anders Rundgren <[hidden email]> wrote:
On 2018-03-16 20:09, Mike Samuel wrote:

    Availability beats perfection anytime.  This is the VHS (if anybody remember that old story) of canonicalization and I don't feel too bad about that :-)


Perhaps.  Any thoughts on my question about the merits of "Hashable" vs "Canonical"?

No, there were so much noise here so I may have need a more dense description if possible.

In the email to which you responded "Have you actually looked ..." look for "If that is correct, Would people be averse to marketing this as "hashable JSON" instead of "canonical JSON?""

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Anders Rundgren-2
In reply to this post by Richard Gibson
On 2018-03-16 20:24, Richard Gibson wrote:
Though ECMAScript JSON.stringify may suffice for certain Javascript-centric use cases or otherwise restricted subsets thereof as addressed by JOSE, it is not suitable for producing canonical/hashable/etc. JSON, which requires a fully general solution such as [1]. Both its number serialization [2] and string serialization [3] specify aspects that harm compatibility (the former having arbitrary branches dependent upon the value of numbers, the latter being capable of producing invalid UTF-8 octet sequences that represent unpaired surrogate code points—unacceptable for exchange outside of a closed ecosystem [4]). JSON is a general language-agnostic interchange format, and ECMAScript JSON.stringify is not a JSON canonicalization solution.

It effectively depends on your objectives.

#2 is not really a problem, you would typically not output canonicalized JSON, it is only used internally since there are no requirements that input is canonicalized .
#3 yes, if you create bad data you can [always] screw up.  It sounds BTW as a bug which presumable get fixed some day.
#4 If you are targeting Node.js, Browsers, OpenAPI, and all other platforms compatible with those, JSON.stringify() seems to suffice.

The JSON.canonicalize() method proposal was intended for the systems specified in #4.

Perfection is often the enemy of good.

Anders



On Fri, Mar 16, 2018 at 3:09 PM, Mike Samuel <[hidden email]> wrote:


On Fri, Mar 16, 2018 at 3:03 PM, Anders Rundgren <[hidden email]> wrote:
On 2018-03-16 19:51, Mike Samuel wrote:


On Fri, Mar 16, 2018 at 2:43 PM, Anders Rundgren <[hidden email] <mailto:[hidden email]>> wrote:

    On 2018-03-16 19:30, Mike Samuel wrote:

        2. Any numbers with minimal changes: dropping + signs, normalizing zeros,
              using a fixed threshold for scientific notation.
              PROS: supports whole JSON value-space
              CONS: less useful for hashing
              CONS: risks loss of precision when decoders decide based on presence of
                 decimal point whether to represent as double or int.


    Have you actually looked into the specification?
    https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html#rfc.section.3.2.2 <https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html#rfc.section.3.2.2>
    ES6 has all what it takes.


Yes, but other notions of canonical equivalence have been mentioned here
so reasons to prefer one to another seem in scope.

Availability beats perfection anytime.  This is the VHS (if anybody remember that old story) of canonicalization and I don't feel too bad about that :-)

Perhaps.  Any thoughts on my question about the merits of "Hashable" vs "Canonical"? 

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss




_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

C. Scott Ananian
On Fri, Mar 16, 2018 at 4:07 PM, Anders Rundgren <[hidden email]> wrote:
Perfection is often the enemy of good.

So, to be clear: you don't plan on actually incorporating any feedback into your proposal, since it's already "good"?
  --scott 


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Mike Samuel


On Fri, Mar 16, 2018 at 4:34 PM, C. Scott Ananian <[hidden email]> wrote:
On Fri, Mar 16, 2018 at 4:07 PM, Anders Rundgren <[hidden email]> wrote:
Perfection is often the enemy of good.

So, to be clear: you don't plan on actually incorporating any feedback into your proposal, since it's already "good"?

To restate my main objections:

I think any proposal to offer an alternative stringify instead of a string->string transform is not very good
and could be easily improved by rephrasing it as a string->string transform.

Also, presenting this as a better wire format I think is misleading since I think it has no advantages as a wire format over JSON.stringify's
output, and recommending canonical JSON, except for the short duration needed to hash it creates more problems than it solves.
 
  --scott 


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Anders Rundgren-2
On 2018-03-16 21:41, Mike Samuel wrote:

>
>
> On Fri, Mar 16, 2018 at 4:34 PM, C. Scott Ananian <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On Fri, Mar 16, 2018 at 4:07 PM, Anders Rundgren <[hidden email] <mailto:[hidden email]>> wrote:
>
>         Perfection is often the enemy of good.
>
>
>     So, to be clear: you don't plan on actually incorporating any feedback into your proposal, since it's already "good"?

I'm not going to incorporate Unicode Normalization because it is better addressed at the application level.


> To restate my main objections:
>
> I think any proposal to offer an alternative stringify instead of a string->string transform is not very good
> and could be easily improved by rephrasing it as a string->string transform.

Could you give a concrete example on that?


> Also, presenting this as a better wire format I think is misleading

This was not my intention, I just expressed it poorly.  It was rather mixed with my objection to Unicode Normalization.


> since I think it has no advantages as a wire format over JSON.stringify's output,

Right, JSON.stringify() is a much better for creating the external format since it honors "creation order".


> and recommending canonical JSON, except for the short duration needed to hash it creates more problems than it solves.

Wrong, this is exactly what I had in mind.  If the hashable/canonicalizable method works as described (it does not?) it solves the hashing problem.

Anders
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Mathias Bynens-2
In reply to this post by Mike Samuel
On Fri, Mar 16, 2018 at 9:04 PM, Mike Samuel <[hidden email]> wrote:

The output of JSON.canonicalize would also not be in the subset of JSON that is also a subset of JavaScript's PrimaryExpression.

   JSON.canonicalize(JSON.stringify("\u2028\u2029")) === `"\u2028\u2029"`

Soon U+2028 and U+2029 will no longer be edge cases. A Stage 3 proposal (currently shipping in Chrome) makes them valid in ECMAScript string literals, making JSON a strict subset of ECMAScript: https://github.com/tc39/proposal-json-superset 

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

C. Scott Ananian
In reply to this post by Mike Samuel
My main feedback is that since this topic has been covered so many times in the past, any serious standardization proposal should include a section surveying existing "canonical JSON" standards and implementations and comparing the proposed standard with prior work.  A standard should be a "best of breed" implementation, which adequately replaces existing work, not just another average implementation narrowly tailored to the proposer's own particular use cases.

I don't think Unicode Normalization should necessarily be a requirement of a canonical JSON standard.  But any reasonable proposal should at least acknowledge the issues raised, as well as the issues of embedded nulls, HTML safety, and the other points that have been raised in this thread (and the many other points addressed by the dozen other "canonical JSON" implementations I linked to).  If you're just going to say, "my proposal is good enough", well then mine is "good enough" too, and so are the other dozen, and none of them need to be the "official JavaScript canonical form".  What's your compelling argument that your proposal is better than any of the other dozen?  And why start the discussion on this list if you're not going to do anything with the information you learn?
 --scott


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Mike Samuel
In reply to this post by Anders Rundgren-2


On Fri, Mar 16, 2018, 4:58 PM Anders Rundgren <[hidden email]> wrote:
On 2018-03-16 21:41, Mike Samuel wrote:
>
>
> On Fri, Mar 16, 2018 at 4:34 PM, C. Scott Ananian <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On Fri, Mar 16, 2018 at 4:07 PM, Anders Rundgren <[hidden email] <mailto:[hidden email]>> wrote:
>

> To restate my main objections:
>
> I think any proposal to offer an alternative stringify instead of a string->string transform is not very good
> and could be easily improved by rephrasing it as a string->string transform.

Could you give a concrete example on that?



I've given three.  As written, the proposal produces invalid or low quality output given (undefined, objects with toJSON methods, and symbols as either keys or values).  These would not be problems for a real canonicalizer since none are present in a string of JSON.

In addition, two distant users of the canonicalizer who wish to check hashes need to agree on the ancillary arguments like the replacer if canonicalize takes the same arguments and actually uses them.  They also need to agree on implementation details of toJSON methods which is a backward compatibility hazard.

If you did solve the toJSON problem by incorporating calls to that method you've now complicated cross-platform behavior.  If you phrase in terms of string->string it is much easier to disentangle the definition of canonicalizers JSON from JS and make it language agnostic.

Finally, your proposal is not the VHS of canonicalizers.  That would be x=>JSON.stringify(JSON.parse(x)) since it's deployed and used.

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Summary of Input. Re: JSON.canonicalize()

Anders Rundgren-2
In reply to this post by C. Scott Ananian
Scott A:
https://en.wikipedia.org/wiki/Security_level
"For example, SHA-256 offers 128-bit collision resistance"
That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect.
Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.

Richard G:
Is the [highly involuntary] "inspiration" to the JSON.canonicalize() proposal:
https://www.ietf.org/mail-archive/web/json/current/msg04257.html
Why not fork your go library? Then there would be three implementations!

Mike S:
Wants to build a 2000+ line standalone JSON canonicalizer working on string data.
That's great but I think that it will be a hard sell getting these guys accept the Pull Request:
https://developers.google.com/v8/
JSON.canonicalize(JSON.parse("json string data to be canonicalized")) would IMHO do the same job.
My (working) code example was only provided to show the principle as well as being able to test/verify.


On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools.  A single line did 99% of the job:
https://github.com/cyberphone/openkeystore/blob/jose-compatible/library/src/org/webpki/json/JSONObjectWriter.java#L928

for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet()) : object.properties.keySet()) {


Other mentioned issues like HTML safety, embedded nulls etc. would apply to JSON.stringify() as well.
JSON.canonicalize() would inherit all the features (and weaknesses) of JSON.stringify().


thanx,
Anders
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

kai zhu
In reply to this post by Mike Samuel
stepping aside from the security aspect, having your code-base’s json-files normalized with sorted-keys is good-housekeeping, especially when you want to sanely maintain ones >1mb in size (e.g. large swagger json-documentations) [1].

and you can easily operationalize your build-process / pre-commit-checks to auto-key-sort json-files with the following simple shell-function [2].

[1] https://github.com/kaizhu256/node-swgg-github-all/blob/2018.2.2/assets.swgg.swagger.json



```shell
#!/bin/sh
# .bashrc
: '
# to install, copy-paste the shell-function shFileJsonNormalize below
# into your shell startup script (.bashrc, .profile, etc...)


# example shell-usage:

source ~/.bashrc
printf "{
    \"version\": \"0.0.1\",
    \"name\": \"my-app\",
    \"aa\": {
        \"zz\": 1,
        \"yy\": {
            \"xx\": 2,
            \"ww\": 3
        }
    },
    \"bb\": [
        3,
        2,
        1,
        null
    ]
}" > package.json
shFileJsonNormalize package.json
cat package.json


# key-sorted output:
{
    "aa": {
        "yy": {
            "ww": 3,
            "xx": 2
        },
        "zz": 1
    },
    "bb": [
        3,
        2,
        1,
        null
    ],
    "name": "my-app",
    "version": "0.0.1"
}
'


shFileJsonNormalize() {(set -e
# this shell-function will
# 1. read the json-data from $FILE
# 2. normalize the json-data
# 3. write the normalized json-data back to $FILE
    FILE="$1"
    node -e "
// <script>
/*jslint
    bitwise: true,
    browser: true,
    maxerr: 8,
    maxlen: 100,
    node: true,
    nomen: true,
    regexp: true,
    stupid: true
*/
'use strict';
var local;
local = {};
local.fs = require('fs');
local.jsonStringifyOrdered = function (jsonObj, replacer, space) {
/*
 * this function will JSON.stringify the jsonObj,
 * with object-keys sorted and circular-references removed
 */
    var circularList, stringify, tmp;
    stringify = function (jsonObj) {
    /*
     * this function will recursively JSON.stringify the jsonObj,
     * with object-keys sorted and circular-references removed
     */
        // if jsonObj is an object, then recurse its items with object-keys sorted
        if (jsonObj &&
                typeof jsonObj === 'object' &&
                typeof jsonObj.toJSON !== 'function') {
            // ignore circular-reference
            if (circularList.indexOf(jsonObj) >= 0) {
                return;
            }
            circularList.push(jsonObj);
            // if jsonObj is an array, then recurse its jsonObjs
            if (Array.isArray(jsonObj)) {
                return '[' + jsonObj.map(function (jsonObj) {
                    // recurse
                    tmp = stringify(jsonObj);
                    return typeof tmp === 'string'
                        ? tmp
                        : 'null';
                }).join(',') + ']';
            }
            return '{' + Object.keys(jsonObj)
                // sort object-keys
                .sort()
                .map(function (key) {
                    // recurse
                    tmp = stringify(jsonObj[key]);
                    if (typeof tmp === 'string') {
                        return JSON.stringify(key) + ':' + tmp;
                    }
                })
                .filter(function (jsonObj) {
                    return typeof jsonObj === 'string';
                })
                .join(',') + '}';
        }
        // else JSON.stringify as normal
        return JSON.stringify(jsonObj);
    };
    circularList = [];
    return JSON.stringify(typeof jsonObj === 'object' && jsonObj
        // recurse
        ? JSON.parse(stringify(jsonObj))
        : jsonObj, replacer, space);
};
local.fs.writeFileSync(process.argv[1], local.jsonStringifyOrdered(
    JSON.parse(local.fs.readFileSync(process.argv[1], 'utf8')),
    null,
    4
) + '\n');
// </script>
    " "$FILE"
)}
```

On Mar 17, 2018, at 5:43 AM, Mike Samuel <[hidden email]> wrote:



On Fri, Mar 16, 2018, 4:58 PM Anders Rundgren <[hidden email]> wrote:
On 2018-03-16 21:41, Mike Samuel wrote:
>
>
> On Fri, Mar 16, 2018 at 4:34 PM, C. Scott Ananian <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On Fri, Mar 16, 2018 at 4:07 PM, Anders Rundgren <[hidden email] <mailto:[hidden email]>> wrote:
>

> To restate my main objections:
>
> I think any proposal to offer an alternative stringify instead of a string->string transform is not very good
> and could be easily improved by rephrasing it as a string->string transform.

Could you give a concrete example on that?



I've given three.  As written, the proposal produces invalid or low quality output given (undefined, objects with toJSON methods, and symbols as either keys or values).  These would not be problems for a real canonicalizer since none are present in a string of JSON.

In addition, two distant users of the canonicalizer who wish to check hashes need to agree on the ancillary arguments like the replacer if canonicalize takes the same arguments and actually uses them.  They also need to agree on implementation details of toJSON methods which is a backward compatibility hazard.

If you did solve the toJSON problem by incorporating calls to that method you've now complicated cross-platform behavior.  If you phrase in terms of string->string it is much easier to disentangle the definition of canonicalizers JSON from JS and make it language agnostic.

Finally, your proposal is not the VHS of canonicalizers.  That would be x=>JSON.stringify(JSON.parse(x)) since it's deployed and used.
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: JSON.canonicalize()

Isiah Meadows-2
With files frequently that size, it might be worth considering whether
you should use a custom format+validator\* instead. It'd take a lot
less memory, which could be helpful since the first row alone of [this
file][1] takes about 4-5K in Firefox when deserialized - I verified
this in the console (To be exact, 5032 the first time, 4128 the
second, and 4416 the third). Also, a megabyte is a *lot* to send down
the wire in Web terms.

\* In this case, you'd need a validator that uses minimal perfect
hashes and a compact binary data representation that doesn't rely on a
concrete start/end. That would avoid the mess of constantly having to
look things up in memory, while leaving your IR much smaller. Another
item of note: JS strings are 16-bit, which is wasteful in memory for
your entire object.

[1]: https://raw.githubusercontent.com/kaizhu256/node-swgg-github-all/2018.2.2/assets.swgg.swagger.json

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com


On Fri, Mar 16, 2018 at 11:53 PM, kai zhu <[hidden email]> wrote:

> stepping aside from the security aspect, having your code-base’s json-files
> normalized with sorted-keys is good-housekeeping, especially when you want
> to sanely maintain ones >1mb in size (e.g. large swagger
> json-documentations) [1].
>
> and you can easily operationalize your build-process / pre-commit-checks to
> auto-key-sort json-files with the following simple shell-function [2].
>
> [1]
> https://github.com/kaizhu256/node-swgg-github-all/blob/2018.2.2/assets.swgg.swagger.json
> [2]
> https://github.com/kaizhu256/node-utility2/blob/2018.1.13/lib.utility2.sh#L1513
>
>
>
> ```shell
> #!/bin/sh
> # .bashrc
> : '
> # to install, copy-paste the shell-function shFileJsonNormalize below
> # into your shell startup script (.bashrc, .profile, etc...)
>
>
> # example shell-usage:
>
> source ~/.bashrc
> printf "{
>     \"version\": \"0.0.1\",
>     \"name\": \"my-app\",
>     \"aa\": {
>         \"zz\": 1,
>         \"yy\": {
>             \"xx\": 2,
>             \"ww\": 3
>         }
>     },
>     \"bb\": [
>         3,
>         2,
>         1,
>         null
>     ]
> }" > package.json
> shFileJsonNormalize package.json
> cat package.json
>
>
> # key-sorted output:
> {
>     "aa": {
>         "yy": {
>             "ww": 3,
>             "xx": 2
>         },
>         "zz": 1
>     },
>     "bb": [
>         3,
>         2,
>         1,
>         null
>     ],
>     "name": "my-app",
>     "version": "0.0.1"
> }
> '
>
>
> shFileJsonNormalize() {(set -e
> # this shell-function will
> # 1. read the json-data from $FILE
> # 2. normalize the json-data
> # 3. write the normalized json-data back to $FILE
>     FILE="$1"
>     node -e "
> // <script>
> /*jslint
>     bitwise: true,
>     browser: true,
>     maxerr: 8,
>     maxlen: 100,
>     node: true,
>     nomen: true,
>     regexp: true,
>     stupid: true
> */
> 'use strict';
> var local;
> local = {};
> local.fs = require('fs');
> local.jsonStringifyOrdered = function (jsonObj, replacer, space) {
> /*
>  * this function will JSON.stringify the jsonObj,
>  * with object-keys sorted and circular-references removed
>  */
>     var circularList, stringify, tmp;
>     stringify = function (jsonObj) {
>     /*
>      * this function will recursively JSON.stringify the jsonObj,
>      * with object-keys sorted and circular-references removed
>      */
>         // if jsonObj is an object, then recurse its items with object-keys
> sorted
>         if (jsonObj &&
>                 typeof jsonObj === 'object' &&
>                 typeof jsonObj.toJSON !== 'function') {
>             // ignore circular-reference
>             if (circularList.indexOf(jsonObj) >= 0) {
>                 return;
>             }
>             circularList.push(jsonObj);
>             // if jsonObj is an array, then recurse its jsonObjs
>             if (Array.isArray(jsonObj)) {
>                 return '[' + jsonObj.map(function (jsonObj) {
>                     // recurse
>                     tmp = stringify(jsonObj);
>                     return typeof tmp === 'string'
>                         ? tmp
>                         : 'null';
>                 }).join(',') + ']';
>             }
>             return '{' + Object.keys(jsonObj)
>                 // sort object-keys
>                 .sort()
>                 .map(function (key) {
>                     // recurse
>                     tmp = stringify(jsonObj[key]);
>                     if (typeof tmp === 'string') {
>                         return JSON.stringify(key) + ':' + tmp;
>                     }
>                 })
>                 .filter(function (jsonObj) {
>                     return typeof jsonObj === 'string';
>                 })
>                 .join(',') + '}';
>         }
>         // else JSON.stringify as normal
>         return JSON.stringify(jsonObj);
>     };
>     circularList = [];
>     return JSON.stringify(typeof jsonObj === 'object' && jsonObj
>         // recurse
>         ? JSON.parse(stringify(jsonObj))
>         : jsonObj, replacer, space);
> };
> local.fs.writeFileSync(process.argv[1], local.jsonStringifyOrdered(
>     JSON.parse(local.fs.readFileSync(process.argv[1], 'utf8')),
>     null,
>     4
> ) + '\n');
> // </script>
>     " "$FILE"
> )}
> ```
>
> On Mar 17, 2018, at 5:43 AM, Mike Samuel <[hidden email]> wrote:
>
>
>
> On Fri, Mar 16, 2018, 4:58 PM Anders Rundgren
> <[hidden email]> wrote:
>>
>> On 2018-03-16 21:41, Mike Samuel wrote:
>> >
>> >
>> > On Fri, Mar 16, 2018 at 4:34 PM, C. Scott Ananian <[hidden email]
>> > <mailto:[hidden email]>> wrote:
>> >
>> >     On Fri, Mar 16, 2018 at 4:07 PM, Anders Rundgren
>> > <[hidden email] <mailto:[hidden email]>>
>> > wrote:
>> >
>>
>> > To restate my main objections:
>> >
>> > I think any proposal to offer an alternative stringify instead of a
>> > string->string transform is not very good
>> > and could be easily improved by rephrasing it as a string->string
>> > transform.
>>
>> Could you give a concrete example on that?
>>
>>
>
> I've given three.  As written, the proposal produces invalid or low quality
> output given (undefined, objects with toJSON methods, and symbols as either
> keys or values).  These would not be problems for a real canonicalizer since
> none are present in a string of JSON.
>
> In addition, two distant users of the canonicalizer who wish to check hashes
> need to agree on the ancillary arguments like the replacer if canonicalize
> takes the same arguments and actually uses them.  They also need to agree on
> implementation details of toJSON methods which is a backward compatibility
> hazard.
>
> If you did solve the toJSON problem by incorporating calls to that method
> you've now complicated cross-platform behavior.  If you phrase in terms of
> string->string it is much easier to disentangle the definition of
> canonicalizers JSON from JS and make it language agnostic.
>
> Finally, your proposal is not the VHS of canonicalizers.  That would be
> x=>JSON.stringify(JSON.parse(x)) since it's deployed and used.
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Browser version on-line. Re: JSON.canonicalize()

Anders Rundgren-2
In reply to this post by Anders Rundgren-2
F.Y.I.

https://cyberphone.github.io/doc/security/browser-json-canonicalization.html

thanx,
Anders
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Hashable vs Canonicalizable. Re: JSON.canonicalize()

Anders Rundgren-2
In reply to this post by Mike Samuel
A "Hashable" format does not have to comply with the original; the only requirement is that it is reproducible.
However, I have difficulties coming up with a good argument for not sticking to the original.
If you stick to the original, then the terms Hashable and Canonicalizable become fully interchangeable.

I could though imagine representing "Number" as IEEE-754 8-byte binary blobs instead of a textual format but the availability of a useful definition and implementation in ES6, makes this less appetizing.

Note that the availability of canonicalization DOES NOT mean that you MUST use it as the "wire format".

In my own applications [*], I do not intend to use "JSON.canonicalize()" except internally for crypto related operations.
Why is that?  Because it breaks the "natural order" provided by JSON.stringify().

thanx,
Anders

*] https://cyberphone.github.io/doc/saturn/
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Summary of Input. Re: JSON.canonicalize()

Mike Samuel
In reply to this post by Anders Rundgren-2


On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren <[hidden email]> wrote:
Scott A:
https://en.wikipedia.org/wiki/Security_level
"For example, SHA-256 offers 128-bit collision resistance"
That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect.
Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.

Richard G:
Is the [highly involuntary] "inspiration" to the JSON.canonicalize() proposal:
https://www.ietf.org/mail-archive/web/json/current/msg04257.html
Why not fork your go library? Then there would be three implementations!

Mike S:
Wants to build a 2000+ line standalone JSON canonicalizer working on string data.
That's great but I think that it will be a hard sell getting these guys accept the Pull Request:
https://developers.google.com/v8/
JSON.canonicalize(JSON.parse("json string data to be canonicalized")) would IMHO do the same job.
My (working) code example was only provided to show the principle as well as being able to test/verify.

I don't know where you get the 2000+ line number.
That's roughly twice as long as your demonstrably broken example code, but far shorter than the number you provided.

If you're being hyperbolic, please stop.
If that was a genuine guesstimate, but you just happened to be off by a factor of 25, then I have less confidence that
you can weigh the design complexity tradeoffs when comparing your's to other proposals.


On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools.  A single line did 99% of the job:
https://github.com/cyberphone/openkeystore/blob/jose-compatible/library/src/org/webpki/json/JSONObjectWriter.java#L928 
for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet()) : object.properties.keySet()) {


Other mentioned issues like HTML safety, embedded nulls etc. would apply to JSON.stringify() as well.
JSON.canonicalize() would inherit all the features (and weaknesses) of JSON.stringify().

Please, when you attribute a summary to me, don't ignore the summary that I myself wrote of my arguments.

You're ignoring the context.  JSON.canonicalize is not generally useful because it undoes safety precautions.
That tied into one argument of mine that you left out: JSON.canonicalize is not generally useful.  It should probably not
be used as a wire or storage format, and is entirely unsuitable for embedding into other commonly used web application
languages.

You also make no mention of backwards compatibility concerns when this depends on things like toJSON, which is hugely important
when dealing with long lived hashes.

When I see that you've summarized my own thoughts incorrectly, even though I provided you with a summary of my own arguments,
I lose confidence that you've correctly summarized other's positions.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Summary of Input. Re: JSON.canonicalize()

Mike Samuel
In reply to this post by Anders Rundgren-2


On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren <[hidden email]> wrote:


On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools.  A single line did 99% of the job:
https://github.com/cyberphone/openkeystore/blob/jose-compatible/library/src/org/webpki/json/JSONObjectWriter.java#L928

for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet()) : object.properties.keySet()) {

If this is what you want then can't you just use a replacer to substitute a record with sorted keys?

JSON.canonicalize = (value) => JSON.stringify(value, (_, value) => {
  if (value && typeof value === 'object' && !Array.isArray(value)) {
    const withSortedKeys = {}
    const keys = Object.getOwnPropertyNames(value)
    keys.sort()
    keys.forEach(key => withSortedKeys[key] = value[key])
    value = withSortedKeys
  }
  return value
})

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Summary of Input. Re: JSON.canonicalize()

Anders Rundgren-2
In reply to this post by Mike Samuel
Hi Guys,

Pardon me if you think I was hyperbolic,
The discussion got derailed by the bogus claims about hash functions' vulnerability.

F.Y.I: Using ES6 serialization methods for JSON primitive types is headed for standardization in the IETF.
https://www.ietf.org/mail-archive/web/jose/current/msg05716.html

This effort is backed by one of the main authors behind the current de-facto standard for Signed and Encrypted JSON, aka JOSE.
If this is in your opinion is a bad idea, now is the right time to shoot it down :-)

This efforts also exploits the ability of JSON.parse() and JSON.stringify() honoring object "Creation Order".

JSON.canonicalize() would be a "Sorting" alternative to "Creation Order" offering certain advantages with limiting deployment impact to JSON serializers as the most important one.

The ["completely broken"] sample code was only submitted as a proof-of-concept. I'm sure you JS gurus can do this way better than I :-)

Creating an alternative based on [1,2,3] seems like a rather daunting task.

Thanx,
Anders
https://github.com/cyberphone/json-canonicalization

1] http://wiki.laptop.org/go/Canonical_JSON
2] https://gibson042.github.io/canonicaljson-spec/
3] https://gist.github.com/mikesamuel/20710f94a53e440691f04bf79bc3d756

On 2018-03-17 22:29, Mike Samuel wrote:

>
>
> On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Scott A:
>     https://en.wikipedia.org/wiki/Security_level <https://en.wikipedia.org/wiki/Security_level>
>     "For example, SHA-256 offers 128-bit collision resistance"
>     That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect.
>     Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.
>
>     Richard G:
>     Is the [highly involuntary] "inspiration" to the JSON.canonicalize() proposal:
>     https://www.ietf.org/mail-archive/web/json/current/msg04257.html <https://www.ietf.org/mail-archive/web/json/current/msg04257.html>
>     Why not fork your go library? Then there would be three implementations!
>
>     Mike S:
>     Wants to build a 2000+ line standalone JSON canonicalizer working on string data.
>     That's great but I think that it will be a hard sell getting these guys accept the Pull Request:
>     https://developers.google.com/v8/ <https://developers.google.com/v8/>
>     JSON.canonicalize(JSON.parse("json string data to be canonicalized")) would IMHO do the same job.
>     My (working) code example was only provided to show the principle as well as being able to test/verify.
>
>
> I don't know where you get the 2000+ line number.
> https://gist.github.com/mikesamuel/20710f94a53e440691f04bf79bc3d756 comes in at 80 lines.
> That's roughly twice as long as your demonstrably broken example code, but far shorter than the number you provided.
>
> If you're being hyperbolic, please stop.
> If that was a genuine guesstimate, but you just happened to be off by a factor of 25, then I have less confidence that
> you can weigh the design complexity tradeoffs when comparing your's to other proposals.
>
>
>     On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools.  A single line did 99% of the job:
>     https://github.com/cyberphone/openkeystore/blob/jose-compatible/library/src/org/webpki/json/JSONObjectWriter.java#L928 <https://github.com/cyberphone/openkeystore/blob/jose-compatible/library/src/org/webpki/json/JSONObjectWriter.java#L928>
>
>     for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet()) : object.properties.keySet()) {
>
>
>     Other mentioned issues like HTML safety, embedded nulls etc. would apply to JSON.stringify() as well.
>     JSON.canonicalize() would inherit all the features (and weaknesses) of JSON.stringify().
>
>
> Please, when you attribute a summary to me, don't ignore the summary that I myself wrote of my arguments.
>
> You're ignoring the context.  JSON.canonicalize is not generally useful because it undoes safety precautions.
> That tied into one argument of mine that you left out: JSON.canonicalize is not generally useful.  It should probably not
> be used as a wire or storage format, and is entirely unsuitable for embedding into other commonly used web application
> languages.
>
> You also make no mention of backwards compatibility concerns when this depends on things like toJSON, which is hugely important
> when dealing with long lived hashes.
>
> When I see that you've summarized my own thoughts incorrectly, even though I provided you with a summary of my own arguments,
> I lose confidence that you've correctly summarized other's positions.
>

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

JSON.canonicalize()

Richard Gibson
In reply to this post by Richard Gibson
On Sunday, March 18, 2018, Anders Rundgren <[hidden email]> wrote:
On 2018-03-16 20:24, Richard Gibson wrote:
Though ECMAScript JSON.stringify may suffice for certain Javascript-centric use cases or otherwise restricted subsets thereof as addressed by JOSE, it is not suitable for producing canonical/hashable/etc. JSON, which requires a fully general solution such as [1]. Both its number serialization [2] and string serialization [3] specify aspects that harm compatibility (the former having arbitrary branches dependent upon the value of numbers, the latter being capable of producing invalid UTF-8 octet sequences that represent unpaired surrogate code points—unacceptable for exchange outside of a closed ecosystem [4]). JSON is a general language-agnostic interchange format, and ECMAScript JSON.stringify is not a JSON canonicalization solution.


Richard, I may be wrong but AFAICT, our respective canoncalization schemes are in fact principally IDENTICAL.

In that they have the same goal, yes. In that they both achieve that goal, no. I'm not married to choices like exponential notation and uppercase escapes, but a JSON canonicalization scheme MUST cover all of JSON.
 
That the number serialization provided by JSON.stringify() is unacceptable, is not generally taken as a fact.  I also think it looks a bit weird, but that's just a matter of esthetics.  Compatibility is an entirely different issue.

I concede this point. The modified algorithm is sufficient, but note that a canonicalization scheme will remain static even if ECMAScript changes.

Sorting on Unicode Code Points is of course "technically 100% right" but strictly put not necessary.

Certain scenarios call for different systems to _independently_ generate equivalent data structures, and it is a necessary property of canonical serialization that it yields identical results for equivalent data structures. JSON does not specify significance of object member ordering, so member ordering does not distinguish otherwise equivalent objects, so canonicalization MUST specify member ordering that is deterministic with respect to all valid data.

Your claim about uppercase Unicode escapes is incorrect, there is no such requirement:
 
I don't recall ever making a claim about uppercase Unicode escapes, other than observing that it is the preferred form for examples in the JSON RFCs... what are you talking about?

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Summary of Input. Re: JSON.canonicalize()

Mike Samuel
In reply to this post by Anders Rundgren-2


On Sun, Mar 18, 2018 at 2:14 AM, Anders Rundgren <[hidden email]> wrote:
Hi Guys,

Pardon me if you think I was hyperbolic,
The discussion got derailed by the bogus claims about hash functions' vulnerability.

I didn't say I "think" you were being hyperbolic.  I asked whether you were.

You asserted a number that seemed high to me.
I demonstrated it was high by a factor of at least 25 by showing an implementation that
used 80 lines instead of the 2000 you said was required.

If you're going to put out a number as a reason to dismiss an argument, you should own it
or retract it.
Were you being hyperbolic?  (Y/N)

Your claim and my counterclaim are in no way linked to hash function vulnerability.
I never weighed in on that claim and have already granted that hashable JSON is a
worthwhile use case.

 
F.Y.I: Using ES6 serialization methods for JSON primitive types is headed for standardization in the IETF.
https://www.ietf.org/mail-archive/web/jose/current/msg05716.html

This effort is backed by one of the main authors behind the current de-facto standard for Signed and Encrypted JSON, aka JOSE.
If this is in your opinion is a bad idea, now is the right time to shoot it down :-)

Does this main author prefer your particular JSON canonicalization scheme to
others?
Is this an informed opinion based on flaws in the others that make them less suitable for
JOSE's needs that are not present in the scheme you back?

If so, please provide links to their reasoning.
If not, how is their backing relevant?

 
This efforts also exploits the ability of JSON.parse() and JSON.stringify() honoring object "Creation Order".

JSON.canonicalize() would be a "Sorting" alternative to "Creation Order" offering certain advantages with limiting deployment impact to JSON serializers as the most important one.

The ["completely broken"] sample code was only submitted as a proof-of-concept. I'm sure you JS gurus can do this way better than I :-)

This is a misquote.  No-one has said your sample code was completely broken.
Neither your sample code nor the spec deals with toJSON.  At some point you're
going to have to address that if you want to keep your proposal moving forward.
No amount of JS guru-ry is going to save your sample code from a specification bug.

 
Creating an alternative based on [1,2,3] seems like a rather daunting task.

Maybe if you spend more time laying out the criteria on which a successful proposal
should be judged, we could move towards consensus on this claim.

As it is, I have only your say so but I have reason to doubt your evaluation
of task complexity unless you were being hyperbolic before.


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
1234