Array#sort() implementations not interoperable

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Array#sort() implementations not interoperable

Jussi Kalliokoski
Hello everyone,

Reposting, I think my previous attempt got stuck in a filter or something, because I somehow managed to have the code there in several copies.

I was thinking about sorting algorithms yesterday and I realized that ES implementations may have different sorting algorithms in use, and decided to try it out. Now, if you sort strings or numbers, it doesn't matter, but you may be sorting objects by a key and this is where things get nasty (think non-deterministic vs deterministic). Here's an example:

function shuffle (arr, depth) {
var tmp = []

var pi = String(Math.PI).substr(2)
var i = 0
var p = 0

while (arr.length) {
i = (i + +pi[p]) % arr.length
p = (p + 1) % pi.length
tmp.push(arr[i])
arr.splice(i, 1)
}

if (!depth) return tmp

return shuffle(tmp, depth - 1)
}

var unique = 'abcdefghijklmnopqrstu'
var sorter = 'deggeaeasemiuololizor'

var arr = Array.apply(null, Array(unique.length)).map(function (a, i) {
return {
unique: unique[i],
sorter: sorter.charCodeAt(i)
}
})

var original = shuffle(arr, 3)
var sorted = original.slice().sort(function (a, b) {
return a.sorter - b.sorter
})

console.log(original.map(function (item) { return item.unique }))
console.log(sorted.map(function (item) { return item.unique }))
function shuffle (arr, depth) {
  /* it's a silly way to shuffle, but at least it's deterministic. */
  var tmp = []
 
  var pi = String(Math.PI).substr(2)
  var i = 0
  var p = 0
 
  while (arr.length) {
    i = (i + +pi[p]) % arr.length
    p = (p + 1) % pi.length
    tmp.push(arr[i])
    arr.splice(i, 1)
  }
 
  if (!depth) return tmp
 
  return shuffle(tmp, depth - 1)
}

var unique = 'abcdefghijklmnopqrstu'
var sorter = 'deggeaeasemiuololizor'

var arr = Array.apply(null, Array(unique.length)).map(
function (a, i) {
  return {
    unique: unique[i],
    sorter: sorter.charCodeAt(i)
  }
})

var original = shuffle(arr, 3)
var sorted = original.slice().sort(function (a, b) {
  return a.sorter - b.sorter
})

console.log(original.map(function (item) { return item.unique }))
console.log(sorted.map(function (item) { return item.unique }))

In Firefox, you get:

["s", "m", "q", "l", "e", "b", "k", "i", "f", "g", "o", "j", "d", "t", "n", "c", "a", "p", "h", "r", "u"]
["f", "h", "a", "e", "b", "g", "j", "d", "c", "l", "r", "q", "o", "k", "t", "n", "p", "u", "i", "m", "s"]

In Chrome, you get:

["s", "m", "q", "l", "e", "b", "k", "i", "f", "g", "o", "j", "d", "t", "n", "c", "a", "p", "h", "r", "u"]
["f", "h", "a", "g", "e", "b", "j", "d", "c", "l", "r", "o", "q", "k", "n", "t", "p", "u", "i", "m", "s"]

Real world consequences of this may include:

 * A blog where posts are sorted by date ("YYYY/MM/DD"). Different browsers will show the posts in different order if Array#sort is used to accomplish this. Not a very severe consequence.
 * A spreadsheet application. If it has some order-dependent algorithm to calculate values, different browsers can give different results for the same research data.

Now I'm not sure what could be done to this, if anything even should be, just thought I'd bring it up.

Cheers,
Jussi

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Brendan Eich-2
Jussi Kalliokoski wrote:
> Hello everyone,
>
> Reposting, I think my previous attempt got stuck in a filter or
> something, because I somehow managed to have the code there in several
> copies.

You have three messages total on this topic at

https://mail.mozilla.org/pipermail/es-discuss/2012-December/


> I was thinking about sorting algorithms yesterday and I realized that
> ES implementations may have different sorting algorithms in use, and
> decided to try it out. Now, if you sort strings or numbers, it doesn't
> matter, but you may be sorting objects by a key and this is where
> things get nasty (think non-deterministic vs deterministic).

Have you read the language dating from ES3 on Array sort in the spec? In
particular Array#sort is not guaranteed to be stable. Perhaps it should be.

/be

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Jussi Kalliokoski
On Mon, Dec 3, 2012 at 8:46 PM, Brendan Eich <[hidden email]> wrote:
You have three messages total on this topic at

https://mail.mozilla.org/pipermail/es-discuss/2012-December/

 
Oh, sorry about the noise, I should've checked the archives!
 
Have you read the language dating from ES3 on Array sort in the spec? In particular Array#sort is not guaranteed to be stable. Perhaps it should be

Yes, I have, that's why I actually thought about trying this out it. I'm not sure if we should change it, but it'd be interesting to have the conversation, to see if there are any real world use cases that may benefit from doing so. Otherwise, I don't see a reason to change it. "Don't fix it if it ain't broke".

Cheers,
Jussi

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Fedor Indutny
In reply to this post by Brendan Eich-2
It's abort stability, and I think it's better to keep it un-stable for performance performance.



On Mon, Dec 3, 2012 at 10:46 PM, Brendan Eich <[hidden email]> wrote:
Jussi Kalliokoski wrote:
Hello everyone,

Reposting, I think my previous attempt got stuck in a filter or something, because I somehow managed to have the code there in several copies.

You have three messages total on this topic at

https://mail.mozilla.org/pipermail/es-discuss/2012-December/



I was thinking about sorting algorithms yesterday and I realized that ES implementations may have different sorting algorithms in use, and decided to try it out. Now, if you sort strings or numbers, it doesn't matter, but you may be sorting objects by a key and this is where things get nasty (think non-deterministic vs deterministic).

Have you read the language dating from ES3 on Array sort in the spec? In particular Array#sort is not guaranteed to be stable. Perhaps it should be.

/be

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Norbert Lindenberg-3
I haven't looked into sort algorithms in a while - how much slower are the fastest stable ones than the fastest non-stable ones?

I ran into the stability issue recently when implementing a function to interpret HTTP Accept-Language headers [1]. The language tags in these headers can have quality values, but can also omit them, in which case they're assumed to be 1. In normal processing you want to get rid of the quality values and just have an ordered list. To get to the ordered list, you have to sort by quality value, but not change the order of tags with the same quality value, such as all the ones with the default 1.0.

A workaround to ensure stability is to consider the original index of each array element in the comparison function, but a sort function that's guaranteed to be stable would be easier to use.

Norbert

[1] http://tools.ietf.org/html/rfc2616#section-14.4


On Dec 3, 2012, at 11:00 , Fedor Indutny wrote:

> It's abort stability, and I think it's better to keep it un-stable for performance performance.
>
>
>
> On Mon, Dec 3, 2012 at 10:46 PM, Brendan Eich <[hidden email]> wrote:
> Jussi Kalliokoski wrote:
> Hello everyone,
>
> Reposting, I think my previous attempt got stuck in a filter or something, because I somehow managed to have the code there in several copies.
>
> You have three messages total on this topic at
>
> https://mail.mozilla.org/pipermail/es-discuss/2012-December/
>
>
>
> I was thinking about sorting algorithms yesterday and I realized that ES implementations may have different sorting algorithms in use, and decided to try it out. Now, if you sort strings or numbers, it doesn't matter, but you may be sorting objects by a key and this is where things get nasty (think non-deterministic vs deterministic).
>
> Have you read the language dating from ES3 on Array sort in the spec? In particular Array#sort is not guaranteed to be stable. Perhaps it should be.
>
> /be
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Mathias Bynens-2
Bumping this old thread since V8 issue #90 (https://code.google.com/p/v8/issues/detail?id=90) has been getting lots of comments lately.

It appears that unstable sort, while perfectly spec-compliant, doesn’t match user expectations. It doesn’t help that some browsers/engines _do_ use a stable sorting algorithm, while others don’t — which surprises people and occasionally breaks (badly-written, but hey) code. (See the thread I linked to for examples.) Then, there’s V8, which uses stable sort for small arrays with 10 or fewer elements, but an unstable sorting algorithm for larger arrays, causing even more confusion.

Here’s a test case that tests arrays of varying sizes: http://ofb.net/~sethml/is-sort-stable.html The results in different browsers are listed, too.

IMHO it would be nice if ES would require a stable sorting algorithm: it would match user expectations, cause fewer issues in existing code, and improve operability in general.

What would be the best way to make TC39 consider this change?

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Kevin Gadd
Even if stable sorts don't get required, it would make sense to require that a given implementation is either always stable or always not stable. The current situation with V8 seems likely to result in subtly broken software shipping to the web, where it works in testing environments with small amounts of data and then breaks in the wild only on certain browsers and only if you have a certain amount of data. Yuck.

-kg


On Thu, Jun 13, 2013 at 6:05 AM, Mathias Bynens <[hidden email]> wrote:
Bumping this old thread since V8 issue #90 (https://code.google.com/p/v8/issues/detail?id=90) has been getting lots of comments lately.

It appears that unstable sort, while perfectly spec-compliant, doesn’t match user expectations. It doesn’t help that some browsers/engines _do_ use a stable sorting algorithm, while others don’t — which surprises people and occasionally breaks (badly-written, but hey) code. (See the thread I linked to for examples.) Then, there’s V8, which uses stable sort for small arrays with 10 or fewer elements, but an unstable sorting algorithm for larger arrays, causing even more confusion.

Here’s a test case that tests arrays of varying sizes: http://ofb.net/~sethml/is-sort-stable.html The results in different browsers are listed, too.

IMHO it would be nice if ES would require a stable sorting algorithm: it would match user expectations, cause fewer issues in existing code, and improve operability in general.

What would be the best way to make TC39 consider this change?

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Mark S. Miller-2
On Thu, Jun 13, 2013 at 7:01 AM, Kevin Gadd <[hidden email]> wrote:
Even if stable sorts don't get required, it would make sense to require that a given implementation is either always stable or always not stable.

How would such a requirement differ from the status quo? Doesn't the current v8 impl satisfy it, since a sort that happens to be stable still meets the requirements of an unstable sort? What does "always not stable" mean?


 
The current situation with V8 seems likely to result in subtly broken software shipping to the web, where it works in testing environments with small amounts of data and then breaks in the wild only on certain browsers and only if you have a certain amount of data. Yuck.

-kg


On Thu, Jun 13, 2013 at 6:05 AM, Mathias Bynens <[hidden email]> wrote:
Bumping this old thread since V8 issue #90 (https://code.google.com/p/v8/issues/detail?id=90) has been getting lots of comments lately.

It appears that unstable sort, while perfectly spec-compliant, doesn’t match user expectations. It doesn’t help that some browsers/engines _do_ use a stable sorting algorithm, while others don’t — which surprises people and occasionally breaks (badly-written, but hey) code. (See the thread I linked to for examples.) Then, there’s V8, which uses stable sort for small arrays with 10 or fewer elements, but an unstable sorting algorithm for larger arrays, causing even more confusion.

Here’s a test case that tests arrays of varying sizes: http://ofb.net/~sethml/is-sort-stable.html The results in different browsers are listed, too.

IMHO it would be nice if ES would require a stable sorting algorithm: it would match user expectations, cause fewer issues in existing code, and improve operability in general.

What would be the best way to make TC39 consider this change?

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss




--
    Cheers,
    --MarkM

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Kevin Gadd
I don't really care about the precise language or semantics. I just don't want applications to break in the wild because an Array.sort implementation's stability changes based on the number of elements. That feels like a much easier problem to solve than the problem of some browsers being unstable and some being stable. This is absolutely the sort of thing that would bite me as a JS dev and will bite every person who uses my compiler to convert an application. Why would they test with both small and large element counts?

-kg


On Thu, Jun 13, 2013 at 8:16 AM, Mark S. Miller <[hidden email]> wrote:
On Thu, Jun 13, 2013 at 7:01 AM, Kevin Gadd <[hidden email]> wrote:
Even if stable sorts don't get required, it would make sense to require that a given implementation is either always stable or always not stable.

How would such a requirement differ from the status quo? Doesn't the current v8 impl satisfy it, since a sort that happens to be stable still meets the requirements of an unstable sort? What does "always not stable" mean?


 
The current situation with V8 seems likely to result in subtly broken software shipping to the web, where it works in testing environments with small amounts of data and then breaks in the wild only on certain browsers and only if you have a certain amount of data. Yuck.

-kg


On Thu, Jun 13, 2013 at 6:05 AM, Mathias Bynens <[hidden email]> wrote:
Bumping this old thread since V8 issue #90 (https://code.google.com/p/v8/issues/detail?id=90) has been getting lots of comments lately.

It appears that unstable sort, while perfectly spec-compliant, doesn’t match user expectations. It doesn’t help that some browsers/engines _do_ use a stable sorting algorithm, while others don’t — which surprises people and occasionally breaks (badly-written, but hey) code. (See the thread I linked to for examples.) Then, there’s V8, which uses stable sort for small arrays with 10 or fewer elements, but an unstable sorting algorithm for larger arrays, causing even more confusion.

Here’s a test case that tests arrays of varying sizes: http://ofb.net/~sethml/is-sort-stable.html The results in different browsers are listed, too.

IMHO it would be nice if ES would require a stable sorting algorithm: it would match user expectations, cause fewer issues in existing code, and improve operability in general.

What would be the best way to make TC39 consider this change?

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss




--
    Cheers,
    --MarkM


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Mark S. Miller-2
How would any test verify that a sort is unstable?


On Thu, Jun 13, 2013 at 8:56 AM, Kevin Gadd <[hidden email]> wrote:
I don't really care about the precise language or semantics. I just don't want applications to break in the wild because an Array.sort implementation's stability changes based on the number of elements. That feels like a much easier problem to solve than the problem of some browsers being unstable and some being stable. This is absolutely the sort of thing that would bite me as a JS dev and will bite every person who uses my compiler to convert an application. Why would they test with both small and large element counts?

-kg


On Thu, Jun 13, 2013 at 8:16 AM, Mark S. Miller <[hidden email]> wrote:
On Thu, Jun 13, 2013 at 7:01 AM, Kevin Gadd <[hidden email]> wrote:
Even if stable sorts don't get required, it would make sense to require that a given implementation is either always stable or always not stable.

How would such a requirement differ from the status quo? Doesn't the current v8 impl satisfy it, since a sort that happens to be stable still meets the requirements of an unstable sort? What does "always not stable" mean?


 
The current situation with V8 seems likely to result in subtly broken software shipping to the web, where it works in testing environments with small amounts of data and then breaks in the wild only on certain browsers and only if you have a certain amount of data. Yuck.

-kg


On Thu, Jun 13, 2013 at 6:05 AM, Mathias Bynens <[hidden email]> wrote:
Bumping this old thread since V8 issue #90 (https://code.google.com/p/v8/issues/detail?id=90) has been getting lots of comments lately.

It appears that unstable sort, while perfectly spec-compliant, doesn’t match user expectations. It doesn’t help that some browsers/engines _do_ use a stable sorting algorithm, while others don’t — which surprises people and occasionally breaks (badly-written, but hey) code. (See the thread I linked to for examples.) Then, there’s V8, which uses stable sort for small arrays with 10 or fewer elements, but an unstable sorting algorithm for larger arrays, causing even more confusion.

Here’s a test case that tests arrays of varying sizes: http://ofb.net/~sethml/is-sort-stable.html The results in different browsers are listed, too.

IMHO it would be nice if ES would require a stable sorting algorithm: it would match user expectations, cause fewer issues in existing code, and improve operability in general.

What would be the best way to make TC39 consider this change?

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss




--
    Cheers,
    --MarkM




--
    Cheers,
    --MarkM

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Andreas Rossberg-4
In reply to this post by Kevin Gadd
On 13 June 2013 17:56, Kevin Gadd <[hidden email]> wrote:
> I don't really care about the precise language or semantics. I just don't
> want applications to break in the wild because an Array.sort
> implementation's stability changes based on the number of elements.

It does not change. Stable is a subset of unstable. And vice versa,
every unstable algorithm returns a stable result for some inputs.
Mark's point is that requiring "always unstable" has no meaning, no
matter what language you chose.

/Andreas
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Kevin Gadd
Like I said, I don't care about the language or semantics. If it's such a big problem for you to overlook the way I phrased it initially, I'll rephrase it:

If a stable sort is used by the browser to implement Array.sort in some scenarios, a stable sort should always be used to implement Array.sort.

And like I said, the specific problem is that if users fail to test on data sets that straddle both sides of whatever arbitrary boundary the JS implementor picked, they will fail to observe the browser's actual behavior, and failures will only occur in the wild. The threshold at which V8 switches between stable/unstable could change at any time if they decided the performance gains justified potentially breaking a few applications, and then suddenly more developers would be vulnerable to this.

-kg


On Thu, Jun 13, 2013 at 9:09 AM, Andreas Rossberg <[hidden email]> wrote:
On 13 June 2013 17:56, Kevin Gadd <[hidden email]> wrote:
> I don't really care about the precise language or semantics. I just don't
> want applications to break in the wild because an Array.sort
> implementation's stability changes based on the number of elements.

It does not change. Stable is a subset of unstable. And vice versa,
every unstable algorithm returns a stable result for some inputs.
Mark's point is that requiring "always unstable" has no meaning, no
matter what language you chose.

/Andreas


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

David Bruant-5
In reply to this post by Kevin Gadd
Le 13/06/2013 17:56, Kevin Gadd a écrit :
> I don't really care about the precise language or semantics.
Maybe you should. In my opinion, that would certainly help having your
case better understood and heard.

> I just don't want applications to break in the wild because an
> Array.sort implementation's stability changes based on the number of
> elements.
A stable sort is just a particular case of an unstable sort. So, if a
sort is "sometimes unstable", then it is "always unstable". The
impression of a stability for some cases is just a distraction.

It's also not like if "sort" was confusing like isNaN. "sort" does its job.

> That feels like a much easier problem to solve than the problem of
> some browsers being unstable and some being stable. This is absolutely
> the sort of thing that would bite me as a JS dev and will bite every
> person who uses my compiler to convert an application. Why would they
> test with both small and large element counts?
They can also read the spec and learn they can't rely on sort stability
(second sentence of ES5 - 15.4.4.11 !). Specs aren't just for
implementors. As a web developer, I feel it's a time-consuming yet very
healthy exercise to read specs to avoid pain later down the road. I
wouldn't have said that for ES3, but ES5 is decently developer friendly,
especially http://es5.github.io/#x15.4.4.11 with links and all that.

If people are unsatisfied with the language sort function, maybe they
should pick a different sort function, implement one that fits their
need, why not.
They can even monkeypatch array#sort!
Why not try a stackoverflow sort [1][2]? Try with "stable sort" ;-)

David

[1] http://xkcd.com/1185/
[2] http://gkoberger.github.io/stacksort/
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Kevin Gadd
I have read the ES specs multiple times, and still accidentally shipped an application that was broken by Array.sort's default behavior in the wild. I know other people who have had the same issues, and people who have read the spec and don't happen to have particular quirks defined in the spec memorized. People are not great at remembering spec details. Simply demanding that all JS developers in the wild read the spec will *not* address these issues. Modern application development occurs on multiple platforms, in multiple languages, using multiple libraries. No matter how many times the spec is read, if the developer is regularly writing and thinking in different languages using different primitives, the primitives that defy trends and act in unexpected ways will always be a stumbling block. The v8 issue and related issue reports against Underscore both serve to demonstrate this.

I don't understand why you would intentionally sidetrack a discussion about a simple problem with academic details. Yes, if your goal is to write proofs or rigorously demonstrate that your software is correct all the time, the exact definition of different sort algorithms and terminology really does matter, and yes, it is valuable for people to read the spec. But that is not remotely relevant to the original post in this discussion thread and was not suggested by my replies either. This thread *should* be about whether the ES spec can protect developers from subtle mistakes and errors by changing the specification of Array.sort. Is the point trying to be made here that it is impossible for the spec to clearly communicate that implementations should not do what V8 does, and this communication is impossible because of the academic definition? You haven't even once addressed the original core question of whether it would be possible to switch Array.sort to being stable, and what the obstacles to that would be.

There are examples out there in the wild of how difficult it is to write a performant sort in JS from scratch; you need only look at the Bugzilla bug about self-hosting Array.sort in Spidermonkey. Or we can look at the number of *broken* binary search implementations out in the wild caused by people copying from broken algorithms in textbooks that behave incorrectly in boundary cases. Please, for the love of $deity, do not just tell developers to type a query into stackoverflow and grab the top result. I don't necessarily think that it is automatically the right choice to say 'do it yourself' for a problem like this, though it could easily be correct in this specific case, since Underscore ships a stable sort function. Most developers probably use jQuery and/or Underscore already to make up for the small number of useful primitives in the JS standard library, and that's fine.

-kg


On Thu, Jun 13, 2013 at 9:50 AM, David Bruant <[hidden email]> wrote:
Le 13/06/2013 17:56, Kevin Gadd a écrit :

I don't really care about the precise language or semantics.
Maybe you should. In my opinion, that would certainly help having your case better understood and heard.


I just don't want applications to break in the wild because an Array.sort implementation's stability changes based on the number of elements.
A stable sort is just a particular case of an unstable sort. So, if a sort is "sometimes unstable", then it is "always unstable". The impression of a stability for some cases is just a distraction.

It's also not like if "sort" was confusing like isNaN. "sort" does its job.


That feels like a much easier problem to solve than the problem of some browsers being unstable and some being stable. This is absolutely the sort of thing that would bite me as a JS dev and will bite every person who uses my compiler to convert an application. Why would they test with both small and large element counts?
They can also read the spec and learn they can't rely on sort stability (second sentence of ES5 - 15.4.4.11 !). Specs aren't just for implementors. As a web developer, I feel it's a time-consuming yet very healthy exercise to read specs to avoid pain later down the road. I wouldn't have said that for ES3, but ES5 is decently developer friendly, especially http://es5.github.io/#x15.4.4.11 with links and all that.

If people are unsatisfied with the language sort function, maybe they should pick a different sort function, implement one that fits their need, why not.
They can even monkeypatch array#sort!
Why not try a stackoverflow sort [1][2]? Try with "stable sort" ;-)

David

[1] http://xkcd.com/1185/
[2] http://gkoberger.github.io/stacksort/


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Oliver Hunt-2
JSC switched to an always stable sort years ago due to compatibility problems with content targeting firefox and IE depending on it.

We also had issues with inconsistent comparison functions, but i can't recall exactly what the reasoning behind it was (nor the exact behavior we felt was necessary), but we ended up with an AVL tree being involved, so we may be attempting to only compare two elements with each other once.  Unfortunately this code is a little bit gnarly for me to read and understand today :-(

I believe that the spec should mandate a stable sort, but i'm not sure just how far we can go in trying to standardize exact behavior of the sort without tying implementations to a single implementation for all time.

--Oliver


On Jun 13, 2013, at 12:16 PM, Kevin Gadd <[hidden email]> wrote:

I have read the ES specs multiple times, and still accidentally shipped an application that was broken by Array.sort's default behavior in the wild. I know other people who have had the same issues, and people who have read the spec and don't happen to have particular quirks defined in the spec memorized. People are not great at remembering spec details. Simply demanding that all JS developers in the wild read the spec will *not* address these issues. Modern application development occurs on multiple platforms, in multiple languages, using multiple libraries. No matter how many times the spec is read, if the developer is regularly writing and thinking in different languages using different primitives, the primitives that defy trends and act in unexpected ways will always be a stumbling block. The v8 issue and related issue reports against Underscore both serve to demonstrate this.

I don't understand why you would intentionally sidetrack a discussion about a simple problem with academic details. Yes, if your goal is to write proofs or rigorously demonstrate that your software is correct all the time, the exact definition of different sort algorithms and terminology really does matter, and yes, it is valuable for people to read the spec. But that is not remotely relevant to the original post in this discussion thread and was not suggested by my replies either. This thread *should* be about whether the ES spec can protect developers from subtle mistakes and errors by changing the specification of Array.sort. Is the point trying to be made here that it is impossible for the spec to clearly communicate that implementations should not do what V8 does, and this communication is impossible because of the academic definition? You haven't even once addressed the original core question of whether it would be possible to switch Array.sort to being stable, and what the obstacles to that would be.

There are examples out there in the wild of how difficult it is to write a performant sort in JS from scratch; you need only look at the Bugzilla bug about self-hosting Array.sort in Spidermonkey. Or we can look at the number of *broken* binary search implementations out in the wild caused by people copying from broken algorithms in textbooks that behave incorrectly in boundary cases. Please, for the love of $deity, do not just tell developers to type a query into stackoverflow and grab the top result. I don't necessarily think that it is automatically the right choice to say 'do it yourself' for a problem like this, though it could easily be correct in this specific case, since Underscore ships a stable sort function. Most developers probably use jQuery and/or Underscore already to make up for the small number of useful primitives in the JS standard library, and that's fine.

-kg


On Thu, Jun 13, 2013 at 9:50 AM, David Bruant <[hidden email]> wrote:
Le 13/06/2013 17:56, Kevin Gadd a écrit :

I don't really care about the precise language or semantics.
Maybe you should. In my opinion, that would certainly help having your case better understood and heard.


I just don't want applications to break in the wild because an Array.sort implementation's stability changes based on the number of elements.
A stable sort is just a particular case of an unstable sort. So, if a sort is "sometimes unstable", then it is "always unstable". The impression of a stability for some cases is just a distraction.

It's also not like if "sort" was confusing like isNaN. "sort" does its job.


That feels like a much easier problem to solve than the problem of some browsers being unstable and some being stable. This is absolutely the sort of thing that would bite me as a JS dev and will bite every person who uses my compiler to convert an application. Why would they test with both small and large element counts?
They can also read the spec and learn they can't rely on sort stability (second sentence of ES5 - 15.4.4.11 !). Specs aren't just for implementors. As a web developer, I feel it's a time-consuming yet very healthy exercise to read specs to avoid pain later down the road. I wouldn't have said that for ES3, but ES5 is decently developer friendly, especially http://es5.github.io/#x15.4.4.11 with links and all that.

If people are unsatisfied with the language sort function, maybe they should pick a different sort function, implement one that fits their need, why not.
They can even monkeypatch array#sort!
Why not try a stackoverflow sort [1][2]? Try with "stable sort" ;-)

David

[1] http://xkcd.com/1185/
[2] http://gkoberger.github.io/stacksort/

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Bill Frantz
On 6/13/13 at 12:24 PM, [hidden email] (Oliver Hunt) wrote:

>I believe that the spec should mandate a stable sort, but i'm
>not sure just how far we can go in trying to standardize exact
>behavior of the sort without tying implementations to a single
>implementation for all time.

One possibility which will allow implementations to include a
more performant sort is to specify two sorts:

   sort - which is stable
   unstablestort - which is either an alias for sort or is a
faster unstable sort.

Cheers - Bill

---------------------------------------------------------------------------
Bill Frantz        |"Web security is like medicine - trying to
do good for
408-356-8506       |an evolved body of kludges" - Mark Miller
www.pwpconsult.com |

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

David Bruant-5
In reply to this post by Kevin Gadd
Le 13/06/2013 21:16, Kevin Gadd a écrit :
> I have read the ES specs multiple times, and still accidentally
> shipped an application that was broken by Array.sort's default
> behavior in the wild. I know other people who have had the same
> issues, and people who have read the spec and don't happen to have
> particular quirks defined in the spec memorized. People are not great
> at remembering spec details.
Agreed. The spec on the web. I re-read it the parts I have doubts about
regularly. I write and read doc because spec prose can be tiresome. I
just changed MDN to be very clear on the fact that sorts aren't expected
to be stable. Feel free to contribute to MDN (or WebPlatform at your
preference, both are wikis) whenever you feel that something should be
easily found and shouldn't have to be remembered by developers.
Modern development isn't a person against a programming language. The
web is part of modern development.

> Simply demanding that all JS developers in the wild read the spec will
> *not* address these issues. Modern application development occurs on
> multiple platforms, in multiple languages, using multiple libraries.
> No matter how many times the spec is read, if the developer is
> regularly writing and thinking in different languages using different
> primitives, the primitives that defy trends and act in unexpected ways
> will always be a stumbling block. The v8 issue and related issue
> reports against Underscore both serve to demonstrate this.
>
> I don't understand why you would intentionally sidetrack a discussion
> about a simple problem with academic details. Yes, if your goal is to
> write proofs or rigorously demonstrate that your software is correct
> all the time, the exact definition of different sort algorithms and
> terminology really does matter
I only cared about the words "stable" and "unstable" which are at the
heart of the debate. I'm not sure I understand what academic details
you're referring to and why you're talking about proofs.

A sort being stable is a property on the position of elements which are
considered equal by the sort algorithm. If people want equal elements to
not be moved, they should let the comparator believe that they are equal.
This is not about academics or proof. It's about understanding what
you're doing. Properly understanding the tools at your disposals, the
abstractions. In essence, it's asking the very skills that are required
to build any sort of software.

> and yes, it is valuable for people to read the spec. But that is not
> remotely relevant to the original post in this discussion thread and
> was not suggested by my replies either. This thread *should* be about
> whether the ES spec can protect developers from subtle mistakes and
> errors by changing the specification of Array.sort. Is the point
> trying to be made here that it is impossible for the spec to clearly
> communicate that implementations should not do what V8 does, and this
> communication is impossible because of the academic definition? You
> haven't even once addressed the original core question of whether it
> would be possible to switch Array.sort to being stable, and what the
> obstacles to that would be.
My (implicit, sorry about that) point was that there is no need to
change the sort function. Just for people to read the spec or doc. No
one is a hero and expected to remember everything, but reading the
second sentence of the Array.prototype.sort spec seems rather low-cost
to me.

> There are examples out there in the wild of how difficult it is to
> write a performant sort in JS from scratch; you need only look at the
> Bugzilla bug about self-hosting Array.sort in Spidermonkey. Or we can
> look at the number of *broken* binary search implementations out in
> the wild caused by people copying from broken algorithms in textbooks
> that behave incorrectly in boundary cases. Please, for the love of
> $deity, do not just tell developers to type a query into stackoverflow
> and grab the top result.
That was a joke obviously :-) But open source, robust, well-tested
algorithms exist.

Alternate proposal to forcing stable sorts in the spec based on the idea
that "equal" elements shouldn't be equal:
Have the original index of a value passed to the comparator function. A
stable sort can then be a few characters away:

     arr.sort(compare)

becomes:

     arr.sort((v1, v2, i1, i2) => { return compare(v1, v2) || i1 > i2; })

Where the compare function considers v1 and v2 to be equal (returns 0,
only falsy number), original indices are used to decide which is
greater. This should stabilize the output I think (by fully ordering
elements either by their value when one is greater than the other and by
original index when the 2 values are considered equal by the compare
function)

A simple and optimistic static analysis (no "arguments", no "eval", 3rd
and 4th arg undeclared, etc.) on the comparator body should leave perf
roughly intact.

David
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Brendan Eich-3
In reply to this post by Oliver Hunt-2
Just confirming: In ES1 days, the MS guy (Shon K.) suggested stability
but we all agreed not to require it, but I believe he implemented it.
This created a de-facto standard and SpiderMonkey and JSC matched.

I think V8 has a de-facto bug to fix. I'm ok with requiring stability as
a normative property of Array.prototype.sort given such a V8 bugfix.

I don't yet see enough value in adding an unstableSort (to Bill F's point).

/be

Oliver Hunt wrote:

> JSC switched to an always stable sort years ago due to compatibility
> problems with content targeting firefox and IE depending on it.
>
> We also had issues with inconsistent comparison functions, but i can't
> recall exactly what the reasoning behind it was (nor the exact
> behavior we felt was necessary), but we ended up with an AVL tree
> being involved, so we may be attempting to only compare two elements
> with each other once.  Unfortunately this code is a little bit gnarly
> for me to read and understand today :-(
>
> I believe that the spec should mandate a stable sort, but i'm not sure
> just how far we can go in trying to standardize exact behavior of the
> sort without tying implementations to a single implementation for all
> time.
>
> --Oliver
>
>
> On Jun 13, 2013, at 12:16 PM, Kevin Gadd <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>> I have read the ES specs multiple times, and still accidentally
>> shipped an application that was broken by Array.sort's default
>> behavior in the wild. I know other people who have had the same
>> issues, and people who have read the spec and don't happen to have
>> particular quirks defined in the spec memorized. People are not great
>> at remembering spec details. Simply demanding that all JS developers
>> in the wild read the spec will *not* address these issues. Modern
>> application development occurs on multiple platforms, in multiple
>> languages, using multiple libraries. No matter how many times the
>> spec is read, if the developer is regularly writing and thinking in
>> different languages using different primitives, the primitives that
>> defy trends and act in unexpected ways will always be a stumbling
>> block. The v8 issue and related issue reports against Underscore both
>> serve to demonstrate this.
>>
>> I don't understand why you would intentionally sidetrack a discussion
>> about a simple problem with academic details. Yes, if your goal is to
>> write proofs or rigorously demonstrate that your software is correct
>> all the time, the exact definition of different sort algorithms and
>> terminology really does matter, and yes, it is valuable for people to
>> read the spec. But that is not remotely relevant to the original post
>> in this discussion thread and was not suggested by my replies either.
>> This thread *should* be about whether the ES spec can protect
>> developers from subtle mistakes and errors by changing the
>> specification of Array.sort. Is the point trying to be made here that
>> it is impossible for the spec to clearly communicate that
>> implementations should not do what V8 does, and this communication is
>> impossible because of the academic definition? You haven't even once
>> addressed the original core question of whether it would be possible
>> to switch Array.sort to being stable, and what the obstacles to that
>> would be.
>>
>> There are examples out there in the wild of how difficult it is to
>> write a performant sort in JS from scratch; you need only look at the
>> Bugzilla bug about self-hosting Array.sort in Spidermonkey. Or we can
>> look at the number of *broken* binary search implementations out in
>> the wild caused by people copying from broken algorithms in textbooks
>> that behave incorrectly in boundary cases. Please, for the love of
>> $deity, do not just tell developers to type a query into
>> stackoverflow and grab the top result. I don't necessarily think that
>> it is automatically the right choice to say 'do it yourself' for a
>> problem like this, though it could easily be correct in this specific
>> case, since Underscore ships a stable sort function. Most developers
>> probably use jQuery and/or Underscore already to make up for the
>> small number of useful primitives in the JS standard library, and
>> that's fine.
>>
>> -kg
>>
>>
>> On Thu, Jun 13, 2013 at 9:50 AM, David Bruant <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     Le 13/06/2013 17:56, Kevin Gadd a écrit :
>>
>>         I don't really care about the precise language or semantics.
>>
>>     Maybe you should. In my opinion, that would certainly help having
>>     your case better understood and heard.
>>
>>
>>         I just don't want applications to break in the wild because
>>         an Array.sort implementation's stability changes based on the
>>         number of elements.
>>
>>     A stable sort is just a particular case of an unstable sort. So,
>>     if a sort is "sometimes unstable", then it is "always unstable".
>>     The impression of a stability for some cases is just a distraction.
>>
>>     It's also not like if "sort" was confusing like isNaN. "sort"
>>     does its job.
>>
>>
>>         That feels like a much easier problem to solve than the
>>         problem of some browsers being unstable and some being
>>         stable. This is absolutely the sort of thing that would bite
>>         me as a JS dev and will bite every person who uses my
>>         compiler to convert an application. Why would they test with
>>         both small and large element counts?
>>
>>     They can also read the spec and learn they can't rely on sort
>>     stability (second sentence of ES5 - 15.4.4.11 !). Specs aren't
>>     just for implementors. As a web developer, I feel it's a
>>     time-consuming yet very healthy exercise to read specs to avoid
>>     pain later down the road. I wouldn't have said that for ES3, but
>>     ES5 is decently developer friendly, especially
>>     http://es5.github.io/#x15.4.4.11 with links and all that.
>>
>>     If people are unsatisfied with the language sort function, maybe
>>     they should pick a different sort function, implement one that
>>     fits their need, why not.
>>     They can even monkeypatch array#sort!
>>     Why not try a stackoverflow sort [1][2]? Try with "stable sort" ;-)
>>
>>     David
>>
>>     [1] http://xkcd.com/1185/
>>     [2] http://gkoberger.github.io/stacksort/
>>
>>
>> _______________________________________________
>> es-discuss mailing list
>> [hidden email] <mailto:[hidden email]>
>> https://mail.mozilla.org/listinfo/es-discuss
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Sean Silva
In reply to this post by Kevin Gadd
On Thu, Jun 13, 2013 at 12:16 PM, Kevin Gadd <[hidden email]> wrote:
I don't understand why you would intentionally sidetrack a discussion about a simple problem with academic details. 

He brings up a very real point, which is that you can't realistically have an "always unstable" sort. If I understand you correctly, what you want by an "always unstable" sort is that you want developers to "get bitten" in testing due to instability, even for small test cases, so that the issue is caught earlier. The problem with that desire is that there are no sorting algorithms AFAIK that guarantee that they will *always* rearrange equal-sorting elements (i.e., "always unstable"): even a pure recursive quicksort (no insertion-sort base case) will sometimes not rearrange equal-sorting elements (i.e., seem stable).

If I understand your desire correctly, then what you're asking for by "always unstable" is to require that if an implementation's sorting algorithm *might* rearrange equal-sorting elements, then it must *always* go out of its way to ensure that if equal-sorting elements are present, then it does not preserve their order; I haven't looked in detail at what this would mean from an implementation standpoint, but I'm pretty sure that it is unrealistic.

-- Sean Silva

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Array#sort() implementations not interoperable

Felix-54
Always-unstable is trivial: use a stable sort, and the first time you
encounter two elements that compare equal, swap them. The problem with
that is it isn't necessarily detectably unstable. The two elements you
swap might be equal in all ways. To be detectably unstable, you need
the sort function to know when two elements that compare equal are not
otherwise equal, so you have to pass it a second comparison function,
and if you can do that why are you bothering with an unstable sort in
the first place? I think I'm confused about the point of "always
unstable".

An alternate interpretation of "always unstable" is for sort to throw
an error anytime it encounters two elements that compare equal. Which
is pretty annoyingly useless. You might as well just use a stable sort
all the time.

On Thu, Jun 13, 2013 at 3:10 PM, Sean Silva <[hidden email]> wrote:

> On Thu, Jun 13, 2013 at 12:16 PM, Kevin Gadd <[hidden email]> wrote:
>>
>> I don't understand why you would intentionally sidetrack a discussion
>> about a simple problem with academic details.
>
>
> He brings up a very real point, which is that you can't realistically have
> an "always unstable" sort. If I understand you correctly, what you want by
> an "always unstable" sort is that you want developers to "get bitten" in
> testing due to instability, even for small test cases, so that the issue is
> caught earlier. The problem with that desire is that there are no sorting
> algorithms AFAIK that guarantee that they will *always* rearrange
> equal-sorting elements (i.e., "always unstable"): even a pure recursive
> quicksort (no insertion-sort base case) will sometimes not rearrange
> equal-sorting elements (i.e., seem stable).
>
> If I understand your desire correctly, then what you're asking for by
> "always unstable" is to require that if an implementation's sorting
> algorithm *might* rearrange equal-sorting elements, then it must *always* go
> out of its way to ensure that if equal-sorting elements are present, then it
> does not preserve their order; I haven't looked in detail at what this would
> mean from an implementation standpoint, but I'm pretty sure that it is
> unrealistic.
>
> -- Sean Silva
>
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
12