forEach() trouble when an element is removed during execution

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

forEach() trouble when an element is removed during execution

Peter Michaux
Hi,

I ran the following snip of code on Firefox 2 with Firebug installed.

var arr = [0,1,2,3];
arr.forEach(function(el) {
  console.log(el);
  if (el == 1) {
    arr.splice(el, 1);
  }
});

// output
// 0
// 1
// 3


Note that 2 is skipped yet 2 is in the array both before and after the
the forEach() execution.

I understand why this is happening if the underlying iterator is a for
loop  but iterators are supposed to hide implementation.

If I say I want to iterate over the elements of an array then I think
at least all the elements that are in the array both before and after
the iteration should be iterated.

Elements that are removed during the forEach execution but have not
yet been iterated...that is debatable if the removed elements should
be iterated. (In the case of DOM2 event handlers these would not be
iterated. Two browsers do it one way and two do it the other way.)

Elements that are added during the forEach execution, I think those
should not be iterated since that could lead to infinite loops.

So does forEach() behave properly in the above example? (I don't think so.)

Peter
_______________________________________________
Es4-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es4-discuss
Reply | Threaded
Open this post in threaded view
|

Re: forEach() trouble when an element is removed during execution

Jeff Walden-3
Peter Michaux wrote:

> I ran the following snip of code on Firefox 2 with Firebug installed.
>
> var arr = [0,1,2,3];
> arr.forEach(function(el) {
>   console.log(el);
>   if (el == 1) {
>     arr.splice(el, 1);
>   }
> });
>
> // output
> // 0
> // 1
> // 3
>
>
> Note that 2 is skipped yet 2 is in the array both before and after the
> the forEach() execution.

(It's iterating over indices, and when el == 1 the next property is 2, which after the splice produces the value 3, for those who don't have splice memorized and didn't look it up.)


> iterators are supposed to hide implementation.

Not necessarily, although that may sometimes be a good thing.  An array has a well-defined ordering, and having iteration match that ordering seems better to me than slightly-gratuitous abstraction.


> If I say I want to iterate over the elements of an array then I think
> at least all the elements that are in the array both before and after
> the iteration should be iterated.

How do you propose doing this?  Any solution that comes to my mind is complicated and uses O(# props) memory.  Do you still wish for order of iteration to be preserved?  You'll have to specify these cases for interop.  There's a reason that for loops don't have exactly specified behavior in the presence of mutation of the object over which iteration occurs.  Mutation in general plays havoc with sequential execution, and this is just one case demonstrating its perils.  Functional style discourages mutation, and I think forEach should follow in that tradition.


> Elements that are removed during the forEach execution but have not
> yet been iterated...that is debatable if the removed elements should
> be iterated. (In the case of DOM2 event handlers these would not be
> iterated. Two browsers do it one way and two do it the other way.)

The DOM isn't a programming language, so they can to a degree "get away" with sloppiness in the behavior their specification promises, to the chagrin of implementers, with behaviors which are undefined.  ES4 is a programming language, and I for one very much want exactness.


> Elements that are added during the forEach execution, I think those
> should not be iterated since that could lead to infinite loops.

Many things can lead to infinite loops; this particular one doesn't seem like a huge problem if you avoid mutation, as the intent of forEach was not for use with a mutating method.


> So does forEach() behave properly in the above example?

It behaves according to the description of its behavior as given on <http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:Array:forEach>.  I happen to think this is the right behavior.  It allows the procedure to be written in JavaScript if you so desire (you might need |delete| hooks if you want to detect them), which at least for the web is an important property (and makes it more understandable to implementors and users who wish to fully understand the behavior).  The behavior's reasonably simple to understand, and attempting to handle mutation will make the behavior more difficult to understand.  The user who wishes to handle mutation won't have that behavior swept under the rug of forEach; it will be explicit in his code, and thus that much more understandable.  Finally, we do have precedent in doing this in other languages: Java's iterators over its collections attempt to throw a ConcurrentModificationException when they detect mutation and
 don't promise some special behavior that tries to handle mutation.

On a further note, forEach is in at least two shipping browsers with the precise behavior documented on that page, so inertia if nothing else suggests the existing definition is the one which should be in the spec.

Jeff
_______________________________________________
Es4-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es4-discuss
Reply | Threaded
Open this post in threaded view
|

Re: forEach() trouble when an element is removed during execution

Peter Michaux
Hi Jeff,

On 6/18/07, Jeff Walden <[hidden email]> wrote:

> Peter Michaux wrote:
> > I ran the following snip of code on Firefox 2 with Firebug installed.
> >
> > var arr = [0,1,2,3];
> > arr.forEach(function(el) {
> >   console.log(el);
> >   if (el == 1) {
> >     arr.splice(el, 1);
> >   }
> > });
> >
> > // output
> > // 0
> > // 1
> > // 3
> >
> >
> > Note that 2 is skipped yet 2 is in the array both before and after the
> > the forEach() execution.
>
> (It's iterating over indices, and when el == 1 the next property is 2, which after the splice produces the value 3, for those who don't have splice memorized and didn't look it up.)
>
>
> > iterators are supposed to hide implementation.
>
> Not necessarily, although that may sometimes be a good thing.  An array has a well-defined ordering, and having iteration match that ordering seems better to me than slightly-gratuitous abstraction.

I agree that iterating in order makes sense and what people would want.

> > If I say I want to iterate over the elements of an array then I think
> > at least all the elements that are in the array both before and after
> > the iteration should be iterated.
>
> How do you propose doing this?  Any solution that comes to my mind is complicated and uses O(# props) memory.

I don't know how arrays are implemented "under the hood" so perhaps
this would be complex. If each array element is an object then it
isn't so difficult but perhaps a little memory and time consuming.


For example, currently to implement a robust observer pattern one way
to do it is the following...

var myObj = {

  observers: [],

  addObserver: function(fn) {
    // wrap the function in an object
    // to give a way to watch when removed
    this.observers.push({fn:fn});
  },

  removeObserver: function(fn) {
    for (var i=this.observers.length; i--; ) {
      if (this.observers[i].fn == fn) {
        // make is obvious that the observer is removed
        // for the notifyObservers loop
        this.observers[i].fn = null;
        // do the actual removal
        this.observers.splice(i,1);
      }
    }
  },

  notifyObservers: function() {
    // make a copy of current observers
    var observers = this.observers.slice(0);
    for (var i=0; i<observers.length; i++) {
      // check the observer hasn't been removed
      if (observers[i].fn) {
        observers[i].fn();
      }
    }
  }
};

If forEach behaved the way I'd recommend with array mutations during
iteration this would be much simpler and make more sense

var myObj = {

  observers: [],

  addObserver: function(fn) {
    // just the function without wrapping
    this.observers.push(fn);
  },

  removeObserver: function(fn) {
    for (var i=this.observers.length; i--; ) {
      if (this.observers[i] == fn) {
        // just removal
        this.observers.splice(i,1);
      }
    }
  },

  notifyObservers: function() {
    // simple iteration without concern of array mutations
    this.observers.forEach(function(o) {o();});
  }
};


> Do you still wish for order of iteration to be preserved?  You'll have to specify these cases for interop.  There's a reason that for loops don't have exactly specified behavior in the presence of mutation of the object over which iteration occurs.  Mutation in general plays havoc with sequential execution, and this is just one case demonstrating its perils.  Functional style discourages mutation, and I think forEach should follow in that tradition.

Yes functional style would discourage mutation and, from what I
understand, a language like Haskell wouldn't encounter this problem.
However JavaScript is not a pure functional language and in fact the
nature of the forEach function requires side-effects to be useful in
anyway. If the function that is sent to forEach doesn't have
side-effects then the whole iteration isn't of much use.


> > Elements that are removed during the forEach execution but have not
> > yet been iterated...that is debatable if the removed elements should
> > be iterated. (In the case of DOM2 event handlers these would not be
> > iterated. Two browsers do it one way and two do it the other way.)
>
> The DOM isn't a programming language, so they can to a degree "get away" with sloppiness in the behavior their specification promises, to the chagrin of implementers, with behaviors which are undefined.  ES4 is a programming language, and I for one very much want exactness.

The DOM does specify what should happen when an even handler is
removed. If a handler has not run then it should not run. So if it is
removed while an event is being handled and it has not run then it
doesn't run even for that event.


> > Elements that are added during the forEach execution, I think those
> > should not be iterated since that could lead to infinite loops.
>
> Many things can lead to infinite loops; this particular one doesn't seem like a huge problem if you avoid mutation, as the intent of forEach was not for use with a mutating method.
>
> > So does forEach() behave properly in the above example?
>
> It behaves according to the description of its behavior as given on <http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:Array:forEach>.

I don't think that in the mutation case that it does behave the way
this statement implies

"forEach executes the provided function (callback) once for each
element present in the array."


> I happen to think this is the right behavior.  It allows the procedure to be written in JavaScript if you so desire (you might need |delete| hooks if you want to detect them), which at least for the web is an important property (and makes it more understandable to implementors and users who wish to fully understand the behavior).  The behavior's reasonably simple to understand, and attempting to handle mutation will make the behavior more difficult to understand.  The user who wishes to handle mutation won't have that behavior swept under the rug of forEach; it will be explicit in his code, and thus that much more understandable.  Finally, we do have precedent in doing this in other languages: Java's iterators over its collections attempt to throw a ConcurrentModificationException when they detect mutation and
>  don't promise some special behavior that tries to handle mutation.

If the JavaScript forEach won't handle the mutations well then perhaps
JavaScript should throw an error also.


> On a further note, forEach is in at least two shipping browsers with the precise behavior documented on that page, so inertia if nothing else suggests the existing definition is the one which should be in the spec.

I don't think that fixing this edge case to make a very robust forEach
would be a bad idea. I think having a very robust forEach (similar to
the DOM2 iteration over handlers) would be much better and make the
language more self-consistent.

Peter
_______________________________________________
Es4-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es4-discuss
Reply | Threaded
Open this post in threaded view
|

Re: forEach() trouble when an element is removed during execution

Jason Orendorff
On 6/18/07, Peter Michaux <[hidden email]> wrote:
> I don't think that fixing this edge case to make a very robust forEach
> would be a bad idea. I think having a very robust forEach (similar to
> the DOM2 iteration over handlers) would be much better and make the
> language more self-consistent.

For self-consistency, I think the most important thing is that
"forEach()" and "for each" behave the same way.  Do you agree?

I mean, these two things should work the same:

  this.observers.forEach(function(o) {o();});

  for each (let o in this.observers) o();

My point is, I think you're proposing a change to Array iterators,
not just a change to forEach().

-j
_______________________________________________
Es4-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es4-discuss
Reply | Threaded
Open this post in threaded view
|

Re: forEach() trouble when an element is removed during execution

Lars T Hansen-2
On 6/18/07, Jason Orendorff <[hidden email]> wrote:
> On 6/18/07, Peter Michaux <[hidden email]> wrote:
> > I don't think that fixing this edge case to make a very robust forEach
> > would be a bad idea. I think having a very robust forEach (similar to
> > the DOM2 iteration over handlers) would be much better and make the
> > language more self-consistent.
>
> For self-consistency, I think the most important thing is that
> "forEach()" and "for each" behave the same way.  Do you agree?

They don't behave the same way, though.  "ForEach" considers only
properties named by uint-like names below "length", whether enumerable
or not, whereas the latter touches anything enumerable in the object
and prototype chain.  The former is defined as iterating from low
indices toward high indices, whereas the latter enumerates in
insertion order (though how the object and prototype enumerations are
interleaved is not defined, to my knowledge).  Finally, JS1.6 and ES4
define forEach as computing the end index at the beginning of the
iteration, so forEach definitely does not touch properties added (at
the end of the array) during iteration, while "for each" may or may
not.

--lars

>
> I mean, these two things should work the same:
>
>   this.observers.forEach(function(o) {o();});
>
>   for each (let o in this.observers) o();
>
> My point is, I think you're proposing a change to Array iterators,
> not just a change to forEach().
>
> -j
> _______________________________________________
> Es4-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es4-discuss
>
_______________________________________________
Es4-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es4-discuss