What is the status of Weak References?

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

What is the status of Weak References?

Kevin Gadd
A search shows some old discussions of the topic mentioning that they
might be going in to future versions of the language, etc. But on the
other hand I've been told in response to this question before that
TC39 has a general policy against features that allow garbage
collection to be visible to applications.

There's still a strawman up on the wiki:
http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
while it appears to be a relatively simple, sane way to expose weak
references, it looks like the strawman hasn't been touched since late
2011. Is that because it's dead? Or was it deprioritized because weak
references are believed to not be needed by JS application developers?

I ask this because the lack of weak references (or any suitable
substitute mechanism) comes up regularly when dealing with the
challenge of porting native apps to JavaScript, and it leads people to
consider extremely elaborate workarounds just to build working
applications (like storing *all* their data in a virtual heap backed
by typed arrays and running their own garbage collector against it).
If there is really a firm reason why this must be so, so be it, but
seeing so many people do an end-run around the JS garbage collector
only to implement their own *in JavaScript* makes me wonder if perhaps
something is wrong. The presence of WeakMaps makes it clear to me that
solving this general class of problems is on the table.

People are certainly solving this problem in other JS related contexts:
https://github.com/TooTallNate/node-weak
Historically the lack of weak references has resulted in various
solutions in libraries like jQuery specifically designed to avoid
cycles being created between event listeners and DOM objects. Many of
these solutions are error-prone and require manual breaking of cycles.

To make a controversial statement, I would suggest that there are some
problems that cannot be solved in JavaScript unless it has an
approximation of weak references. I would love to be proven wrong here
because then I can use whatever your crazy computer science
alternative is. :D

Thanks,
-kg
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Tab Atkins Jr.
On Thu, Jan 31, 2013 at 1:48 PM, Kevin Gadd <[hidden email]> wrote:

> A search shows some old discussions of the topic mentioning that they
> might be going in to future versions of the language, etc. But on the
> other hand I've been told in response to this question before that
> TC39 has a general policy against features that allow garbage
> collection to be visible to applications.
>
> There's still a strawman up on the wiki:
> http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
> while it appears to be a relatively simple, sane way to expose weak
> references, it looks like the strawman hasn't been touched since late
> 2011. Is that because it's dead? Or was it deprioritized because weak
> references are believed to not be needed by JS application developers?

I believe that proposal was dropped in favor of just always using WeakMaps.

~TJ
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Erik Arvidsson
We have yet to find a solution that does not leak information between
two actors that are not supposed to be able to communicate. At this
point we have a lot of important work to do for ES6 and until someone
comes up with a solution to the security issues WeakRefs are
postponed.

I'm not an expert in this field but I can try to explain the problem
as I understand it.

- Given a master actor M that can communicate with two independent
actors A and B (but A cannot communicate with B).
- M passes a frozen object O to A.
- A then creates a WeakRef for O.
- A also strongly holds on to O.
- M then passes the same object to B.
- B creates another weak ref to O.
- Now B will get notified when A stops holding strongly to O leading
to a communication channel.

It is possible that we can accept this information leak and just have
Caja etc blacklist use of WeakRefs but this is a discussion that we
decided to postpone for now.


On Thu, Jan 31, 2013 at 1:55 PM, Tab Atkins Jr. <[hidden email]> wrote:

> On Thu, Jan 31, 2013 at 1:48 PM, Kevin Gadd <[hidden email]> wrote:
>> A search shows some old discussions of the topic mentioning that they
>> might be going in to future versions of the language, etc. But on the
>> other hand I've been told in response to this question before that
>> TC39 has a general policy against features that allow garbage
>> collection to be visible to applications.
>>
>> There's still a strawman up on the wiki:
>> http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
>> while it appears to be a relatively simple, sane way to expose weak
>> references, it looks like the strawman hasn't been touched since late
>> 2011. Is that because it's dead? Or was it deprioritized because weak
>> references are believed to not be needed by JS application developers?
>
> I believe that proposal was dropped in favor of just always using WeakMaps.
>
> ~TJ
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss



--
erik
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Kevin Gadd
Thank you for the detailed explanation of the information leak - I had
never seen an explanation of why GC visibility creates an issue.

Is there a page on the wiki somewhere that explains why information
leaks are such a huge concern in JavaScript? I've never been under the
impression that it is designed to be a secure/sandboxed language - I
always was under the impression that for such uses you would need a
layer like Caja.

Postponing work on Weak References due to an information leak feels to
me like prioritizing theoretical security concerns over the real
usability/implementation obstacles for existing applications that need
a weak reference equivalent. Those applications will stay outside the
domain of JS (in something like Native Client or the JVM instead)
until it is available, or at best, end up as worse-performing JS
applications that don't actually leverage any large part of the JS API
and spec that you guys work so hard on. I have to imagine that's not
the case though and these security concerns are a real pressing issue
for a subset of the JS userbase I don't know about?

-kg

On Thu, Jan 31, 2013 at 2:37 PM, Erik Arvidsson
<[hidden email]> wrote:

> We have yet to find a solution that does not leak information between
> two actors that are not supposed to be able to communicate. At this
> point we have a lot of important work to do for ES6 and until someone
> comes up with a solution to the security issues WeakRefs are
> postponed.
>
> I'm not an expert in this field but I can try to explain the problem
> as I understand it.
>
> - Given a master actor M that can communicate with two independent
> actors A and B (but A cannot communicate with B).
> - M passes a frozen object O to A.
> - A then creates a WeakRef for O.
> - A also strongly holds on to O.
> - M then passes the same object to B.
> - B creates another weak ref to O.
> - Now B will get notified when A stops holding strongly to O leading
> to a communication channel.
>
> It is possible that we can accept this information leak and just have
> Caja etc blacklist use of WeakRefs but this is a discussion that we
> decided to postpone for now.
>
>
> On Thu, Jan 31, 2013 at 1:55 PM, Tab Atkins Jr. <[hidden email]> wrote:
>> On Thu, Jan 31, 2013 at 1:48 PM, Kevin Gadd <[hidden email]> wrote:
>>> A search shows some old discussions of the topic mentioning that they
>>> might be going in to future versions of the language, etc. But on the
>>> other hand I've been told in response to this question before that
>>> TC39 has a general policy against features that allow garbage
>>> collection to be visible to applications.
>>>
>>> There's still a strawman up on the wiki:
>>> http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
>>> while it appears to be a relatively simple, sane way to expose weak
>>> references, it looks like the strawman hasn't been touched since late
>>> 2011. Is that because it's dead? Or was it deprioritized because weak
>>> references are believed to not be needed by JS application developers?
>>
>> I believe that proposal was dropped in favor of just always using WeakMaps.
>>
>> ~TJ
>> _______________________________________________
>> es-discuss mailing list
>> [hidden email]
>> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
> --
> erik
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Rick Waldron
In reply to this post by Kevin Gadd



On Thu, Jan 31, 2013 at 4:48 PM, Kevin Gadd <[hidden email]> wrote:
A search shows some old discussions of the topic mentioning that they
might be going in to future versions of the language, etc. But on the
other hand I've been told in response to this question before that
TC39 has a general policy against features that allow garbage
collection to be visible to applications.

There's still a strawman up on the wiki:
http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
while it appears to be a relatively simple, sane way to expose weak
references, it looks like the strawman hasn't been touched since late
2011. Is that because it's dead? Or was it deprioritized because weak
references are believed to not be needed by JS application developers?

I ask this because the lack of weak references (or any suitable
substitute mechanism) comes up regularly when dealing with the
challenge of porting native apps to JavaScript, and it leads people to
consider extremely elaborate workarounds just to build working
applications (like storing *all* their data in a virtual heap backed
by typed arrays and running their own garbage collector against it).
If there is really a firm reason why this must be so, so be it, but
seeing so many people do an end-run around the JS garbage collector
only to implement their own *in JavaScript* makes me wonder if perhaps
something is wrong. The presence of WeakMaps makes it clear to me that
solving this general class of problems is on the table.

People are certainly solving this problem in other JS related contexts:
https://github.com/TooTallNate/node-weak
Historically the lack of weak references has resulted in various
solutions in libraries like jQuery specifically designed to avoid
cycles being created between event listeners and DOM objects. Many of
these solutions are error-prone and require manual breaking of cycles.

Indeed! In the spirit of Erik Arvidsson's response above, I personally reached out to Nate on two separate occasions asking for a document of the semantics and an implementation experience write up... still waiting, but I'm sure he's a busy guy. 

Rick

 

To make a controversial statement, I would suggest that there are some
problems that cannot be solved in JavaScript unless it has an
approximation of weak references. I would love to be proven wrong here
because then I can use whatever your crazy computer science
alternative is. :D

Thanks,
-kg
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Mark S. Miller-2
In reply to this post by Erik Arvidsson
Earlier today I discussed with Arv an approach for doing WeakRefs without compromising security. (FWIW, I also posted this in my message near the end of <https://groups.google.com/forum/m/?fromgroups#!topic/nodejs/fV8MDpkBauw>. Thanks to Rick for the pointer.)

When a WeakRef from Realm A points at an object from Realm A, it points weakly. When it points at an object from another Realm, it points strongly (or throws an error -- take your pick). Within a Realm, SES can do what E has always done -- treat the makeWeakRef function as privileged, not to be made available by default to non-privileged code within that Realm (much as we already do with document, window, XHR, etc). The problem is that SES is not in a position to police other Realms, and we had not known how to solve the cross-Realm problem. This restriction prevents the cross-Realm leak.

Arv and I came up with a nice refinement: when dealing with another Realm, if you can obtain the makeWeakRef for that Realm, then you can use it to point at its objects weakly. If you obtain makeWeakRefs for several Realms, perhaps it makes sense to provide a makeWeakRef combiner that makes a composite makeWeakRef function, which will point weakly at any of the Realms of the leaf makeWeakRef functions. However, I think this is a detail. Virtually all use cases of interest only need to point weakly within a Realm.



On Thu, Jan 31, 2013 at 2:37 PM, Erik Arvidsson <[hidden email]> wrote:
We have yet to find a solution that does not leak information between
two actors that are not supposed to be able to communicate. At this
point we have a lot of important work to do for ES6 and until someone
comes up with a solution to the security issues WeakRefs are
postponed.

I'm not an expert in this field but I can try to explain the problem
as I understand it.

- Given a master actor M that can communicate with two independent
actors A and B (but A cannot communicate with B).
- M passes a frozen object O to A.
- A then creates a WeakRef for O.
- A also strongly holds on to O.
- M then passes the same object to B.
- B creates another weak ref to O.
- Now B will get notified when A stops holding strongly to O leading
to a communication channel.

It is possible that we can accept this information leak and just have
Caja etc blacklist use of WeakRefs but this is a discussion that we
decided to postpone for now.


On Thu, Jan 31, 2013 at 1:55 PM, Tab Atkins Jr. <[hidden email]> wrote:
> On Thu, Jan 31, 2013 at 1:48 PM, Kevin Gadd <[hidden email]> wrote:
>> A search shows some old discussions of the topic mentioning that they
>> might be going in to future versions of the language, etc. But on the
>> other hand I've been told in response to this question before that
>> TC39 has a general policy against features that allow garbage
>> collection to be visible to applications.
>>
>> There's still a strawman up on the wiki:
>> http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
>> while it appears to be a relatively simple, sane way to expose weak
>> references, it looks like the strawman hasn't been touched since late
>> 2011. Is that because it's dead? Or was it deprioritized because weak
>> references are believed to not be needed by JS application developers?
>
> I believe that proposal was dropped in favor of just always using WeakMaps.
>
> ~TJ
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss



--
erik
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss



--
    Cheers,
    --MarkM

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Mark S. Miller
In reply to this post by Tab Atkins Jr.
Hi Tab, not quite. WeakMaps and WeakRefs address mostly disjoint use cases. WeakRefs cannot be used to fully emulate the GC virtues of WeakMaps on the one hand, and WeakMaps cannot be used to make an object accessible only until it is not *otherwise* referenced. We separated the two because the former is deterministic and can be made freely available to unprivileged code. The latter make non-deterministic GC decisions visible. Separating these abstractions clarifies both.

I do not recall anyone deciding to drop weak refs, as opposed to just postponing them. I hope to see them in ES7.


On Thu, Jan 31, 2013 at 1:55 PM, Tab Atkins Jr. <[hidden email]> wrote:
On Thu, Jan 31, 2013 at 1:48 PM, Kevin Gadd <[hidden email]> wrote:
> A search shows some old discussions of the topic mentioning that they
> might be going in to future versions of the language, etc. But on the
> other hand I've been told in response to this question before that
> TC39 has a general policy against features that allow garbage
> collection to be visible to applications.
>
> There's still a strawman up on the wiki:
> http://wiki.ecmascript.org/doku.php?id=strawman:weak_references and
> while it appears to be a relatively simple, sane way to expose weak
> references, it looks like the strawman hasn't been touched since late
> 2011. Is that because it's dead? Or was it deprioritized because weak
> references are believed to not be needed by JS application developers?

I believe that proposal was dropped in favor of just always using WeakMaps.

~TJ
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss



--
Text by me above is hereby placed in the public domain

  Cheers,
  --MarkM

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

David Bruant-5
In reply to this post by Kevin Gadd
Le 31/01/2013 22:48, Kevin Gadd a écrit :
I ask this because the lack of weak references (or any suitable
substitute mechanism) comes up regularly when dealing with the
challenge of porting native apps to JavaScript, and it leads people to
consider extremely elaborate workarounds just to build working
applications (like storing *all* their data in a virtual heap backed
by typed arrays and running their own garbage collector against it).
If there is really a firm reason why this must be so, so be it, but
seeing so many people do an end-run around the JS garbage collector
only to implement their own *in JavaScript* makes me wonder if perhaps
something is wrong. The presence of WeakMaps makes it clear to me that
solving this general class of problems is on the table.
I don't understand the connection between the lack of weak references and emulating a heap in a typed array.

Historically the lack of weak references has resulted in various
solutions in libraries like jQuery specifically designed to avoid
cycles being created between event listeners and DOM objects. Many of
these solutions are error-prone and require manual breaking of cycles.
Garbage collectors have evolved and cycles aren't an issue any longer, weak references or not.

But on the
other hand I've been told in response to this question before that
TC39 has a general policy against features that allow garbage
collection to be visible to applications.
I'm not part of TC39, but I'm largely opposed to anything that makes GC observable. It introduces a source of non-determinism; that is the kind of things that brings bugs that you observe in production, but unfortunately didn't notice and can't reproduce in development environment. Or if you observe them when running the program, you don't observe it in debugging mode.

David

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Kevin Gadd
On Fri, Feb 1, 2013 at 2:06 AM, David Bruant <[hidden email]> wrote:
> I don't understand the connection between the lack of weak references and
> emulating a heap in a typed array.

For an algorithm that needs weak references to be correct, the only
way to implement that algorithm in JavaScript is to stop using the JS
garbage collector and write your own collector. This is basically the
model used by Emscripten applications compiled from C++ to JS - you can use a
C++ weak reference type like boost::weak_ptr, but only because the
entire application heap is stored inside of a typed array and not
exposed to the JS garbage collector. This is great from the
perspective of wanting near-native performance, because there are JS
runtimes that can turn this into incredibly fast native assembly, but
the resulting code barely looks like JavaScript and has other
disadvantages, so that is why I bring it up - weakref support in JS
would make it possible to express these algorithms in hand-written,
readable, debuggable JS.

> Garbage collectors have evolved and cycles aren't an issue any longer, weak
> references or not.

Cycles are absolutely an issue, specifically because JS applications
can interact with systems that are not wholly managed by the garbage
collector. The problem in this case is a cycle being broken *too
early* because the application author has to manually break cycles. To
present a couple simple examples:

I have a top-level application object that manages lower-level 'mode'
objects representing screens in the application. The screens, when
constructed, attach event listeners to the application object. Because
the application manages modes, it needs to have a list of all the
active modes.
* The event handler closures can accidentally (or intentionally)
capture the mode object, creating a real cycle involving a dead mode
that will not be collected by even the most sophisticated GC.
* If I am not extremely cautious, when a mode is destroyed I might
forget (or fail) to remove its associated event handlers from the
event handler list, causing the event handler lists to grow over time
and eventually degrade the performance of the entire application.
* I have to explicitly decide when a mode has become dead and manually
break cycles between the mode and the application, while also cleaning
up any running code (or callbacks on pending operations) that rely on
the mode.
In this scenario, weak references are less essential but still
tremendously valuable: An event handler list containing weak
references would never form a cycle, and would continue to work
correctly as long as the mode is alive. It is also trivial to prune
'dead' event handlers from a list of weak event handlers. The need to
explicitly tag a mode as dead and break cycles (potentially breaking
ongoing async operations like an XHR) goes away because any ongoing
async operations will keep the object itself alive (even if it has
been removed from the mode list), allowing it to be eventually
collected when it is safe (because the GC can prove that it is safe).

I decide to build a simple pool allocator for some frequently used JS
objects, because JS object construction is slow. This is what
optimization guides recommend. I pull an object instance out of the
pool and use it for a while, and return it to the pool.
* If I forget to return an object to the pool when I'm done with it,
it gets collected and eventually the pool becomes empty.
* If I mistakenly return an object to the pool when it actually
escaped into a global variable, object attribute, or closure, now the
state of the object may get trampled over if it leaves the pool again
while it's still in use.
* If I mess up my pool management code I might return the same object
to the pool twice.
In this scenario, weak references would allow you to make the pool
implementation wholly automatic (though that would require the ability
to resurrect collected objects - I'm not necessarily arguing for that
feature). I should point out that this scenario is complicated by JS's
lack of an equivalent to RAII lifetime management in C++ and the
'using' block in C# (you can vaguely approximate it with try/finally
but doing so has many serious downsides) - given RAII or a 'using'
equivalent, you could manually ref-count pool entries instead of using
weakrefs. But I hope you can see the general gist here of solving a
problem the GC should be solving?

These examples are simplified but are both based on real world
applications I've personally worked on where the listed issues caused
us real grief - crashes and leaks from buggy manual lifetime
management, inferior performance, etc.

> I'm not part of TC39, but I'm largely opposed to anything that makes GC
> observable. It introduces a source of non-determinism; that is the kind of
> things that brings bugs that you observe in production, but unfortunately
> didn't notice and can't reproduce in development environment. Or if you
> observe them when running the program, you don't observe it in debugging
> mode.

My argument here is not that non-determinism is good. My argument is
that an application that runs non-deterministically in every web
browser (because it's a JavaScript application) is superior to an
application that deterministically doesn't run in any web browser
because the application cannot be expressed accurately in JS. It is
possible that the set of these applications is a small set, but it
certainly seems of considerable size to me because I encounter these
problems on a regular basis. The developers that I speak to who are
building these applications are being forced to choose Native Client
or Emscripten because their applications are not expressible in JS.

I'm personally developing a compiler that targets JS and the lack of
weak references (or RAII/'using') dramatically limits the set of
programs I can actually convert to JS because there are lots of
applications out there that simply need this functionality. If this is
something that can't be done in JS, or isn't possible until ES7/ES8, I
understand, but I would be very disappointed if the only reasons for
it are the hypothetical dread spectres of non-determinism and
information leaks.

Thanks,
-kg
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

David Bruant-5
Le 01/02/2013 12:21, Kevin Gadd a écrit :

> On Fri, Feb 1, 2013 at 2:06 AM, David Bruant <[hidden email]> wrote:
>> I don't understand the connection between the lack of weak references and
>> emulating a heap in a typed array.
> For an algorithm that needs weak references to be correct, the only
> way to implement that algorithm in JavaScript is to stop using the JS
> garbage collector and write your own collector. This is basically the
> model used by Emscripten applications compiled from C++ to JS - you can use a
> C++ weak reference type like boost::weak_ptr, but only because the
> entire application heap is stored inside of a typed array and not
> exposed to the JS garbage collector. This is great from the
> perspective of wanting near-native performance, because there are JS
> runtimes that can turn this into incredibly fast native assembly, but
> the resulting code barely looks like JavaScript and has other
> disadvantages, so that is why I bring it up - weakref support in JS
> would make it possible to express these algorithms in hand-written,
> readable, debuggable JS.
Sorry for repeating myself, but I still don't see the connection between
the lack of weak references and emulating a heap in a typed array.
Phrased as a question:
Would it be possible to compile a C++ program in JS with weakrefs
without emulating a heap in a typed array? Because of pointer
arithmetics, I doubt it, but I'm curious to learn if that's the case.

>> Garbage collectors have evolved and cycles aren't an issue any longer, weak
>> references or not.
> Cycles are absolutely an issue, specifically because JS applications
> can interact with systems that are not wholly managed by the garbage
> collector. The problem in this case is a cycle being broken *too
> early* because the application author has to manually break cycles. To
> present a couple simple examples:
>
> I have a top-level application object that manages lower-level 'mode'
> objects representing screens in the application. The screens, when
> constructed, attach event listeners to the application object. Because
> the application manages modes, it needs to have a list of all the
> active modes.
> * The event handler closures can accidentally (or intentionally)
Last I heard, it's very difficult to accidentally capture a reference in
a closure because modern engines check which objects are actually used
(looking at variable names), so for an object to be captured in a
closure, it has to be used. So "intentionally".

> capture the mode object, creating a real cycle involving a dead mode
> that will not be collected by even the most sophisticated GC.
The problem is not about cycles. It's about abusively holding references
to objects.

> * If I am not extremely cautious, when a mode is destroyed I might
> forget (or fail) to remove its associated event handlers from the
> event handler list, causing the event handler lists to grow over time
> and eventually degrade the performance of the entire application.
> * I have to explicitly decide when a mode has become dead
Yes. I would say "understand" rather than "decide", but yes. And that's
a very important point that most developers ignore or forget. GC is an
undecidable problem, meaning that there will always be cases where a
human being needs to figure out when in the object lifecycle it is not
longer needed and either free it in languages where that's possible or
make it collectable in languages with a GC. There will be such cases
even in languages where there are weak references.
Nowadays, making an object collectable means cutting all references
(even if the object is not involved in a cycle!) that the mark-and-sweep
algorithm (as far as I know, all modern engines use this algorithm)
would traverse.


> In this scenario, weak references are less essential but still
> tremendously valuable: An event handler list containing weak
> references would never form a cycle, and would continue to work
> correctly as long as the mode is alive. It is also trivial to prune
> 'dead' event handlers from a list of weak event handlers.
When does the GC decide to prune dead event handlers? randomly? Or maybe
when you've performed some action meaning that the corresponding mode is
dead?

> The need to
> explicitly tag a mode as dead and break cycles (potentially breaking
> ongoing async operations like an XHR) goes away because any ongoing
> async operations will keep the object itself alive (even if it has
> been removed from the mode list), allowing it to be eventually
> collected when it is safe (because the GC can prove that it is safe).
>
> I decide to build a simple pool allocator for some frequently used JS
> objects, because JS object construction is slow. This is what
> optimization guides recommend.
Are these guides aware of bump allocators? or that keeping objects alive
more than they should pressures generational garbage collectors?

> I pull an object instance out of the
> pool and use it for a while, and return it to the pool.
> * If I forget to return an object to the pool when I'm done with it,
> it gets collected and eventually the pool becomes empty.
> * If I mistakenly return an object to the pool when it actually
> escaped into a global variable, object attribute, or closure, now the
> state of the object may get trampled over if it leaves the pool again
> while it's still in use.
> * If I mess up my pool management code I might return the same object
> to the pool twice.
I'm sorry, but all your examples are "if I forget, if i make a
mistake...". I don't think making bugs are a good justification to add
new features in a language. If you really care about memory, make your
algorithms right, spend the necessary time to understand the lifecycle
of your own objects to understand when to release them.

> In this scenario, weak references would allow you to make the pool
> implementation wholly automatic (though that would require the ability
> to resurrect collected objects - I'm not necessarily arguing for that
> feature). I should point out that this scenario is complicated by JS's
> lack of an equivalent to RAII lifetime management in C++ and the
> 'using' block in C# (you can vaguely approximate it with try/finally
> but doing so has many serious downsides) - given RAII or a 'using'
> equivalent, you could manually ref-count pool entries instead of using
> weakrefs. But I hope you can see the general gist here of solving a
> problem the GC should be solving?
>
> These examples are simplified but are both based on real world
> applications I've personally worked on where the listed issues caused
> us real grief - crashes and leaks from buggy manual lifetime
> management, inferior performance, etc.
>
>> I'm not part of TC39, but I'm largely opposed to anything that makes GC
>> observable. It introduces a source of non-determinism; that is the kind of
>> things that brings bugs that you observe in production, but unfortunately
>> didn't notice and can't reproduce in development environment. Or if you
>> observe them when running the program, you don't observe it in debugging
>> mode.
> My argument here is not that non-determinism is good. My argument is
> that an application that runs non-deterministically in every web
> browser (because it's a JavaScript application) is superior to an
> application that deterministically doesn't run in any web browser
> because the application cannot be expressed accurately in JS.
:-) Interesting argument.

> It is
> possible that the set of these applications is a small set, but it
> certainly seems of considerable size to me because I encounter these
> problems on a regular basis. The developers that I speak to who are
> building these applications are being forced to choose Native Client
> or Emscripten because their applications are not expressible in JS.
I don't know enough languages to tell, but I wonder until which point
should JS import other language features for the sake of porting programs.
Where are the JS equivalent of Scala actors? There are probably some
very interesting Scala programs to port to the web?

> I'm personally developing a compiler that targets JS and the lack of
> weak references (or RAII/'using') dramatically limits the set of
> programs I can actually convert to JS because there are lots of
> applications out there that simply need this functionality.
ES6 introduces revokable proxies [1] which could be used to implement as
"explicit weakrefs" (you need to say explicitely when you don't want to
use an object anymore).
One idea would be to add some source annotations to tell at a coarse
level when some object is guaranteed to be not needed anymore. It would
compile to revoking the proxy.

> If this is
> something that can't be done in JS, or isn't possible until ES7/ES8, I
> understand, but I would be very disappointed if the only reasons for
> it are the hypothetical dread spectres of non-determinism and
> information leaks.
Each of these reasons seems to be valid to me.

David

[1] http://wiki.ecmascript.org/doku.php?id=strawman:revokable_proxies
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Brandon Benvie
In reply to this post by David Bruant-5
It's not possible to polyfill or emulate weak references (or WeakMaps for that matter) in JS without completely disengaging from using the host JS engine's object model. In Continuum, for example, I can get close but not quite all the way to emulating WeakMaps using a meta-interpretive approach to the object model (each object in the VM corresponds to one or more objects in the host engine). Eventually I'll switch to switching to using my own typedarray-backed heap with a GC in order to fully realize the semantics of WeakMaps.


On Fri, Feb 1, 2013 at 5:06 AM, David Bruant <[hidden email]> wrote:
Le 31/01/2013 22:48, Kevin Gadd a écrit :

I ask this because the lack of weak references (or any suitable
substitute mechanism) comes up regularly when dealing with the
challenge of porting native apps to JavaScript, and it leads people to
consider extremely elaborate workarounds just to build working
applications (like storing *all* their data in a virtual heap backed
by typed arrays and running their own garbage collector against it).
If there is really a firm reason why this must be so, so be it, but
seeing so many people do an end-run around the JS garbage collector
only to implement their own *in JavaScript* makes me wonder if perhaps
something is wrong. The presence of WeakMaps makes it clear to me that
solving this general class of problems is on the table.
I don't understand the connection between the lack of weak references and emulating a heap in a typed array.

Historically the lack of weak references has resulted in various
solutions in libraries like jQuery specifically designed to avoid
cycles being created between event listeners and DOM objects. Many of
these solutions are error-prone and require manual breaking of cycles.
Garbage collectors have evolved and cycles aren't an issue any longer, weak references or not.


But on the
other hand I've been told in response to this question before that
TC39 has a general policy against features that allow garbage
collection to be visible to applications.
I'm not part of TC39, but I'm largely opposed to anything that makes GC observable. It introduces a source of non-determinism; that is the kind of things that brings bugs that you observe in production, but unfortunately didn't notice and can't reproduce in development environment. Or if you observe them when running the program, you don't observe it in debugging mode.

David

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

David Bruant-5
Le 01/02/2013 15:53, Brandon Benvie a écrit :
> It's not possible to polyfill or emulate weak references (or WeakMaps
> for that matter) in JS without completely disengaging from using the
> host JS engine's object model.
For all practical purposes, you've done such a thing yourself
https://github.com/Benvie/WeakMap ;-)

> In Continuum, for example, I can get close but not quite all the way
> to emulating WeakMaps using a meta-interpretive approach to the object
> model (each object in the VM corresponds to one or more objects in the
> host engine). Eventually I'll switch to switching to using my own
> typedarray-backed heap with a GC in order to fully realize the
> semantics of WeakMaps.
There are probably minor limitations to your above implementation, but I
wonder if they're worth the trouble to move to a typedarray heap. It's
your call not mine anyway :-)

David
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Brandon Benvie
Indeed, and Continuum uses the same strategy for implementing WeakMap currently. To my knowledge, the circumstances that makes that implementation of WeakMap fail is unlikely to arise except in the case of a membrane. As a polyfill provided to developers devloping with ES5 as a target, this is acceptable since a membrane requires an implementation of Proxy. But since Continuum implements Proxy, this sets up the one case where the two (lack of native WeakMaps and Proxy) to meet.


On Fri, Feb 1, 2013 at 10:07 AM, David Bruant <[hidden email]> wrote:
Le 01/02/2013 15:53, Brandon Benvie a écrit :

It's not possible to polyfill or emulate weak references (or WeakMaps for that matter) in JS without completely disengaging from using the host JS engine's object model.
For all practical purposes, you've done such a thing yourself https://github.com/Benvie/WeakMap ;-)


In Continuum, for example, I can get close but not quite all the way to emulating WeakMaps using a meta-interpretive approach to the object model (each object in the VM corresponds to one or more objects in the host engine). Eventually I'll switch to switching to using my own typedarray-backed heap with a GC in order to fully realize the semantics of WeakMaps.
There are probably minor limitations to your above implementation, but I wonder if they're worth the trouble to move to a typedarray heap. It's your call not mine anyway :-)

David


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Brandon Benvie
And by fail I mean result in a set of objects that next gets collected by the GC correctly despite being otherwise fit to be collected.


On Fri, Feb 1, 2013 at 10:12 AM, Brandon Benvie <[hidden email]> wrote:
Indeed, and Continuum uses the same strategy for implementing WeakMap currently. To my knowledge, the circumstances that makes that implementation of WeakMap fail is unlikely to arise except in the case of a membrane. As a polyfill provided to developers devloping with ES5 as a target, this is acceptable since a membrane requires an implementation of Proxy. But since Continuum implements Proxy, this sets up the one case where the two (lack of native WeakMaps and Proxy) to meet.


On Fri, Feb 1, 2013 at 10:07 AM, David Bruant <[hidden email]> wrote:
Le 01/02/2013 15:53, Brandon Benvie a écrit :

It's not possible to polyfill or emulate weak references (or WeakMaps for that matter) in JS without completely disengaging from using the host JS engine's object model.
For all practical purposes, you've done such a thing yourself https://github.com/Benvie/WeakMap ;-)


In Continuum, for example, I can get close but not quite all the way to emulating WeakMaps using a meta-interpretive approach to the object model (each object in the VM corresponds to one or more objects in the host engine). Eventually I'll switch to switching to using my own typedarray-backed heap with a GC in order to fully realize the semantics of WeakMaps.
There are probably minor limitations to your above implementation, but I wonder if they're worth the trouble to move to a typedarray heap. It's your call not mine anyway :-)

David



_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

RE: What is the status of Weak References?

Nathan Wall
In reply to this post by David Bruant-5
David Bruant wrote:
> >> David Bruant wrote:
> >> Garbage collectors have evolved and cycles aren't an issue any longer, weak

> >> references or not.
> >
> > Kevin Gadd wrote:
> > Cycles are absolutely an issue, specifically because JS applications
> > can interact with systems that are not wholly managed by the garbage
> > collector. The problem in this case is a cycle being broken *too
> > early* because the application author has to manually break cycles. To
> > present a couple simple examples:
> >
> > I have a top-level application object that manages lower-level 'mode'
> > objects representing screens in the application. The screens, when
> > constructed, attach event listeners to the application object. Because
> > the application manages modes, it needs to have a list of all the
> > active modes.
> > * The event handler closures can accidentally (or intentionally)
>
> Last I heard, it's very difficult to accidentally capture a reference in
> a closure because modern engines check which objects are actually used
> (looking at variable names), so for an object to be captured in a
> closure, it has to be used. So "intentionally".

We had a situation recently where we needed to monitor an element with `setInterval` to get information about when it was resized or moved.  As library authors we wanted to encapsulate this logic into the module so that it would "just work".  We wanted someone to be able to call `var widget = new Widget();`, attach it to the document, and have it automatically size itself based on certain criteria. If a developer then moved its position in the document (using purely DOM means), we wanted it to resize itself automatically again. We didn't want to make a requirement to call a public `resize` method, nor did we want to impose `dispose` (it's an easy thing to forget to call and it doesn't feel like JavaScript).  Of course, strongly referencing the element in the `setInterval` keeps it alive in memory even after the developer using the library has long since discarded it.

In this case, we managed to come up with a solution to refer to elements "weakly" through selectors, retrieving them out of the document only when they're attached (we have a single `setInterval` that always runs, but it allows objects to be GC'd).  However, this solution is far from fool-proof, lacks integrity (any element can mimic our selectors and cause us grief), and not performant.  In our case it's good enough, but I can imagine a case where it wouldn't be.  I can also imagine a case where you wouldn't have the luxury to use DOM traversal as a "weak" mechanism for referring to objects.  I think it could be useful internally in library components which make use of 3rd party components (in this case the DOM) to be able to monitor aspects of those components only when they're being consumed.

Having said that, I also understand the desire to keep the language deterministic and to not expose GC operations.

Nathan

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

David Bruant-5
Le 02/02/2013 06:41, Nathan Wall a écrit :
David Bruant wrote:
> >> David Bruant wrote:
> >> Garbage collectors have evolved and cycles aren't an issue any longer, weak
> >> references or not.
> >
> > Kevin Gadd wrote:
> > Cycles are absolutely an issue, specifically because JS applications
> > can interact with systems that are not wholly managed by the garbage
> > collector. The problem in this case is a cycle being broken *too
> > early* because the application author has to manually break cycles. To
> > present a couple simple examples:
> >
> > I have a top-level application object that manages lower-level 'mode'
> > objects representing screens in the application. The screens, when
> > constructed, attach event listeners to the application object. Because
> > the application manages modes, it needs to have a list of all the
> > active modes.
> > * The event handler closures can accidentally (or intentionally)
>
> Last I heard, it's very difficult to accidentally capture a reference in
> a closure because modern engines check which objects are actually used
> (looking at variable names), so for an object to be captured in a
> closure, it has to be used. So "intentionally".

We had a situation recently where we needed to monitor an element with `setInterval` to get information about when it was resized or moved.  As library authors we wanted to encapsulate this logic into the module so that it would "just work".  We wanted someone to be able to call `var widget = new Widget();`, attach it to the document, and have it automatically size itself based on certain criteria. If a developer then moved its position in the document (using purely DOM means), we wanted it to resize itself automatically again. We didn't want to make a requirement to call a public `resize` method, nor did we want to impose `dispose` (it's an easy thing to forget to call and it doesn't feel like JavaScript).  Of course, strongly referencing the element in the `setInterval` keeps it alive in memory even after the developer using the library has long since discarded it.
Since we're discussing the addition of a new feature, let's first start to see how existing or about-to-exist features can help us solve the same problem.

In an ES6 world, new Widget() can return a proxy and you, as the widget library author, can track down anytime the element is moved and resized (the handler will probably have to do some unwrapping, function binding, etc, but that's doable).
DOM mutation observers [1] can be of some help to track down this, I think.

Hmm... It's been a couple of years that I have the intuition that events should be considered as part of an object interface and not some sort of external thing and I think the justification is right here.
Doing setInterval polling has you do forces you to create a function unrelated to the object you want to observe and keeps a reference in that function.
If you were able to attach an event listener to the object itself to be notified of just what you need, the observer function would die as soon as the object it's attached to would die.

In your particular case, events at the object level would solve your problem, I think.


[answering separately]
nor did we want to impose `dispose` (it's an easy thing to forget to call and it doesn't feel like JavaScript)
I'd like to repeat something I wrote in another message: "...a very important point that most developers ignore or forget. GC is an undecidable problem, meaning that there will always be cases where a human being needs to figure out when in the object lifecycle it is not longer needed and either free it in languages where that's possible or make it collectable in languages with a GC. There will be such cases even in languages where there are weak references. "
And when such a case will be found, what will be the solution? Adding a new subtler language construct which exposes a bit more of the GC?

JavaScript has an history of being the language of the client side where a web page lives for a couple of minutes; leaks were largely unnoticeable because navigating or closing a tab would make the content collectable (well... except in crappy version of IE in which JS content could make browser-wide leaks -_-#).
As soon as we have long-running JavaScript, we have to start caring more about our memory usage, we have to question what we assumed/knew of JavaScript. The GC does maybe 80-100% of the job in well-written complex code, but we must never forget that the GC only does an approximation of an undecidable problem.
In applications where memory matters a lot, maybe a protocol like .dispose will become necessary.


In this case, we managed to come up with a solution to refer to elements "weakly" through selectors, retrieving them out of the document only when they're attached (we have a single `setInterval` that always runs, but it allows objects to be GC'd).  However, this solution is far from fool-proof, lacks integrity (any element can mimic our selectors and cause us grief), and not performant.  In our case it's good enough, but I can imagine a case where it wouldn't be.  I can also imagine a case where you wouldn't have the luxury to use DOM traversal as a "weak" mechanism for referring to objects.  I think it could be useful internally in library components which make use of 3rd party components (in this case the DOM) to be able to monitor aspects of those components only when they're being consumed.
Do you mean events? :-)

Having said that, I also understand the desire to keep the language deterministic and to not expose GC operations.
About weakrefs, I've read a little bit [2][3] and I'm puzzled by one thing: the return value of get is a strong reference, so if a misbehaving component keeps this strong reference around, having passed a weak reference was pointless.

David

[1] https://developer.mozilla.org/en-US/docs/DOM/MutationObserver
[2] http://wiki.ecmascript.org/doku.php?id=strawman:weak_references
[3] http://weblogs.java.net/blog/2006/05/04/understanding-weak-references

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Tom Van Cutsem-3
2013/2/2 David Bruant <[hidden email]>
About weakrefs, I've read a little bit [2][3] and I'm puzzled by one thing: the return value of get is a strong reference, so if a misbehaving component keeps this strong reference around, having passed a weak reference was pointless.

For use cases where you're passing a reference to some plug-in/component and want the referred-to object to be eventually collected, we have revocable proxies. Weak references aren't the right tool when you want to express the guarantee that the component can no longer hold onto the object.

Cheers,
Tom

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

David Bruant-5
Le 02/02/2013 15:32, Tom Van Cutsem a écrit :
2013/2/2 David Bruant <[hidden email]>
About weakrefs, I've read a little bit [2][3] and I'm puzzled by one thing: the return value of get is a strong reference, so if a misbehaving component keeps this strong reference around, having passed a weak reference was pointless.

For use cases where you're passing a reference to some plug-in/component and want the referred-to object to be eventually collected, we have revocable proxies. Weak references aren't the right tool when you want to express the guarantee that the component can no longer hold onto the object.
Indeed, it makes weak references a tool only useful within a trust boundary (when you don't need to share the object reference with an untrusted 3rd party).

Interestingly, revocable proxies require their creator to think to the lifecycle of the object to the point where they know when the object shouldn't be used anymore by whoever they shared the proxy with. I feel this is the exact same reflections that is needed to understand when an object isn't needed anymore within a trust boundary... seriously questioning the need for weak references.

David

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

Brendan Eich-3
David Bruant wrote:
> Interestingly, revocable proxies require their creator to think to the
> lifecycle of the object to the point where they know when the object
> shouldn't be used anymore by whoever they shared the proxy with. I
> feel this is the exact same reflections that is needed to understand
> when an object isn't needed anymore within a trust boundary...
> seriously questioning the need for weak references.

Sorry, but this is naive. Real systems such as COM, XPCOM, Java, and C#
support weak references for good reasons. One cannot do "data binding"
transparently without either making a leak or requiring manual dispose
(or polling hacks), precisely because the lifecycle of the model and
view data are not known to one another, and should not be coupled.

See http://wiki.ecmascript.org/doku.php?id=strawman:weak_refs intro, on
the observer and publish-subscribe patterns.

/be
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: What is the status of Weak References?

David Bruant-5
Le 02/02/2013 20:02, Brendan Eich a écrit :
> David Bruant wrote:
>> Interestingly, revocable proxies require their creator to think to
>> the lifecycle of the object to the point where they know when the
>> object shouldn't be used anymore by whoever they shared the proxy
>> with. I feel this is the exact same reflections that is needed to
>> understand when an object isn't needed anymore within a trust
>> boundary... seriously questioning the need for weak references.
>
> Sorry, but this is naive.
It is, you don't need to apologize.

> Real systems such as COM, XPCOM, Java, and C# support weak references
> for good reasons. One cannot do "data binding" transparently without
> either making a leak or requiring manual dispose (or polling hacks),
> precisely because the lifecycle of the model and view data are not
> known to one another, and should not be coupled.
>
> See http://wiki.ecmascript.org/doku.php?id=strawman:weak_refs intro,
> on the observer and publish-subscribe patterns.
I guess manual dispose would make a lot of sense. A view knows own its
lifecycle, it involves adding observers in a bunch of places. When the
view lifecycle comes to an end for whatever reason, it only makes sense
that it removes the observers it added. My rule of thumb would be "clean
up the mess you made".
Memory leaks are bugs. Like off-by-ones. People should just fix their bugs.
Garbage collectors encourage the fantasy that people can forget about
memory. It is a fantasy. A convenient one, but a fantasy nonetheless. A
fantasy like "we can have a lifestyle that assumes oil is unlimited".
</naivety>

<acceptance>
I guess it's just human nature, so weakrefs are pretty much unavoidable.

If a weakref to a function is passed to Object.observe, will it auto-get
the function and unobserve automatically if the .get returns null?

David
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
12