A network-change detection and IPv6 quirk - on Windows

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
Hi friends,

I work on a little issue (bug 1245059) that I feel I could use some feedback
or thoughts on, on how to best go about and handle it. This problem is
happening on Windows (right now) but in theory it could happen similarly on
other platforms too.

The ground rules:

1. We trigger an internal network change event on IP and network
    interface changes.

2. We make a checksum of all network "adapters" and their IP addresses to
    avoid duplicate events. We also coalesce events to not send them more often
    than once per second.

3. When a change is detected, we want to detect stalled HTTP connections to
    avoid "hangs" and to provide a snappier experience.

4. A "stalled" HTTP/1 connection is detected by not having traffic for N
    seconds. There is no difference between a stalled connection and a
    connection on which the server is just very slow to respond and thus
    leaving an N second pause. (N is 5 seconds by default).

The problem:

1. The user uses a slow server that often takes more than N seconds to
    respond.

2. The same user has a (Microsoft Teredo tunneling) network adapter that
    appears and disappears every few minutes (with 60 - 200 seconds interval it
    seems) thus triggering network change events fairly often.

3. User gets sad face because Firefox keeps cutting off slow (but working)
    HTTP requests. (There's a few other downsides to these frequent network
    change events, but they're not as visible.)

Additionally:

- Yes, this seems like a broken/strange user setup, but still it happens and
   it is not causing a (noticable) problem for the user if Firefox is prevented
   from killing silent HTTP connections.

- We can detect Teredo tunnels by its IP address range, but how does that
   help?

The bug:

   https://bugzilla.mozilla.org/show_bug.cgi?id=1245059


Any bright ideas?

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Honza Bambas-4
First, how are you detecting slow connections?  Just by not receiving
data?  If so, that is a wrong approach and always was.

You have out of bounds ways to detect this - TCP keep alive.  I don't
remember if that has been put off the table as uncontrollable on all
platforms or too faulty.  If not, setting the TCP keep-alive timeout
_during the time you are receiving a response_ to something reasonable
(your 5 seconds) will do the job for you very nicely.  And I also think
that 5 seconds is a too small interval.  At least 10 would do better IMO.

Other way is to rate the _response_ and probably also each
_connection_.  You can measure how fast they are receiving and adjust
your timeout according it.  Still not a perfect solution and also not
simple and also stateful.

In general, I'm very much against application-level timeouts.  I was
never able to find the right time and whichever value you choose you
always overlap both the good and the bad land as well.


-hb-


On 2/25/2016 10:23, Daniel Stenberg wrote:

> Hi friends,
>
> I work on a little issue (bug 1245059) that I feel I could use some
> feedback or thoughts on, on how to best go about and handle it. This
> problem is happening on Windows (right now) but in theory it could
> happen similarly on other platforms too.
>
> The ground rules:
>
> 1. We trigger an internal network change event on IP and network
>    interface changes.
>
> 2. We make a checksum of all network "adapters" and their IP addresses to
>    avoid duplicate events. We also coalesce events to not send them
> more often
>    than once per second.
>
> 3. When a change is detected, we want to detect stalled HTTP
> connections to
>    avoid "hangs" and to provide a snappier experience.
>
> 4. A "stalled" HTTP/1 connection is detected by not having traffic for N
>    seconds. There is no difference between a stalled connection and a
>    connection on which the server is just very slow to respond and thus
>    leaving an N second pause. (N is 5 seconds by default).
>
> The problem:
>
> 1. The user uses a slow server that often takes more than N seconds to
>    respond.
>
> 2. The same user has a (Microsoft Teredo tunneling) network adapter that
>    appears and disappears every few minutes (with 60 - 200 seconds
> interval it
>    seems) thus triggering network change events fairly often.
>
> 3. User gets sad face because Firefox keeps cutting off slow (but
> working)
>    HTTP requests. (There's a few other downsides to these frequent
> network
>    change events, but they're not as visible.)
>
> Additionally:
>
> - Yes, this seems like a broken/strange user setup, but still it
> happens and
>   it is not causing a (noticable) problem for the user if Firefox is
> prevented
>   from killing silent HTTP connections.
>
> - We can detect Teredo tunnels by its IP address range, but how does that
>   help?
>
> The bug:
>
>   https://bugzilla.mozilla.org/show_bug.cgi?id=1245059
>
>
> Any bright ideas?
>

_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
On Thu, 25 Feb 2016, Honza Bambas wrote:

> First, how are you detecting slow connections?  Just by not receiving data?
> If so, that is a wrong approach and always was.

I'm not detecting slow connections at all. We attempt to detect stalled HTTP/1
connections where absolutely nothing gets transferred. It can happen when you
move your browser between networks. Like when changing between wifi access
points.

> You have out of bounds ways to detect this - TCP keep alive. I don't
> remember if that has been put off the table as uncontrollable on all
> platforms or too faulty.  If not, setting the TCP keep-alive timeout _during
> the time you are receiving a response_ to something reasonable (your 5
> seconds) will do the job for you very nicely.

Adaptive keep-alive! I like that. Just a tad bit worried it'll introduce other
problems as it'll then make _all_ transfers get that keepalive thing going at
a fairly high frequency and not only the rare ones where we get a network
change. For example the mobile use cases tend to not like keepalive. But maybe
I'm just overly cautious.

> And I also think that 5 seconds is a too small interval.  At least 10 would
> do better IMO.

It was mostly just a pick to get something that would still feel good enough
for truly stalled connections and yet have a fairly low risk of actually
interferring with a slow-to-respond server.

Anything in particular that makes you say that 10 is better? Why not 8 or 12?

> Other way is to rate the _response_ and probably also each _connection_. You
> can measure how fast they are receiving and adjust your timeout according
> it. Still not a perfect solution and also not simple and also stateful.

From my experience, the slowest responses take a long while for the server to
start responding and in those cases we'd have 0 bytes transfered for a long
while and thus nothing to adapt the timeout to at that point.

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Honza Bambas-4
On 2/25/2016 14:41, Daniel Stenberg wrote:
> On Thu, 25 Feb 2016, Honza Bambas wrote:
>
>> First, how are you detecting slow connections?  Just by not receiving
>> data?
>> If so, that is a wrong approach and always was.
>
> I'm not detecting slow connections at all. We attempt to detect
> stalled HTTP/1
> connections where absolutely nothing gets transferred.

OK, and how do you recognize it?  From the problem you described it
seemed like you are checking on intervals between chunks you receive in
socket transport or connection or transaction.  That is the "not
receiving data" approach I refer to.  That is the approach that never
works...

> It can happen when you
> move your browser between networks. Like when changing between wifi
> access
> points.
>
>> You have out of bounds ways to detect this - TCP keep alive. I don't
>> remember if that has been put off the table as uncontrollable on all
>> platforms or too faulty.  If not, setting the TCP keep-alive timeout
>> _during the time you are receiving a response_ to something
>> reasonable (your 5 seconds) will do the job for you very nicely.
>
> Adaptive keep-alive! I like that. Just a tad bit worried it'll
> introduce other problems as it'll then make _all_ transfers get that
> keepalive thing going at a fairly high frequency

It doesn't need to be a super high frequency.  You will only enable it
between request sending and last response bit receive (complete).  You
can also engage it on open connections only after a network change has
been detected and only for a short period of time, but I would like to
have such a mechanism engaged all the time.  Connections may drop not
just because your adapter configuration changes.

> and not only the rare ones where we get a network change. For example
> the mobile use cases tend to not like keepalive. But maybe I'm just
> overly cautious.

On idle connection I would return to a normal t/o logic we have (I think
we don't use TCP keep-alive at all on idle conns)

>
>> And I also think that 5 seconds is a too small interval.  At least 10
>> would
>> do better IMO.
>
> It was mostly just a pick to get something that would still feel good
> enough for truly stalled connections and yet have a fairly low risk of
> actually interferring with a slow-to-respond server.
>
> Anything in particular that makes you say that 10 is better? Why not 8
> or 12?

Yeah :)  that's it!  nobody knows what this should be...

>
>> Other way is to rate the _response_ and probably also each
>> _connection_. You can measure how fast they are receiving and adjust
>> your timeout according it. Still not a perfect solution and also not
>> simple and also stateful.
>
>> From my experience, the slowest responses take a long while for the
>> server to
> start responding and in those cases we'd have 0 bytes transfered for a
> long
> while and thus nothing to adapt the timeout to at that point.
>

Yep, imperfect solution, right.  TCP k-a would tho catch this.


-hb-


_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
On Thu, 25 Feb 2016, Honza Bambas wrote:

>> Adaptive keep-alive! I like that. Just a tad bit worried it'll introduce
>> other problems as it'll then make _all_ transfers get that keepalive thing
>> going at a fairly high frequency
>
> It doesn't need to be a super high frequency.  You will only enable it
> between request sending and last response bit receive (complete).

Maybe we should then do this on each network-change-event?

1. read out the tcp keepalive state from the http/1 connection, if it is
    is "short lived", we let the default keepalive handling take care of itself
    and we're done. If not short-lived, continue

2. call nsHttpConnection::StartShortLivedTCPKeepalives() (which seems to have
    a default 'idle_time' set to 10 seconds)

3. set a timeout that is ShortLivedTCPKeepalive interval + some time

4. when the timeout triggers, restore the original keepalive state from (1)
    (which I presume is either long-lived or disabled)

It should at least do less kills of actually used alive connections, although
it will be slower to kill actually dead connections.

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Patrick McManus
In reply to this post by Daniel Stenberg
network link detection is a signal that drives a remedy. i.e. we detect the
network has changed and we do something about it.

The discussion here should be about the signal (that the network has
changed) - if the remedy needs to be tweaked (which I'm not convinced of)
we can do that separately. Remember that there are quite a few things that
get done as part of the remedy not even mentioned here and the real issue
here is that the signal is too noisy. So let's talk about the signal.

I don't think we want to just ignore teredo - a tunnel coming and going is
relevant to the general problem at hand, but it appears the tunnel isn't
relevant to the use case of the reporter.

Here's a basic question - is the address of the tunnel ever used  by necko
in the log? Perhaps we could only hash interfaces with addresses that have
seen use..




On Thu, Feb 25, 2016 at 2:23 AM, Daniel Stenberg <[hidden email]> wrote:

> Hi friends,
>
> I work on a little issue (bug 1245059) that I feel I could use some
> feedback or thoughts on, on how to best go about and handle it. This
> problem is happening on Windows (right now) but in theory it could happen
> similarly on other platforms too.
>
> The ground rules:
>
> 1. We trigger an internal network change event on IP and network
>    interface changes.
>
> 2. We make a checksum of all network "adapters" and their IP addresses to
>    avoid duplicate events. We also coalesce events to not send them more
> often
>    than once per second.
>
> 3. When a change is detected, we want to detect stalled HTTP connections to
>    avoid "hangs" and to provide a snappier experience.
>
> 4. A "stalled" HTTP/1 connection is detected by not having traffic for N
>    seconds. There is no difference between a stalled connection and a
>    connection on which the server is just very slow to respond and thus
>    leaving an N second pause. (N is 5 seconds by default).
>
> The problem:
>
> 1. The user uses a slow server that often takes more than N seconds to
>    respond.
>
> 2. The same user has a (Microsoft Teredo tunneling) network adapter that
>    appears and disappears every few minutes (with 60 - 200 seconds
> interval it
>    seems) thus triggering network change events fairly often.
>
> 3. User gets sad face because Firefox keeps cutting off slow (but working)
>    HTTP requests. (There's a few other downsides to these frequent network
>    change events, but they're not as visible.)
>
> Additionally:
>
> - Yes, this seems like a broken/strange user setup, but still it happens
> and
>   it is not causing a (noticable) problem for the user if Firefox is
> prevented
>   from killing silent HTTP connections.
>
> - We can detect Teredo tunnels by its IP address range, but how does that
>   help?
>
> The bug:
>
>   https://bugzilla.mozilla.org/show_bug.cgi?id=1245059
>
>
> Any bright ideas?
>
> --
>
>  / daniel.haxx.se
> _______________________________________________
> dev-tech-network mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-tech-network
>
>
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
On Thu, 25 Feb 2016, Patrick McManus wrote:

> The discussion here should be about the signal (that the network has
> changed) - if the remedy needs to be tweaked (which I'm not convinced of) we
> can do that separately. Remember that there are quite a few things that get
> done as part of the remedy not even mentioned here and the real issue here
> is that the signal is too noisy. So let's talk about the signal.
>
> I don't think we want to just ignore teredo - a tunnel coming and going is
> relevant to the general problem at hand, but it appears the tunnel isn't
> relevant to the use case of the reporter.
>
> Here's a basic question - is the address of the tunnel ever used by necko in
> the log? Perhaps we could only hash interfaces with addresses that have seen
> use..

Very good point. The multiple triggers will still be bad even if the HTTP/1
connections would surive.

The log I have doesn't reveal that detail, it was limited to detecting what
the network changes were. Since the user didn't even know there was a network
change, my guess is that it wasn't actually used. Of course I can ask about
it, or get more logs.

But then, just because it wasn't used (I'm not even sure how we should decide
if the interface was used) the previous time the interface was present I don't
think we can assume that it won't be used this time...

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Patrick McManus
the remedies are backwards looking (flush caches, close connections etc)..
so the phrasing about the hash was probably too lazy, but perhaps the basic
idea has merit?

On Thu, Feb 25, 2016 at 9:01 AM, Daniel Stenberg <[hidden email]> wrote:

> On Thu, 25 Feb 2016, Patrick McManus wrote:
>
> The discussion here should be about the signal (that the network has
>> changed) - if the remedy needs to be tweaked (which I'm not convinced of)
>> we can do that separately. Remember that there are quite a few things that
>> get done as part of the remedy not even mentioned here and the real issue
>> here is that the signal is too noisy. So let's talk about the signal.
>>
>> I don't think we want to just ignore teredo - a tunnel coming and going
>> is relevant to the general problem at hand, but it appears the tunnel isn't
>> relevant to the use case of the reporter.
>>
>> Here's a basic question - is the address of the tunnel ever used by necko
>> in the log? Perhaps we could only hash interfaces with addresses that have
>> seen use..
>>
>
> Very good point. The multiple triggers will still be bad even if the
> HTTP/1 connections would surive.
>
> The log I have doesn't reveal that detail, it was limited to detecting
> what the network changes were. Since the user didn't even know there was a
> network change, my guess is that it wasn't actually used. Of course I can
> ask about it, or get more logs.
>
> But then, just because it wasn't used (I'm not even sure how we should
> decide if the interface was used) the previous time the interface was
> present I don't think we can assume that it won't be used this time...
>
>
> --
>
>  / daniel.haxx.se
> _______________________________________________
> dev-tech-network mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-tech-network
>
>
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
On Thu, 25 Feb 2016, Patrick McManus wrote:

> the remedies are backwards looking (flush caches, close connections etc)..
> so the phrasing about the hash was probably too lazy, but perhaps the basic
> idea has merit?

We can certainly use a hash to figure out that we're dealing with a "yo-yo
interface", but what to do with that information is the tricky part.

I'm currently thinking of a few different routes of exploration:

1. check the routing table (updates) to better figure out when an adapter is
removed but doesn't affect the routing as then it shouldn't need to cause a
network change. Then it would also have to not signal a network change on a
new adapter until it gets routing added to it. I don't really know what the
Teredo adapters get in terms of routing by default - and especially not for
these yo-yo setups.

2. check data counters for the particular adapter so an unused adapter can be
ignored when removed. This will obvsiously not affect new adapters so it'll
only address half the problem.

3. Use of a hash to store added and removed adapters and keep them around for
N minutes (I'm thinking 300 seconds to start with). If an adapter is
added/removed and its name is already in the hash, it's a yo-yo and we ignore
it. It is a bit harsh and risky but... We could potentially work on a scheme
where the N value changes over time to adapt.

4. Short-term and the most abrupt: disable (behind a pref) the use of
NotifyIpInterfaceChange. It makes Firefox go back to the old
NotifyAddrChange() method which doesn't have this problem - mostly because it
doesn't support IPv6. Incidentally, Chrome only uses NotifyAddrChange to
detect network changes.

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Christopher Barry
On Thu, 17 Mar 2016 09:32:28 +0100 (CET)
Daniel Stenberg <[hidden email]> wrote:

>On Thu, 25 Feb 2016, Patrick McManus wrote:
>
>> the remedies are backwards looking (flush caches, close connections
>> etc).. so the phrasing about the hash was probably too lazy, but
>> perhaps the basic idea has merit?  
>
>We can certainly use a hash to figure out that we're dealing with a
>"yo-yo interface", but what to do with that information is the tricky
>part.
>
>I'm currently thinking of a few different routes of exploration:
>
>1. check the routing table (updates) to better figure out when an
>adapter is removed but doesn't affect the routing as then it shouldn't
>need to cause a network change. Then it would also have to not signal
>a network change on a new adapter until it gets routing added to it. I
>don't really know what the Teredo adapters get in terms of routing by
>default - and especially not for these yo-yo setups.
>
>2. check data counters for the particular adapter so an unused adapter
>can be ignored when removed. This will obvsiously not affect new
>adapters so it'll only address half the problem.
>
>3. Use of a hash to store added and removed adapters and keep them
>around for N minutes (I'm thinking 300 seconds to start with). If an
>adapter is added/removed and its name is already in the hash, it's a
>yo-yo and we ignore it. It is a bit harsh and risky but... We could
>potentially work on a scheme where the N value changes over time to
>adapt.
>
>4. Short-term and the most abrupt: disable (behind a pref) the use of
>NotifyIpInterfaceChange. It makes Firefox go back to the old
>NotifyAddrChange() method which doesn't have this problem - mostly
>because it doesn't support IPv6. Incidentally, Chrome only uses
>NotifyAddrChange to detect network changes.
>

Should all network applications behave in this manner, detecting and
internally adjusting to network modification? Would this not lead to
myriad implementations and redundancies? Is it in fact the domain of
network *applications* to do this?

Isn't this a systems function that should be left to the system itself?

I seem to remember another out-of-scope foray where FF was using
built-in dns server addresses behind the user's back a while ago,
and *bypassing* the specific servers set by the administrator. Is that
still happening too? Why is FF going here? Why does it *need* to? Do
you see a problem not being addressed by the system, are frustrated
that your efforts to help change it there are not being addressed, and
thus are creating these homegrown workarounds? Why is this even an
issue? Why does it actually matter which interface you are
communicating through?

This kind of stuff is out-of-authorized-and-expected-scope for a user
program, and frankly is more than a little creepy. I know others will
share this concern if/when they are aware of it.


--
Regards,
Christopher
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Patrick McManus
Hi Christopher,

On Thu, Mar 17, 2016 at 8:38 AM, Christopher Barry <
[hidden email]> wrote:

> Should all network applications behave in this manner,


Only firefox (and gecko) is in scope here.

myriad implementations and redundancies? Is it in fact the domain of
> network *applications* to do this?
>
>
when necessary to ensure security, performance and usability applications
have always availed themselves of customizations beyond that provided by
the operating system. Firefox will do so on a case by case basis.


> Isn't this a systems function that should be left to the system itself?
>

That's always a judgment call. I mean we have our own set of PKI roots
bundled in firefox - so we aren't purists on the issue when we think we can
bring value.


> I seem to remember another out-of-scope foray where FF was using
> built-in dns server addresses behind the user's back a while ago,
>

I don't believe that is true - I would be interested in any citation you
had. That being said, we don't do that because at this point in time we
don't think it would be a good user experience. If we had a model where it
added value we would consider doing so openly and likely with user choice.


> still happening too? Why is FF going here? Why does it *need* to? Do
>

sometimes the underlying services and even standards they implement have
trouble at web scale - especially at the tail.

The issues Daniel are working with are correlated with network mobility -
that's why he is using changes in addressing as a proxy for mobility. This
can manifest itself in some odd ways: TCP can simply stall for minutes (or
hours) before figuring out the connection isn't working any more, that's a
bad user experience. Your proxy selection may or may not be relevant after
such an event. Same thing with your DNS cache because of split DNS. Some
unauthenticated data that you used in one location might want to be flushed
from cache when you move, etc.. We see this when you wake from a sleep with
an open H2 session - an attempt to reuse it can just hang for a long time
without any feedback Daniel's code provides a framework for when more
aggressive timeouts are appropriate.

And we are actively working on this stuff in the standards space where
applicable.


> This kind of stuff is out-of-authorized-and-expected-scope for a user
> program, and frankly is more than a little creepy. I know others will
> share this concern if/when they are aware of it.
>
>
I don't share your concern that awareness of local addressing is out of
scope for a user space application. Indeed, enforcing security around that
kind of thing is the role of the operating system and we wouldn't attempt
to subvert that. The OS provides an unpriv'd interface to learn about
address changes and we're listening to it - not a big deal.
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
In reply to this post by Christopher Barry
On Thu, 17 Mar 2016, Christopher Barry wrote:

> Should all network applications behave in this manner, detecting and
> internally adjusting to network modification? Would this not lead to myriad
> implementations and redundancies? Is it in fact the domain of network
> *applications* to do this?

I don't think I can speak for all network applications, but I think all
applications that have long lived connections to remote servers, where the
application travels in and out of networks while running, where there are
network interfaces coming and going and for which networks expose different
"realities" or "views" of the world depending on which network you're on will
benefit from acting on network changes.

By being aware of network changes and adapting to them, we actually improve
the user experience quite significantly thanks to less stalled connections
that require the user to either just wait out or shift-reload to fix, with
less old/wrong content sticking around and surviving into the new network
while it was only actually true in the former network and we increase
responsiveness for proxies (when moving in/out of "proxy controlled"
networks).

If we don't react to network changes appropriately, users suffer.

> Isn't this a systems function that should be left to the system itself?

Yes, if that worked it'd be aweseome but as it is now, the stack works in one
way and we need to give it some gentle pushes to act more in ways we like.

> I seem to remember another out-of-scope foray where FF was using built-in
> dns server addresses behind the user's back a while ago, and *bypassing* the
> specific servers set by the administrator. Is that still happening too?

Sorry, I don't know what you're referring to so I don't know and can't comment
on that specific thing. But of course we're not strangers to trying out things
in any area, if we think it improves functionality or user experience etc.
Experiments of course by their nature don't always pan out successful.

> Do you see a problem not being addressed by the system, are frustrated that
> your efforts to help change it there are not being addressed, and thus are
> creating these homegrown workarounds? Why is this even an issue? Why does it
> actually matter which interface you are communicating through?

It isn't important which network interface Firefox is using. We're detecting
that there is a *network change*. Any network change. And when such a change
has happened we act on that information. It is for example very common for
networks to have private DNS spaces, to show different views of the same site
and more depending on which network you are on. And it is also common that
connections get stalled in the actual process of switching networks and the
regular way to detect that is very slow and that slowness leads to frustrated
users.

This is just one of those tiny little details ordinary humans don't need to
bother or care about.

Firefox is by the way not the only browser detecting network changes and
acting on them.

> This kind of stuff is out-of-authorized-and-expected-scope for a user
> program, and frankly is more than a little creepy.

Sorry, I don't understand. Exactly how is this creepy and why? There's really
no magic involved here.

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Christopher Barry
In reply to this post by Daniel Stenberg
On Thu, 17 Mar 2016 09:32:28 +0100 (CET)
Daniel Stenberg <[hidden email]> wrote:

>On Thu, 25 Feb 2016, Patrick McManus wrote:
>
>> the remedies are backwards looking (flush caches, close connections
>> etc).. so the phrasing about the hash was probably too lazy, but
>> perhaps the basic idea has merit?  
>
>We can certainly use a hash to figure out that we're dealing with a
>"yo-yo interface", but what to do with that information is the tricky
>part.
>
>I'm currently thinking of a few different routes of exploration:
>
>1. check the routing table (updates) to better figure out when an
>adapter is removed but doesn't affect the routing as then it shouldn't
>need to cause a network change. Then it would also have to not signal
>a network change on a new adapter until it gets routing added to it. I
>don't really know what the Teredo adapters get in terms of routing by
>default - and especially not for these yo-yo setups.
>
>2. check data counters for the particular adapter so an unused adapter
>can be ignored when removed. This will obvsiously not affect new
>adapters so it'll only address half the problem.
>
>3. Use of a hash to store added and removed adapters and keep them
>around for N minutes (I'm thinking 300 seconds to start with). If an
>adapter is added/removed and its name is already in the hash, it's a
>yo-yo and we ignore it. It is a bit harsh and risky but... We could
>potentially work on a scheme where the N value changes over time to
>adapt.
>
>4. Short-term and the most abrupt: disable (behind a pref) the use of
>NotifyIpInterfaceChange. It makes Firefox go back to the old
>NotifyAddrChange() method which doesn't have this problem - mostly
>because it doesn't support IPv6. Incidentally, Chrome only uses
>NotifyAddrChange to detect network changes.
>

From: Patrick McManus <[hidden email]>
To: Christopher Barry <[hidden email]>
Cc: dev-tech-network <[hidden email]>
Subject: Re: A network-change detection and IPv6 quirk - on Windows
Date: Thu, 17 Mar 2016 10:38:31 -0400

Hi Christopher,

On Thu, Mar 17, 2016 at 8:38 AM, Christopher Barry <
[hidden email]> wrote:  

> Should all network applications behave in this manner,  


Only firefox (and gecko) is in scope here.

myriad implementations and redundancies? Is it in fact the domain of
> network *applications* to do this?
>
>  
when necessary to ensure security, performance and usability
applications have always availed themselves of customizations beyond
that provided by the operating system. Firefox will do so on a case by
case basis.


> Isn't this a systems function that should be left to the system
> itself?

That's always a judgment call. I mean we have our own set of PKI roots
bundled in firefox - so we aren't purists on the issue when we think we
can bring value.


> I seem to remember another out-of-scope foray where FF was using
> built-in dns server addresses behind the user's back a while ago,
>  

I don't believe that is true - I would be interested in any citation you
had. That being said, we don't do that because at this point in time we
don't think it would be a good user experience. If we had a model where
it added value we would consider doing so openly and likely with user
choice.


> still happening too? Why is FF going here? Why does it *need* to? Do
>  

sometimes the underlying services and even standards they implement have
trouble at web scale - especially at the tail.

The issues Daniel are working with are correlated with network mobility
- that's why he is using changes in addressing as a proxy for mobility.
This can manifest itself in some odd ways: TCP can simply stall for
minutes (or hours) before figuring out the connection isn't working any
more, that's a bad user experience. Your proxy selection may or may not
be relevant after such an event. Same thing with your DNS cache because
of split DNS. Some unauthenticated data that you used in one location
might want to be flushed from cache when you move, etc.. We see this
when you wake from a sleep with an open H2 session - an attempt to
reuse it can just hang for a long time without any feedback Daniel's
code provides a framework for when more aggressive timeouts are
appropriate.

And we are actively working on this stuff in the standards space where
applicable.


> This kind of stuff is out-of-authorized-and-expected-scope for a user
> program, and frankly is more than a little creepy. I know others will
> share this concern if/when they are aware of it.
>
>  
I don't share your concern that awareness of local addressing is out of
scope for a user space application. Indeed, enforcing security around
that kind of thing is the role of the operating system and we wouldn't
attempt to subvert that. The OS provides an unpriv'd interface to learn
about address changes and we're listening to it - not a big deal.

From: Daniel Stenberg <[hidden email]>
To: Christopher Barry <[hidden email]>
cc: [hidden email]
Subject: Re: A network-change detection and IPv6 quirk - on Windows
Date: Thu, 17 Mar 2016 15:39:22 +0100 (CET)
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)

On Thu, 17 Mar 2016, Christopher Barry wrote:

> Should all network applications behave in this manner, detecting and
> internally adjusting to network modification? Would this not lead to
> myriad implementations and redundancies? Is it in fact the domain of
> network *applications* to do this?  

I don't think I can speak for all network applications, but I think all
applications that have long lived connections to remote servers, where
the application travels in and out of networks while running, where
there are network interfaces coming and going and for which networks
expose different "realities" or "views" of the world depending on which
network you're on will benefit from acting on network changes.

By being aware of network changes and adapting to them, we actually
improve the user experience quite significantly thanks to less stalled
connections that require the user to either just wait out or
shift-reload to fix, with less old/wrong content sticking around and
surviving into the new network while it was only actually true in the
former network and we increase responsiveness for proxies (when moving
in/out of "proxy controlled" networks).

If we don't react to network changes appropriately, users suffer.

> Isn't this a systems function that should be left to the system
> itself?  

Yes, if that worked it'd be aweseome but as it is now, the stack works
in one way and we need to give it some gentle pushes to act more in
ways we like.

> I seem to remember another out-of-scope foray where FF was using
> built-in dns server addresses behind the user's back a while ago, and
> *bypassing* the specific servers set by the administrator. Is that
> still happening too?  

Sorry, I don't know what you're referring to so I don't know and can't
comment on that specific thing. But of course we're not strangers to
trying out things in any area, if we think it improves functionality or
user experience etc. Experiments of course by their nature don't always
pan out successful.

> Do you see a problem not being addressed by the system, are
> frustrated that your efforts to help change it there are not being
> addressed, and thus are creating these homegrown workarounds? Why is
> this even an issue? Why does it actually matter which interface you
> are communicating through?  

It isn't important which network interface Firefox is using. We're
detecting that there is a *network change*. Any network change. And
when such a change has happened we act on that information. It is for
example very common for networks to have private DNS spaces, to show
different views of the same site and more depending on which network
you are on. And it is also common that connections get stalled in the
actual process of switching networks and the regular way to detect that
is very slow and that slowness leads to frustrated users.

This is just one of those tiny little details ordinary humans don't
need to bother or care about.

Firefox is by the way not the only browser detecting network changes
and acting on them.

> This kind of stuff is out-of-authorized-and-expected-scope for a user
> program, and frankly is more than a little creepy.  

Sorry, I don't understand. Exactly how is this creepy and why? There's
really no magic involved here.

--

  / daniel.haxx.se



========================================================================

I've had to cut and paste the thread parts into position above to help
provide some context here. Their nesting is of course broken now. It's
generally acceptable, and indeed appreciated to continue to use the
complete message thread, keeping it intact, replying inline if desired,
but it's generally not helpful to cherry pick parts of a thread and
still reply to the original thread with most of it missing. It simply
makes it impossible to follow without opening every single response
individually. That has been the case for the twenty plus years I've
been using mailing lists anyway.

From my read of the fist post that I saw, which I will admit was
incomplete and missing most of it's context, you, Daniel, are
discussing adapters being removed from the system, analyzing routing
tables, counters, considering virtual tunneling adapters, etc. What's
creepy, is that a generic user-level network application is concerning
itself with low level system information. Strictly speaking, this data
is really none of FF's business.

I get you are concerned with your users experience and you want to make
it as performant as possible, and while I laud your commitment to your
user base, I think your efforts could be more useful at the system
level.

I get also that you work for Mozilla, so you are seeing it through that
lens, that set of restrictions, and from that set of issues and
requirements.

However, I think your energies should probably be put toward helping to
fix the underlying part that's non-optimal, not hacking stuff on top of
a networking stack that's obviously inadequate to deal with roaming at
speed, or docking/un-docking/un-suspending - because apparently that's
become a huge issue.

Perhaps time better spent would be teaching users that changing network
adapters or state will negatively impact active network applications.
Hacking in workarounds to cater to the least aware and informed amongst
us is causing you to veer outside of your lane, and really should be
avoided.

Just because others (Chrome?) are taking the approach of cobbing system
stuff into a browser does not mean Mozilla should too.

As for the dns topic I mentioned, my concerns there were eerily similar
to these here. The dns thread is here Patrick, to which you responded a
few times:

https://groups.google.com/forum/#!topic/mozilla.dev.tech.network/D8UmrLZZh5k

The fact that the concern I mention is very similar for two very
different things says to me that straying into the system arena and
(attempting to) control and/or bypass system stuff is not an unusual
event, is not seen as off-limits, and in fact is seen as completely
acceptable if it serves your immediate purposes.

My opinion is that this basic mindset and approach is ineffective,
non-standard, somewhat irresponsible, will lead to multiple variously
hacky redundant implementations across similar applications, add even
more bloat, provide less security, and increase the likelihood of even
more privacy invasion. All in the name of 'fixing' something that
arguably is not actually broken.

That's what concerns me.


--
Regards,
Christopher
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Patrick McManus
[christopher]

>
> > I seem to remember another out-of-scope foray where FF was using
> > built-in dns server addresses behind the user's back a while ago,
> >
>

[patrick]

>
>
I don't believe that is true - I would be interested in any citation you
> had. That being said, we don't do that because at this point in time we
> don't think it would be a good user experience. If we had a model where
> it added value we would consider doing so openly and likely with user
> choice.
>
> [..]

[christopher]

>
> As for the dns topic I mentioned, my concerns there were eerily similar
> to these here. The dns thread is here Patrick, to which you responded a
> few times:
>
>
> https://groups.google.com/forum/#!topic/mozilla.dev.tech.network/D8UmrLZZh5k
>
>
I think my response is consistent - in that previous thread I said if we
saw value in that technique we would consider pursuing it though I was
skeptical the pros could outweigh the cons. The cons in that case were far
more concerning than in the issue at hand. The advocate didn't follow
through so my skepticism prevailed :) I guess I wouldn't call a random list
discussion a foray - but at least I know what you were referring to.

best regards.
-P
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: A network-change detection and IPv6 quirk - on Windows

Daniel Stenberg
In reply to this post by Christopher Barry
On Thu, 17 Mar 2016, Christopher Barry wrote:

> What's creepy, is that a generic user-level network application is
> concerning itself with low level system information. Strictly speaking, this
> data is really none of FF's business.

Then I can only assume that you and I don't share the same vision for how
Firefox should do act when the network changes. That's of course totally fine.

> However, I think your energies should probably be put toward helping to
> fix the underlying part that's non-optimal

Our efforts to make Firefox run fine on existing operating systems as well as
"old" operating systems that were released several years ago means we need to
work with network stacks and operating systems that already are deployed. But
then we of course also work within various communities to push networking
forward all over in ways we think it can improve. I think we can do both.

--

  / daniel.haxx.se
_______________________________________________
dev-tech-network mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-network
Loading...