Potential privacy issues of not showing suggestions in certain contexts

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Potential privacy issues of not showing suggestions in certain contexts

Ed Lee
[ For context around Suggested Tiles, please read http://ed.agadak.net/2015/04/whys-and-hows-of-suggested-tiles ]

An important feature for some brands who would want to provide content in the new tab page is to not be shown in the context of certain sites. For example, a suggestion for an upcoming movie trailer wouldn't want to be shown next to a tile for illegal file sharing.

We're planning on supporting this feature, but it has some privacy nuances.

Similar to how Firefox has enough data to decide that it's appropriate to show a movie trailer suggestion to users who tend to visit movie related sites, Firefox could have enough data to decide when it's inappropriate. If a given Firefox instance does *not* report back that the movie trailer suggestion was shown, one could try to infer that it was because the user had a visible illegal file sharing site in the new tab page.

Another approach is to temporarily hide the illegal file sharing site so that it's acceptable to show the movie trailer. Once the suggestion no longer needs to be shown, the hidden tile can reappear. However, even in this situation, Firefox reports back the structure of the new tab page (but no URLs), and there could be enough data to infer something was hidden then shown again.

To be clear on both potential privacy leaks, we have aggressive data deletion policies and don't keep any unique identifiers that can be associated with our users. But this ability could be used by malicious entities to learn information about users.

Do people have thoughts on the privacy issues raised here and potential solutions?

Ed Lee
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Sebastian Hengst
-------- Original-Nachricht --------
Betreff: Potential privacy issues of not showing suggestions in certain
contexts
Von: Ed Lee <[hidden email]>
Datum: 2015-04-23 20:23
 > Similar to how Firefox has enough data to decide that it's
 > appropriate to show a movie trailer suggestion to users who tend to
 > visit movie related sites, Firefox could have enough data to decide
 > when it's inappropriate. If a given Firefox instance does *not*
 > report back that the movie trailer suggestion was shown, one could
 > try to infer that it was because the user had a visible illegal file
 > sharing site in the new tab page.

Is it feasible to only show such paid tiles under the condition that it
only gets shown to X% of the users so not showing the tile doesn't imply
anything?

Archaeopteryx
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Ed Lee
On Friday, April 24, 2015 at 1:36:45 AM UTC-7, Archaeopteryx wrote:
> Is it feasible to only show such paid tiles under the condition that it
> only gets shown to X% of the users so not showing the tile doesn't imply
> anything?
There is some fuzziness already around when a suggested tile is shown. We only show one suggestion at a time, so if a given tile isn't shown, it could be because another tile is being shown. We also frequency cap suggestions, i.e., stop showing them after some number of views.

We could add to that uncertainty by having Firefox decide to show a suggestion 50% of the time that it could have shown something.

Each of these makes it so that when a tile isn't shown, it's not guaranteed it's because the user has some illegal site visible.
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Andrew Sutherland-5
In reply to this post by Ed Lee
On Thu, Apr 23, 2015, at 02:23 PM, Ed Lee wrote:
> Do people have thoughts on the privacy issues raised here and potential
> solutions?

Use a probabilistic mechanism like bloom filters tuned to err on the
side of false positives to determine when to not show the suggested
tile?  (And which can be additionally permuted to further increase
false-positives.)

This could also be beneficial because if the brand has a very large list
of sites they don't want to be associated with, all of that information
doesn't need to be downloaded.  And the side-effects of (don't show)
false positives is beneficial in that it decreases the information from
a tile not being shown.

A possibly good/possibly bad side effect is that this could allow the
brands to not explicitly say which websites they don't want to be
associated with.  If this is a desirable characteristic, even Mozilla
potentially need not know what the list of sites was.  If this is not a
desirable characteristic, the Mozilla automation could automatically
derive the filter from the total list of domains and make that available
as part of the data.

I would think that letting brands not explicitly reveal the websites
with 100% certainty that they don't want to be associated with is good
and acceptable since it provides parity with server-side solution and
the historical nature of ads.  (Just because a company doesn't advertise
in a certain TV show/magazine doesn't mean they have explicitly decided
not to advertise there.)  And interested users could still run
brute-forces against the filters and assign probabilities to certain
sites or clusters of sites being intentionally excluded in a similar
fashion to how they could notice what sites a company is not running ad
campaigns on.

Andrew
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Ed Lee
In reply to this post by Ed Lee
On Saturday, April 25, 2015 at 7:35:35 PM UTC-7, Andrew Sutherland wrote:
> Use a probabilistic mechanism like bloom filters tuned to err on the
> side of false positives to determine when to not show the suggested
> tile?
Hey, that's pretty clever. ;) I believe what you're getting at is false positives for "negative matches" ends up causing us to show fewer times than we would have -- potentially leading to less money, but brands are protected and user data isn't leaked because there's so many possible false positives.

We wouldn't be able to use this directly for "positive matches" because a bloom filter matching "web developers" would accidentally include some non web developer sites. But if we had some metrics to figure out which false positive sites were leading to low engagement (more blocks / less clicks), we could just add those false positives into the "negative matches" bloom filter.

Ed Lee
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Eric Rescorla
In reply to this post by Andrew Sutherland-5
On Sat, Apr 25, 2015 at 7:35 PM, Andrew Sutherland <
[hidden email]> wrote:

> On Thu, Apr 23, 2015, at 02:23 PM, Ed Lee wrote:
> > Do people have thoughts on the privacy issues raised here and potential
> > solutions?
>
> Use a probabilistic mechanism like bloom filters tuned to err on the
> side of false positives to determine when to not show the suggested
> tile?  (And which can be additionally permuted to further increase
> false-positives.
>

I love me some bloom filters, but if you want false positives, why not
just inject some directly?

-Ekr


> This could also be beneficial because if the brand has a very large list
> of sites they don't want to be associated with, all of that information
> doesn't need to be downloaded.  And the side-effects of (don't show)
> false positives is beneficial in that it decreases the information from
> a tile not being shown.
>
> A possibly good/possibly bad side effect is that this could allow the
> brands to not explicitly say which websites they don't want to be
> associated with.  If this is a desirable characteristic, even Mozilla
> potentially need not know what the list of sites was.  If this is not a
> desirable characteristic, the Mozilla automation could automatically
> derive the filter from the total list of domains and make that available
> as part of the data.
>
> I would think that letting brands not explicitly reveal the websites
> with 100% certainty that they don't want to be associated with is good
> and acceptable since it provides parity with server-side solution and
> the historical nature of ads.  (Just because a company doesn't advertise
> in a certain TV show/magazine doesn't mean they have explicitly decided
> not to advertise there.)  And interested users could still run
> brute-forces against the filters and assign probabilities to certain
> sites or clusters of sites being intentionally excluded in a similar
> fashion to how they could notice what sites a company is not running ad
> campaigns on.
>
> Andrew
> _______________________________________________
> dev-planning mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-planning
>
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Edward Lee
In reply to this post by Ed Lee
On Saturday, April 25, 2015 at 7:35:35 PM UTC-7, Andrew Sutherland wrote:
> A possibly good/possibly bad side effect is that this could allow the
> brands to not explicitly say which websites they don't want to be
> associated with.
This concept has come up in some discussions around whether it's okay for us to ship a list of adult/porn sites as clear text with Firefox. If we do want to prevent users from having a list of questionable sites on their computer, using a bloom filter can indeed avoid the problem.

Having a list could power other features such as porn-browser or the opposite with parental-control browser. Although not quite the topic of discussion here.

Do people have thoughts on whether it's okay to have a plaintext list of sites either as source or packaged as part of Firefox?

Ed Lee
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Eric Rescorla
On Fri, May 1, 2015 at 2:58 PM, <[hidden email]> wrote:

> On Saturday, April 25, 2015 at 7:35:35 PM UTC-7, Andrew Sutherland wrote:
> > A possibly good/possibly bad side effect is that this could allow the
> > brands to not explicitly say which websites they don't want to be
> > associated with.
> This concept has come up in some discussions around whether it's okay for
> us to ship a list of adult/porn sites as clear text with Firefox. If we do
> want to prevent users from having a list of questionable sites on their
> computer, using a bloom filter can indeed avoid the problem.
>
> Having a list could power other features such as porn-browser or the
> opposite with parental-control browser. Although not quite the topic of
> discussion here.
>
> Do people have thoughts on whether it's okay to have a plaintext list of
> sites either as source or packaged as part of Firefox?


Is there a reason not to use the same techniques we use for safe browsing?

-Ekr
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Ed Lee
In reply to this post by Edward Lee
On Saturday, May 2, 2015 at 4:24:17 AM UTC-7, Eric Rescorla wrote:
> Is there a reason not to use the same techniques we use for safe browsing?
That's an interesting possibility, but we've been trying to reduce risk by keeping code relatively self contained within new tab modules. I could definitely see the code refactored to make use of safebrowsing for blacklisting as well as potentially positive matches for triggering a suggested tile. This would definitely be more involved as there would need to be coordination of multiple services on both server and client pieces.

Another tricky aspect is the longer term plans that don't necessarily involve matching at a site level. For example, we might want to trigger independently from sites on search keywords or page titles or path segments. Potentially we could augment safe browsing to allow for that, but that increases risk for other users of safe browsing, and it would be faster to keep with the current delivery mechanism of tiles data.
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Potential privacy issues of not showing suggestions in certain contexts

Eric Rescorla
On Sat, May 2, 2015 at 11:12 AM, Ed Lee <[hidden email]> wrote:

> On Saturday, May 2, 2015 at 4:24:17 AM UTC-7, Eric Rescorla wrote:
> > Is there a reason not to use the same techniques we use for safe
> browsing?
> That's an interesting possibility, but we've been trying to reduce risk by
> keeping code relatively self contained within new tab modules. I could
> definitely see the code refactored to make use of safebrowsing for
> blacklisting as well as potentially positive matches for triggering a
> suggested tile. This would definitely be more involved as there would need
> to be coordination of multiple services on both server and client pieces.
>
> Another tricky aspect is the longer term plans that don't necessarily
> involve matching at a site level. For example, we might want to trigger
> independently from sites on search keywords or page titles or path
> segments. Potentially we could augment safe browsing to allow for that, but
> that increases risk for other users of safe browsing, and it would be
> faster to keep with the current delivery mechanism of tiles data.


I was only considering a safe browsing-style mechanism for blacklisting.

Note: I'm not suggesting actually using the safe browsing list per se, just
a hash-block
mechanism like is used by safe browsing

-Ekr
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning