Unicode domain names issue (Encrypting a "fake" domain name)

classic Classic list List threaded Threaded
66 messages Options
1234
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Kai Engert-4


On 18 April 2017 23:35:46 GMT+02:00, Boris Zbarsky <[hidden email]> wrote:
>Will be display "Latin:" next to every non-punycode domain, so we're
>not
>making non-English-speakers second-class citizens?

Maybe it's sufficient to show Latin: for users having configured a default language that isn't based on latin characters.

Kai
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Kai Engert-4
In reply to this post by Boris Zbarsky


On 18 April 2017 23:35:46 GMT+02:00, Boris Zbarsky <[hidden email]> wrote:
>On 4/18/17 5:32 PM, Kai Engert wrote:
>> IIUC, Dan said, mixed scripts are always shown as xn--, never
>rendered.
>
>Who said anything about mixed scripts?

Dan Veditz did.

I repeated his statement to support my conclusion.

Kai
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Kyle Hamilton
In reply to this post by L. David Baron
How did the algorithm in
https://bugzilla.mozilla.org/show_bug.cgi?id=722299 (which points to
https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm ) fail to
help in this instance?

Are there other instances in which it could be expected to fail?

If there are, the hypothesis set forth in
https://bugzilla.mozilla.org/show_bug.cgi?id=843689 (that the new IDN
display algorithm was sufficient enough to prevent IDN weirdnesses
that the whitelist could be removed) is shown to be false, and Mozilla
either needs to either find a better solution, or go back to the
whitelist.

-Kyle H


On Tue, Apr 18, 2017 at 6:35 AM, L. David Baron <[hidden email]> wrote:

> On Tuesday 2017-04-18 10:29 +0100, Gervase Markham wrote:
>> Neither browsers nor CAs have a database of all domain names, such that
>> they can see that one is visually confusable with another. Registries
>> have this data, and it is their responsibility to deal with this problem.
>
> So we used to have a whitelist of registries that had sensible
> policies for dealing with this, but we stopped using it in
> https://bugzilla.mozilla.org/show_bug.cgi?id=843689 .
>
> Should we enable the whitelist approach again?
>
> (One of the big issues with it was that some of the most prominent
> domains, like .com, had policies that we saw as unacceptable.)
>
> -David
>
> --
> ๐„ž   L. David Baron                         http://dbaron.org/   ๐„‚
> ๐„ข   Mozilla                          https://www.mozilla.org/   ๐„‚
>              Before I built a wall I'd ask to know
>              What I was walling in or walling out,
>              And to whom I was like to give offense.
>                - Robert Frost, Mending Wall (1914)
>
> _______________________________________________
> dev-security mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-security
>
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Boris Zbarsky
In reply to this post by L. David Baron
On 4/18/17 9:13 PM, Kyle Hamilton wrote:
> How did the algorithm in
> https://bugzilla.mozilla.org/show_bug.cgi?id=722299 (which points to
> https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm ) fail to
> help in this instance?

All the characters are from a single script.

-Boris
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Igor Bukanov-2
In reply to this post by Daniel Veditz-2
On 18 April 2017 at 17:27, Daniel Veditz <[hidden email]> wrote:
> Are there are no legitimate Russian words made only of the 11 or so letters
> that look like latin script?

If mixed scripts are not allowed, then the browser should show the
type of language of the script, perhaps using native abbreviations,
like Lat for Latin, ะšะธั€ for Cyrillic etc. That should be sufficient to
cover this case.
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Anne van Kesteren
On Wed, Apr 19, 2017 at 9:38 AM, Igor Bukanov <[hidden email]> wrote:
> If mixed scripts are not allowed, then the browser should show the
> type of language of the script, perhaps using native abbreviations,
> like Lat for Latin, ะšะธั€ for Cyrillic etc. That should be sufficient to
> cover this case.

No, we should strive to offer less and simpler UI to users, not more,
and certainly not something as complicated as that.


--
https://annevankesteren.nl/
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Igor Bukanov-2
In reply to this post by Kai Engert-4
On 18 April 2017 at 23:41, Kai Engert <[hidden email]> wrote:
> Could the browser use the configured default language, to know the expected usual script, and use special hightlighting (looking like a warning) whenever the domain uses a non-matching script?

The default language does not work for countries using Cyrillic
script. The vast majority of domains there are in Latin. That makes
fishing attacks more effective as domains are not expected to be typed
at all. They either come from search engines or links in email or
social media.
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

craig.francis
For those who use Latin characters "most of the time" (US, UK, etc), then why not apply a highlight to any non-Latin characters? i.e. characters you would not expect to see normally.

As per the screenshot attached, or if attachments get removed, at this URL:

https://www.krang.org.uk/misc/unicode-domain.jpg

Notes:

- I only highlighted the first character, in this case it should have been the whole word.

- I am a little unsure about this approach from an accessibility point of view (which might not be as much of an issue for screen readers, e.g. VoiceOver says something like "yeris dot com").

- This does not consider that Chinese and Spanish are the most spoken languages.













> On 19 Apr 2017, at 08:47, Igor Bukanov <[hidden email]> wrote:
>
> On 18 April 2017 at 23:41, Kai Engert <[hidden email]> wrote:
>> Could the browser use the configured default language, to know the expected usual script, and use special hightlighting (looking like a warning) whenever the domain uses a non-matching script?
>
> The default language does not work for countries using Cyrillic
> script. The vast majority of domains there are in Latin. That makes
> fishing attacks more effective as domains are not expected to be typed
> at all. They either come from search engines or links in email or
> social media.
> _______________________________________________
> dev-security mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-security

_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Gervase Markham
In reply to this post by Gervase Markham
On 18/04/17 14:35, L. David Baron wrote:

> On Tuesday 2017-04-18 10:29 +0100, Gervase Markham wrote:
>> Neither browsers nor CAs have a database of all domain names, such that
>> they can see that one is visually confusable with another. Registries
>> have this data, and it is their responsibility to deal with this problem.
>
> So we used to have a whitelist of registries that had sensible
> policies for dealing with this, but we stopped using it in
> https://bugzilla.mozilla.org/show_bug.cgi?id=843689 .
>
> Should we enable the whitelist approach again?

The reason we stopped using a whitelist is that it didn't scale when the
gTLD explosion happened. This is still the case - no-one has the time or
energy to keep tracking of 1000+ anti-spoofing policies.

We could perhaps have a blacklist, but see below.

> (One of the big issues with it was that some of the most prominent
> domains, like .com, had policies that we saw as unacceptable.)

The new mechanism has the advantage of allowing IDN domains in .com,
which users both want and use. If we returned to a white or blacklist,
would we whitelist .com? If so, we are no further forward. If not, we
break all those domains.

The newly-authored https://wiki.mozilla.org/IDN_Display_Algorithm_FAQ
sets out the position in what is hopefully a clear fashion.

Gerv
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Gervase Markham
In reply to this post by L. David Baron
On 19/04/17 02:13, Kyle Hamilton wrote:
> How did the algorithm in
> https://bugzilla.mozilla.org/show_bug.cgi?id=722299 (which points to
> https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm ) fail to
> help in this instance?

Because it is a known issue that it does not deal with whole-script
confusables. This was documented at the time we adopted it - see:
https://wiki.mozilla.org/IDN_Display_Algorithm#Downsides

> Are there other instances in which it could be expected to fail?

No.

> If there are, the hypothesis set forth in
> https://bugzilla.mozilla.org/show_bug.cgi?id=843689 (that the new IDN
> display algorithm was sufficient enough to prevent IDN weirdnesses
> that the whitelist could be removed) is shown to be false, and Mozilla
> either needs to either find a better solution, or go back to the
> whitelist.

That was not the hypothesis. As noted above, this edge case was a known
and accepted part of the solution, because all of the alternatives are
worse.

The argument is that the browser only has sufficient knowledge to solve
a part of this problem; we can't solve the entire thing using an
algorithm without privileging some scripts over others, which is not an
appropriate action for an organization which believes in a truly World
Wide web. Fixing whole-script spoofing is the responsibility of those
who have databases of all the existing registrations - i.e. registries.

See https://wiki.mozilla.org/IDN_Display_Algorithm_FAQ for more details.

Gerv
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Martin Heaps
In reply to this post by Martin Heaps
I have been away for a few days, hence I would have added this below clarifier earlier:

My issue is NOT with the character sets used or the ambiguity of these character sets, but is with the Browsers complete lack of ability in telling the human user that epic.com !== epic.com at any point short of opening up and readng the core TLS certificate itself.

Reading the certificate is relatively simple (albeit 3-4 clicks) for Firefox but it's a hidden away aspect on Google Chrome, where the user needs to know where to find the certificate to reach it, rather than just exploreing and clicking suitable looking buttons (As seems to be with firefox Browser) .

Some examples using the epic.com domain name; showing that ALL views of the security of the website short of viewing the full cerificate output the "output" name rather than the "raw" name of the website.

THIS is the issue I am taking, and feel that should be fixed by browsers, by certificate providers and all parties in between.

I have NO issue with the character set of the certificate or the character set of the URL, this does not need to use the data held by registrars but simply a patch on the browsers to note that:

- A certificate for "xn--e1awd7f.com" can be interpreted as "epic.com"


Some screen shots of the core issue, please review:

https://www.imageupload.co.uk/images/2017/04/19/chrome-epic-dot-com.png

https://www.imageupload.co.uk/images/2017/04/19/chrome-epic-dot-com2.png

https://www.imageupload.co.uk/images/2017/04/19/Firefox-epic-dot-com.png

https://www.imageupload.co.uk/images/2017/04/19/Firefox-epic-dot-com3.png

https://www.imageupload.co.uk/images/2017/04/19/Firefox-epic-dot-com2.png

Thanks

_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Martin Heaps
In reply to this post by Martin Heaps
A possible solution is one that I envisage has the following practical effects:

That if the name can appear valid in another character set (as xn--e1awd7f.com can be epic.com, and russian characters can be used for latin characters, etc.) that the specific name the certificate is for is detailed on the Browser as exampled in my edited screen shots below.

Please review this possible approach in clarifying the exact nature of which URL is being certified by a TLS certificate.

https://www.imageupload.co.uk/images/2017/04/19/certificated_example2.png

https://www.imageupload.co.uk/images/2017/04/19/certificated_example.png

Cheers
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Justin Dolske-2
In reply to this post by Martin Heaps
On 4/18/17 2:29 AM, Gervase Markham wrote:

>> The word "fake" is probably an inpracise word to use, but in the
>> context that the domain name as registered is perporting to be
>> another domain name it is not; this is fake. The SSL provider;
>> LetsEncrypt in this case seems to not be able to ensure with browsers
>> that there is a clear destinction between the two names of the domain
>> certified by the CA.
>
> Neither browsers nor CAs have a database of all domain names, such that
> they can see that one is visually confusable with another. Registries
> have this data, and it is their responsibility to deal with this problem.

[As much as I hate to wade into this...]

Hmm. One thing browsers do have is the user's browsing history.

Half-baked thought for an imperfect mitigation:

When visiting a page, compute the normalized version domain, and see if
that exists as a history entry. If the entry exists, display the
punycode version of the domain. Otherwise, display the unicode version
of the domain.


That makes it more difficult to trick an existing user of a site. If
I've previously visited epic.com (ascii), visiting xn--e1awd7f.com will
show xn--e1awd7f.com instead of ะตั€ั–ั.com (cyrillic). But does nothing
for attacks against domains a user might recognize but hasn't visited.

To handle bz's case of two non-ascii domains that are homographs of each
other, I think you'd need to store the normalized domain in history too?

Bah, but visiting one such homograph would then cause both to display as
punycode (as both history entries exist). So the history check would
need to be a little more complex, to see which is the oldest site in the
user's history. (If the oldest site is the fake site, the user is
screwed for a while.)

So, I dunno. I'm sure there are other issues too. But maybe some kind of
imperfect mitigation (perhaps not this one) is better than nothing?

Justin
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Boris Zbarsky
In reply to this post by Martin Heaps
On 4/19/17 8:40 PM, Justin Dolske wrote:
> When visiting a page, compute the normalized version domain

Normalized in what sense?

-Boris
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Gervase Markham
In reply to this post by Martin Heaps
On 20/04/17 02:56, Boris Zbarsky wrote:
> On 4/19/17 8:40 PM, Justin Dolske wrote:
>> When visiting a page, compute the normalized version domain
>
> Normalized in what sense?

Presumably, in a "fold it to a canonical form using (the subset used in
IDN of) ftp://ftp.unicode.org/Public/security/latest/confusables.txt" sense.

Gerv

_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Eli the Bearded
In reply to this post by Martin Heaps
In mozilla.dev.security, Justin Dolske  <[hidden email]> wrote:
> [As much as I hate to wade into this...]
>
> Hmm. One thing browsers do have is the user's browsing history.

Objection. Configuration to not record history is trivial, and even
if not configured such, some confusables could easily be sites that
the user doesn't visit often enough to have in history.

> Half-baked thought for an imperfect mitigation:
>
> When visiting a page, compute the normalized version domain, and see if
> that exists as a history entry. If the entry exists, display the
> punycode version of the domain. Otherwise, display the unicode version
> of the domain.

More baked: Using the confusables list from Unicode, if a domain label
consists entirely of letters in one script that are "confusable" to
another (single) script, start raising red flags.

Probably special case things that can be confused with a FULL STOP for
attacks that attempt to just confuse part of the DNS name.

Elijah
------
has not checked to see what can be can be confused with a FULL STOP
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Justin Dolske-2
In reply to this post by Martin Heaps
On 4/20/17 2:54 AM, Gervase Markham wrote:
> On 20/04/17 02:56, Boris Zbarsky wrote:
>> On 4/19/17 8:40 PM, Justin Dolske wrote:
>>> When visiting a page, compute the normalized version domain
>>
>> Normalized in what sense?
>
> Presumably, in a "fold it to a canonical form using (the subset used in
> IDN of) ftp://ftp.unicode.org/Public/security/latest/confusables.txt" sense.

Correct. To be more concrete, with the simple case of apple.com (real)
vs ะฐั€ั€ำะต.com (spoof): both "normalize" to apple.com (plain ascii). If
that's in the user's history, the spoofed domain would be displayed as
xn--80ak6aa92e.com.

Justin
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Justin Dolske-2
In reply to this post by Martin Heaps
On 4/20/17 3:54 PM, Eli the Bearded wrote:
> In mozilla.dev.security, Justin Dolske  <[hidden email]> wrote:
>> [As much as I hate to wade into this...]
>>
>> Hmm. One thing browsers do have is the user's browsing history.
>
> Objection. Configuration to not record history is trivial, and even
> if not configured such, some confusables could easily be sites that
> the user doesn't visit often enough to have in history.

Yep.

I don't think "browsing history disabled" is necessarily common enough
to worry about (for an imperfect mitigation), but in any case
not-yet-visited is certainly an issue. Hence, again, "imperfect
mitigation". :-)

> More baked: Using the confusables list from Unicode, if a domain label
> consists entirely of letters in one script that are "confusable" to
> another (single) script, start raising red flags.

Sure, but the angle I found interesting here was to make a guess as to
which one is the legitimate site for the user, based solely on local
user data and avoiding favoring a particular script.

Justin
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Kyle Hamilton
In reply to this post by Gervase Markham
Perhaps, only display non-punycode from codepoint sets used in
languages already installed on the computer?

i.e., if the Russian language is installed on the computer, it might
be a strong indicator that Cyrillic codepoints should be shown as
Cyrillic.  Otherwise, it's someone who probably can't even read it,
and so the commitment to displaying non-punycode probably can only be
damaging.

-Kyle H

On Wed, Apr 19, 2017 at 4:40 AM, Gervase Markham <[hidden email]> wrote:

> On 19/04/17 02:13, Kyle Hamilton wrote:
>> How did the algorithm in
>> https://bugzilla.mozilla.org/show_bug.cgi?id=722299 (which points to
>> https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm ) fail to
>> help in this instance?
>
> Because it is a known issue that it does not deal with whole-script
> confusables. This was documented at the time we adopted it - see:
> https://wiki.mozilla.org/IDN_Display_Algorithm#Downsides
>
>> Are there other instances in which it could be expected to fail?
>
> No.
>
>> If there are, the hypothesis set forth in
>> https://bugzilla.mozilla.org/show_bug.cgi?id=843689 (that the new IDN
>> display algorithm was sufficient enough to prevent IDN weirdnesses
>> that the whitelist could be removed) is shown to be false, and Mozilla
>> either needs to either find a better solution, or go back to the
>> whitelist.
>
> That was not the hypothesis. As noted above, this edge case was a known
> and accepted part of the solution, because all of the alternatives are
> worse.
>
> The argument is that the browser only has sufficient knowledge to solve
> a part of this problem; we can't solve the entire thing using an
> algorithm without privileging some scripts over others, which is not an
> appropriate action for an organization which believes in a truly World
> Wide web. Fixing whole-script spoofing is the responsibility of those
> who have databases of all the existing registrations - i.e. registries.
>
> See https://wiki.mozilla.org/IDN_Display_Algorithm_FAQ for more details.
>
> Gerv
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode domain names issue (Encrypting a "fake" domain name)

Eli the Bearded
In reply to this post by Martin Heaps
In mozilla.dev.security, Justin Dolske  <[hidden email]> wrote:
> On 4/20/17 3:54 PM, Eli the Bearded wrote:
>> Objection. Configuration to not record history is trivial, and even
> Yep.
> I don't think "browsing history disabled" is necessarily common enough

Common or not, it's broken to ignore that case.

>> More baked: Using the confusables list from Unicode, if a domain label
>> consists entirely of letters in one script that are "confusable" to
>> another (single) script, start raising red flags.
> Sure, but the angle I found interesting here was to make a guess as to
> which one is the legitimate site for the user, based solely on local
> user data and avoiding favoring a particular script.

I'm not proposing favoring any particular script, just highlight to the
user that an IDN is composed entirely of confusables to a single
different script. There may be false positives, particalarly on short
hostnames, but I suspect that will be unlikely in practice.

    This site, https://www.xn--80ak6aa92e.com/, uses the Cyrillic
    alphabet to create a URL that resembles the Latin alphabet
    "www.apple.com". Do you wish to continue?

Elijah
------
bonus for defaulting to "No"
_______________________________________________
dev-security mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-security
1234
Loading...