[cap-talk] Firefox breaks the principle of identifiability

Ka-Ping Yee cap-talk at zesty.ca
Mon Feb 7 18:11:27 EST 2005

> >Unfortunately, so far the response to this announcement has only
> >been "Oh well.  Too bad!"  No one can see any other way to make
> >IDNs work.  The only solution is to turn off IDNs altogether.
> Where have you seen this response? Over on the crypto + security at
> mozilla groups there has been quite a bit of chit chat on the problem,
> although I grant that nobody who claims to be a member of a security
> team has said anything yet.

I admit that my characterization of the response comes from various
blog entries about the vulnerability, not the Mozilla newsgroups.  So
perhaps it is not fair for me to say that there is no response.  Rumour
has it that Opera is claiming there is nothing wrong with their
implementation, which, if true, is quite depressing.

I am very disappointed that the implementors of IDNs in Firefox did not
anticipate this problem.  The problem is well known and well documented.
See http://www.icann.org/committees/idn/idn-codepoint-paper.htm or
http://www.cs.technion.ac.il/~gabr/papers/homograph.html for instance.
RFC 3454 (Stringprep) specifically points out:

    The Unicode and ISO/IEC 10646 repertoires have many characters that
    look similar.  In many cases, users of security protocols might do
    visual matching, such as when comparing the names of trusted third
    parties.  Because it is impossible to map similar-looking characters
    without a great deal of context such as knowing the fonts used,
    stringprep does nothing to map similar-looking characters together
    nor to prohibit some characters because they look like others.  User
    applications can help disambiguate some similar-looking characters by
    showing the user when a string changes between scripts.

Even if no one on the Firefox team read this paragraph, even 0.5 second
of thought on the topic of security and usability should have been
sufficient to realize that the use of Unicode in the location bar would
yield a security-damaging source of ambiguity.

> >However, i'm inclined to think that Unicode domain names are just
> >inherently insecure and should not be used.  Even if users learn
> >to identify sites with pet names, they are still vulnerable to
> >confusion if they look at the location bar, read the name there,
> >and type it into the location bar later.
> By this logic we should stop using language!

I think you're overstating my case.  My argument is that Unicode is
much too large a character set for humans to be able to distinguish
different characters.

> The IDN situation has always existed in the domain system via
> PayPa1.com.  Should we not accept digits in domains?

It seems feasible to me to choose (or design) a font that would make
all of the 7-bit printable ASCII characters clearly distinguishable.
For domain recognition we don't even need all the characters -- just
the letters, digits, hyphen, and period.

This does not seem feasible to me for a character set as large as
Unicode, and flatly impossible for Unicode in particular because
Unicode includes characters that are *defined* to be invisible or
to combine with other characters to yield perfect homographs.

-- ?!ng

More information about the cap-talk mailing list