[cap-talk] use of hashcodes?
Jack Lloyd
lloyd at randombit.net
Fri Feb 19 13:14:28 PST 2010
On Fri, Feb 19, 2010 at 12:27:17PM -0800, Raoul Duke wrote:
> On Fri, Feb 19, 2010 at 12:19 PM, Tyler Close <tyler.close at gmail.com> wrote:
> > The ETag values in Waterken are a SHA-256 HMAC of all state and code
> > used during a query. You'd have a pretty good paper if you could
> > generate a collision.
>
> just because something is a low probability...
>
> that's what i am curious about -- i get the impression (confirmed
> above :-) that most folks aren't kept up at night by the chance of
> collisions. nobody seems to talk about the implications of there ever
> being a collision.
I don't think it's really a matter of probability, except insofar in
many cases the best known algorithms are blind guessing or other brute
force techniques (as with SHA-256 HMAC I believe). It's possible
sometime later this afternoon someone will publish a paper on eprint
showing how to break HMAC with random unknown keys in constant
time. Or maybe it will not happen for another 100 years, or ever. I
don't have a good handle on the probability distribution for an event
like that, and I'd be surprised if anyone else did either. (Though it
would be interesting to try and derive an approximation for it from
historical evidence with other algorithms.) About the best we can say
is that it is low probability with the current state of public
knowledge, and that the odds of anyone discovering a catastrophic
weakness seem lower over time as more effort is spent on studying the
algorithm in question and the theory surrounding it (and I would
assess HMAC as among the better studied of crypto algorithms at this
point).
Even so, thinking about the implications of such a failure seems
worthwhile because it can expose the fragility of the system in
question. There are systems where a violation of a design assumption
(such as that the hash does not have known collisions, or that a nonce
is never reused) causes the system to break entirely, where another
system might suffer milder damage in the same situation. A canonical
example might be DSA versus RSA/PSS signatures: if you reuse a nonce
with a DSA signature, you reveal the private key, whereas if you reuse
a nonce with an RSA/PSS signature, there is almost no damage. Since
assumptions are 'made to be broken', it's worth figuring out what the
effects will be in such cases so one can keep an eye out for warning
signs and take precautions.
For instance if you know a reused nonce will utterly break your
system, you can know to spend extra time thinking about how to prevent
that from happening. If you just assume it won't happen and then don't
consider it further you can leave yourself open for problems later. Or
if you know a hash collision will break your system, you can keep this
in mind when you see a paper showing how to do collisions in 2^100
time; sure, impractical, but you know it's a hint to start moving to
something else so you don't end up still relying on it later when it's
trivially breakable.
-Jack
More information about the cap-talk
mailing list