[cap-talk] use of hashcodes?

David Wagner daw at cs.berkeley.edu
Mon Feb 22 12:15:09 PST 2010


David Barbour  wrote:
>I'm curious as to why you believe this to be a 'use case for
>cryptography'. Precisely which attack does having a 'big'
>cryptographically-secure ETag prevent?

My understanding is that the mechanism Tyler chose helps to ensure the
following property to a high degree of confidence: If two GET requests
recieve the same ETag, then their responses would be the same.

If you want to talk about concrete attacks, that's probably easier to
do in the context of a specific counterproposal for what you would use
in place of SHA256-HMAC.

To illustrate why some kind of cryptography may be needed, let's look
at an example of what could happen if we used some non-cryptographic
hash function to compute Etags.  For instance, using CRC32 to compute
ETags might not be a good choice, because with CRC32 checksums, it is
plausible that two responses to a GET query could have different content
but be associated with the same ETag value.  As an example, consider
a scenario with Andy the Attacker, Wally the Waterken Web service,
and Vicky the Victim.  Andy is malicious, and wants to attack Vicky.
Wally is a credulous well-intentioned web service; Wally is not malicious.
Vicky relies upon Wally and invokes Wally's public interface.  Andy has
access to Wally and can issue queries to Wally that affect Wally's state,
and thus can get partial control over Wally's state, and Andy wants to
use that to manipulate Vicky.  It may be the case that Andy has enough
control over Wally's internal state to mount the following attack:
Wally's initial state is S.  Vicky issues a GET request to Wally, and
gets a response (which depends upon S) along with an ETag E = CRC32(S).
Andy issues a POST request to Wally, which is carefully chosen to
modify Wally's state to some new state S' with the property that S != S'
but CRC32(S) = CRC32(S').  Now Vicky issues her GET request to Wally a
second time, along with the ETag E.  Note that if we were using CRC32,
the server might respond saying that the ETag has not changed, since the
CRC32 of Wally's new internal state S' matches the ETag E sent by Vicky.
As a result Vicky might conclude that Wally's response to her 2nd GET
request would be the same as Wally's response to her 1st GET request --
which is a failure of the caching protocol, as the actual response that
Wally would have provided (in the absence of the ETag optimization) might
be different.  As a result, this attack on the caching mechanism causes
Vicky to receive a response other than what she should have received.
That might in turn enable some attacks against Vicky, depending upon
the details of the scenario.  That's a kind of attack we might like to
defend against.

At a minimum, it makes it harder to reason about the correctness of the
system if the ETags mechanism fails to satisfy the property I mentioned
above.  To the extent that the security of some principals depends upon
the correctness of the system, this could potentially be a security risk.

>To be clear: Raould asked about use of hash codes for security. While
>he did mention ETags, his question was not limited to them. Saying
>"this has nothing to do with ETags" repeatedly, as you have been doing
>from the very beginning, isn't of much help in answering Raould's OP
>question.

OK, perhaps I misunderstood, then.  My apologies.

I thought the main thrust of Raould's original question was about
collisions, and I think that question has been answered -- but OK,
I see you are commenting more generally on the use of hash functions
in secure systems.

>> That goal has nothing to do with the very narrow question of what is the
>> best way to generate ETags values.
>
>Who asked this very narrow question?
>
>It seems you dodged this "very narrow question" when you answered
>"Compared to not using ETags" when I asked how using HMAC for ETags
>compares for performance, so I don't think you're actually interested
>in answering it.

Perhaps I misunderstood what question you were asking; my apologies.
If you're asking whether Waterken would be more efficient if it used
public-key signatures instead of SHA256-HMAC to compute its ETags, I
think it's pretty clear the answer is no, it would not.  If I've still
misunderstood the question, I apologize again.

Previously I wrote:
  I doubt that SHA256-HMAC is the performance bottleneck in Waterken [...]

  Moreover, caching using ETags has important performance benefits, so I'd
  expect that the net performance impact of Waterken's use of SHA256-HMAC
  for ETags is a win for performance, not a loss.
and you wrote:
  Compared to what, exactly?
to which my response was intended to elaborate that my paragraph
"Moreover..." was intended as a comparison the alternative of not using
ETags at all.  It was not my intent to dodge any question but rather to
clarify what I was trying to say in the "Moreover..." paragraph.


More information about the cap-talk mailing list