Happy Happy Joy Joy (was: On to Hydro)

Bill Frantz frantz@communities.com
Thu, 24 Aug 2000 10:59:58 -0700


At 04:09 AM 8/24/00 -0400, Tyler Close wrote:
>Jonathan Shapiro wrote:
>> I am currently building a system in which I rely on being
>> able to compute
>> cryptographic hashes of the content using an algorithm that
>> I specify.
>
>It is an unfortunate thing that cryptographic hash functions and
>hashtable hash functions are both called hash functions. They are two
>very different activities with very different purposes. Hashtable hash
>functions are used only for searching, whereas cryptographic hash
>functions are used for integrity checking and identification.
>
>It might be better if you think of cryptographic hashs as "message
>digests", or MACs (Message Authentication Codes), instead of hashs.
>This makes it clear that there are additional requirements for these
>hashs than for hashtable hashs (requirements that a hashtable does not
>need to meet).

Just for pedantic clarity:

A "hash" does not have to have cryptographic properties.  There are
algorithms (I am told) for building "perfect hashes", where each input
generates a different hash output.  These are useful when building hash
tables of fixed, known entries.  An example is the dispatch table for an
interpreted language.

A "message digest" has two characteristics which distinguish it from a
"hash".  (1) It is hard to build two inputs which hash to the same value.
(2) It is hard to go from the hash to the input.  Message digests can make
good hash table hashes.

A "message authentication code" (MAC) uses a secret key to allow
authentication of messages.  MACs can be built from symmetric cyphers and
message digests.  The HMAC* algorithm is a popular way of building a MAC
from a message digest because with weak security assumptions about the
underlying message digest, it is provably secure.

Note that all these hashes have the "birthday problem", where two inputs
produce the same output.  Message digests and MACs overcome this problem by
producing a long (128-160 bit) output.  When used with a hash table, the
birthday problem causes some chains to have several members while other
chains are empty.  Perfect hashes are a solution to this problem.

* See "Keying Hash Functions for Message Authentication", Bellare, Canetti,
and Krawiczyk (RFC-2104, and
http://www.research.ibm.com/security/keyed-md5.html; and "Message
Authentication using Hash Functions - The HMAC Construction" by the same
authors, which appears in CryptoBytes Vol 2, Number 1, Spring 1996
<http://www.rsalabs.com/cryptobytes/>.