
Samuel Neves wrote:
Furthermore, the way extended output works breaks pseudorandomness. Suppose I have a keyed MD5 instance and want to generate several blocks of output. The expectation here is that _all_ of the output is indistinguishable from a random string of the same length. But that is not the case here. What we have instead is
first_block = MD5(k || m || padding) second_block = MD5(k || m || padding || more padding) ...
An attacker who observes this can easily distinguish this by taking first_block, which consists of the internal MD5 state, hashing the extra padding, and checking whether the output is equal to second_block.
This only works if the message length is known to the attacker, because the "more padding" part includes the message length, which is incremented by each call to result().