The cryptography world has been buzzing with the news that researchers at Google and CWI Amsterdam have succeeded in successfully generating a ‘hash collision’ for two different documents using the SHA1 encryption algorithm, rendering the algorithm ‘broken’ according to cryptographic standards.
But what does this mean in plain language, and what are the implications for the bitcoin network?
As laid out in a recent CoinDesk explainer, a hash function (of which SHA1 is an example) is used to take a piece of data of any length, process it, and return another piece of data – the ‘hash digest’ – with a fixed length.
One way that hash functions are used in computing is to check whether the contents of files are identical: as long as a hash function is secure, then two files which hash to the same value will always have the same contents.
However, a hash collision occurs when two different files hash to the same value.
Given the mathematical laws that govern hash functions, it is inevitable that hash collisions will occur for some values of input data (because the range of data you could put into the hash function is potentially infinite, but the output length is fixed).
For a secure hash function, the probability of this should be so small that, in practice, it is not possible to make a sufficient number of calculations to find it.
The significance of the Google/CWI team’s results is in the fact that they were able to create a hash collision by finding a much more efficient method – 100,000 times more efficient in fact – than simply guessing every possible value of data.
It’s the efficiency of this method that means SHA1 is now officially broken. (These results are outlined in more depth on SHAttered.io, with an explanation of systems affected.)
On 23rd February, a sharp-eyed Redditor on the /r/bitcoin page made a post pointing out that a long-standing bounty for discovering just such a SHA1 collision has now been claimed.
The bounty – aimed to discover vulnerabilities in the algorithm – was originally announced by cryptography researcher Peter Todd in a post on the Bitcoin Talk forum in September 2013, but remained unclaimed until this week.
The challenge consisted of a script, written by Todd, which would allow anyone to move the bitcoins from the bounty address to an address of their choice if they could submit two messages which were not equal in value, but resulted in the same digest when hashed.
In addition to Todd, other contributors also donated to the bounty fund, raising a total of 2.5 bitcoins.
According to the researcher, the timing of the claim – slightly after publication of the collision attack – suggests that it was a third party who had read the Google team’s research and made use of the results, rather than one of the original researchers, that took the reward.
Todd said:
“If it was the authors themselves, we would have expected the bounty to be claimed just prior to the announcement being published. As it happened, that wasn’t the case.”
It’s important to stress that the cryptography underpinning the bitcoin network, which makes use of the more secure SHA256 algorithm, is not directly affected by the discovery.
But, besides enriching the mystery bounty recipient, the SHA1 collision vulnerability does pose a concern for the bitcoin development community, since its Git version control system uses SHA1 to generate the hash digest for commits.
“The consequences aren’t that we have to stop using Git immediately,” Todd said, “but it will make it more important to review other people’s work, because a third party could try to push a malicious commit in.”
The vulnerability here is that an attacker could theoretically create two different versions of a code commit that would appear to be the same when hash values were compared – though for now, given the vast number of computations still needed to find a collision, it’s highly unlikely that could happen.
As well as SHA1, Todd has placed similar bounties on the RIPE MD160 and SHA256 hash functions – both of which are necessary for the integrity of the bitcoin standard, and would therefore be calamitous for the network if compromised.
Todd concluded:
“If you claim that bounty, you better go spend your bitcoins pretty quick.”
Binary code image via Shutterstock