Google performs first successful collision attack on SHA-1 security algorithm

Saturday, February 25, 2017

This Thursday, Google announced that it had performed a successful collision attack on the popular SHA-1 cryptographic hash function for the first time — that they know of. The collision attack demonstrated an algorithm for making two distinct inputs map to the same hash output, putting at risk the usage of SHA-1 for verifying data integrity. Google published a blog post and made a website about the collision attack.

An unofficial icon for group SHA1, SHA2, SHA3 family.
Image: Krauss.

A successful attacker would be able to add a malicious file to the system to damage a backup relying on SHA-1 for checking data integrity, deliver a malicious update to clients using SHA-1 to verify the update file, attack and decrypt an encrypted connection to a website where the user's browser is using SHA-1 to identify the connection certificate, replace a file revision history using SHA-1 to identify commits, and perform other actions that would substitute for valuable files with something seemingly identical but misleading or malicious in practice. Similar systems not using SHA-1 would not be affected.

To demonstrate the success of the algorithm, Google published two distinct Portable Document Format (PDF) files with identical SHA-1 hash. Google recommended everyone who uses SHA-1 to switch to SHA-256 or SHA-3, stronger cryptographic hash functions.

The group started with a paper published by Marc Stevens in 2013 in which the authors proposed a theoretical algorithm for the collision attack. They chose PDF files as the basis for their attack. They wrote a PDF file prefix on both files and PDF contents which hashed to the same SHA-1 digest. Then they used the Google computing infrastructure to perform more than nine quintillion (upwards of 9,223,372,036,854,775,808) SHA1 computations. They described this as 6,500 years of CPU time distributed in the first phase, then 110 years of graphics processing unit (GPU) time total in the second phase of the attack to reach full collision. They said this new algorithm was 100,000 times faster than the brute-force attack and 50 times faster than a collision attack algorithm proposed in 2005.

The cryptology group at the Centrum Wiskunde & Informatica (CWI) institute, Netherlands, collaborated with the Google Research Security, Privacy and Anti-abuse Group to achieve the collision. According to a press release, Marc Stevens and Elie Bursztein were the initial leaders of the initiative, Ange Albertini developed the PDF attack code, Pierre Karpman developed the cryptoanalysis, Yarik Markov and Pierre Karpman developed the distributed graphical processing unit (GPU) code, and Clement Baisse checked the reliability of the programs.

Google also provided a collision detector online, where users can upload a PDF file to test whether it was tampered with for reaching a collision. Alex Petit Bianco developed the online file collision detector. Following its security policy, Google set a 90-day delay before the release of the source code for the attack.

Operating systems and software rely on SHA-1 for verifying the file integrity when distributing updates to its users and in ISO checksums. Developers use SHA-1 in their file revision control systems, such as git, to verify the files' integrity. Many people install programs that use SHA-1 for detecting duplicate files on storage media and for verifying backups' integrity. People surfing the web see SHA-1 used for verifying the integrity of HTTPS certificates to verify the users' connection with the website is not subject to a man-in-the-middle attack. People also use SHA-1 in email PGP/GPG signatures.

Since January, Google Chrome does not trust SHA-1 certificates. Mozilla Firefox stopped trusting them yesterday.

SHA-1 was introduced more than twenty years ago.


Sources