PROBLEM LINKS
DIFFICULTY
CHALENGE
PREREQUISITES
Ad-Hoc, Brute Force
PROBLEM
You are given N servers, out of which a vast majority are honeypots. You have to guess the passwords at these servers. You know that the servers mix the passwords with some random salt, and then compare SHA-1 hashes of both the strings.
The score is equal to the sum of the square of the number of bits correctly matched. Honeypot servers do not affect the score at all.
EXPLANATION
Firstly, you can try to find which servers are honeypot servers.
This can be done by first getting a reference score by using some a set of passwords initially; then
- Change a single password to some other string
- If the score changes, then that server is not honeypot
- Otherwise the server is honeypot
Since we know that the density of honeypot servers is high, we can be smarter and try to find blocks of honeypot servers together. In case we find the score did change for the block of servers we were hoping to be honeypot, we can then try them one by one.
After finding the honeypot servers, we must now try to find the passwords.
The chances of finding the exact password in this problem for any server are extremely narrow. Because of the nature of SHA-1, you cannot even hope to change a string ever so slightly to try and get a better result.
Your best bet is to generate random strings and see which one gets you closer to better and better scores.
One method is to only generate random strings and then try to run the corpus of random strings for each server (of course not for honeypots) while keeping all the other strings constant to try and maximize the score.
This is what the tester's solution does. You have to be careful that you should not exceed the number of attepts you have.
Note that we cannot try to use plaintext correlations between the passwords because the score is being calculated upon SHA-1 hashes that have little correlation to the plaintext that is being hashed.
Thus, it is only possible to try and exploit the correlations between two strings in their calculated SHA-1 hashes and the score difference caused by them. What this means is
- On one server, keeping all other server's password the same, changing the password of course causes change in score.
- Try and estimate how many bits of SHA-1 hash are correct in the old password and the new password
- Since we know the difference in scores is caused by change of only one string
- We can try and find x and y in x2 - y2 = score1 - score2
- x and y are the number of correct bits in the SHA1 hashes of the two strings
Once we find a string that has too few bits that are correct, we can ignore any string whose SHA-1 hash matches a lot of bits from this string. This way, we should be able to make more informed attack upon the SHA-1 hash based scoring.
SETTER'S SOLUTION
Can be found here.
TESTER'S SOLUTION
Can be found here.