homeworks

homeworks Solutions to Homework 3
"The HALL of FAME"

Problem 4 (Part B) was solved by:
Uri Barenholz, Roman Dovgard, Omer Holz, Leonid Karlinski, Emmanuel Lanzmann, Itsik Mantin, Yaniv Meoded, Eran Ofek, Eduard Oks.

This was a "relax" homework with an exception of problem 4 (iterative cryptosystems) and 5 (grandpre, hill climbing contest which nobody has solved so far). The average grade was 85. Below are several best solutions. Again, some people submitted excellent works by e-mail or in hand writing, so they can't be shown here.

Solution by Roman Dovgard (PS file) [1, 3, 4A, 4B,5B]
Solution by Paul Koning (text file) [1, 2, 3, 4A, 5B]
Solution by Omer Golz (in Hebrew) [1, 3, 4A, 4B]
Solution by Uri Barenholz (text file) [4A,4B]
Solution by Emmanuel Lanzmann [1, 2, 4A,4B]
Solution by Itsik Mantin (PS file) [1, 4A,4B]

Brief Comments

1. Optimal Trial and Error
A famous problem and a good training for cryptanalyst.

Each weighing has the power of splitting the coin-space into three parts. Single weighing provides us with a "trit" of information (log_3 (S)-- log to the base 3 of the number of possible states).
Even when the weight of the fake coin is known 2 weighings can work only up to 9 coins. Since we have 12 and 13 coins we need at least 3 weighings.
The system can be in 2*12 = 24 or in 2*13 = 26 states, thus log_3 (26) <3 and 3 weighings should be enough for full determination of the state. This is however not the case since we can weight only integer number of coins each time.
Note that in case of 12 coins we can obtain full information (the weight of the fake coin), however in the case of 13 coins our best split is, 4 4 5 and if there is a balance between the 4-coin groups we are left with the case of 5 coins. In this case there is no additional information about the possible weight of the fake coin. There are 2*5 = 10 > 9 possible states, and thus two weighings will not always reveal the weight of the fake coin.

It is interesting to show that for 14 coins 3 weighings are not enough, although information theoretically it is not obvious. See Paul Koning's work for detailed analysis.

2. Unicity Distance
A. one can use D = R - R1 = 2.7 since transposition cipher preserves letter frequencies.

The aim of this exercise was to give some crude estimates for the Grandpre contest. However, not much can be done in the cases B, C except to apply random cipher formula. Nobody knows how estimate UD even for simple substitution case. Friedman's claimed 25 letters is hard to believe (without spaces), and Grandpre is not even a pure cipher.

B. The answer is: 410/3.2 ~ 128
C. The answer is: 381/3.2~119
this is clearly misleading, since UD in the second case must be larger and not smaller. Interesting analysis is given in Paul Koning's work.

3. Pure Ciphers
a) Pure; residue class is exactly as in a simple substitution cipher
(T^{-1}_j T_k is a simple substitution).

b) Not pure (one of possible counterexamples):

T_k: AA BB -> GXGX XGXG->GXGXGXGX
T^{-1}_j: GXGXGXGX ->GXGXGXGX -> CCCC
T_i: CCCC -> AAAAAAAA -> AAAAAAAA
no T_l can transform AABB into AAAAAAAA

c) Not pure (one of the possible counterexamples):

T_k:     A -> 12, B-> 40; AB -> 12 40
T^{-1}_j:12,40 -> C;   12 40 -> C C
T_i:     C -> 11; C C -> 11 11
no T_l can transform AB into 11 11.

4. Iterative ciphers
A. Playfair
Seven queries in adaptive chosen plaintext attack (i.e. our next query depends on the results of the previous queries) are enough if Playfair matrix is random. Less if it is built with a keyword.

In this cipher substitution and transposition commute. Thus one cancels out the effect of the transposition which being raised to the fourth power becomes an identity.
Except for the letters that are in the same row/column Playfair^4 is and identity. This happens in 1-8/25 cases. Thus about 2/3 of the plaintext is not encrypted. However contrary to popular belief this is not sufficient for the text to be readable. Note that one time pad also "leaks" about half of the plaintext bits to the ciphertext.

You are welcome to prove the opposite (there are 12 errors in the 37-letter text below): Testlitesolanyqrhoroismtitlhirmtobead.
Note that iterating our cipher 5 times gives cleartext.

4B. Generic
338 queries, 338 steps to recover full S-box. This is about 4.7 times more than UD of this cipher.

Information theoretically there are: 676! ~ 2^{5380} keys. Each ciphertext block provides us with log(26^{16}) ~ 75 bits of information. Thus Unicity Distance is about 5380/75 ~ 72 blocks.
This cipher has small invariant subset (of size 676): {abababababababab, for any ab}. The action of the cipher on this subset leaks information about the secret key (S-box) for any number of rounds.
Weak avalanche (propagation of changes): if abababababababab->efefefefefefefef and if cdcdcdcdcdcdcdcd->ghghghghghghghghthen ababababcdcdcdcd->......ef......gh .Information is leaked from plaintext to the ciphertext.
Denote S-box by S and swap of two letters by T, then with just 676/2=338 queries we get a full structure of permutation: Q= (TS)^4. Our aim is to find a quartic root of this permutation: (TS).
Given a cycle of length k in TS , if gcd(k,4)=1 then it corresponds to a cycle of the same length in Q. If gcd(k,4)=2 , there should be a pair of cycles of length k/2 in Q corresponding to a single cycle of length k in TS. If gcd(k,4)=4, there are four cycles of length k/4 in Q.
If S is a random permutation, then it has about log(676)~6.5 cycles. 62% of elements are in a single cycle. With probability 1/2 the length of this cycle is odd and thus we recover about 419 entries in the S-box table. Note that Q has a skewed statistic since it is a fourth power of a random permutation. For example it tends to have twice as many cycles as S (denote by Nc the number of cycles in S): #{gcd(k,4)=1}+#{gcd(k,4)=2}+#{gcd(k,4)=4} = Nc/2+2*Nc/4+4*Nc/4 = 2*Nc).
If cycle splitting has happened we may have two problems: a) finding correct pairs or quartets of cycles (not a major problem in the case of few cycles that we have); b) finding relative shift of the cycles: for two cycles at most 338 guesses are needed, for four cycles at most 3!(676/4)^3 = 2^{24} steps are needed. We need 1-2 additional known plaintext queries in order to check the guesses. Wrong guess has the following probability to survive each query: 1/26^{16} ~ 2^{-75}.

Here we assumed that the number of cycles is small, which is plausible since the variance of the number of cycles is about ln(n)~6.5 (see here). However the rare case of many small cycles quickly comes out of control. So here is another method to attack the problem:

Slide Attacks

Suppose that we know the outcome of a single round (TS(P)) for a certain plaintext P. Consider a pair of plaintexts P and P'=TS(P). Denoteby C and C' their corresponding ciphertexts. Since all rounds perform the same function, the two encryptions will produce the same results with a slide by one round. Note that the ciphertext of the first encryption is just the input to the fourth round in the second encryption. Since ciphertexts are known we receive an equation C' = TS(C). This gives us 8 entries in the S table. Few slid queries will provide us with necessary cycle binding information.

So the attack will be as follows:

Make 338 queries to map the structure of Q. Find largest odd cycles in Q and thus recover corresponding cycles of S.
Use recovered information about S to construct few slid pairs to fill the gaps.

Refinements are possible (even a short odd cycle is sufficient to start the second (sliding) phase of the attack).

Other Attacks
This cipher allows many efficient attacks. One can consider:

Differential attack.
Meet in the middle?

5. Grandpre

While many solved the second test grandpre, nobody has solved the 1st test and the contest problems. You are welcome to try, the stakes are increased.

Here is the program for testing the grandpre matrix written by
Dani Halevi and Raanan Wald.

Last modified on 12.06.2000.