homeworks
Solutions to Homework 3
"The HALL of FAME"
Problem 4 (Part B) was solved by:
Uri Barenholz, Roman Dovgard, Omer Holz, Leonid Karlinski, Emmanuel
Lanzmann, Itsik Mantin, Yaniv Meoded, Eran Ofek, Eduard Oks.
This was a "relax" homework with an exception of problem
4 (iterative cryptosystems) and 5 (grandpre, hill climbing contest which
nobody has solved so far). The average grade was 85.
Below are several best solutions. Again, some people submitted excellent
works by e-mail or in hand writing, so they can't be shown here.
-
Solution
by Roman Dovgard (PS file) [1, 3, 4A, 4B,5B]
-
Solution
by Paul Koning (text file) [1, 2, 3,
4A, 5B]
-
Solution
by Omer Golz (in Hebrew) [1, 3, 4A, 4B]
-
Solution
by Uri Barenholz (text file) [4A,4B]
-
Solution
by Emmanuel Lanzmann [1, 2, 4A,4B]
-
Solution
by Itsik Mantin (PS file) [1, 4A,4B]
Brief Comments
1. Optimal Trial and Error
A famous problem and a good
training for cryptanalyst.
-
Each weighing has the power of splitting
the coin-space into three parts. Single weighing provides us with a "trit"
of information (log_3 (S)-- log to the base 3 of the number of possible
states).
-
Even when the weight of the fake coin
is known 2 weighings can work only up to 9 coins. Since we have 12
and 13 coins we need at least 3 weighings.
-
The system can be in 2*12 = 24 or in
2*13 = 26 states, thus log_3 (26) <3 and 3 weighings should be enough
for full determination of the state. This is however not the case since
we can weight only integer number of coins each time.
-
Note that in case of 12 coins we can
obtain full information (the weight of the fake coin), however in the case
of 13 coins our best split is, 4 4 5 and if there is a balance between
the 4-coin groups we are left with the case of 5 coins. In this case there
is no additional information about the possible weight of the fake coin.
There are 2*5 = 10 > 9 possible states, and thus two weighings will not
always reveal the weight of the fake coin.
It is interesting to show that for 14 coins 3 weighings are
not enough, although information theoretically it is not obvious. See Paul
Koning's work for detailed analysis.
2. Unicity Distance
A. one can use
D = R - R1 =
2.7 since transposition cipher preserves letter frequencies.
The aim of this exercise was to
give some crude estimates for the Grandpre contest. However, not much can
be done in the cases B, C except to apply random cipher formula. Nobody
knows how estimate UD even for simple substitution case. Friedman's claimed
25 letters is hard to believe (without spaces), and Grandpre is not even
a pure cipher.
B. The answer is: 410/3.2
~ 128
C. The answer is: 381/3.2~119
this is clearly misleading, since
UD in the second case must be larger and not smaller. Interesting analysis
is given in Paul
Koning's work.
3. Pure Ciphers
a) Pure; residue class is exactly
as in a simple substitution cipher
(T^{-1}_j T_k is a simple
substitution).
b) Not pure (one of possible counterexamples):
T_k:
AA BB -> GXGX XGXG->GXGXGXGX
T^{-1}_j: GXGXGXGX ->GXGXGXGX
-> CCCC
T_i:
CCCC -> AAAAAAAA -> AAAAAAAA
no T_l can transform AABB into AAAAAAAA
c) Not pure (one of the possible
counterexamples):
T_k:
A -> 12, B-> 40; AB -> 12 40
T^{-1}_j: 12,40 -> C;
12 40 -> C C
T_i:
C -> 11; C C -> 11 11
no T_l can
transform AB into 11 11.
4. Iterative ciphers
A. Playfair
Seven queries in adaptive
chosen plaintext attack (i.e. our next query depends on the results of
the previous queries) are enough if Playfair matrix is random. Less if
it is built with a keyword.
-
In this cipher substitution and transposition
commute. Thus one cancels out the effect of the transposition which being
raised to the fourth power becomes an identity.
-
Except for the letters that are in
the same row/column Playfair^4 is and identity. This happens in 1-8/25
cases. Thus about 2/3 of the plaintext is not encrypted. However contrary
to popular belief this is not sufficient for the text to be readable. Note
that one time pad also "leaks" about half of the plaintext bits to the
ciphertext.
You are welcome to prove the opposite
(there are 12 errors in the 37-letter text below): Testlitesolanyqrhoroismtitlhirmtobead.
Note that iterating our cipher
5 times gives cleartext.
4B. Generic
338 queries, 338 steps to recover full S-box. This is
about 4.7 times more than UD of this cipher.
-
Information theoretically there are:
676! ~ 2^{5380} keys. Each ciphertext block provides us with log(26^{16})
~ 75 bits of information. Thus Unicity Distance is about 5380/75
~ 72 blocks.
-
This cipher has small
invariant subset (of size 676):
{abababababababab, for any ab}. The action of the cipher
on this subset leaks information about the secret key (S-box) for any number
of rounds.
-
Weak avalanche (propagation of changes):
if abababababababab->efefefefefefefef and if cdcdcdcdcdcdcdcd->ghghghghghghghgh
then
ababababcdcdcdcd->......ef......gh
.
Information is leaked from plaintext to the ciphertext.
-
Denote S-box by S and
swap of two letters by T, then with just 676/2=338 queries we get
a full structure of permutation: Q= (TS)^4. Our aim is to
find a quartic root of this permutation: (TS).
-
Given a cycle of length k in
TS
, if gcd(k,4)=1 then it corresponds to a cycle of the same
length in Q. If gcd(k,4)=2 , there should be
a pair of cycles of length k/2 in Q corresponding to a single
cycle of length k in TS. If gcd(k,4)=4, there are
four cycles of length k/4 in Q.
-
If S is a random permutation,
then it has about log(676)~6.5 cycles. 62% of elements are in a single
cycle. With probability 1/2 the length of this cycle is odd and thus we
recover about 419 entries in the S-box table. Note that Q has a
skewed statistic since it is a fourth power of a random permutation. For
example it tends to have twice as many cycles as S (denote by Nc
the number of cycles in S): #{gcd(k,4)=1}+#{gcd(k,4)=2}+#{gcd(k,4)=4}
= Nc/2+2*Nc/4+4*Nc/4 = 2*Nc).
-
If cycle splitting has happened we
may have two problems: a) finding correct pairs or quartets of cycles (not
a major problem in the case of few cycles that we have); b) finding
relative shift of the cycles: for two cycles at most 338 guesses are needed,
for four cycles at most 3!(676/4)^3 = 2^{24} steps are needed. We need
1-2 additional known plaintext queries in order to check the guesses. Wrong
guess has the following probability to survive each query: 1/26^{16} ~
2^{-75}.
Here we assumed that the number of
cycles is small, which is plausible since the variance of the number of
cycles is about ln(n)~6.5 (see here).
However the rare case of many small cycles quickly comes out of control.
So here is another method to attack the problem:
Slide Attacks
Suppose that we know the outcome
of a single round (TS(P)) for a certain plaintext P.
Consider a pair of plaintexts P and P'=TS(P). Denoteby
C and C' their corresponding ciphertexts. Since all rounds
perform the same function, the two encryptions will produce the same results
with a slide by one round. Note that the ciphertext of the first
encryption is just the input to the fourth round in the second encryption.
Since ciphertexts are known we receive an equation C' = TS(C). This
gives us 8 entries in the S table. Few slid queries will
provide us with necessary cycle binding information.
So the attack will be as follows:
-
Make 338 queries to map the structure
of Q. Find largest odd cycles in Q and thus recover
corresponding cycles of S.
-
Use recovered information about S
to construct few slid pairs to fill the gaps.
Refinements are possible (even a short
odd cycle is sufficient to start the second (sliding) phase of the attack).
Other Attacks
This cipher allows many efficient
attacks. One can consider:
-
Differential attack.
-
Meet in the middle?
5. Grandpre
While many solved the second test
grandpre, nobody has solved the 1st test and the contest
problems. You are welcome to try, the stakes are increased.
Here is the program
for testing the grandpre matrix written by
Dani Halevi and Raanan Wald.
Last modified on 12.06.2000.