DIT250-TDA352 Home assignment 1-Cryptography course Solved

35.00 $

Category:

Description

5/5 - (2 votes)

The assignment consists of two largely independent parts. In the first (and main) part you will study a well-known attack on an SSL channel and answer some questions. In the second part, you will encrypt your solution using gpg before submitting it.

Part A. An attack against SSL.

0. Introduction

In this part we will explore a relatively recent (2003) attack on widely used cryptographic software, discovered by Serge Vaudenay and co-workers at EPFL in Switzerland and reported in [1]. The attack is against an SSL/TLS channel; one example instantiation could be to find the password used by an email client to get email from an IMAP mail server. In reported experiments, the attack could recover the password of a user in less than an hour.

The attack is trivial to prevent by a simple change in the SSL implementation and the OpenSSL software implements this change from version 0.9.7a, February 2003 (see [2]). This fix was implemented before [1] was announced (thanks to communication between the discoverers and the OpenSSL developers), so the attack was never significant in practice, but it is anyhow interesting.

In this document we will describe the attack in some detail; at some points you will find questions that you are supposed to answer in your submission. The questions are generally simple when you have understood the explanations up to that point. Thus they serve mainly to check your understanding of what is presented.

Note added October 21, 2010: Variants of this attack has been subject to a lot of publicity in the last few months, under the catchy name of Padding Oracle Attacks. This is due to the discovery by Rizzo and Duong that many web frameworks (ASP.NET, JavaServer Faces, Ruby on Rails, OWASP ESAPI) are vulnerable to the attack. We describe this in more detail in the Appendix to this assignment. The appendix is independent of the assignment as such, but we hope that it provides an interesting case study.

1. Overview of SSL

SSL (Secure Socket Layer) is a security protocol that runs below application-layer protocols like HTTP or IMAP, but on top of transport-layer protocols such as TCP. It provides both confidentiality and data integrity, thus providing a secure channel to the communicating applications. An overview of how the protocol operates is given in the course textbook. For the purposes of this assignment it is only necessary to know that when a connection is established, secret keys are exchanged between the parties using public-key techniques. The parties also agree on which algorithms to use for secret-key encryption and Message Authentication Code (MAC) computations.

A Message Authentication Code (MAC) is a short piece of information used to authenticate a message; in other words, to confirm that the message came from the stated sender (its authenticity) and has not been changed in transit (its integrity). When the connection has been established, subsequent communication during the session is encrypted using the agreed methods and keys.

When a message MES is to be sent, first its MAC is computed, using the agreed method. Then the MAC is appended to MES and padding PAD added so that the full message MES || MAC || PAD makes up a whole number of blocks. The padded message is then encrypted using a block cipher in CBC mode. The IV for encryption is not sent as part of the ciphertext but is agreed on in other ways, not detailed here.

The protocol is typically used in a client/server setting. On the server side we will find e.g. a web server or a mail server. The attack will be performed against the server by an active adversary, who intercepts client messages, modifies them and sends the modified message to the server. It is thus a chosen ciphertext attack.

The server, on receipt of a ciphertext, performs the following steps:

  1. The message is decrypted, using the CBC decryption algorithm.
  2. The padding is checked to be correct and removed.
  3. The MAC is checked to be correct and removed.
  4. The remaining message MES is handed over to the application.

If either of the checks in steps 2 or 3 fails, an error message is sent (over the secure channel, i.e. MACed, PADded and encrypted) and the session is aborted. This behaviour is typical for cryptographic protocols: any failed check is an indication of an attack and the protocol should be immediately aborted.

2. Padding

In this assignment, for concreteness we make the assumption that the block size is 64 bits = 8 bytes (as is the case e.g. for 3DES).

We will need to be specific about how padding is done. Let the length in bytes of MES || MAC be n. The padding then consists of 8 − (n mod 8) bytes, each with the value 7−(n mod 8) (as an eight-bit integer). So if n is a multiple of 8, padding consists of eight bytes, each with value 7 (= 000001112).

We now introduce some notation: With capital letter variables (S,C,) we will always refer to blocks. We will need to describe blocks also as sequences of eight bytes, for which we will use the notation < b1,b2,b3,b4,b5,b6,b7,b8 >.

The last block of a full message, i.e. the one that contains the padding, we will call the pad block. The pad block thus has one of the following eight forms, where the ?:s represent the last bytes of the unpadded message:

<?,?,?,?,?,?,?,0 > <?,?,?,?,?,?,1,1 > <?,?,?,?,?,2,2,2 > <?,?,?,?,3,3,3,3 > <?,?,?,4,4,4,4,4 > <?,?,5,5,5,5,5,5 > <?,6,6,6,6,6,6,6 >

< 7,7,7,7,7,7,7,7 >

(Remark: SSL actually allows longer padding than necessary, in order to hide message length, but we will ignore that.)

We will later construct random blocks and need to have an idea about the probability that a random block is actually a valid pad block. So, let’s for a moment just consider randomly constructed blocks of 64 bits. Note that in the next three questions we look just at randomly constructed blocks and some probabilities when such blocks are chosen; you do not need to think about the wider context to answer these questions.

Question 1: What is the probability that such a random block (uniform distribution over all 64 bit blocks) is a pad block of type 0 (i.e., of the form in the first line above)?

Question 2: What is the probability that a random block is a valid pad block (of any of the above forms)?

Question 3: What is the conditional probability that a random block is of type 0, given that it is a valid pad block? To avoid unnecessary resubmissions, we give the answer: around 99.6 %. You must anyhow answer, showing how this is computed. Hint: check Bayes Theorem (Wikipedia)

We will later make use of these facts; when a random block is a pad block, we will assume that it is of type 0, since the probability for this is very high.

3. A side channel attack

In the attack to be described, the attacker will intercept a ciphertext in transit and modify it, adding some random data before forwarding it to the server.

When the modified message is received, it is processed as described above. It is almost certain that this will lead to abortion; even if the padding check succeeds, it is extremely unlikely that the MAC check will succeed. The attacker will not be able to comprehend the error message, since it is encrypted. Instead the attack is based on measuring the time elapsed before the session aborts. Checking the padding is a quick process, while recomputing a MAC for a lengthy message takes noticeably longer time. The attacker thus uses a side channel; he cannot break encryption but gets useful information by other means. Other forms of side channels that have been exploited in attacks are e.g. variations in power consumption for small devices.

We can now see the simple fix that prevents the attack: just perform the MAC check even if the padding check fails! A better solution, which requires a change to the protocol, would be to apply the MAC after padding and encryption. Then all the attacker’s modified messages will have an invalid MAC and decryption will never be done and thus padding never checked.

4. The IMAP protocol

As mentioned above, one special case of the attack could be to find the password of a user, checking mail from an email server running IMAP (Internet Mail Access Protocol). We do not need to discuss the IMAP protocol in detail. It is enough to know that a common set-up is that the client that wants to check mail opens an SSL channel to the server. After channel establishment at the SSL level, the first application-level command from the client is

XXXX LOGIN “username” “password

where XXXX is a message sequence number and username and password are replaced by actual data. Of course, the message is MAC’ed, padded and encrypted as above by the SSL channel.

The important properties here are

  • A session starting with this command is initiated several times per hour by the client, sometimes as frequent as once per minute. In some cases multiple mailboxes in the client lead to independent logins, resulting in dozens or even hundreds of sessions per hour.
  • This login message is the same every time for a given user except for the sequence number. In particular, the password is at a fixed position in the message.

The attack proceeds by passive listening to the channel establishment of each session followed by active interception of the login message, which is modified before it is forwarded to the server. This leads to session abortion, but also to more timing information for the adversary. After many sessions (in the order of 1000), the password will be recovered, as detailed in the next section.

For the duration of the attack, the mail client will try to get mail many times, each time with successful channel establishment followed by immediate abortion after the login message. It is interesting that some common mail clients accept this without alerting the user. Of course, if the user is present at her machine, she might notice that no new mails arrive, so an attack undisturbed overnight could be more convenient.

5. An attack against CBC mode

Finally, we are ready to describe the heart of the attack, which is really an attack against block cipher encryption in CBC mode. We consider the following scenario:

  • A sender repeatedly sends messages that contain the same, fixed plaintext block P of critical importance at the same, fixed position in every message. Other parts of the messages may differ.
  • Messages are MAC-ed, padded and encrypted using a block cipher with block size 64 bits in CBC mode. Different messages may be encrypted and MAC’ed with different keys (in the mail example, each session has different keys).
  • The adversary can intercept and modify messages.
  • The adversary can distinguish whether messages he sends have invalid padding or invalid MAC, for example by measuring time differences in server response. The probability of both valid padding and valid MAC is negligible, so that case is ignored.

Most of the discussion so far has tried to convey to the reader that such a scenario is quite realistic. We just note that in the mail example, the password may cross block boundaries, so that two blocks will be of interest, but we ignore that here.

For simplicity, let us assume that the critical block is the third block in the message, so that a cleartext message has the form ??P?, where the question-marks stand for uninteresting blocks. The corresponding ciphertext has the form ?C0C?, where again ? blocks are uninteresting. Both C0 and C are important, however, since CBC encryption means that C = E(P ⊕ C0), where E is the block cipher encryption function.

In order to gain information about P, the adversary constructs blocks of the form Ri =< r1,r2,r3,r4,r5,r6,r7,i >, where all the rk are random bytes and i is systematically changed by the adversary. The messages sent to the server are (Ri ⊕ C0)||C.

The server first decrypts the message; the last block (the pad block) will then be decrypted to be D(C) ⊕ (Ri ⊕ C0), which turns out to be P ⊕ Ri.

Question 4: Show the calculations that verify this fact.

Now, the server checks the padding. If padding is correct, we assume that the block is a pad block of type 0, i.e. P ⊕ Ri has last byte 0.

Question 5: What is then the last byte of P?

If padding is not correct, the adversary was unlucky and will have to wait for the next message, where he will try another value of i. After at most 255 messages, he will have determined the last byte of P.

Question 6: The message sent is only two blocks long, so MAC computation does not take noticeable time. In the real attack the message is made considerably longer in a way that does not affect the above reasoning, but will slow down the MAC computation. Suggest (with motivation) a way to do this. There is considerable freedom here, so nothing clever is needed, but you must make a concrete suggestion.

After the last byte of P has been decided to be, say, b8, the attack proceeds to find the second to last byte. To this end, the adversary constructs blocks of the form Si =< r1,r2,r3,r4,r5,r6,i,b8 ⊕ 1 >, where again the ri are random. The message sent is (Si ⊕

C0)||C.

Question 7: If padding is correct, which is byte b7 of P?

In this way, after at most a further 255 messages, b7 will be determined and the attack proceeds to find b6.

Question 8: Describe the form of messages used to discover b6.

Proceeding similarly, the whole block P can be found in at most 8 · 255 = 2040 messages, or in ca 1000 messages on average.

Using dictionary lookup techniques, trying values of i in order of frequency, this bound on the number of messages can be considerably tightened. (Just by restricting to printable ASCII characters, disregarding frequencies among these, the average number of messages will be reduced to around 400.)

We end with a voluntary question, which you do not need to answer to pass the assignment.

Question 9: We assumed above, when trying to find the last byte, that a correct pad block had type 0. This is wrong in 0.4 % of the cases. Suggest a way to improve the technique and actually find the type of pad block, at the expense of further messages (but also finding more bytes of P).

Does this refinement apply also to finding the other bytes of P?

  • Assignment-1-5gsbba.zip