|
| 1 | +Chapter 6. Transport Layer Security (TLS, SSL, HTTPS) |
| 2 | +======================================================= |
| 3 | + |
| 4 | +To understand the design goals and requirements for the Transport Layer |
| 5 | +Security (TLS) standard and the Secure Socket Layer (SSL) on which TLS |
| 6 | +is based, it is helpful to consider one of the main problems that they |
| 7 | +are intended to solve. As the World Wide Web became popular and |
| 8 | +commercial enterprises began to take an interest in it, it became clear |
| 9 | +that some level of security would be necessary for transactions on the |
| 10 | +Web. The canonical example of this is making purchases by credit card. |
| 11 | +There are several issues of concern when sending your credit card |
| 12 | +information to a computer on the Web. First, you might worry that the |
| 13 | +information would be intercepted in transit and subsequently used to |
| 14 | +make unauthorized purchases. You might also worry about the details of a |
| 15 | +transaction being modified, such as changing the purchase amount. And |
| 16 | +you would certainly like to know that the computer to which you are |
| 17 | +sending your credit card information is in fact one belonging to the |
| 18 | +vendor in question and not some other party. Thus, we immediately see a |
| 19 | +need for confidentiality, integrity, and authentication in Web |
| 20 | +transactions. The first widely used solution to this problem was SSL, |
| 21 | +originally developed by Netscape and subsequently the basis for the |
| 22 | +IETF’s TLS standard. |
| 23 | + |
| 24 | +The designers of SSL and TLS recognized that these problems were not |
| 25 | +specific to Web transactions (i.e., those using HTTP) and instead built |
| 26 | +a general-purpose protocol that sits between an application protocol |
| 27 | +such as HTTP and a transport protocol such as TCP. The reason for |
| 28 | +calling this “transport layer security” is that, from the application’s |
| 29 | +perspective, this protocol layer looks just like a normal transport |
| 30 | +protocol except for the fact that it is secure. That is, the sender can |
| 31 | +open connections and deliver bytes for transmission, and the secure |
| 32 | +transport layer will get them to the receiver with the necessary |
| 33 | +confidentiality, integrity, and authentication. By running the secure |
| 34 | +transport layer on top of TCP, all of the normal features of TCP |
| 35 | +(reliability, flow control, congestion control, etc.) are also provided |
| 36 | +to the application. This arrangement of protocol layers is depicted in |
| 37 | +:numref:`Figure %s <fig-tls-stack>`. |
| 38 | + |
| 39 | +.. _fig-tls-stack: |
| 40 | +.. figure:: figures/f08-15-9780123850591.png |
| 41 | + :width: 300px |
| 42 | + :align: center |
| 43 | + |
| 44 | + Secure transport layer inserted between application and TCP layers. |
| 45 | + |
| 46 | +When HTTP is used in this way, it is known as HTTPS (Secure HTTP). In |
| 47 | +fact, HTTP itself is unchanged. It simply delivers data to and accepts |
| 48 | +data from the SSL/TLS layer rather than TCP. For convenience, a default |
| 49 | +TCP port has been assigned to HTTPS (443). That is, if you try to |
| 50 | +connect to a server on TCP port 443, you will likely find yourself |
| 51 | +talking to the SSL/TLS protocol, which will pass your data through to |
| 52 | +HTTP provided all goes well with authentication and decryption. Although |
| 53 | +standalone implementations of SSL/TLS are available, it is more common |
| 54 | +for an implementation to be bundled with applications that need it, |
| 55 | +primarily web browsers. |
| 56 | + |
| 57 | +In the remainder of our discussion of transport layer security, we focus |
| 58 | +on TLS. Although SSL and TLS are unfortunately not interoperable, they |
| 59 | +differ in only minor ways, so nearly all of this description of TLS |
| 60 | +applies to SSL. |
| 61 | + |
| 62 | + |
| 63 | +6.1 Handshake Protocol |
| 64 | +----------------------- |
| 65 | + |
| 66 | +A pair of TLS participants negotiate at runtime which cryptography to |
| 67 | +use. The participants negotiate a choice of: |
| 68 | + |
| 69 | +- Data integrity hash (MD5, SHA-1, etc.), used to implement HMACs |
| 70 | + |
| 71 | +- secret-key cipher for confidentiality (among the possibilities are |
| 72 | + DES, 3DES, and AES) |
| 73 | + |
| 74 | +- Session key establishment approach (among the possibilities are |
| 75 | + Diffie-Hellman, and public-key authentication protocols using DSS) |
| 76 | + |
| 77 | +Interestingly, the participants may also negotiate the use of a |
| 78 | +compression algorithm, not because this offers any security benefits, |
| 79 | +but because it’s easy to do when you’re negotiating all this other stuff |
| 80 | +and you’ve already decided to do some expensive per-byte operations on |
| 81 | +the data. |
| 82 | + |
| 83 | +In TLS, the confidentiality cipher uses two keys, one for each |
| 84 | +direction, and similarly two initialization vectors. The HMACs are |
| 85 | +likewise keyed with different keys for the two participants. Thus, |
| 86 | +regardless of the choice of cipher and hash, a TLS session requires |
| 87 | +effectively six keys. TLS derives all of them from a single shared |
| 88 | +*master secret*. The master secret is a 384-bit (48-byte) value that in |
| 89 | +turn is derived in part from the “session key” that results from TLS’s |
| 90 | +session key establishment protocol. |
| 91 | + |
| 92 | +The part of TLS that negotiates the choices and establishes the shared |
| 93 | +master secret is called the *handshake protocol*. (Actual data transfer |
| 94 | +is performed by TLS’s *record protocol*.) The handshake protocol is at |
| 95 | +heart a session key establishment protocol, with a master secret instead |
| 96 | +of a session key. Since TLS supports a choice of approaches to session |
| 97 | +key establishment, these call for correspondingly different protocol |
| 98 | +variants. Furthermore, the handshake protocol supports a choice between |
| 99 | +mutual authentication of both participants, authentication of just one |
| 100 | +participant (this is the most common case, such as authenticating a |
| 101 | +website but not a user), or no authentication at all (anonymous |
| 102 | +Diffie-Hellman). Thus, the handshake protocol knits together several |
| 103 | +session key establishment protocols into a single protocol. |
| 104 | + |
| 105 | +:numref:`Figure %s <fig-tls-hand>` shows the handshake protocol at a |
| 106 | +high level. The client initially sends a list of the combinations of |
| 107 | +cryptographic algorithms that it supports, in decreasing order of |
| 108 | +preference. The server responds, giving the single combination of |
| 109 | +cryptographic algorithms it selected from those listed by the |
| 110 | +client. These messages also contain a *client nonce* and a *server |
| 111 | +nonce*, respectively, that will be incorporated in generating the |
| 112 | +master secret later. |
| 113 | + |
| 114 | +.. _fig-tls-hand: |
| 115 | +.. figure:: figures/f08-16-9780123850591.png |
| 116 | + :width: 300px |
| 117 | + :align: center |
| 118 | + |
| 119 | + Handshake protocol to establish TLS session. |
| 120 | + |
| 121 | +At this point, the negotiation phase is complete. The server now sends |
| 122 | +additional messages based on the negotiated session key establishment |
| 123 | +protocol. That could involve sending a public-key certificate or a set |
| 124 | +of Diffie-Hellman parameters. If the server requires authentication of |
| 125 | +the client, it sends a separate message indicating that. The client then |
| 126 | +responds with its part of the negotiated key exchange protocol. |
| 127 | + |
| 128 | +Now the client and server each have the information necessary to |
| 129 | +generate the master secret. The “session key” that they exchanged is not |
| 130 | +in fact a key, but instead what TLS calls a *pre-master secret*. The |
| 131 | +master secret is computed (using a published algorithm) from this |
| 132 | +pre-master secret, the client nonce, and the server nonce. Using the |
| 133 | +keys derived from the master secret, the client then sends a message |
| 134 | +that includes a hash of all the preceding handshake messages, to which |
| 135 | +the server responds with a similar message. This enables them to detect |
| 136 | +any discrepancies between the handshake messages they sent and received, |
| 137 | +such as would result, for example, if a man in the middle modified the |
| 138 | +initial unencrypted client message to weaken its choices of |
| 139 | +cryptographic algorithms. |
| 140 | + |
| 141 | +6.3.2 Record Protocol |
| 142 | +~~~~~~~~~~~~~~~~~~~~~ |
| 143 | + |
| 144 | +Within a session established by the handshake protocol, TLS’s record |
| 145 | +protocol adds confidentiality and integrity to the underlying transport |
| 146 | +service. Messages handed down from the application layer are: |
| 147 | + |
| 148 | +1. Fragmented or coalesced into blocks of a convenient size for the |
| 149 | + following steps |
| 150 | + |
| 151 | +2. Optionally compressed |
| 152 | + |
| 153 | +3. Integrity-protected using an HMAC |
| 154 | + |
| 155 | +4. Encrypted using a secret-key cipher |
| 156 | + |
| 157 | +5. Passed to the transport layer (normally TCP) for transmission |
| 158 | + |
| 159 | +The record protocol uses an HMAC as an authenticator. The HMAC uses |
| 160 | +whichever hash algorithm (MD5, SHA-1, etc.) was negotiated by the |
| 161 | +participants. The client and server have different keys to use when |
| 162 | +computing HMACs, making them even harder to break. Furthermore, each |
| 163 | +record protocol message is assigned a sequence number, which is included |
| 164 | +when the HMAC is computed—even though the sequence number is never |
| 165 | +explicit in the message. This implicit sequence number prevents replays |
| 166 | +or reorderings of messages. This is needed because, although TCP can |
| 167 | +deliver sequential, unduplicated messages to the layer above it under |
| 168 | +normal assumptions, those assumptions do not include an adversary that |
| 169 | +can intercept TCP messages, modify messages, or send bogus ones. On the |
| 170 | +other hand, it is TCP’s delivery guarantees that make it possible for |
| 171 | +TLS to rely on a legitimate TLS message having the next implicit |
| 172 | +sequence number in order. |
| 173 | + |
| 174 | +Another interesting feature of the TLS protocol is the ability to resume |
| 175 | +a session. To understand the original motivation for this, it is helpful |
| 176 | +to understand how HTTP originally mades use of TCP connections. (The |
| 177 | +details of HTTP are presented in the next chapter.) Each HTTP operation, |
| 178 | +such as getting a page from a server, required a new TCP connection to |
| 179 | +be opened. Retrieving a single page with a number of embedded graphical |
| 180 | +objects might take many TCP connections. Opening a TCP connection |
| 181 | +requires a three-way handshake before data transmission can start. Once |
| 182 | +the TCP connection is ready to accept data, the client would then need |
| 183 | +to start the TLS handshake protocol, taking at least another two |
| 184 | +round-trip times (and consuming some amount of processing resources and |
| 185 | +network bandwidth) before actual application data could be sent. The |
| 186 | +resumption capability of TLS was designed to alleviate this problem. |
| 187 | + |
| 188 | +The idea of session resumption is to optimize away the handshake in |
| 189 | +those cases where the client and the server have already established |
| 190 | +some shared state in the past. The client simply includes the session ID |
| 191 | +from a previously established session in its initial handshake message. |
| 192 | +If the server finds that it still has state for that session, and the |
| 193 | +resumption option was negotiated when that session was originally |
| 194 | +created, then the server can reply to the client with an indication of |
| 195 | +success, and data transmission can begin using the algorithms and |
| 196 | +parameters previously negotiated. If the session ID does not match any |
| 197 | +session state cached at the server, or if resumption was not allowed for |
| 198 | +the session, then the server will fall back to the normal handshake |
| 199 | +process. |
| 200 | + |
| 201 | +The reason the preceeding discussion emphasized the *original* |
| 202 | +motivation is that having to do a TCP handshake for every embedded |
| 203 | +object in a web page led to so much overhead, independent of TLS, that |
| 204 | +HTTP was eventually optimized to support *persistent connections* (also |
| 205 | +discussed in the next chapter). Because optimizing HTTP mitigated the |
| 206 | +value of session resumption in TLS (plus the realization that reusing |
| 207 | +the same session IDs and master secret key in a series of resumed |
| 208 | +sessions is a security risk), TLS changed its approach to resumption in |
| 209 | +the latest version (1.3). |
| 210 | + |
| 211 | +In TLS 1.3, the client sends an opaque, server-encrypted *session |
| 212 | +ticket* to the server upon resumption. This ticket contains all the |
| 213 | +information required to resume the session. The same master secret is |
| 214 | +used across handshakes, but the default behavior is to perform a session |
| 215 | +key exchange upon resumption. |
| 216 | + |
| 217 | +.. _key-layering: |
| 218 | +.. admonition:: Key Takeaway |
| 219 | + |
| 220 | + We call attention to this change in TLS because it illustrates the |
| 221 | + challenge of knowing which layer should solve a given problem. In |
| 222 | + isolation, session resumption as implemented in the earlier version |
| 223 | + of TLS seems like a good idea, but it needs to be considered in the |
| 224 | + context of the dominate use case, which is HTTP. Once the overhead of |
| 225 | + doing multiple TCP connections was addressed by HTTP, the equation |
| 226 | + for how resumption should be implemented by TLS changed. The bigger |
| 227 | + lesson is that we need to avoid rigid thinking about the right |
| 228 | + layer to implement a given function—the answer changes over time |
| 229 | + as the network evolves—where a holistic/cross-layer analysis is |
| 230 | + required to get the design right. |
| 231 | + |
0 commit comments