Rick van Rein

Tue 15 August 2017


Dissecting TLS for Operational Flexibility

The TLS protocol is usually considered as a black box that somehow bestows security. But like any other protocol, it is a sequence of bits and bytes. This article explains how a bit more depth about the protocol is helpful to understand how it can be split into two dramatically different components; and how this can be incredibly useful from an operational perspective.

This article is part of a series of articles about TLS.

TLS is the protocol that wraps around an insecure protocol and, through the "magic" of cryptography, adds authentication and application data privacy. And it is possible to understand how this can be dissected, even without knowing anything about cryptography.

Handshake and Bulk Traffic

There are two operational modes of concern in TLS, namely the handshake and bulk traffic handling. The handshake initiates TLS connections, and ensures the identity inasfar as desired, and establishes keys for use in the second phase. In the second phase, application data such as web traffic or mail exchange is carried under a cloak of encryption and origin-protection.

The first phase is complex, and has many variants. Handshakes are what typically drives an application developer mad, in terms of added complexity for his application. It is the core reason why we designed the TLS Pool to separate application logic from security logic.

The second phase is pretty mundane, and has a limited number of variants, which are usually covered by straigthforward libraries that map plaintext blocks to secure blocks, or vice versa. Although we initially made this part of the TLS Pool, it would be inefficient to pass this (potentially bulky) traffic through the TLS Pool, at least when it is a remote connection.

The handshake phase derives so-called "session key" material, which is the input to the second phase. A session key is just a small series of bits that serves as parameters to the bulk security mechanisms.

Splitting up TLS

The question is now, can we split TLS into these two independent kinds of activity? The answer is yes, as far as we can see now. And it brings great operational convenience to do so.

Splitting off TLS Handshakes

TLS happens to consist of four different flows:

  • Handshake, the cryptographic and complex derivation of a session key and cipher suite
  • Change Cipherspec, to indicate that a handshake outcome moves over to the application data level
  • Application data, the payload of the wrapped connection, protected with a session key for a certain cipher suite
  • Alerts, to indicate when things go wrong

These are just marked with a few bits in the protocol, and this can be used to direct one type of traffic one way, and another type another way. So indeed, it is possible to keep bulk handling local while passing the complex handshake to a backend module. As for the other two flows in the TLS protocol, we need to be a bit clever about those.

Operationally, this is very attractive:

  • The handshake is relatively slow and rare; passing it to a backend is not as troublesome as doing it for the bulk application data
  • When the handshake is passed into the backend, the ability to protect the long-term credentials involved improves
  • Backends for handshakes can be centrally configured and controlled, simplifying operations and maintenance
  • Bulk application handling nearby the application makes sense from an efficiency perspective
  • It is easy to have many nodes doing their own bulk TLS handling; only the handshake is easier to do central

So, what remains is the question what carrier protocol to use for sending TLS to the backend. And this is the cause of a second surprise, because this protocol already exists, more or less!

The EAP-TLS and EAP-TTLS protocols are designed to carry TLS handshake packets over an EAP connection, which is customarily passed over to a backend over Diameter or RADIUS. Specifically Diameter, when run over SCTP, is quite capable of passing over the semi-long fragments of TLS without the inconvenience of delays through added round-trip delay times. When large TLS fragments need to be sent over EAP with a smaller MTU, then extra such round-trips need to be added, which is a disadvantage to efficiency. Both the Diameter and RADIUS protocol are capable of multiplexing a large number of ongoing connections.

One concern that is not covered by this setup is EAP-TLS or EAP-TTLS for clients. This is a reasonably straightforward extension however. It can initially be implemented in software, and later specified in a simple document describing best current practices. When an end point can play both the client and server roles, it also ought to be able to handle Symmetric TLS which can be useful in peer-to-peer networks.

Redesigning the TLS Pool

This realisation will reflect on our design of the TLS Pool. We intend to make it remotely accessible over Diameter/EAP, and specifically for the handshake flow. The result of this negotiation will then be passed back to a calling application, which will continue with the negotiated cipher suite and session key, and take care of the modest chores of adding and removing the secure wrappers protecting the application data.

This is a redesign of the TLS Pool idea in the sense that the responsibility for application data now returns to the application. This is possible because all the complexity of dealing with TLS is in the handshake. And it makes sense because the application now does not need to make as many context switches.

The origin that made us find this innovation was precisely this wish — to be able to keep bulk session crypto in the application, while handing off the handshake to a TLS Pool component that may or may not run on the same host.

In many operational contexts, the connection between a front-end application and the TLS Pool in its backend will be connected by a considered-safe network, thus requiring no further encryption and/or authentication. But even when this is considered useful then it is possible to encrypt the entire Diameter/EAP backend connection, which is done once for all the upcoming handshakes and so no real delay. As for authentication, Diameter is quite capable of interrupting its flow and asking for a credential.

Go Top