csocketstcpheartbleed-bug

Heartbleed bug: Why is it even possible to process the heartbeat request before the payload is delivered?


First, I am no C programmer and the OpenSSL codebase is huge, so forgive me for asking a question that I could probably find the answer to, given I had the time and skill to dig through the code.

TLS runs over TCP from what I can tell. TCP is stream oriented, so there is no way to know when a message has been delivered. You must know in advance how long the incoming message should be or have a delimiter to scan for.

With that in mind, how is it possible for OpenSSL to process a heartbeat request before the full payload has been received?

If OpenSSL just starts processing the first chunk of data it reads from the TCP socket after the payload length is received, then OpenSSL would appear to be not just insecure, but broken under normal operation. Since the maximum segment size of TCP is 536 bytes, any payload larger than that would span multiple TCP segments and therefore potentially span multiple socket reads.

So the question is: How/Why can OpenSSL start processing a message that is yet to be delivered?


Solution

  • This is the definition of a heartbeat packet.

    struct {
      HeartbeatMessageType type;
      uint16 payload_length;
      opaque payload[HeartbeatMessage.payload_length];
      opaque padding[padding_length];
    } HeartbeatMessage;
    

    Incorrect handling of the payload_length field is what caused the heartbleed bug.

    However this whole packet is itself encapsulated within another record that has it's own payload length, looking roughly like this:

     struct {
          ContentType type;
          ProtocolVersion version;
          uint16 length;
          opaque fragment[TLSPlaintext.length];
      } TLSPlaintext;
    

    The struct HeartbeatMessage is placed inside the above fragment.

    So one whole TLS "packet" can be processed when the data according to the length field here has arrived, but in the inner Heartbeat message, openssl failed to validate its payload_length.

    Here's a screenshot of a packet capture, in which you can see the outer length of 3 specifies the length of a "packet", and the inner (wrong) payload length of 16384 is what caused the exploit, as openssl failed to validate this against the actual received length of the packet.

    wireshark screenshot of heartbeat packet

    Ofcourse, similar care must be taken when processing the length field of this outer record, you really want to make sure you have actually received length data before beginning to process/parse the content of the packet.

    Note also that there's not a particular correlation between socket reads and TCP segments, 1 socket read can read many segments, or just part of a segment. To the application, TCP is just a byte stream, and one socket read could read just up to half the length field of one TLSPlaintext packet, or it could read several whole TLSPlaintext packets.