Tue, May 12, 2015

Reading fixed length TCP packets in Go with chunkedreader


The library: chunkedreader


TCP guarantees packet delivery (by way of retransmissions) unlike UDP. It is important to note that packet here is used loosely because TCP has no concept of a framing a message or packing it into a unit. A message of 10 bytes can be received as one whole packet, ten packets of 1 byte each, or any such combination. In reality, TCP streams data. One send may trigger many receives. When testing transmissions on localhost, this is rarely apparent, and hence, it’s a general misconception that TCP delivers complete, framed packets (UDP on the other hand, while unreliable, guarantees full or no delivery. There is never a partial delivery).

Since TCP streams data, one almost always has to come up with some sort of a pseudo-protocol over TCP to transmit and receive meaningful data structures. If all the possible characters that would ever be sent in a message are known, then a character that doesn’t belong to that set can be used as a delimiter to seggregate messages. On the other hand, if a message can contain arbitrary characters, it becomes a pain to then escape the delimiter.

The most common approach is probably a length-prefixed message format. The first few bytes (the header) of a message indicate the total length of the message along with other details. The receiver then buffers all incoming bytes until it meets the length criteria, at which point, it’s received one full packet, and this process continues. While this is robust, buffering partial messages until the length criteria is met can be tricky.

Fixed length messages

There are also cases when the length of all incoming messages are fixed. For instance, a server that receives commands that are always 12 bytes long, where empty bytes are padded. Things become much easier when the length is constant. The reader can have a buffer where incoming bytes are appended and when it is of length n, it can be moved else where and the process continued.

Go’s bufio package provides a really nice buffering abstraction over any io.Reader and io.Writer. This, in addition to the filesystem, works with the net package as well.

The cake however is bufio’s SplitFunc that lets you specify a custom function for tokenizing (chunking) input, while not having to worry about buffering bytes manually or EOF encounters. Applied to Scanner.Split(), it’s generally used for spotting delimiters such as \n in a stream and calling it a chunk, but is ideal for chunking streams based on lengths as well.

Assuming conn is an initialised and connected net.Conn instance, receiving fixed length messages using chunkedreader is as easy as this:

// Fixed length of messages.
length := 12

// Initialize the reader to read chunks of 12 bytes.
ch := chunkedreader.New(conn, length)

for ch.Read() {
    // msg is one `packet` of length 12 bytes.
    msg := ch.Bytes()

Get chunkedreader on Github.

« The occasional blog