Cryptolosophy     About     Articles     Papers     Subscribe

Memory Security in Go

A few months ago, while working on dissident, I started looking around for guidance on how I should manage encryption keys. I found a few references here and there, but the best method I could patch together, with the limited information I had, was something like:

  1. Call mlock(2) (or VirtualLock) on any sensitive resources.
  2. Overwrite the resources when finished with them.
  3. Allow them to run out of scope.
  4. Pray for the garbage-collector to clean them up, or force it with runtime.GC().

So, I patched together a little library that handled some of it for me, chucked it on HN, and moved on.

A short time later, I realised that beef was kicking off. As it turned out, the aforementioned approach was fundamentally flawed as one thing hadn’t been accounted for: the garbage-collector. It goes around doing whatever it feels like doing; making a copy here; moving something around there; it’s a real pain from a security standpoint.

I really didn’t have a choice at that point: I had to dedicate all the time I could spare to the project, for the ten days it took to develop and release the fix.

A few people had mentioned wrapping libsodium, but I wanted a pure-go solution, so that wasn’t ideal. Instead myself a friend, dotcppfile, began analysing how libsodium actually worked. Well, I say “we”; more like he pretty much audited the entire library while I checked the documentation and implemented the relevant system calls.

Within a few days, we had a pretty solid understanding of libsodium and we were ready with a new and improved plan. I think the best way to explain it is to introduce you to the end product: memguard.

Alright, now say you need to generate an encryption key and store it securely. What’s the process?

Well first we need some memory from the OS, so we need to determine the number of pages that we have to allocate. In this case the length of the buffer is 32 bytes and we can assume the system page-size to be 4096 bytes. The data is stored between two guard pages and is prepended with a random canary of length 32 bytes (more on these later). So, since the data and the canary together will comfortably fit into a single page, we need to allocate just three pages.

But we can’t ask the Go runtime for the memory—since then it is free to mess around with it—so how do we do it? Well, there are a few ways to accomplish this, but we decided to go with Joseph Richey’s suggestion of using mmap(2) (or VirtualAlloc on Windows), since the system-call is natively implemented, and that allowed us to avoid a dirty cgo solution.

But actually, of course only the Unix system-calls were natively-implemented, the Windows ones were not. Luckily, there was this library by Alex Brainman that we could vendor instead, and it proved invaluable. (I did later add the missing system-calls to the standard library to remove the dependency.)

Now that our pages are allocated, we should configure the guard pages. We tell the kernel to disallow all reads and writes to the first and last pages, so if anything does try to do so, a SIGSEGV access violation is thrown and the process panics. This way buffer overflows can be detected immediately, and it becomes almost impossible for other processes to locate and access the data.

The remaining page, the one sandwiched between the guard pages, needs to be protected too. You see, as system memory runs out, the kernel copies over the memory of inactive processes to the disk, and that is something we would like to avoid. So, we tell the kernel to leave this middle page alone.

The last thing is the canary: a random value placed just before the data. If it ever changes, we know that something went wrong—probably a buffer underflow. When the program first ran, we generated a global value for the canary, so we just set the canary bytes to that, and the container is pretty much ready for use.

xkcd_protocol

The current state of our three pages.

All that is left to do now is handle the data itself. In our case the function NewImmutableRandom() was called, which fills the created buffer with cryptographically-secure random bytes after it is created. A read-only status was also requested, so after the buffer is filled, we tell the kernel to only allow reads from the middle page. As before, any attempts to write to the buffer will trigger a SIGSEGV access violation and the process will panic.

This project is under active development and so new security features may be added. You can view the official project page here, and full documentation can be found here.

Note that while memguard is not meant to absolutely guarantee security of the data, it is the best that you can reasonably hope to achieve. If you have a suggestion for improvement, feel free to open a pull-request: contributions are welcome.

Quantum Key-Exchange

The perfect cryptographic primitive—does it exist?

You’ve probably heard of the one-time pad: a cipher infamous for its drawbacks almost as much as for its absolute perfection. It is an example of an information-theoretic secure cipher—one that is mathematically proven to be unbreakable even in the face of infinite computing power.

So why don’t we use it? Well, the drawbacks of the scheme are in the key-exchange. The OTP requires a truly random key that is as long as the plain-text and that can only be used once. So if you want the benefits that it provides, you’re pretty much resigned to meeting up in person and exchanging an SSD full of random bytes every so often.

Basically, all you’re doing is having a face-to-face meeting delayed until after you meet. Suffice it to say, this is a little inconvenient.

But it is necessary, right? I mean, if you use conventional key-exchange using RSA or ECC, you’d defeat the purpose of using an information-theoretic secure cipher. But what if there existed another way of exchanging keys, one that was also perfect?

Well, there is: quantum key-distribution.

I’m not just talking perfect as in infinite computing power cannot touch it, I’m talking so perfect that no one can even attempt to intercept the transmission without being detected. This is guaranteed by the no-cloning theorem, so if this turns out to be wrong, we would have to reconsider our most fundamental beliefs about the nature of the universe—not to mention that it would fly in the face of the most precise and accurate experimental tests in all of science.

how does it work tho?

Yeah I’m getting to it.

Consider two people—Alice and Bob—attempting to communicate securely, and a third person—Eve—attempting to intercept their communications. They have access to two insecure channels: a conventional one for bits, and a quantum one for qubits.

xkcd_protocol

xkcd: Protocol

Alice has the option of using two different polarisation basis—rectilinear and diagonal—using which she can send either 0 or 1. She arbitrarily decides that a 1 encoded in the rectilinear basis will be vertically (0°) polarised, a 1 encoded in the diagonal basis will be polarised at 45°, and so on. This is summarised in the table:

Basis 0 1
Rectilinear (+) 90°
Diagonal (x) 135° 45°

She informs Bob of this scheme through the conventional channel, and now they are ready to exchange keys.

Alice begins by generating cryptographically-secure random pairings of bits and basis—huge amounts of them. For example,

1: diagonal,
0: diagonal,
0: rectilinear,
1: diagonal,
...

One at a time, she encodes the bits in their associated basis and sends the resulting polarised photons to Bob through the quantum channel. Now, the way this works is that any party that wishes to read these incoming qubits cannot tell which basis they were encoded in—so they just guess. But there’s another catch: if they measure the qubit in the wrong basis, the reading they get is purely random, and the qubit is destroyed in either case.

If you’re paying attention, you will have realised the importance of that property. Eve cannot read any qubits and remain undetected, as any attempted measurement on her part will result in the destruction of the original data.

After Alice has sent all of her data, Bob publicly informs Alice of his choices of basis for each bit. Alice then replies with the actual basis and they both discard any bits where Bob guessed incorrectly. Statistically, Bob will guess correctly around 50% of the time, so they are left with around half of the total bits sent.

Now all they need to do is check if Eve attempted to eavesdrop on the key-exchange. They take a random subset of bits from the exchanged data and compare them over the conventional channel. If this subset matches, they know—with a high degree of confidence—that their exchange is secure. Otherwise, they discard the data and start again.

This setup is provably secure against passive eavesdropping, but what about an active man-in-the-middle attack? Say Eve is able to modify the data sent on each communication channel. On the quantum channel this isn’t a problem—Eve cannot perform an MITM attack without being detected—but on the conventional channel we have to think about some kind of authentication.

Luckily, there exists an authentication scheme that provides unconditionally-secure authentication—similar to the OTP. It was invented in the late 70’s by J. Lawrence Carter and Mark N. Wegman, and so is commonly referred to as Wegman-Carter authentication. Without getting into the specific details—this post is meant to be about quantum key-exchange—it allows us to turn a pre-shared secret into a message authentication code that we can include with our messages.

So, to authenticate the conventional channel, on the first exchange Alice and Bob simply agree on a small, cryptographically-secure key in advance. They use this key for Wegman-Carter authentication and save a subset of the exchanged bits for use as the key in the next exchange. This way, both active and passive attacks by Eve are impossible without detection, and the OTP is back in the game.