# The use and misuse of the XOR stream cipher

The XOR stream cipher is the foundation of the one-time pad cipher, as well as many other strong ciphers, but it can also be the foundation of a very weak cryptographic system, and it serves equally well as a tool for cracking itself. The devil is in the details.

The XOR stream cipher is the foundation of the one-time pad cipher, as well as many other strong ciphers, but it can also be the foundation of a very weak cryptographic system, and it serves equally well as a tool for cracking itself. The devil is in the details.

Some details of the one-time pad cipher -- including a summary of its history, an explanation of how it works, and some reasons that it is often not practical to use it -- were provided in the enduring cipher. The algorithm used to apply a one-time pad key is trivially implemented in most programming languages. For instance, in Ruby:

``````class String
def xor(key)
text = dup
text.length.times {|n| text[n] ^= key[n] }
text
end
end
``````

This algorithm can also be very easily generalized to perform transformations on data that do not require the difficulties of a one-time pad, by allowing the key to repeat. This generalized form is known as an XOR stream cipher:

``````class String
def xor(key)
text = dup
text.length.times {|n| text[n] ^= key[n.modulo key.size] }
text
end
end
``````

In the former case, the key must be the same length as the text that needs to be transformed. In the case of the XOR stream cipher, however, `n` modulo the size of the key is used instead of the unmodified value of `n` to select a specific character from the key to use. In case you are not familiar with the term "modulo", it is just an operation that returns the remainder of a division operation. For example, `10.modulo 6` returns a value of `4`, the remainder of the operation `10 / 6`. Using the modulo operator this way simply causes iteration through the key to loop around to the beginning of the key when the key runs out.

While a one-time pad itself is highly impractical, and an XOR stream cipher with a reused key (as in the case of using a key shorter than the plaintext you want to encrypt) is subject to a known-plaintext attack, many very strong ciphers over the years have used the XOR stream cipher as an essential part of a more complex encryption algorithm.

Unfortunately, there are a lot of people in the world who simply do not understand the difference between a one-time pad and a repeating XOR stream cipher, missing the importance of using a key that is the same length as the text to be transformed by the cipher, and must be discarded after it is used once, to avoid being vulnerable to a known-plaintext attack. Without satisfying this requirement, you would be better off using a cipher that is designed to be effective using a shorter, reusable key.

Worse, some of the people who do not understand how a simple XOR stream cipher that is not a true one-time pad cipher end up writing cryptographic software. In some cases, in fact, proprietary cipher implementations with their implementations kept secret are sometimes nothing more than an XOR stream cipher using a repeating key, perhaps with an additional layer or two of easily cracked obfuscation painted over it. A good hint that some ciphertext -- that is, encrypted text -- was produced by any XOR stream cipher is that the ciphertext is the same length as the plaintext. If the key is also the same length, an XOR stream cipher may be a one-time pad; otherwise, it is not.

Ciphertext and plaintext being the same length is no guarantee that the cipher used was a simple XOR stream cipher, but it is a pretty good bet. If it is, the key can be trivially cracked if you have access to at least as much plaintext as the key length. In fact, the exact same XOR stream cipher algorithm that is used to encrypt and decrypt text can be used to crack the key. Loading the above `String.xor` method into irb, the interactive Ruby interpreter, allows the following:

``````> plaintext = 'Four Score'
=> "Four Score"
> key = 'foo'
=> "foo"
> ciphertext = plaintext.xor key
=> " \000\032\024O<\005\000\035\003"
> keycrack = ciphertext.xor plaintext
=> "foofoofoof"
``````

It is pretty clear that three characters are repeating here, suggesting that `foo` is the entire key. Voila: you have successfully completed a known plaintext attack on an XOR stream cipher.

Now, take note of the fact that many files have some known bit of text in them. For instance, if you want to decrypt a file called `shopping_list.txt`, you might be right to guess that it contains the words "shopping list". If you are very lucky, that might be longer than the key. With some trivial effort, you can use the encrypted file and the known text from the plaintext version of the file to derive the key -- and, if the person is reusing that key for everything, you have cracked the key for all of it.

The moral of the story is that you should be careful whose cryptographic software you trust, and even more importantly be sure you do not try to implement cryptographic software yourself for real-world use without knowing what you are doing and getting some input from other people who know what they are doing.

A word of warning: I do not know enough about what I am doing that you should use my code to implement cryptographic software for real world use. I am not Bruce Schneier. I am not yet someone you want implementing your cryptosystems. You probably should not use my `String.xor` method as the basis of a one-time pad cipher implementation. Then again, if you need my help to implement a one-time pad program, you probably should not be writing cryptographic software anyway.

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.