Codes & Ciphers

Since I was a little kid I’ve been fascinated with codes and ciphers. (There is a difference between the two – technically a ‘code’ replaces whole words with symbols, whilst a ‘cipher’ replaces individual letters with other letters or numbers. What most people call codes are actually ciphers!)

I’ve always enjoyed sending secret messages to friends, and cracking the ones they send back. And in my work it’s surprising how often you run across hidden or coded messages that need breaking.

There’s a myth that you always need the key to crack a cipher. But the truth is very different – there are a few basic tricks you can use to break secret messages. I’ve outlined a few of them below, and taken a look at some simple ciphers…

Caesar Cipher (aka the Cycle or Shift or Rotation Cipher)

The Caesar Cipher is one of the earliest ciphers known – allegedly invented by Julius Caesar, although more likely only used by him. It’s a simple substitution cipher where each letter of the alphabet is ‘shifted’ by a certain amount. So if the required shift is 2 then the letter A would become C, B becomes D, C is E etc. And towards the end of the alphabet, the cipher ‘wraps around’ – so with the 2-shift example, the letter Y becomes A, and Z is B.

There will often be a clue somewhere to give a hint as to how to crack the cipher – references to movement or rotation etc (or maybe even to Caesar!) Sometimes it’s useful to take a short section of the coded message and try shifting the letters up or down a few different places in the alphabet, see if recognisable words suddenly appear.

Number Cipher

A number cipher is often a variation of a Caesar cipher. It looks more complicated, but it’s much the same. You give each letter of the alphabet a number, and then perform the shift. So in the 2-shift example, the letter A would become 3, B is 4 etc.

Mirror Cipher (aka the Atbash Cipher)

The Mirror Cipher is another ancient cipher, a simple letter substitution, originally used to encode messages in Hebrew. It involves taking the alphabet, splitting it in half, and ‘mirroring’ it. So the letter A becomes Z, and Z is A, whilst the letter B becomes Y, and Y is B.

Substitution Cipher

A Substitution Cipher is more complex than just shifting the alphabet, or mirroring its halves. Here, letters will be reassigned at random, so the letter A could be H, and B could be R, and C could be X etc. There isn’t an obvious shift of alphabet, and the recipient of the message needs the key (the regular alphabet alongside its enciphered equivalent) to read the text.

Messages enciphered with letter substitution can look complex and intimidating to the would-be decoder. But don’t be down-hearted, there are some tricks you can pull to crack this kind of message, even if you don’t have access to the key...

Cracking a Substitution Cipher

You can take advantage of the hidden structure in language to break a cipher. The most basic tool in the art of deciphering is ‘frequency analysis’. Some letters appear much more frequently in regular text than others. In English, the letter E is the most frequently-occurring letter, followed by T, and then A. The least frequent letters are X, Q, U, and Z.

If you’ve got a decent length of enciphered text, you can count the number of times particular letters appear and then make an educated guess about the identities of the most common. The ten most common letters in English are: E, T, A, O, I, N, S, R, H, D. Identifying some of them will provide you with a crowbar to crack the rest of the message wide open.

Once you’ve got a rough idea of what some of the more common letters in the message might represent, you can use another deciphering tool – ‘pattern recognition’...

Are there frequent patterns of 3 letters in the message? The most common 3-letter words in English are ‘the’ and ‘and’. So any frequent 3-letter sequence ending in what might be E is probably ‘the’. This clue can confirm your identification of E, and also give you the identity of T and H. Once you’ve spotted ‘the’, the next-most common 3-letter pattern is probably ‘and’, revealing the identity of a further 3 letters.

Are there double-letters in your enciphered text? They can be an excellent clue for identifying letters. The most common double-letters in English are LL, SS, and EE. They occur almost twice as often as the next set of ‘doubles’ which are OO and TT.

Lastly, if you know anything about the message’s context, you may have some clues to help break the cipher. Are people’s names, or place names, likely to show up in the text? If you can identify any of those then you’ve found yourself a skeleton key for the remainder of the message.

Once you’ve started to identify a decent number of the letters with these tools, the message will begin to reveal itself. Cracking ciphers in this way can be great fun, and is often easier than you might expect.

Next time you run into a mysterious message, be sure to put your new-found deciphering skills to the test.