Substitution ciphers are one of the simplest and oldest methods of encryption. They involve replacing plaintext letters or characters with others in a fixed pattern to create ciphertext.
They represent probably the most widely used encryption system in antiquity. Despite the development of more complex cryptographic techniques, substitution ciphers remain a fundamental building block of modern encryption.
In today’s digital age, substitution ciphers continue to be relevant. Their use is widespread from internet communication and computer security to secure messaging apps and video games. As such, understanding the basics of substitution ciphers is essential for anyone who wants to understand cryptography or computer security.
What’s their usage today?
Even though the substitution ciphers are not as secure as block ciphers, they are still in use:
- Data encryption: Encryption algorithms like the Advanced Encryption Standard (AES) use substitution ciphers in their encryption process. In AES, substitution ciphers create the S-box, which performs a substitution operation on the plaintext (Substitution–permutation network).
- Steganography: those who want to hide a secret might sometimes couple it with steganography. A substitution cipher can encode a hidden message in order to add an additional layer of security. We have seen an example of steganography in this article.
- Video games: Substitution ciphers are the base to create secret codes that players can use to unlock hidden features or levels.
- Code obfuscation: due to its lightness and easiness, that kind of cryptography developers and hackers use them to obfuscate respectively their code and malware’s sections (XOR encryption is also a substitution cypher).
History of Substitution Ciphers
Several populations have used substitution ciphers to protect sensitive information.
Here is an overview of their history:
The Origins of Substitution Ciphers
The origins of substitution ciphers can be traced back to ancient civilizations. One of the earliest known examples of a substitution cipher is the Atbash cipher, which Hebrews used to encode their alphabet.
The Atbash cipher is a simple substitution cipher that replaces each letter with its corresponding letter at the opposite end of the alphabet. For example, replaces “A” with “Z”, “B” with “Y”, and so on.
The Caesar Cipher, also known as the shift cipher, is one of the simplest substitution ciphers. It was named after Julius Caesar, who allegedly used it to communicate with his generals.
The Caesar Cipher works by shifting each letter of the plaintext a fixed number of positions down the alphabet.
For example, if the shift is 4 the letter “A” becomes “E”, “B” becomes “F”, and so on. The key for the Caesar Cipher is the number of positions to shift the letters.
This is the table that we can derive from the key 4:
Here’s an example of the Caesar Cipher in action:
- Plaintext: STACKZERO
- Key: 4
- Ciphertext: XYEGODIVS
To decrypt the ciphertext, the receiver simply shifts the letters back to the same number of positions. In this case, the receiver would shift the letters back by 4 positions to get the original plaintext: STACKZERO.
The Caesar Cipher is a very weak cipher, as there are only 25 possible keys (since a shift of 26 would simply result in the original plaintext). Frequency analysis, which involves analyzing the frequency of each letter in the ciphertext to determine the most likely shift value, can easily break it.
Despite its weaknesses, the Caesar Cipher has been used throughout history for simple encryption tasks, as it is easy to understand and implement. It also serves as a basic building block for more complex substitution ciphers, which use multiple shifts or other substitution methods to make the cipher more secure.
The Vigenere Cipher is a polyalphabetic substitution cipher, which means that it uses multiple substitution alphabets instead of just one. Giovan Battista Bellaso invented it in the 16th century and Blaise de Vigenere popularized it in the 19th century.
The Vigenere Cipher works by using a series of interwoven Caesar ciphers based on a keyword. The keyword is repeated as many times as necessary to match the length of the plaintext message. Each letter in the keyword is then used to shift the corresponding letter in the plaintext message.
Here’s an example of the Vigenere Cipher in action:
To encrypt the plaintext message “MALWAREANALYSIS” using the Vigenere Cipher with the keyword “STACKZERO”, we must follow the following steps:
- Write the keyword repeatedly until it matches the length of the plaintext message: STACKZEROSTACKZ
- Assign each letter in the keyword a number based on its position in the alphabet (A=0, B=1, C=2, and so on):
Shift each letter of the plaintext message by the corresponding number in the keyword, using the Caesar Cipher:
The resulting ciphertext message is “ETLYKQIRBSEYUSR”.
To decrypt the ciphertext message, the receiver needs to know the keyword and the original position of each letter in the keyword. They can then use the same process in reverse to retrieve the plaintext message.
The Vigenere Cipher is more secure than the Caesar Cipher, as the use of multiple substitution alphabets makes it more difficult to crack with frequency analysis. However, it can still be vulnerable to other forms of cryptanalysis, especially if the keyword is short or if the same keyword is used repeatedly.
The Enigma machine was an electromechanical encryption device used by the Germans during World War II to encode their military communications.
Arthur Scherbius, a German engineer that complex device that used a combination of substitution and transposition ciphers.
Its structure consisted of:
- a keyboard
- a set of rotors that rotated with each keystroke
- a set of lights that displayed the encrypted letters.
Each rotor had 26 contacts on its circumference, one for each letter of the alphabet. When an operator typed a letter was typed on the keyboard, the result was that it enlightened the encrypted letter.
The Enigma machine had multiple rotor configurations, or “settings,” which could be changed on a daily basis. It made that machine so hard to crack that the Germans became too confident in its security.
They made the error of entrusting completely their military communications to Enigma.
Indeed, a team of codebreakers at Bletchley Park in England, led by mathematician Alan Turing cracked the Enigma machine. They developed a machine, which used statistical analysis and other techniques to break the Enigma code.
That code’s break was a significant turning point in World War II, as it allowed the Allies to intercept and decode German military messages. This gave them a strategic advantage and helped them to win key battles.
Today, cryptography enthusiasts consider the Enigma machine a classic example of cryptography. It also s inspired numerous books, movies, and documentaries. Besides that it is also a cautionary tale about the importance of encryption and the risks of relying too heavily on a single encryption method.
Monoalphabetic vs polyalphabetic substitution ciphers
A monoalphabetic substitution cipher uses only one substitution alphabet to encrypt a message. So, the algorithm replaces each letter in the plaintext with the same letter in the ciphertext. We’ve seen the example of the Caesar Cipher, every letter is shifted by the same fixed number of positions in the alphabet. Monoalphabetic substitution ciphers are easy to use and understand, but they are also easy to crack.
A polyalphabetic substitution cipher, on the other hand, uses multiple substitution alphabets to encrypt a message. Each letter in the plaintext may be replaced by a different letter in the ciphertext, depending on its position in the message or on some other factor. The example we’ve seen previously is the Vigenere Cipher. These kinds of ciphers are more difficult to crack than monoalphabetic ones.
In summary, monoalphabetic substitution ciphers use one substitution alphabet, while polyalphabetic substitution ciphers use multiple substitution alphabets. Polyalphabetic substitution ciphers are generally more secure than monoalphabetic substitution ciphers, but they can also be more complex to use and understand.
How to break substitution ciphers
Breaking a substitution cipher involves trying to discover the original plaintext message from the encrypted ciphertext. There are several methods that we can use to break a substitution cipher, depending on the complexity and type. For example:
- Frequency analysis: This involves analyzing the frequency of letters and letter combinations in the ciphertext. In English, certain letters and letter combinations, such as “e” and “th,” appear more frequently than others. By analyzing the frequency of letters in the ciphertext, a cryptoanalyst can identify the most common letters and make educated guesses about their substitutions.
- Crib: If the cryptoanalyst knows part of the original plaintext message, he can use this information to make educated guesses about the substitutions. For example, if the attacker knows that the word “the” appears in the plaintext message, he can look for repeated letters in the ciphertext and try to determine which letters correspond to “t,” “h,” and “e.”
- Kasiski examination: is a more complex technique, very effective against polyalphabetic substitution ciphers. It involves analyzing repeated sequences of letters in the ciphertext to identify the length and structure of the substitution alphabets.
In conclusion, substitution ciphers have played a significant role in cryptography throughout history, from simple monoalphabetic ciphers to the more complex polyalphabetic ones. Despite their relative simplicity, they remain a fascinating and important part of the history of cryptography.
If you want to learn more about cryptography, cybersecurity, and related topics, be sure to follow StackZero’s social media channels and blog. By staying informed and educated about these important issues, you can help protect yourself and others from cyber threats and contribute to a safer and more secure online world.