A Comprehensive Deep Dive to Caesar's Cipher
Until now, I've only posted DeFi-related blog posts; however, from now on, I want to expand the scope of the content I create. Welcome to my first cryptography post. In this writing, we will discuss some aspects and attacks on a cryptographic scheme based on the Caesar's cipher (or shift cipher). Note that, to prevent confusion, we use small letters for the plaintext and capital ones for the ciphertext. Here is the content list:
Understanding the Caesar Cipher
Definition and Historical Significance
The Basic Mechanics of the Cipher
An Example of Caesar Cipher Encryption
The Vulnerability of Caesar's Cipher
The Concept of Brute Force Attacks
Python Code for Brute Force Decryption
Python Code for Frequency Analysis to Automate Attacks
Enhancing Caesar's Cipher: Mono-Alphabetic Substitution
The Idea of Using Permutations as Keys
Python Code for Mono-Alphabetic Substitution Encryption
The Feasibility of Brute Force Attacks
Statistical Methods in Cryptanalysis
Reference to Further Cryptanalysis Example
Cryptography was predominantly used in the military because it was crucial to share confidential information in a way that enemies could not decipher. To encrypt his messages, Julius Caesar shifted each letter of his message by 3 places. For example, he would write "D" instead of "A" and "E" instead of "B," and so on. This is one of the simplest examples of symmetric encryption, where the same key is used for both encryption and decryption; in this case, the key is 3.
"ZHE WKUHH LV D IXWXUH"
is the encryption of the sentence 'web three is a future' by Caesar's cipher with the key 3. Note that using spaces in the plaintext may leak some information, for instance when an attacker sees the letter "D" standing alone in the sentence he can easily guess that D represents "A".
Formally, Caesar’s cipher (private key encryption in general) consists of three algorithms. The first one generates keys, the second one encrypts plaintexts. We denote the encryption of the plaintext m using the key k by
The last one is decryption which takes ciphertext and generates the corresponding plaintext. It’s denoted by
In order to be consistent, we should have
for every message m and key k.
Thus, methodically we say that the Caesar’s schema is
and
for all m_i is a char of the message and c_i is a char of the ciphertext.
Let's write a Python code that encrypts messages via a randomly generated key of Caesar's cipher. Note: you can reach all python codes we mentioned in this post here https://github.com/arbnom/Caesar-s-Cipher
import random
def caesar_cipher_encrypt(plaintext, key):
encrypted_text = ""
for char in plaintext:
# Convert to lowercase if it's not
char = char.lower()
# Check if the character is a lowercase letter
if char.islower():
encrypted_char = chr((ord(char) - 97 + key) % 26 + 97)
encrypted_text += encrypted_char.upper() # Convert to uppercase
else:
# Keep the character unchanged if it's not a letter
encrypted_text += char.upper()
return encrypted_text
# Choose a random key between 1 and 25
key = random.randint(1, 25)
print(f"Key: {key}")
# Set the plaintext directly
plaintext = "web three is a future"
# Encrypt the plaintext
encrypted_text = caesar_cipher_encrypt(plaintext, key)
# Print the encrypted text
print(f"Encrypted Text: {encrypted_text}")
That code snippet generates a random key between 1-25 and encrypts the given plaintext according to that key. It does not change the non-alphabetic chars so typing them gives better results.
While Caesar's cipher is an engaging introduction to encryption, its simplicity makes it vulnerable to brute force attacks. In such attacks, every possible key is tested to decrypt a ciphertext. Given the 26 letters in the English alphabet, finding the correct key is straightforward. Below is a Python code snippet demonstrating a brute force attack on Caesar's cipher.
def caesar_cipher_decrypt(ciphertext, key):
decrypted_text = ""
for char in ciphertext:
# Check if the character is an uppercase letter
if 'A' <= char <= 'Z':
# Perform decryption only on uppercase letters
decrypted_char = chr((ord(char) - 65 - key) % 26 + 65)
decrypted_text += decrypted_char.lower() # Convert decrypted letters to lowercase for consistency
else:
# Leave non-letter characters unchanged
decrypted_text += char
return decrypted_text
# Ciphertext input is provided directly in the code
ciphertext = "ZHE WKUHH LV D IXWXUH"
# Try all possible keys from 1 to 25
for i in range(1, 26):
# Decrypt the ciphertext using the current key
decrypted_text = caesar_cipher_decrypt(ciphertext, i)
# Print the decrypted text
print(f"Key {i}: {decrypted_text}")
The Python code snippet provided efficiently demonstrates this by iterating through every potential key from 1 to 25 and decrypting the ciphertext "ZHE WKUHH LV D IXWXUH". The outputs for each key are as follows:
Key 1: ygd vjtgg ku c hwvwtg
Key 2: xfc uisff jt b gvuvsf
Key 3: web three is a future (revealing the original plaintext)
Key 4: vda sgqdd hr z etstqd
Key 5: ucz rfpcc gq y dsrspc
Key 6: tby qeobb fp x crqrob
Key 7: sax pdnaa eo w bqpqna
Key 8: rzw ocmzz dn v apopmz
Key 9: qyv nblyy cm u zonoly
Key 10: pxu makxx bl t ynmnkx
Key 11: owt lzjww ak s xmlmjw
Key 12: nvs kyivv zj r wlkliv
Key 13: mur jxhuu yi q vkjkhu
Key 14: ltq iwgtt xh p ujijgt
Key 15: ksp hvfss wg o tihifs
Key 16: jro guerr vf n shgher
Key 17: iqn ftdqq ue m rgfgdq
Key 18: hpm escpp td l qfefcp
Key 19: gol drboo sc k pedebo
Key 20: fnk cqann rb j odcdan
Key 21: emj bpzmm qa i ncbczm
Key 22: dli aoyll pz h mbabyl
Key 23: ckh znxkk oy g lazaxk
Key 24: bjg ymwjj nx f kzyzwj
Key 25: aif xlvii mw e jyxyvi
This exhaustive list clearly illustrates how each key transforms the ciphertext into a different plaintext, ultimately uncovering the original message with Key 3. This example underscores the critical weakness of Caesar's cipher against brute force attacks, highlighting the necessity for more sophisticated encryption methods to ensure secure communication.
While brute force attacks on Caesar's cipher test every possible key, such an approach isn't easily automated, especially if the message has been intentionally obfuscated (e.g., by substituting "q" for "p"). To address this, we can employ frequency analysis, leveraging the known frequencies of letters in the English language (denoted as p_i) to decipher the ciphertext.
For instance, the letter "e" (denoted as p_5) appears in 12.7% of cases in English texts. By analyzing the frequency of each letter in the ciphertext and comparing it to these known values, we can estimate the key used for encryption. The sum of the squares of the frequencies of letters in English is approximately 0.065, a figure we'll use to guide our decryption.
The following Python code performs frequency analysis on a given ciphertext, "ZHE WKUHH LV D IXWXUH", to estimate the encryption key:
def calculate_frequency(text):
frequency = {chr(i + 97): 0 for i in range(26)} # Initialize frequency dictionary for 'a' to 'z'
for char in text.lower(): # Convert text to lowercase to standardize
if char in frequency:
frequency[char] += 1
total = sum(frequency.values())
if total > 0:
frequency = {char: count / total for char, count in frequency.items()} # Normalize frequencies
return frequency
# Frequencies of letters in the English alphabet (p_i values)
english_freq = {
'a': 0.08167, 'b': 0.01492, 'c': 0.02782, 'd': 0.04253, 'e': 0.12702,
'f': 0.02228, 'g': 0.02015, 'h': 0.06094, 'i': 0.06966, 'j': 0.00153,
'k': 0.00772, 'l': 0.04025, 'm': 0.02406, 'n': 0.06749, 'o': 0.07507,
'p': 0.01929, 'q': 0.00095, 'r': 0.05987, 's': 0.06327, 't': 0.09056,
'u': 0.02758, 'v': 0.00978, 'w': 0.02360, 'x': 0.00150, 'y': 0.01974,
'z': 0.00074
}
# Ciphertext input
ciphertext = "ZHE WKUHH LV D IXWXUH".replace(" ", "").upper() # Remove spaces and convert to uppercase for analysis
# List frequencies for letters in ciphertext (q_i values)
ciphertext_freq = calculate_frequency(ciphertext)
# Create a dictionary to hold I_j values
I_j_values = {}
# Calculate I_j for each possible shift j
for j in range(26):
I_j = 0
for i in range(26):
char_from_english = chr((i + 97) % 126) # Ensure we're using lowercase letters
char_from_ciphertext = chr((i + j + 97) % 126)
if char_from_english in english_freq and char_from_ciphertext in ciphertext_freq:
I_j += english_freq[char_from_english] * ciphertext_freq[char_from_ciphertext]
I_j_values[j] = I_j
# Find the key by identifying the shift with I_j value nearest to 0.065
nearest_I_j_value = min(I_j_values.values(), key=lambda x: abs(x - 0.065))
key = [k for k, v in I_j_values.items() if v == nearest_I_j_value][0]
print(f"Estimated Key: {key}")
This code first calculates the frequency of each letter in the ciphertext, then compares these frequencies against those expected in English text. The key is estimated by finding the shift that brings the product of the English and ciphertext letter frequencies closest to 0.065.
Additionally, the provided frequency_analysis_debug.py file in the repository enables a deeper dive, showing the I_j - 00.65 values for each potential key j. This can illustrate how even with short ciphertexts like "ZHE WKUHH LV D IXWXUH", the correct key results in a value closest to 0.065, aiding in the decryption process without needing an extensive ciphertext. Here is the values:
I_0 - 0.065 = 0.02695
I_1 - 0.065 = 0.03463
I_2 - 0.065 = 0.03462
I_3 - 0.065 = 0.00561
I_4 - 0.065 = 0.01560
I_5 - 0.065 = 0.03382
I_6 - 0.065 = 0.02737
I_7 - 0.065 = 0.01500
I_8 - 0.065 = 0.03250
I_9 - 0.065 = 0.03262
I_10 - 0.065 = 0.03567
I_11 - 0.065 = 0.03285
I_12 - 0.065 = 0.03496
I_13 - 0.065 = 0.03934
I_14 - 0.065 = 0.02270
I_15 - 0.065 = 0.01765
I_16 - 0.065 = 0.01072
I_17 - 0.065 = 0.02736
I_18 - 0.065 = 0.02065
I_19 - 0.065 = 0.01161
I_20 - 0.065 = 0.02051
I_21 - 0.065 = 0.03260
I_22 - 0.065 = 0.02465
I_23 - 0.065 = 0.03499
I_24 - 0.065 = 0.05015
I_25 - 0.065 = 0.02611
The Caesar cipher, while simple and historically significant, is easily compromised due to the limited number of possible keys. To enhance the security, one could consider a mono-alphabetic substitution cipher, which uses permutations of the alphabet as keys. For the 26 letters of the English alphabet, this results in a vast key space of 26! (26 factorial) permutations, making brute force attacks impractical due to the sheer number of potential combinations.
Here's a Python code snippet that implements the mono-alphabetic substitution cipher by generating a random key and using it for encryption:
import random
# Generate a random permutation of the alphabet
key = random.sample(range(26), 26)
print(f"Key: {key}")
def encrypt_with_monoalphabetic_cipher(plaintext, key):
encrypted_text = ""
for char in plaintext.lower(): # Ensure plaintext is in lowercase
if char.isalpha(): # Check if the character is a letter
# Encrypt the letter using the generated key
encrypted_char = chr(key[ord(char) - ord('a')] + ord('A')) # Convert to uppercase
encrypted_text += encrypted_char
else:
# Non-letter characters are kept as is
encrypted_text += char
return encrypted_text
# Example plaintext
plaintext = "web three is a future"
# Encrypt the plaintext
encrypted_text = encrypt_with_monoalphabetic_cipher(plaintext, key)
# Print the encrypted text
print(f"Encrypted Text: {encrypted_text}")
This code will encrypt a given plaintext using a randomly generated permutation of the alphabet as the key. The resulting ciphertext will appear as a series of uppercase letters, with non-letter characters left unchanged.
Despite the significantly larger key space, the mono-alphabetic substitution cipher is not immune to attacks. Cryptanalysts can employ statistical methods to uncover patterns within the ciphertext. For instance, in English:
The letter "e" is the most frequent.
The letter "q" is typically followed by "u".
The letter "h" commonly appears between "t" and "e".
These patterns can be exploited to decipher the ciphertext without the need for brute force, particularly if a substantial amount of text is available for analysis.
For a comprehensive example of cryptanalysis applied to a mono-alphabetic encryption, check out the detailed explanation provided at the link: About Cryptanalysis of Monoalphabetic Encryption.
This introduction to Caesar's cipher is just the beginning. In future posts, I plan to delve deeper into the fascinating world of cryptography. Thank you for reading and don’t forget to subscribe my Substack.