1

Chapter 2: Substitution Ciphers

A substitution cipher is a cipher in which correspondents agree on a rearrangement (permutation) of the alphabet in which messages are written. This rearrangement of the alphabet letters is often called the cipher alphabet.

Examples

1. Ciphers given in newspapers.

2. Atbash cipher(one of the earliest known ciphers)

3. The Gold Bug short story by Edgar Allan Poe:

4. Beale Cipher(contained in the Beale Papers) Bedford, Virginia.

In Chapter 5, we will look at mathematical ways of rearranging the alphabet. In this chapter, we describe three non-mathematical methods.

Section 2.1: Keyword Substitution Ciphers

When creating the cipher alphabet for a substitution cipher, we always assign each plaintext letter to only one unique ciphertext letter, that is, create a one-to-one correspondence between the plaintext and ciphertext. One way we can do this is through a random assignment, as the first example illustrates.

Example 1: Consider a substitution cipher with the following cipher alphabet.

Plain:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Cipher:T V X Z U W Y A D G K N Q B E H R O S C F J M P I L

Use this cipher alphabet to

a. Encipher the message EDGAR ALLAN POE

b. Decipher the message CAU OTJUB

A problem with using a random cipher alphabet is that it may be cumbersome for users to keep a record of. One solution is for the sender and receiver of the message to construct a cipher alphabet using a keyword. We demonstrate two methods for doing this.

Techniques For Creating Simple Substitution Ciphers

1. Simple Keyword Substitution Ciphers.

2. Keyword Columnar Substitution Ciphers.

We describe these techniques next.

2.1.1 Simple Keyword Substitution Ciphers

In this method,we write the letters of a keywordwithoutrepetitionsin order of appearance below the plaintext alphabet. We then list the remaining letters of the alphabet below the plaintext in the usual order.

Example 2: Suppose we want the use the keyword NEILSIGMON to create a simple keyword substitution cipher.

a. Use the keyword to create the cipher alphabet.

b. Encipher BURIED TREASURE

c. Decipher TQAXAS AR N HAIS LJM.

Solution:

Example 1 illustrates a flaw that can occur in a simple substitution keyword cipher. Normally, unless the keyword has a letter in the latter part of the alphabet, the last several letters of the plain and ciphertext in a simple substitution cipher are the same. These “collisions” can make this type of cipher more vulnerable to cryptanalysis. The next method for creating a substitution cipher attempts to alleviate this problem.

2.1.1 Keyword Columnar Substitution Ciphers

We write the letters of a keywordwithoutrepetitions in order of appearance. The remaining letters of the alphabet are written in successive rows belowthekeyword. The mixed ciphertext alphabet is obtained by writing the letters of the resulting array column by column (starting with column 1) below the plaintext alphabet.

Example 3: Suppose we want to use the keyword RADFORDVA to create a keyword columnar substitution cipher.

a. Create the cipher alphabet.

b. Encipher MCCOYS

c. Decipher KRHASYEQO

Solution:

Some Suggested Textbook Exercises for Practice for Section 2.1

p. 9: # 1-6

2.2 A Maplet for Substitution Ciphers

Some Suggested Textbook Exercises for Practice for Section 2.2

p. 24: # 1-4

Cryptanalysis of Substitution Ciphers

To break a ciphertext that is encrypted using a substitution cipher, we use frequency analysis on single letters, digraphs (highly occurring two letter sequences), and trigraphs (highly occurring three letter sequences). The following tables list the most common occurring frequencies of single letters, digraphs, and trigraphs.

Letter

/ Relative Frequency (%) /

Letter

/ Relative Frequency (%)
A / 8.17 / N / 6.75
B / 1.49 / O / 7.51
C / 2.78 / P / 1.93
D / 4.25 / Q / 0.01
E / 12.70 / R / 5.99
F / 2.23 / S / 6.33
G / 2.02 / T / 9.06
H / 6.09 / U / 2.76
I / 6.97 / V / 0.98
J / 0.15 / W / 2.36
K / 0.77 / X / 0.15
L / 4.03 / Y / 1.98
M / 2.41 / Z / 0.07

Table 1: Relative Frequencies of letters of English Language

Most Common are E, T, A, O, I, N, and R

1. TH / 9. HA
2. ER / 10. AT, EN, ES, OF, OR
3. ON / 11. NT
4. AN / 12. EA, TI, TO
5. RE / 13. IT, ST
6. HE / 14. IO, LE
7. IN / 15. IS, OU
8. ED, ND / 16. AR, AS, DE, RT, VE

Table 2: Most Common Digraphs in the English Language

(Based on a 2000 letter sample)

1. THE / 6. TIO / 11. EDT
2. AND / 7. FOR / 12. TIS
3. THA / 8. NDE / 13. OFT
4. ENT / 9. HAS / 14. STH
5. ION / 10. NCE / 15. MEN

Table 3: Most Common Trigraphs in the English Language

Example 4: Suppose the message

DO YMT ABK EQHBG SCH EHBIH ADNCHQ, YMT JBY EHAMJH B QDAC NHQRMK. JBKY NHMNIH SCDKG SCH EHBIH ADNCHQ DR B CMBW. VH JBY KHUHQ GKMV OMQ RTQH.

was enciphered using a substitution cipher and the following frequencies are given.

1 Graphs

A / 5 /
/ F / 0 /
/ K / 7 /
/ P / 0 /
/ U / 1 /
/ Z / 0
B / 10 /
/ G / 3 /
/ L / 0 /
/ Q / 8 /
/ V / 2 /
C / 7 /
/ H / 18 /
/ M / 8 /
/ R / 3 /
/ W / 1 /
D / 6 /
/ I / 3 /
/ N / 5 /
/ S / 3 /
/ X / 0 /
E / 4 /
/ J / 4 /
/ O / 2 /
/ T / 3 /
/ Y / 5 /

2 Graphs

YM / 2
/ / MT / 2
/ / BK / 3
/ / QH / 2
/ / HB / 3
/ / SC / 3
CH / 4
/ / EH / 3
/ / BI / 2
/ / IH / 3
/ / AD / 2
/ / DN / 2
NC / 2
/ / HQ / 4
/ / JB / 3
/ / NH / 2
/ / KY / 2
/

3 Graphs

YMT / 2
/ / SCH / 2
/ / EHB / 2
/ / HBI / 2
/ / BIH / 2
ADN / 2
/ / DNC / 2
/ / NCH / 2
/ / CHQ / 2
/ / JBK / 2
BKY / 2

Decipher this message and find the keyword used to set up the substitution cipher.

Solution:

DO YMT ABK EQHBG SCH EHBIH ADNCHQ, YMT

JBY EHAMJH B QDAC NHQRMK. JBKY NHMNIH

SCDKG SCH EHBIH ADNCHQ DR B CMBW. VH

JBY KHUHQ GKMV OMQ RTQH.

Plain:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Cipher:

Some Suggested Textbook Exercises for Practice for Section 2.3

p. 21: # 1-3

2.4 A Maplet for Cryptanalysis of Substitution Ciphers

Some Suggested Textbook Exercises for Practice for Section 2.4

p. 24: # 1-2

2.5 The Playfair Cipher

Involves encrypting plaintext characters in pairs of two letters each (digraphs) so that the ciphertext is less vulnerable to frequency analysis.

Playfair ciphers were first described in 1854 by English scientist and inventor Sir Charles Wheatstone.

The cipher is named forScottish scientist and politician Baron Lyon Playfair, Wheatstone’s friend,who argued in favor of their use by the British government.

Although initiallyrejected because of their perceived complexity, Playfair ciphers were

eventually used by the British military during the Second Boer War and World War I, as well as by British intelligence and the militaries of severalcountries, including the United States and Germany, during World War II.

Playfair Cipher Encryption Steps

1.Choose a keyword and remove the repetition of letters. Write the keyword in a

way similar to keyword columnarsubstitution ciphers, except that the array must alwayshave exactly five letters per row. After the keyword with repetitions is written, the rest of the alphabet excluding the keyword letters is written to create a 5 × 5 array made up of 5 rows and 5 columns. Note that since the array consist of only 25 letters, I and J are considered to be thesame letter in Playfair arrays, so J is not included.

2.Spaces are removed from theplaintext, and the plaintext is then split into digraphs. If

any digraphscontain repeated letters, an X is inserted in the plaintext between the

firstpair of repeated letters that were grouped together in a digraph, and the plaintext

is again split into digraphs. This process is repeated if necessaryand as many times as
necessary until no digraphs contain repeated letters.Finally, if necessary, an X is inserted at the end of the plaintext so that thelast letter is in a digraph.

3.In a Playfair cipher, the 5 × 5 array of letters is used to convert plaintext digraphs into

ciphertext digraphs according to the following rules.

  • If the letters in a plaintext digraph are in the same row of the array, then the

ciphertext digraph is formed by replacing each plaintextletter with the letter in the array in the same row but one position tothe right, wrapping from the end of the row to the start if necessary.

  • If the letters in a plaintext digraph are in the same column of thearray, then the

ciphertext digraph is formed by replacing each plaintextletter with the letter in the array in the same column but oneposition down, wrapping from the bottom of the column to the top ifnecessary.

  • If the letters in a plaintext digraph are not in the same row or columnof the array,

then the ciphertext digraph is formed by replacing thefirst plaintext letter with the letter in the array in the same row asthe first plaintext letter and the same column as the second plaintextletter, and replacing the second plaintext letter with the letter in thearray in the same row as the second plaintext letter and the same

column as the first plaintext letter.

Example 4: Suppose we want the use the keyword TREASURE to create a Playfair cipher.

a. Use the keyword to create the 5 × 5 cipher array.

b. Encipher MOVE TEN FEET EAST FROM THE BIG TREE

Solution:

Playfair Cipher Decryption

For decryption, the rules for encryption are reversed. After forming the Playfair array with the given keyword, the ciphertext message is group into digraphs (note no X’s are inserted for repeated letters).

  1. The first decryptionrule is identical to the first encryption rule for digraphs on the

same row except letters one position to theleft are chosen, wrapping from the start of the row to the end.

2.The seconddecryption rule is identical to the second encryption rule for digraphs on the same column except letters oneposition up are chosen, wrapping from the top of the column to the bottom.

3.The third decryption rule is identical to the third encryption rule. That is, ifthe letters in a ciphertext digraph are not in the same row or columnof the array, then the plaintext digraph is formed by replacing the first ciphertext letter with the letter in the array in the same row as the first ciphertext letter and the same column as the second ciphertextletter, and replacing the second ciphertext letter with the letter in thearray in the same row as the second ciphertext letter and the samecolumn as the first ciphertext letter.

Example 5: Suppose the keyword TREASURE was used to create a Playfair cipher. Decipher the message GELECBEHACSDXEMECPRW.

Notes Concerning Playfair cipher cryptanalysis

Because Playfair ciphers encrypt digraphs, single-letter frequency analysis is in general not helpful. However, when used to encrypt long messages, it is sometimes possible to break Playfair ciphers using frequency analysis on digraphs, since identical plaintext digraphs will always encrypt to identical ciphertext digraphs. Other weaknesses are that a plaintext digraph and its reverse (e.g., AB and BA) will always encrypt to a ciphertext digraph and its reverse, and that for short keywords the bottom rows of the array may be predictable.

Some Suggested Textbook Exercises for Practice for Section 2.5

p. 28: # 1-4

2.6 A Maplet for Playfair Ciphers

Some Suggested Textbook Exercises for Practice for Section 2.6

p. 31: # 1-4