Cypher User Manual

Introduction

Description

Cypher is a software toolkit designed to aid in the decryption of standard (historical) ciphers by providing statistical data and algorithmic analysis on encrypted messages. Most modern forms of encryption utilize the relatively recently discovered method of public-key cryptography which, for the purposes of this software, is currently unbreakable and thus is not addressed. As well, Cypher allows users to encrypt messages using these same historical encryption algorithms and techniques. Therefore, this toolkit hopes to serve as a cryptographic learning tool and as a pastime for amateur cryptographers.

Assumptions

This software is best put to use with a basic background in historical cryptography as much of the manipulation of encrypted messages is user-guided. As not all users may be familiar with historical cryptography, a brief background to the field is included in this manual. It is possible to run the automated decryption algorithms, but these are limited in scope and lack the ingenuity of the human brain, which historically has prevailed against the most trying odds in cipher decryption.

Intended Users

Anyone wanting to learn to use basic historical encryption and decryption techniques will find this software useful. Some advanced ciphers can be decrypted (or encrypted) with this software, but these are methods that have historically been overcome and so Cypher does not provide secure data encryption at a professional level.

Organization

This manual is divided into six sections:

1.  Introduction: A brief overview of the software.

2.  Installation: Instruction on installing and running Cypher.

3.  Getting Started: A step-by-step guide to solving a

monoalphabetic, a polyalphabetic, and a transposition cipher with user assistance.

4.  Additional Functions and Features:

A detailed guide to all the options, functions, and

features of Cypher, including cipher type

selection, type-specific options, display options,

and automated cipher decryption.

5.  Background: A basic historical background necessary to

maximize the software’s decryption potential.

6.  Glossary: A list of common cryptographic terms and their

definitions.

Installation

Unix Machines

Download cypher.tar.gz from the Cypher homepage. Uncompress the file by typing the following commands:

gunzip cypher.tar.gz

tar -vfx cypher.tar

Move into the cypher directory by typing cd cypher and follow instructions found in the

README file.

Microsoft Windows

Download the binary executable, Cypher.exe, from the Cypher homepage. Start the program by double-clicking on the Cypher.exe icon.

Getting Started

User-assisted Cipher Decryption of a Simple Monoalphabetic Cipher

After starting up the program, you should see a window similar to the following on your screen:

The layout is fairly simple - there are four windows and a toolbar. The two text windows and the key palette (at the bottom of the screen) are action windows, where the user can perform text manipulations or substitutions. The top left window is a reference tool to allow the user to compare statistical results of a message to that of the English language. Finally, the bottom left window is a display window for the results of any analysis performed on the encrypted message.

In the course of decryption, the user may modify any of the action windows and affect the decryption or partial decryption of the message in the decryption window:

·  Modifying text in the cipher window will change the source for viewing the encrypted message, which is then replaced by plaintext characters according to the key palette.

·  Modifying the key palette will change what a character in the cipher window will be replaced by when displayed on the decryption window.

·  Changing a character in the decryption window has the same effect as modifying an entry in the key palette, with the same consequences of reconstituting the decryption shown according to the new key settings.

Our first task is to input the encrypted message. This can be done in two ways, either by manually inputting the encrypted message by hand, or by loading the message from a text file by choosing “Open…” from the File menu in the cipher window. At any point when the message is displayed on the cipher window, it can be saved to disk by choosing “Save…” or “Save As…” from the File menu on the cipher window. (Similarly, a message in the decryption window can be saved as plain text at any time by using the File menu on that window.) Once the message has been inputted, choose “Begin Decryption” from the Option menu in the cipher window, and the encoded message will be copied to the decryption window:

At this point we probably want to run a frequency analysis on the encrypted message. To do this, click on the “%” icon on the toolbar on the right of the screen. When the analysis is run, statistics are computed automatically for single character frequencies, digram frequencies, and trigram frequencies, but by default only the single character frequencies are displayed. In order to view the other results, or to view any results in histogram format, choose the appropriate display option from the application’s View menu. The results for the message will automatically be displayed in the message frequency window on the left of the screen:

With this information we can start guessing at the substitutions. All of M, J, and X have high frequencies in the encrypted message, so one of them is probably “e” in plain text. We might try some combinations of these by inputting the corresponding plain text letters in the key palette at the bottom of the screen. Assuming we guess correctly, here are the results of those substitutions and the corresponding display on the screen (we’ll assume correct guesses until a little farther in this manual):

At this point, we might assume that “TNE” represents “the” and so “N” is “h” in plain text. Using these intuitive observations along with the frequency tables for single characters, digrams, and trigrams, one might arrive at this partial decryption:

By now, several potential words have been completed. As strings of plain text characters lengthen, and if the dictionary auto-lookup is enabled, plain text character substrings are checked against a standard dictionary file and are underlined green if there is a match; if there is a partial match or seemingly misspelled word, the character substring will be underlined red. In the example above, there are no misspellings.

Considering the substring “Yall”, we might guess that “Y” represents “b” since “ball” is a valid word. However, making this substitution produces the following display (Note the underlined red words that are misspelled as a result of this substitution):

We can see that we’ve made a mistake; so we delete “b” from below “Y” in the key palette, and reassess, this time coming up with “w” as a possible match. The dictionary auto-lookup recalculates the matches and this time there are no misspelled words:


Entering keys for the remaining few cipher text letters produces a completely deciphered message, with no errors (Note that a proper name or other uncommon words may not be recognized by the dictionary auto-lookup, and so a message that is completely deciphered may have “error” markings. Here the dictionary auto-lookup has been disabled.):

Solving a Polyalphabetic Cipher

The display for a polyalphabetic cipher is very similar to that of the monalphabetic cipher. However, the message frequencies do not lead to a ready decryption as in the case of the monalphabetic cipher. The additional repetition search tool (denoted “XYZ,XYZ” on the toolbar) will display results that may give insight into the length of the keyword. Once some preliminary guesses are established, set the keyword length (whose default is one, for monalphabetic ciphers) by clicking on the keyword length icon on the toolbar, and recalculating the message frequencies for different character positions within the keyword. Different frequencies within the keyword will be viewable by selecting the relative position in the keyword from the View menu in the main application window. Once a keyword length is selected, all statistical and substitution functions are performed only on multiples of the keyword length from that position.

Solving Transposition Ciphers

Select “Transposition Cipher” from the main application’s Options menu. The program will prompt for matrix dimensions and the display will be reset. The toolbar icons will change to perform inversions, rotations, transposes, and other matrix functions.

Additional Functions and Features

Cipher Type Selection and Type-specific Options

Tools for decrypting different types of ciphers can be made available by choosing the appropriate cipher type from the Options menu. Once the cipher type is selected, a dialog box will prompt the user for the options to be used for that mode of cipher decryption. In this way it is also possible to change the options for the current decryption mode.

Display of Results

Using the View menu, results of statistical analysis can be displayed in either tabular or histogram format. As well, the data displayed can be changed from single character analysis to digrams or trigrams, also available in tabular or histogram format. The single character histogram is plotted alphabetically to aid in determining any possible shift encryptions, but the digram and trigram histograms are plotted in decreasing order of frequency. If the results of the histogram are hard to see, click once on the histogram and a new window will pop-up with a larger version of that image.

Changing the Dictionary

Under the Options menu, choose “Load new dictionary” and follow the prompts.

Automated Cipher Decryption

Select “Automated Cipher Decryption” from the Options menu. A pop-up box will appear prompting for the type of cipher to be used, potential key if any, and any other necessary or relevant information (perhaps choice of algorithm, or inclination to encryption type, or selection of multiple cipher types to be attempted?) for decrypting the message. As well, the user can select whether to attempt to decipher the message using a random or iterative key search approach. The dictionary auto-lookup function provides the means of determining the success of decryption; key sequences leading to the highest percentage of matches are likely to be the actual decryption key.

Background

Types of Ciphers

There are two broad categories of ciphers: substitution ciphers and transposition ciphers. Substitution ciphers are characterized by a substitution of a character in the original plain text message by the corresponding character in the cipher alphabet, often according to some protocol derived from a predetermined key. In a substitution cipher, the cryptographer is not concerned with changing the position of the characters being encoded, only their values. Transposition ciphers, on the other hand, retain a character’s true form but instead change its position. The subject of substitution ciphers is addressed first.

Monalphabetic Substitution Ciphers

In a monoalphabetic substitution cipher a single character in the plain text alphabet is replaced by a single character in the cipher alphabet. If the language of the plain text message is known (and it is assumed to be English in this software), then the frequency statistics of the occurrence of each letter in the plain text message are also known by analyzing samples of that language. In English, the most frequently occurring letter is ‘e’ at a 12.3% count, followed by t (9.6%) and then a (8.0%); a complete reference to the order statistics of the English language is included in the software. After analyzing the order statistics of a message encrypted by a monoalphabetic substitution cipher, a fairly close mapping of the most frequently occurring letters in the cipher text to the corresponding most frequently occurring letters in the plain text language often yields most of the correct substitutions. The incorrect assignments can be corrected by visually (or algorithmically) finding patterns within the text, often by recognizing commonly occurring words such as “and” or “the”. Once a few letters are deciphered, the rest will follow as a natural consequence of the order statistics and new recognizable words within the partially decrypted message. If the message does not yield readily to a single-character frequency analysis, the frequencies of digrams (two-character combinations) and trigrams (three-character combinations) can also be analyzed and compared to the known statistics in the English language.

One common form of the monoalphabetic substitution cipher is the Caesar shift, in which the encrypter chooses a keyword or keyphrase (simply referred to as the “key”) that is easy to remember and readily yields the cipher alphabet. As an example, consider the plain text alphabet to be “abcdefghijklmnopqrstuvwxyz” and let the key be the word “cypher”. Then the cipher alphabet would look like “cypherabdfghijklmnoqstuvwxz”, simply the key (with letter repetitions removed) followed by the rest of the normal alphabet (also with repetitions removed). This type of cipher is easily solved by examining what the software calls a “shift histogram”, or a histogram of the cipher alphabet with the assumption that the cipher and plain text alphabets are the same. Since the ordered histogram of letters in the English language is known it can be compared to the results of the cipher text analysis, and peaks and troughs in the graphs can be matched to yield a reasonable guess at the key and thus the cipher alphabet.

Polyalphabetic Substitution Ciphers

A polyalphabetic substitution cipher is fairly similar to a monoalphabetic one, the difference being that instead of using a single cipher alphabet, multiple cipher alphabets are used. The simplest type of polyalphabetic cipher, the Vigenere cipher, can use as many as twenty-six distinct cipher alphabets, each a simple rotation of the normally ordered alphabet. For example, one alphabet might be “abcdefghijklmnopqrstuvwxyz”, another “bcdefghijklmnopqrstuvwxyza”, another “cdefghijklmnopqrstuvwxyzab” and so on. The Vigenere encryption works on the basis of a keyword, whose letters determine the order and character of the cipher alphabets used to encode a message.

As a simple example, to encrypt a message using the keyword “cypher” start by translating the first character of the message according to the cipher alphabet corresponding to the first letter of the keyword, or “cdef…”. If the plain text character is ‘b’ then it becomes ‘d’ according to this first cipher alphabet. To encrypt the second letter of plain text, use the cipher alphabet corresponding to the second letter of the keyword, or “yzabc…”. If the plain text character is ‘b’ again, this time it becomes ‘z’ according to the second cipher alphabet. This process continues to the last letter of the keyword and then the cycle is restarted, reusing the first letter of the keyword, then the second, and so on.