Cryptanalysis Collection
Discover the art and science of breaking codes. Frequency analysis, Kasiski examination, index of coincidence, and the statistical tools that cryptanalysts have used for centuries to read the unreadable.
The Science of Breaking Codes
Cryptanalysis is the art and science of deciphering encrypted messages without knowledge of the key. Where cryptography seeks to conceal meaning, cryptanalysis seeks to reveal it — and the tension between these two disciplines has driven innovation on both sides for millennia. Every cipher ever devised has been met, sooner or later, by a cryptanalyst who found its weakness.
The earliest recorded cryptanalysis appears in ninth-century Arabic manuscripts. The scholar Al-Kindi wrote a treatise titled A Manuscript on Deciphering Cryptographic Messages, in which he described what we now call frequency analysis — the observation that certain letters appear more frequently than others in any given language, and that this statistical fingerprint survives substitution. Al-Kindi's insight was the single most important breakthrough in the history of cryptanalysis, and it remained the dominant method for breaking ciphers for over a thousand years.
The Renaissance brought increasingly sophisticated ciphers — polyalphabetic systems like the Vigenère cipher that used multiple shifting alphabets to obscure letter frequencies. For three centuries these were considered unbreakable, until the Prussian cavalry officer Friedrich Kasiski published a method in 1863 that could determine the keyword length by examining repeated sequences in the ciphertext. Simultaneously, Charles Babbage had independently discovered the same technique years earlier but never published. The Kasiski examination, combined with the Index of Coincidence developed by William F. Friedman in the 1920s, gave cryptanalysts a complete toolkit for attacking polyalphabetic ciphers.
The twentieth century saw cryptanalysis become a mathematical science. The development of the chi-squared statistic provided an objective measure of fit between observed and expected distributions. N-gram analysis — examining patterns of two, three, or more consecutive letters — captured far more linguistic structure than single-letter frequencies alone. These tools reached their apotheosis during World War II, when Allied codebreakers at Bletchley Park used statistical methods, early electromechanical computers, and sheer ingenuity to break the Enigma and Lorenz ciphers on an industrial scale.
Today, the principles of classical cryptanalysis remain deeply relevant. The statistical methods developed for pencil-and-paper ciphers — frequency analysis, chi-squared scoring, Index of Coincidence — are the same techniques that power modern automated cipher solvers and underpin the study of cryptography itself. Understanding how to break a simple substitution cipher builds the intuition needed to appreciate why modern algorithms are designed to resist these very attacks.
Featured Exhibits
Frequency Analysis
The most fundamental tool in the cryptanalyst's arsenal. Every language has a distinctive letter frequency profile. In English, E is the most common letter, followed by T, A, O, I, N, S, H, R. By comparing ciphertext letter counts against expected language distributions, an analyst can identify likely substitutions.
Explore Exhibit →Kasiski Examination
Friedrich Kasiski's 1863 breakthrough made the Vigenère cipher breakable. By identifying repeated sequences in ciphertext and measuring the distances between them, the analyst can deduce the keyword length — collapsing a polyalphabetic cipher into multiple simpler Caesar ciphers.
Explore Exhibit →Index of Coincidence
The Index of Coincidence (IC) measures how likely it is that two randomly selected letters from a text are identical. Natural language has a characteristic IC (~0.067 for English). Random text has an IC of ~0.038. This single number can distinguish monoalphabetic from polyalphabetic ciphertext.
Explore Exhibit →N-Gram Analysis
While single-letter frequencies are useful, patterns of two or three letters (digraphs and trigraphs) carry much more information. Common English digraphs like TH, HE, and IN are powerful discriminators. Quadgram scoring is the backbone of modern automated cipher solvers.
Explore Exhibit →Chi-Squared Statistic
The chi-squared statistic quantifies how closely a ciphertext's letter distribution matches expected language frequencies. A low chi-squared value suggests a good candidate decryption. This mathematical tool replaces human intuition with a precise, objective score.
Explore Exhibit →Cipher Identification
Before breaking a cipher, the analyst must identify what kind of cipher they are facing. IC helps distinguish monoalphabetic from polyalphabetic. Index of coincidence, chi-squared on n-grams, and Kasiski examination together form a diagnostic toolkit that points toward the most likely encryption method.
Explore Exhibit →Recommended Tours
Beginner Codebreaker
Frequency Analysis → Cipher Identification → Cipher Challenges
~30 minStatistical Deep Dive
IC → Chi-Squared → N-Grams → Kasiski → Cryptanalysis Lab
~45 minHands-On Lab
Frequency Analysis Lab → Cryptanalysis Lab → Challenges
~35 minVisitor Information
This collection is part of the museum's permanent exhibit and is always open to the public. All interactive tools run entirely in your browser — no downloads, no accounts, no data collection. We recommend starting with the Beginner Codebreaker tour if this is your first introduction to cryptanalysis.
After exploring the exhibits, put your skills to the test in the Cryptanalysis Lab or take on the Cipher Challenges for a deeper dive into each technique's strengths and weaknesses.
All exhibits are free. No account or installation required. Every interactive tool runs in your browser. Processing is stateless — your input is never stored or logged.