Arab scholars in the ninth century noticed that Arabic letters are unevenly distributed in normal writing. When European cryptanalysts applied the same counting discipline to Latin alphabets, substitution ciphers lost their main defense: obscurity. Frequency analysis does not magically reveal the key. It narrows the search. You compare ranks, test digraphs like TH and HE, and refine mappings until words appear. For a deeper Caesar-focused walkthrough, see the Frequency Analysis Lab.
Cryptanalysis Lab
Learn how real cryptanalysts break classical ciphers using frequency analysis, pattern recognition, and statistical attacks. Every tool on this page runs in your browser so you can experiment safely with ciphertext samples.
Classical ciphers hide meaning but rarely hide structure. When ciphertext still behaves statistically like language, an attacker gains leverage long before guessing the key by hand.
Frequency Analysis Tool
Paste ciphertext and watch letter frequencies emerge. The animated histogram compares your text against standard English, highlights likely substitutions, and lists the top ten symbols by count.
Tall recurring bars are your first attack surface. In monoalphabetic ciphers, ciphertext rank order should mirror English ETAOIN even when every letter is renamed.
Letter distribution: ciphertext vs English
Index of Coincidence
The index of coincidence (IC) measures how often letters repeat within a text. It is one of the fastest ways to distinguish monoalphabetic ciphertext from polyalphabetic ciphertext.
For each letter, count how many times it appears (n), sum n(n−1) across the alphabet, and divide by N(N−1) where N is total letters.
0.0000 from 0 letters
- ~0.066 — likely monoalphabetic (Caesar, substitution)
- ~0.038 — likely polyalphabetic (Vigenère)
- ~0.033 — near random / very short sample
William Friedman formalized IC for American codebreaking workflows in the 1920s. Cryptanalysts still sweep candidate Vigenère keyword lengths and compute IC on each column: when the length is correct, every column behaves like Caesar and IC spikes toward English.
Kasiski Examination
Repeated plaintext under the same keyword alignment produces repeated ciphertext fragments. Measuring distances between repeats exposes likely keyword lengths for Vigenère-style ciphers.
In 1863 Friedrich Kasiski published a method for attacking repeating-key ciphers by cataloguing repeated ciphertext sequences and factoring the spacings between them. Charles Babbage had discovered the same idea earlier but never published it. Once keyword length is known, split the ciphertext into columns; each column is a Caesar cipher solvable by frequency analysis. Combine this tool with the Vigenère cracker guide for a full attack workflow.
N-Gram Analysis
Bigrams and trigrams expose language skeleton that single-letter counts miss. Compare ciphertext token rankings with common English fragments.
| Type | Ciphertext | Count | English reference |
|---|
English favors TH, HE, IN at the digraph level and THE, ING, AND at the trigram level. When ciphertext bigrams look nothing like English rankings, suspect polyalphabetic encryption or transposition. When rankings are distorted but still clustered, monoalphabetic substitution is likely.
Cipher Identification Assistant
Experimental educational classifier. It combines IC, Caesar shift scoring, and rail-fence trials to suggest which classical family fits your ciphertext.
Treat confidence scores as teaching aids. Short messages, jargon, or mixed languages can fool heuristics. Always verify with domain knowledge and additional tests from the cipher identification guide.
Why Classical Ciphers Fail
A practical cryptanalysis curriculum in plain language — the statistical story behind every tool above.
Longer samples stabilize frequencies and IC.
IC, histograms, n-grams, repeats.
Monoalphabetic, polyalphabetic, or transposition.
Brute Caesar, map substitution, split Vigenère columns.
Frequency analysis in depth
Human languages are redundant. English uses E roughly eight percent of the time and Z far less than one percent. Any cipher that applies one fixed substitution alphabet across the entire message preserves those proportions. Cryptanalysts therefore start with histograms, not hunches. They ask: which ciphertext symbol appears about as often as E should? Which pairs repeat like LL or EE? Once a few anchors land, partial words constrain the rest of the mapping.
The technique scales from classroom Caesar exercises to historical battlefield traffic, provided enough ciphertext exists. It fails when the cipher changes substitution too quickly (Enigma), when the plaintext is not natural language, or when the sample is too short for stable counts.
Kasiski and Friedman tests
Vigenère was once called “le chiffre indéchiffrable” because a repeating keyword applies different Caesar shifts to different positions. Single-letter frequency looks flat, discouraging monoalphabetic attacks. Kasiski examination recovers structure by hunting repeats. If THE appears twice in plaintext and both encryptions align with the same key phase, ciphertext shows matching fragments separated by a multiple of keyword length.
Friedman’s index of coincidence automates the same intuition: split text into columns for each guessed length and measure IC per column. Wrong lengths look random; the correct length makes every column spike toward English. Together, Kasiski and IC reduce a daunting keyword search to a manageable column-solving problem.
Statistical cryptanalysis mindset
Classical cryptanalysis is hypothesis testing under uncertainty. You rarely prove a cipher type with one metric. You accumulate evidence: IC suggests polyalphabetic, Kasiski suggests length five, column three has a peak matching T, a crib word like AND appears under that mapping. Each step eliminates inconsistent stories.
Modern cryptography deliberately destroys these trails through diffusion and confusion — every output bit depends on many input bits, and local statistics vanish. Studying classical breaks is therefore not nostalgia. It teaches what “no statistical leakage” actually means and why AES designers obsess over avalanche effects.
From lab to practice
Use this page alongside hands-on puzzles on Cipher Challenges, the focused Frequency Analysis Lab, and the main Cipher Portal for encryption checks. When you can explain why a histogram breaks Caesar but not Enigma, you understand the security upgrade rotor machines attempted — and why operator mistakes, cribs, and mechanized search still defeated them in wartime conditions.
Responsible learning means practicing on educational samples or your own exercises. DecodeCipher teaches historical techniques to clarify cryptography engineering, not to attack real private communications.
Frequently Asked Questions
What IC value indicates monoalphabetic text?
English plaintext typically yields IC ≈ 0.066. Monoalphabetic ciphertext preserves that value. Samples under thirty letters fluctuate; prefer longer texts when possible.
Can Kasiski find the Vigenère keyword itself?
No — it estimates length. After splitting columns, use frequency analysis per column to recover each key letter.
How is this different from the Frequency Analysis Lab?
The Frequency Lab focuses on monoalphabetic leakage with Caesar brute force. This Cryptanalysis Lab adds IC, Kasiski, n-grams, and cipher identification for a broader attack toolkit.
Does analysis run on a server?
All tools on this page execute locally in your browser. Nothing you paste is sent for analysis.
Continue Learning
Cipher Challenges
Practice breaking Caesar, substitution, Vigenère, and mini-Enigma puzzles with hints and achievements.
→Frequency Analysis Lab
Deep dive into monoalphabetic leakage with live Caesar brute force.
→Vigenère Cracker Guide
Column splitting workflow after Kasiski length recovery.
→Identify a Cipher
Decision checklist before choosing an attack path.
→Educational cryptanalysis only. Tools run client-side; no ciphertext is stored.