Interactive Laboratory

Cryptanalysis Lab

Learn how real cryptanalysts break classical ciphers using frequency analysis, pattern recognition, and statistical attacks. Every tool on this page runs in your browser so you can experiment safely with ciphertext samples.

Tools in this lab
IC coincidence test Kasiski key length N-grams pattern match ID cipher guess
Prerequisite intuition

Classical ciphers hide meaning but rarely hide structure. When ciphertext still behaves statistically like language, an attacker gains leverage long before guessing the key by hand.

Frequency Analysis Tool

Paste ciphertext and watch letter frequencies emerge. The animated histogram compares your text against standard English, highlights likely substitutions, and lists the top ten symbols by count.

Tall recurring bars are your first attack surface. In monoalphabetic ciphers, ciphertext rank order should mirror English ETAOIN even when every letter is renamed.

Letter distribution: ciphertext vs English

Ciphertext English reference Highlighted peaks
How to read this

Arab scholars in the ninth century noticed that Arabic letters are unevenly distributed in normal writing. When European cryptanalysts applied the same counting discipline to Latin alphabets, substitution ciphers lost their main defense: obscurity. Frequency analysis does not magically reveal the key. It narrows the search. You compare ranks, test digraphs like TH and HE, and refine mappings until words appear. For a deeper Caesar-focused walkthrough, see the Frequency Analysis Lab.

Index of Coincidence

The index of coincidence (IC) measures how often letters repeat within a text. It is one of the fastest ways to distinguish monoalphabetic ciphertext from polyalphabetic ciphertext.

Formula
IC = Σ n(n−1) / N(N−1)

For each letter, count how many times it appears (n), sum n(n−1) across the alphabet, and divide by N(N−1) where N is total letters.

Result

0.0000 from 0 letters

  • ~0.066 — likely monoalphabetic (Caesar, substitution)
  • ~0.038 — likely polyalphabetic (Vigenère)
  • ~0.033 — near random / very short sample

William Friedman formalized IC for American codebreaking workflows in the 1920s. Cryptanalysts still sweep candidate Vigenère keyword lengths and compute IC on each column: when the length is correct, every column behaves like Caesar and IC spikes toward English.

Kasiski Examination

Repeated plaintext under the same keyword alignment produces repeated ciphertext fragments. Measuring distances between repeats exposes likely keyword lengths for Vigenère-style ciphers.

Repeated fragments
Likely key lengths
Historical note

In 1863 Friedrich Kasiski published a method for attacking repeating-key ciphers by cataloguing repeated ciphertext sequences and factoring the spacings between them. Charles Babbage had discovered the same idea earlier but never published it. Once keyword length is known, split the ciphertext into columns; each column is a Caesar cipher solvable by frequency analysis. Combine this tool with the Vigenère cracker guide for a full attack workflow.

N-Gram Analysis

Bigrams and trigrams expose language skeleton that single-letter counts miss. Compare ciphertext token rankings with common English fragments.

TypeCiphertextCountEnglish reference

English favors TH, HE, IN at the digraph level and THE, ING, AND at the trigram level. When ciphertext bigrams look nothing like English rankings, suspect polyalphabetic encryption or transposition. When rankings are distorted but still clustered, monoalphabetic substitution is likely.

Cipher Identification Assistant

Experimental educational classifier. It combines IC, Caesar shift scoring, and rail-fence trials to suggest which classical family fits your ciphertext.

Treat confidence scores as teaching aids. Short messages, jargon, or mixed languages can fool heuristics. Always verify with domain knowledge and additional tests from the cipher identification guide.

Why Classical Ciphers Fail

A practical cryptanalysis curriculum in plain language — the statistical story behind every tool above.

Collect ciphertext

Longer samples stabilize frequencies and IC.

Measure statistics

IC, histograms, n-grams, repeats.

Hypothesize cipher

Monoalphabetic, polyalphabetic, or transposition.

Recover key / plaintext

Brute Caesar, map substitution, split Vigenère columns.

Frequency analysis in depth

Human languages are redundant. English uses E roughly eight percent of the time and Z far less than one percent. Any cipher that applies one fixed substitution alphabet across the entire message preserves those proportions. Cryptanalysts therefore start with histograms, not hunches. They ask: which ciphertext symbol appears about as often as E should? Which pairs repeat like LL or EE? Once a few anchors land, partial words constrain the rest of the mapping.

The technique scales from classroom Caesar exercises to historical battlefield traffic, provided enough ciphertext exists. It fails when the cipher changes substitution too quickly (Enigma), when the plaintext is not natural language, or when the sample is too short for stable counts.

Kasiski and Friedman tests

Vigenère was once called “le chiffre indéchiffrable” because a repeating keyword applies different Caesar shifts to different positions. Single-letter frequency looks flat, discouraging monoalphabetic attacks. Kasiski examination recovers structure by hunting repeats. If THE appears twice in plaintext and both encryptions align with the same key phase, ciphertext shows matching fragments separated by a multiple of keyword length.

Friedman’s index of coincidence automates the same intuition: split text into columns for each guessed length and measure IC per column. Wrong lengths look random; the correct length makes every column spike toward English. Together, Kasiski and IC reduce a daunting keyword search to a manageable column-solving problem.

Statistical cryptanalysis mindset

Classical cryptanalysis is hypothesis testing under uncertainty. You rarely prove a cipher type with one metric. You accumulate evidence: IC suggests polyalphabetic, Kasiski suggests length five, column three has a peak matching T, a crib word like AND appears under that mapping. Each step eliminates inconsistent stories.

Modern cryptography deliberately destroys these trails through diffusion and confusion — every output bit depends on many input bits, and local statistics vanish. Studying classical breaks is therefore not nostalgia. It teaches what “no statistical leakage” actually means and why AES designers obsess over avalanche effects.

From lab to practice

Use this page alongside hands-on puzzles on Cipher Challenges, the focused Frequency Analysis Lab, and the main Cipher Portal for encryption checks. When you can explain why a histogram breaks Caesar but not Enigma, you understand the security upgrade rotor machines attempted — and why operator mistakes, cribs, and mechanized search still defeated them in wartime conditions.

Responsible learning means practicing on educational samples or your own exercises. DecodeCipher teaches historical techniques to clarify cryptography engineering, not to attack real private communications.

Frequently Asked Questions

What IC value indicates monoalphabetic text?

English plaintext typically yields IC ≈ 0.066. Monoalphabetic ciphertext preserves that value. Samples under thirty letters fluctuate; prefer longer texts when possible.

Can Kasiski find the Vigenère keyword itself?

No — it estimates length. After splitting columns, use frequency analysis per column to recover each key letter.

How is this different from the Frequency Analysis Lab?

The Frequency Lab focuses on monoalphabetic leakage with Caesar brute force. This Cryptanalysis Lab adds IC, Kasiski, n-grams, and cipher identification for a broader attack toolkit.

Does analysis run on a server?

All tools on this page execute locally in your browser. Nothing you paste is sent for analysis.

Educational cryptanalysis only. Tools run client-side; no ciphertext is stored.