04 — The ML-DSA Protocol
Overview
ML-DSA has three operations. Each builds on Module-LWE and Module-SIS, using polynomials in the ring R_q = Z_q[X]/(X^256 + 1).
Step 1: KeyGen
Who runs it: The signer. What it produces: A public key (pk) and private key (sk).
What Happens
- Generate seed ρ: 32 random bytes.
- Expand to matrix A: Hash ρ to generate a k×l matrix of random polynomials in R_q. This is the "bad basis" — public, structured, but hard to find short vectors in.
- Generate secret vectors s₁, s₂: Two short vectors (polynomials with small coefficients, sampled from a narrow distribution).
- Compute t: t = A·s₁ + s₂ (matrix-vector multiplication in R_q).
- Output:
- pk = (ρ, t) — 1,952 bytes (for ML-DSA-65)
- sk = (ρ, K, tr, s₁, s₂, t₀) — includes the secret vectors and intermediate values
Why It's Secure
The public key reveals t = A·s₁ + s₂. Finding s₁ and s₂ from t is the Module-LWE problem — hard even for quantum computers.
Step 2: Sign
Who runs it: The signer, using their private key and a message. What it produces: A signature σ.
What Happens
Hash the message: μ = Hash(tr || message), where tr is a hash of the public key (domain separation).
Generate random mask y: A short vector of polynomials, sampled from a wide distribution.
Compute commitment: w = A·y (high part) and w₁ = HighBits(w) (the "high-order" bits of w).
Derive challenge: c = Hash(μ || w₁) — the Fiat-Shamir challenge.
Compute response: z = y + c·s₁
Rejection sampling: Check if z is "well-shaped":
- ||z||∞ < γ₁ - β (infinity norm bound)
- ||LowBits(w - c·s₂)||∞ < γ₂ - β
If both pass → continue. Otherwise → go back to step 2 with a new y.
Compute hint: h = MakeHint(-c·t₀, w - c·s₂ + c·t₀)
This hint helps the verifier reconstruct w₁ without knowing the secret.
Output signature: σ = (z, h, c)
Signature Size
| Component | Size (ML-DSA-65) | What it is |
|---|---|---|
| z (response vector) | 2,560 bytes | Compressed polynomial vector |
| h (hint) | 128 bytes | Bit-packed hint polynomials |
| c (challenge) | 32 bytes | Hash digest |
| Total | 3,293 bytes |
Step 3: Verify
Who runs it: Anyone with the public key, message, and signature. What it produces: Accept or Reject.
What Happens
Unpack signature: Extract z, h, c from σ.
Check z is short: ||z||∞ < γ₁ - β. If not → REJECT.
Reconstruct w₁':
- Compute Az - c·t (using the public key)
- Use hint h to reconstruct w₁ ≈ HighBits(A·y)
Recompute challenge: c' = Hash(μ || w₁')
Check c' == c: If not → REJECT.
Accept: All checks pass.
Why Verification Works
Az - c·t = A(y + c·s₁) - c·(A·s₁ + s₂)
= A·y + c·A·s₁ - c·A·s₁ - c·s₂
= A·y - c·s₂
= w - c·s₂
The verifier knows A, z, c, t (all public). They can compute Az - c·t and use the hint h to recover the high bits of w = A·y. If the signer really knew s₁, this reconstructs the same w₁ that was used to compute c.
Complete Flow Diagram
Signer Verifier
| |
|-- KeyGen() |
| ρ <- random(32 bytes) |
| A <- Expand(ρ) |
| s₁, s₂ <- small random polynomials |
| t <- A·s₁ + s₂ |
| |
|==== public key pk=(ρ,t) =================>|
| |
|-- Sign(sk, msg) |
| μ <- Hash(tr || msg) |
| retry: |
| y <- random mask |
| w <- A·y |
| w₁ <- HighBits(w) |
| c <- Hash(μ || w₁) |
| z <- y + c·s₁ |
| if ||z|| > bound: continue retry |
| h <- MakeHint(...) |
| σ <- (z, h, c) |
| |
|==== signature σ =========================>|
| |
| Verify(pk, msg, σ) |
| unpack z, h, c |
| check ||z|| < bound |
| w₁' <- UseHint(h, Az - c·t) |
| c' <- Hash(μ || w₁') |
| check c' == c |
| |
|<==== Accept / Reject ====================|
Key Sizes and Speed
| Operation | ML-DSA-44 | ML-DSA-65 | ML-DSA-87 |
|---|---|---|---|
| KeyGen | ~200 µs | ~350 µs | ~500 µs |
| Sign | ~300 µs | ~500 µs | ~800 µs |
| Verify | ~80 µs | ~120 µs | ~180 µs |
| Public key | 1,312 B | 1,952 B | 2,592 B |
| Private key | 2,528 B | 4,032 B | 4,896 B |
| Signature | 2,420 B | 3,293 B | 4,595 B |
Benchmarks on Intel Core i7 at 3.6 GHz.
Resources
- NIST FIPS 204, Sections 5 (KeyGen), 6 (Sign), 7 (Verify)
- CRYSTALS-Dilithium specification: pq-crystals.org/dilithium/data/dilithium-specification.pdf