365 Architect

07 — Hash Then Sign

The Principle

ML-DSA never signs a message directly. It always follows:

Message (any size)
      |
      v
   Hash Function
      |
      v
Fixed-size digest (64 bytes)
      |
      v
   Sign with private key
      |
      v
   Signature

Why?

  • Messages can be any size (megabytes, gigabytes)
  • Signing algorithms work on fixed-size inputs
  • Hashing provides domain separation (messages from different contexts can't collide)

The ML-DSA Hash Function: SHAKE-256

ML-DSA uses SHAKE-256 (a SHA-3 extendable-output function):

Property Value Why
Algorithm SHAKE-256 NIST-standardised, quantum-resistant assumption
Output size 64 bytes (512 bits) Twice the security level (birthday bound protection)
Domain separation tr (public key hash) + message Same message with different keys produces different signatures

The Message Representation

μ = H(tr || M)

Where:

  • tr = H(ρ || t) — a 64-byte hash of the public key
  • M — the message to sign
  • H — SHAKE-256 with 512-bit output
  • || — concatenation

Why include tr? Without it, the same message signed by two different signers would have related signatures. An attacker might exploit this relationship. Including tr ensures each signer's signatures are in a completely separate "domain."

Message Encoding for Different Applications

Pure ML-DSA

Sign the raw message bytes:

signature = ML-DSA.Sign(sk, message_bytes)

Pre-hashed ML-DSA (ExternalMu-ML-DSA)

If the message is already hashed (e.g., by a hardware security module):

# HSM computes:
mu = SHAKE-256(tr || message)

# ML-DSA signs the pre-hash:
signature = ExternalMu-ML-DSA.Sign(sk, mu)

This allows hardware offload — the HSM hashes large messages, then sends only the 64-byte digest to the signing module.

Context String (ctx)

ML-DSA supports an optional context string for application-specific domain separation:

signature = ML-DSA.Sign(sk, message, ctx="TLS-1.3-certificate")

The context is prepended to the message hash:

μ = H(tr || ctx || message)

Use cases:

  • Separate signatures for different protocol versions
  • Distinguish testnet vs. mainnet in blockchain
  • Separate code signing from document signing within the same key

Why SHAKE-256 (Not SHA-256)?

Feature SHA-256 SHAKE-256
Output size Fixed (256 bits) Variable (any length)
Security level 128-bit collision 256-bit preimage
Quantum resistance 64-bit (Grover) 128-bit (Grover)
Standard SHA-2 (2001) SHA-3 (2015)
NIST PQC preference Not recommended Recommended

NIST requires PQC algorithms to use SHA-3/SHAKE because:

  • SHA-2 and RSA/ECDSA share similar algebraic structure (Merkle-Damgård)
  • A breakthrough in one might affect the other (though no such attack is known)
  • SHA-3 (Keccak) uses a completely different design (sponge construction)

Collision Resistance Requirements

For a signature scheme to be secure, the hash function must be:

Property Required strength What it prevents
Preimage resistance 2256 Attacker creates a message with a specific hash
Second preimage 2256 Attacker modifies a signed message while keeping the same signature valid
Collision resistance 2^256 Attacker finds two messages with the same hash (forges signatures)

SHAKE-256 with 512-bit output provides all three at the 256-bit level.

Resources

  • NIST FIPS 202: SHA-3 Standard (Permutation-Based Hash and Extendable-Output Functions)
  • NIST FIPS 204, Section 3: Hash Functions
  • RFC 9180: Hybrid Public Key Encryption (HPKE) — context string usage
Share on LinkedIn