08 — Deployment Checklist
Phase 1: Inventory (Weeks 1–4)
Identify All Key Exchange Usage
| System |
Protocol |
Current algorithm |
Priority |
| Public websites |
TLS 1.3 |
ECDH / RSA |
High |
| APIs |
TLS 1.2/1.3 |
ECDH |
High |
| VPN concentrators |
IPsec / WireGuard |
ECDH |
High |
| Internal services |
mTLS |
ECDH |
Medium |
| File encryption |
PGP / S/MIME |
RSA / ECDH |
Medium |
| IoT devices |
Custom TLS |
ECDH / RSA |
Low |
| Legacy mainframes |
TLS 1.0–1.2 |
RSA |
Low |
Tools:
- Network scanner: nmap with SSL scripts, sslyze, testssl.sh
- Code scanner: grep for crypto libraries (OpenSSL, BoringSSL, libgcrypt)
- Certificate inventory: Parse CT logs, internal CA databases
Classify by Sensitivity
| Data lifetime |
Example |
Action |
| < 1 year |
Session cookies, temporary tokens |
Standard priority |
| 1–10 years |
Healthcare records, financial audits |
High priority |
| > 10 years |
Military secrets, infrastructure designs |
Critical priority |
| Indefinite |
Legal contracts, government archives |
Critical priority + hybrid mode |
Phase 2: Pilot (Weeks 5–12)
Select Pilot Systems
Choose 2–3 systems for initial deployment:
- One internal service (low risk, controlled environment)
- One public-facing API (moderate risk, monitorable)
- One non-critical website (public but replaceable)
TLS 1.3 configuration (OpenSSL 3.2+):
groups = X25519:ML-KEM-768
Or explicit hybrid:
groups = X25519ML-KEM-768
Monitoring:
- Handshake success rate (should be 100%)
- Handshake latency (should increase < 1 ms)
- Client compatibility (watch for old browsers/devices)
- Error logs (any "unknown group" or "handshake failure")
Test Vectors
Verify your implementation against NIST test vectors:
Phase 3: Production Rollout (Weeks 13–26)
Staged Deployment
| Stage |
Scope |
Timeline |
| 1 |
Internal APIs and tools |
Week 13–14 |
| 2 |
Public APIs (non-critical) |
Week 15–17 |
| 3 |
Public websites (hybrid) |
Week 18–20 |
| 4 |
VPN and remote access |
Week 21–23 |
| 5 |
Critical infrastructure (hybrid mandatory) |
Week 24–26 |
Rollback Plan
For each deployment stage:
- Feature flag: Enable/disable ML-KEM via configuration
- Canary: 5% → 25% → 100% traffic over 3 days
- Metrics dashboard: Handshake failures, latency p99, error rates
- Rollback trigger: > 0.1% handshake failure rate
Phase 4: Pure PQC (Future)
When to Remove Classical Algorithms
| Trigger |
Action |
| NIST/NSA guidance update |
Follow agency timeline |
| Quantum computer factoring RSA-2048 |
Emergency: disable RSA immediately |
| ML-KEM 10+ years of cryptanalysis |
Confident in pure PQC |
| Client ecosystem ready |
> 95% of clients support ML-KEM |
Monitoring Metrics
| Metric |
Target |
Alert threshold |
| Handshake success rate |
> 99.9% |
< 99.5% |
| P99 latency increase |
< 2 ms |
> 5 ms |
| Key exchange failures |
0 |
Any increase |
| Client fallback rate (to classical) |
< 1% |
> 5% |
Common Pitfalls
| Pitfall |
Why It Happens |
Fix |
| Hardcoded key sizes |
Old code assumes 256-byte keys |
Use dynamic sizing based on algorithm |
| Certificate chain issues |
Intermediate CAs don't support PQC |
Update CA infrastructure before leaf certs |
| Old client rejection |
Browsers without ML-KEM |
Keep hybrid mode; fallback gracefully |
| Performance regression |
Unoptimised implementation |
Use AVX2/NEON, reference liboqs |
| Side-channel leaks |
Non-constant-time sampling |
Audit with dudect, use constant-time code |
Resources