365 Architect

08 — Deployment Checklist

Phase 1: Inventory (Weeks 1–4)

Identify All Key Exchange Usage

System Protocol Current algorithm Priority
Public websites TLS 1.3 ECDH / RSA High
APIs TLS 1.2/1.3 ECDH High
VPN concentrators IPsec / WireGuard ECDH High
Internal services mTLS ECDH Medium
File encryption PGP / S/MIME RSA / ECDH Medium
IoT devices Custom TLS ECDH / RSA Low
Legacy mainframes TLS 1.0–1.2 RSA Low

Tools:

  • Network scanner: nmap with SSL scripts, sslyze, testssl.sh
  • Code scanner: grep for crypto libraries (OpenSSL, BoringSSL, libgcrypt)
  • Certificate inventory: Parse CT logs, internal CA databases

Classify by Sensitivity

Data lifetime Example Action
< 1 year Session cookies, temporary tokens Standard priority
1–10 years Healthcare records, financial audits High priority
> 10 years Military secrets, infrastructure designs Critical priority
Indefinite Legal contracts, government archives Critical priority + hybrid mode

Phase 2: Pilot (Weeks 5–12)

Select Pilot Systems

Choose 2–3 systems for initial deployment:

  1. One internal service (low risk, controlled environment)
  2. One public-facing API (moderate risk, monitorable)
  3. One non-critical website (public but replaceable)

Configure Hybrid Mode

TLS 1.3 configuration (OpenSSL 3.2+):
  groups = X25519:ML-KEM-768
  
Or explicit hybrid:
  groups = X25519ML-KEM-768

Monitoring:

  • Handshake success rate (should be 100%)
  • Handshake latency (should increase < 1 ms)
  • Client compatibility (watch for old browsers/devices)
  • Error logs (any "unknown group" or "handshake failure")

Test Vectors

Verify your implementation against NIST test vectors:

Phase 3: Production Rollout (Weeks 13–26)

Staged Deployment

Stage Scope Timeline
1 Internal APIs and tools Week 13–14
2 Public APIs (non-critical) Week 15–17
3 Public websites (hybrid) Week 18–20
4 VPN and remote access Week 21–23
5 Critical infrastructure (hybrid mandatory) Week 24–26

Rollback Plan

For each deployment stage:

  1. Feature flag: Enable/disable ML-KEM via configuration
  2. Canary: 5% → 25% → 100% traffic over 3 days
  3. Metrics dashboard: Handshake failures, latency p99, error rates
  4. Rollback trigger: > 0.1% handshake failure rate

Phase 4: Pure PQC (Future)

When to Remove Classical Algorithms

Trigger Action
NIST/NSA guidance update Follow agency timeline
Quantum computer factoring RSA-2048 Emergency: disable RSA immediately
ML-KEM 10+ years of cryptanalysis Confident in pure PQC
Client ecosystem ready > 95% of clients support ML-KEM

Monitoring Metrics

Metric Target Alert threshold
Handshake success rate > 99.9% < 99.5%
P99 latency increase < 2 ms > 5 ms
Key exchange failures 0 Any increase
Client fallback rate (to classical) < 1% > 5%

Common Pitfalls

Pitfall Why It Happens Fix
Hardcoded key sizes Old code assumes 256-byte keys Use dynamic sizing based on algorithm
Certificate chain issues Intermediate CAs don't support PQC Update CA infrastructure before leaf certs
Old client rejection Browsers without ML-KEM Keep hybrid mode; fallback gracefully
Performance regression Unoptimised implementation Use AVX2/NEON, reference liboqs
Side-channel leaks Non-constant-time sampling Audit with dudect, use constant-time code

Resources

Share on LinkedIn