Fix Plan

Phase 1: Baseline Measurement

[1.1] Create cold launch benchmark script
- Files: bench/cold.sh, bench/common.sh
- Acceptance: Runs full launch, reports timing breakdown
[1.2] Run cold baseline and document results
- Run 3 iterations, record typical timing
- Document: Where does the time actually go?
- Dependencies: AWS setup complete
- Result: 25.8s avg. API 19%, Pending→Running 69%, Boot→SSH 12%
- Key Finding: AWS infra time (Pending→Running) dominates; boot optimization has limited impact

[2.1] Create minimal AMI baking script
- Files: scripts/bake-minimal-ami.sh
- Strip: cloud-init, ssm-agent, unnecessary packages
- Keep: sshd, basic networking
[2.2] Create minimal-ami benchmark script
- Files: bench/minimal-ami.sh
- Compare timing to cold baseline
- Result: 28.8s - SLOWER than baseline!
- Key Finding: Cloud-init actually helps boot faster (coordinates service startup)
- Boot→SSH went from 3.2s to 9.0s without cloud-init
- Recommendation: Don't strip cloud-init

Based on baseline results, Pending→Running takes 69% of total time (~18s). This phase tests hypotheses to understand/reduce AWS infrastructure time.

STATUS: DEPRIORITIZED - Boot optimization has minimal impact given findings.

[4.1] Create init-agent (simple Go or bash)
- Fetches pubkey from userdata or IMDS
- Writes to authorized_keys
- Starts sshd
- Files: init-agent/ or inline in bake script
- Note: Minimal-ami showed cloud-init already helps; custom init unlikely to beat it
[4.2] Create custom-init AMI baking script
[4.3] Create custom-init benchmark script

Technique	Total	Pending→Running	Boot→SSH	vs Cold
Cold (baseline)	25.8s	17.8s	3.2s	-
Start-stopped	21.2s	16.4s	3.0s	-18%
Hibernate	20.7s	16.3s	2.4s	-20%
Minimal-ami	28.8s	16.4s	9.0s	+12%
Warm pool (SSH)	1.9s	N/A	N/A	-92%
Warm pool (EIC)	3.3s	N/A	N/A	-87%

The 69% is unavoidable without warm pools
- Pending→Running takes ~16-18s regardless of technique
- This is VM/hypervisor cold-start, not resource allocation
- Pre-attached EBS/ENI doesn't help significantly
Boot optimization has diminishing returns
- Boot→SSH is only 3.2s (12% of total)
- Cloud-init actually helps (stripping it made it slower)
- Hibernate saves only 0.8s vs cold boot
Warm pools are the only answer for <10s
- Pre-running instances: 1.9s (92% reduction)
- Cost: ~$0.10/hr per instance (m7i.large)
- Trade-off: Instance cost vs startup speed