Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions aws-lambda-managed-instances/POWER.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ If this fails, configure credentials via `aws configure` or set `AWS_PROFILE`.

### Step 2: Check regional availability

Lambda Managed Instances is available in select regions. Verify availability:
Currently available: us-east-1, us-east-2, us-west-2, ap-northeast-1, eu-west-1. Expanding to all commercial regions soon. Verify the latest availability:

- [Lambda Managed Instances documentation](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances.html)

Expand All @@ -46,7 +46,7 @@ Lambda Managed Instances is available in select regions. Verify availability:
| Cold starts | Unacceptable (LMI eliminates for provisioned capacity) | Tolerable or mitigated by SnapStart |
| Compute | Latest CPUs, specific families, high network bandwidth | Standard Lambda memory/CPU sufficient |
| Isolation | Dedicated EC2 instances in your account, full VPC control | Shared Firecracker micro-VMs acceptable |
| Scale-to-zero | Not needed (min 3 instances always run) | Required (pay nothing when idle) |
| Scale-to-zero | Not needed (execution environments always running) | Required (pay nothing when idle) |
| Code readiness | Thread-safe (Node.js/Java/.NET) or any Python code | Non-thread-safe code, expensive to change |

## Workflow
Expand All @@ -72,16 +72,16 @@ REQUIRED: Present a cost comparison before recommending LMI. Compare at minimum:
| Lambda on-demand | Low volume, bursty traffic |
| LMI on-demand | High volume, steady traffic |

Rule of thumb: LMI becomes cost-competitive at 50-100M+ req/month with steady traffic.
Rule of thumb: LMI becomes cost-competitive when your Lambda spend exceeds ~$1,000/month with steady traffic.

Use the [LMI Pricing Calculator](https://aws-samples.github.io/sample-aws-lambda-managed-instances/) for accurate comparisons.

### Step 3: Configure the Deployment

- **Instance families** (400+ types, .large and up): C-series (compute), M-series (general), R-series (memory). ARM (Graviton) for best price-performance.
- **Instance families** (~450 types): C-series (compute, .xlarge+), M-series (general, .large+), R-series (memory, .large+). ARM (Graviton) for best price-performance.
- **Memory-to-vCPU ratios**: 2:1 (compute), 4:1 (general, default), 8:1 (memory). Min 2 GB, max 32 GB.
- **Multi-concurrency defaults/vCPU**: Node.js 64, Java 32, .NET 32, Python 16.
- **Scaling**: MinExecutionEnvironments (default 3), MaxVCpuCount (required), TargetResourceUtilization.
- **Scaling**: MinExecutionEnvironments (default 3), MaxVCpuCount (default 400), TargetResourceUtilization.

See `configuration-guide.md` for decision trees and detailed tuning.

Expand All @@ -98,7 +98,7 @@ See `thread-safety.md` for the review checklist and `migration-patterns.md` for
### Step 5: Set Up Infrastructure

1. Create two IAM roles: execution role (for the function) and operator role (for capacity provider EC2 management)
2. Configure VPC with subnets across 3+ AZs
2. Configure VPC with subnets across multiple AZs (recommended 3+ for resiliency)
3. Create capacity provider with VPC config and scaling limits
4. Create or update function with capacity provider attachment
5. Publish a version (triggers instance provisioning)
Expand All @@ -121,7 +121,7 @@ See `infrastructure-setup.md` for CLI commands and SAM templates.
- Use ARM (Graviton) unless x86 dependencies exist
- Let Lambda choose instance types unless specific hardware needed
- Set MaxVCpuCount to control cost ceiling
- Never set MinExecutionEnvironments below 3 (breaks AZ resiliency)
- Never set MinExecutionEnvironments below 3 in production (reduces multi-AZ coverage); non-prod can use 1

### Migration

Expand All @@ -130,11 +130,11 @@ See `infrastructure-setup.md` for CLI commands and SAM templates.
- Use weighted aliases for gradual traffic shift
- Include request IDs in all log statements
- Initialize DB pools and SDK clients outside the handler
- Estimate total `/tmp` usage under max concurrency

### Operations

- Set CloudWatch alarms on throttle rate > 1% and CPU > 80%
- Plan for 14-day instance rotation (automatic)
- Never manually terminate LMI EC2 instances (delete the capacity provider instead)
- Always publish a version — unpublished functions cannot run on LMI

Expand All @@ -143,12 +143,12 @@ See `infrastructure-setup.md` for CLI commands and SAM templates.
| Resource | Limit |
|----------|-------|
| Memory | 2 GB min, 32 GB max |
| Instances | 3 minimum (AZ resiliency) |
| Instance lifespan | 14 days (auto-replaced) |
| Concurrency/vCPU | 64 (Node.js), 32 (Java/.NET), 16 (Python) |
| Instance lifespan | ~12 hours (auto-replaced by Lambda) |
| EE lifespan | ~4 hours (auto-replaced by Lambda) |
| Runtimes | Node.js, Java, .NET, Python |
| Instance families | C, M, R (.large and up) |
| Scaling | Absorbs 50% spike; doubles within 5 min |
| Instance families | C (.xlarge+), M (.large+), R (.large+) |
| Scaling | Doubles within 5 min without throttles |

## Resources

Expand Down
4 changes: 2 additions & 2 deletions aws-lambda-managed-instances/steering/configuration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ Total capacity = MinExecutionEnvironments × PerExecutionEnvironmentMaxConcurren

| Control | Default | Guidance |
|---------|---------|----------|
| MinExecutionEnvironments | 3 | Increase for baseline capacity; never below 3 |
| MinExecutionEnvironments | 3 | Min 1 (non-prod); 3+ recommended for prod AZ coverage |
| MaxExecutionEnvironments | — | Set based on cost budget |
| MaxVCpuCount | Required | Start at 30, adjust by load |
| MaxVCpuCount | 400 | Set to control cost ceiling; adjust by load |
| TargetResourceUtilization | ~50% headroom | Raise for cost savings (less burst tolerance) |
| AllowedInstanceTypes | All | Restrict only for specific hardware needs |
| ExcludedInstanceTypes | None | Exclude expensive types in dev/test |
Expand Down
2 changes: 1 addition & 1 deletion aws-lambda-managed-instances/steering/cost-comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ When building a cost comparison for a user, gather: region, runtime, requests/mo

## When LMI is NOT Cheaper

- < 50M req/month (fixed 3-instance cost exceeds Lambda)
- Lambda spend below ~$1,000/month (fixed minimum execution environment cost exceeds Lambda)
- Very short functions (< 100ms duration)
- Highly bursty, unpredictable traffic
- Workloads needing scale-to-zero
Expand Down
4 changes: 2 additions & 2 deletions aws-lambda-managed-instances/steering/infrastructure-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,9 @@ The `ec2:ManagedResourceOperator` condition ensures RunInstances/CreateTags only

LMI runs functions on EC2 instances inside the VPC. These instances need VPC endpoints or NAT to reach AWS services.

- 3+ subnets across different AZs (for default 3-instance fleet)
- Subnets across multiple AZs recommended (minimum 1 required; 3+ for multi-AZ resiliency)
- Security groups: HTTPS egress (port 443) for AWS API calls; no ingress needed
- Required VPC endpoints:
- VPC endpoints (if no NAT gateway) for AWS service access:

| Endpoint | Type | Purpose |
|----------|------|---------|
Expand Down
3 changes: 2 additions & 1 deletion aws-lambda-managed-instances/steering/thread-safety.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,13 @@ LMI runs multiple invocations concurrently in the same execution environment. Th
When reviewing a function for LMI readiness, check each item:

- [ ] No shared `/tmp` paths (use request ID in filenames, clean up after — shared across ALL runtimes)
- [ ] Estimate total `/tmp` usage under max concurrency (concurrent requests × per-request file size)
- [ ] Database connections use pools (initialized outside handler, not per-invocation)
- [ ] SDK clients outside handler (module-level singletons are fine — they are thread-safe)
- [ ] Logging includes request ID (for tracing concurrent requests)
- [ ] **Node.js/Java/.NET only:** No global/static mutable variables (use immutable or request-local state)
- [ ] **Node.js/Java/.NET only:** Thread-safe libraries only (check DB drivers, HTTP clients, caching libs)
- [ ] **Node.js/Java/.NET only:** No request state in global scope (use AsyncLocalStorage, contextvars, ThreadLocal)
- [ ] **Node.js/Java/.NET only:** No request state in global scope (use AsyncLocalStorage for Node.js, ThreadLocal for Java, AsyncLocal for .NET)
- [ ] **Node.js/Java/.NET only:** No environment variable mutation during requests
- [ ] **Python only:** Memory budget accounts for per-process multiplication (memory × concurrency)

Expand Down
4 changes: 2 additions & 2 deletions aws-lambda-managed-instances/steering/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
| Function version not ACTIVE | Fewer than 3 execution environments ready | Wait for provisioning; check capacity provider status |
| Unexpected 500 errors | Unhandled concurrent access to shared state | Add thread-safe patterns from migration-patterns.md |
| CloudWatch logs missing | VPC egress not configured | Add NAT Gateway or CloudWatch Logs VPC endpoint |
| High costs despite low traffic | Minimum 3 instances always running | Evaluate if standard Lambda is more cost-effective |
| High costs despite low traffic | Minimum execution environments always running | Evaluate if standard Lambda is more cost-effective |

## Debugging Steps

Expand All @@ -22,7 +22,7 @@
1. Check capacity provider status: `aws lambda get-capacity-provider --capacity-provider-name <name>`
2. Verify subnets span 3+ AZs with available IPs
3. Confirm security group allows necessary egress
4. Check operator role has required permissions
4. Check operator role has required permissions (see infrastructure-setup.md for least-privilege policy)
5. Look for `Operator` field in EC2 DescribeInstances or `aws:lambda:capacity-provider` tag

### Performance Issues
Expand Down
Loading