Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions lambda-ecs-durable-python-sam/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Lambda Durable Functions to Amazon ECS with Python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Lambda Durable Functions to Amazon ECS with Python
# AWS Lambda durable functions to Amazon ECS with Python


This pattern demonstrates how to invoke Amazon ECS tasks from AWS Lambda durable functions using Python. The workflow starts an ECS task, waits for a callback, and resumes based on the task result while maintaining state across the pause/resume cycle.

Learn more about this pattern at Serverless Land Patterns: https://serverlessland.com/patterns/lambda-ecs-python-sam

Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.

## Requirements

* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
* [AWS Serverless Application Model](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) (AWS SAM) installed
* [Docker](https://docs.docker.com/get-docker/) installed (for building Lambda container images)
* [Python 3.13](https://www.python.org/downloads/) or later

## Deployment Instructions

1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository:
```
git clone https://github.com/aws-samples/serverless-patterns
```
1. Change directory to the pattern directory:
```
cd lambda-ecs-python-sam
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be lambda-ecs-durable-python-sam

```
1. From the command line, use AWS SAM to build the application:
```
sam build
```
1. From the command line, use AWS SAM to deploy the AWS resources for the pattern as specified in the template.yaml file:
```
sam deploy --guided
```
1. During the prompts:
* Enter a stack name
* Enter the desired AWS Region
* Enter the VpcCIDR parameter (default: 10.0.0.0/16)
* Allow SAM CLI to create IAM roles with the required permissions.
* Create managed ECR repositories for all functions (required for container images)

Once you have run `sam deploy --guided` mode once and saved arguments to a configuration file (samconfig.toml), you can use `sam deploy` in future to use these defaults.

1. Note the outputs from the SAM deployment process. These contain the resource names and/or ARNs which are used for testing.

## How it works

This pattern implements an ECS task orchestration workflow using Lambda durable functions with callback pattern:

1. **Sync Lambda** starts an ECS task and polls for completion using durable waits (no compute charges during waits)
2. **Callback Lambda** starts an ECS task, pauses execution using `callback.result()`, and waits for a callback
3. The ECS task processes work and calls Lambda durable execution callback API when complete
4. The Lambda function resumes automatically when the callback is invoked and returns the result

The pattern uses the AWS Durable Execution SDK for Python with the `@durable_execution` decorator to maintain state across the pause/resume cycle. The callback pattern ensures no compute charges while waiting for ECS task completion.

### Architecture Components

- **Sync Lambda**: Orchestrates ECS tasks using Lambda durable functions SDK with polling pattern and durable waits
- **Callback Lambda**: Orchestrates ECS tasks using Lambda durable functions SDK with callback pattern
- **ECS Tasks**: Process work and send callbacks to Lambda using durable execution callback APIs
- **VPC and Networking**: Provides network connectivity for ECS tasks to pull Docker images and call AWS APIs
- **CloudWatch Logs**: Stores execution logs for Lambda functions and ECS tasks

## Testing

### Set Environment Variables

```bash
export AWS_DEFAULT_REGION=us-east-1
export STACK_NAME=<your-stack-name>

# Get function names from CloudFormation outputs
export SYNC_FUNCTION=$(aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--query 'Stacks[0].Outputs[?OutputKey==`SyncLambdaFunctionArn`].OutputValue' \
--output text | awk -F: '{print $NF}')

export CALLBACK_FUNCTION=$(aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--query 'Stacks[0].Outputs[?OutputKey==`CallbackLambdaFunctionArn`].OutputValue' \
--output text | awk -F: '{print $NF}')
```

### Test Synchronous Pattern

```bash
# Invoke the sync function (must use qualified ARN with :$LATEST)
aws lambda invoke \
--function-name $SYNC_FUNCTION:\$LATEST \
--invocation-type Event \
--cli-binary-format raw-in-base64-out \
--payload '{"message": "Hello from sync pattern", "processingTime": 10}' \
response.json

# Monitor Lambda logs
aws logs tail /aws/lambda/$SYNC_FUNCTION --follow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What information determines a successful test?


# Monitor ECS task logs
aws logs tail /ecs/$STACK_NAME --follow
```

### Test Callback Pattern

```bash
# Invoke the callback function (must use qualified ARN with :$LATEST)
aws lambda invoke \
--function-name $CALLBACK_FUNCTION:\$LATEST \
--invocation-type Event \
--cli-binary-format raw-in-base64-out \
--payload '{"message": "Hello from callback pattern", "processingTime": 30}' \
response.json

# Monitor Lambda logs
aws logs tail /aws/lambda/$CALLBACK_FUNCTION --follow

# Monitor ECS task logs
aws logs tail /ecs/$STACK_NAME --follow
```

Expected output: The Lambda function should complete and return the ECS task result. The logs should show the callback being received and the function resuming execution.

## Cleanup

1. Delete the stack
```bash
sam delete
```
1. Confirm the stack has been deleted
```bash
aws cloudformation list-stacks --query "StackSummaries[?contains(StackName,'$STACK_NAME')].StackStatus"
```

----
Copyright 2025 Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: MIT-0
68 changes: 68 additions & 0 deletions lambda-ecs-durable-python-sam/example-pattern.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"title": "AWS Lambda Durable Functions to Amazon ECS with Python",
"description": "Invoke ECS tasks from Lambda Durable Functions with automatic checkpointing, state management, and resilient execution patterns",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Invoke ECS tasks from Lambda Durable Functions with automatic checkpointing, state management, and resilient execution patterns",
"description": "Invoke ECS tasks from Lambda durable functions with automatic checkpointing, state management, and resilient execution patterns",

"language": "Python",
"level": "300",
"framework": "SAM",
"introBox": {
"headline": "How it works",
"text": [
"This pattern demonstrates AWS Lambda Durable Functions invoking Amazon ECS tasks with resilient, long-running execution capabilities:",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"This pattern demonstrates AWS Lambda Durable Functions invoking Amazon ECS tasks with resilient, long-running execution capabilities:",
"This pattern demonstrates AWS Lambda durable functions invoking Amazon ECS tasks with resilient, long-running execution capabilities:",

"1. Durable Synchronous Pattern: Lambda uses checkpointed steps and durable waits to poll ECS task status. Can run for up to 1 year with automatic recovery from failures. No compute charges during wait periods.",
"2. Durable Callback Pattern: Lambda uses checkpointed steps to reliably initiate ECS tasks. Each step (create record, start task, update status) is automatically checkpointed for guaranteed execution.",
"The pattern uses the AWS Durable Execution SDK for Python, providing automatic state management, checkpoint-based recovery, and cost-effective long-running workflows. Includes inline Python code in ECS containers, VPC networking, and DynamoDB for callback tracking."
]
},
"gitHub": {
"template": {
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/lambda-ecs-python-sam",
"templateURL": "serverless-patterns/lambda-ecs-python-sam",
"projectFolder": "lambda-ecs-python-sam",
"templateFile": "template.yaml"
}
},
"resources": {
"bullets": [
{
"text": "Lambda Durable Functions",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"text": "Lambda Durable Functions",
"text": "Lambda durable functions",

"link": "https://docs.aws.amazon.com/lambda/latest/dg/durable-functions.html"
},
{
"text": "Durable Execution SDK",
"link": "https://docs.aws.amazon.com/lambda/latest/dg/durable-execution-sdk.html"
},
{
"text": "Run Amazon ECS or Fargate tasks",
"link": "https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_run_task.html"
},
{
"text": "Amazon ECS Task Definitions",
"link": "https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definitions.html"
}
]
},
"deploy": {
"text": [
"sam build",
"sam deploy --guided"
]
},
"testing": {
"text": [
"See the GitHub repo for detailed testing instructions."
]
},
"cleanup": {
"text": [
"Delete the stack: <code>sam delete</code>"
]
},
"authors": [
{
"name": "Mian Tariq",
"image": "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a publicly available image?

"bio": "Senior Delivery Consultant",
"linkedin": ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a linkedin ID?

}
]
}
14 changes: 14 additions & 0 deletions lambda-ecs-durable-python-sam/src/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
FROM public.ecr.aws/lambda/python:3.13

# Copy requirements file
COPY requirements.txt ${LAMBDA_TASK_ROOT}/

# Install dependencies including durable SDK
RUN pip install -r requirements.txt

# Copy function code
COPY sync_handler.py ${LAMBDA_TASK_ROOT}/
COPY callback_handler.py ${LAMBDA_TASK_ROOT}/

# Default handler (will be overridden by template)
CMD [ "sync_handler.lambda_handler" ]
114 changes: 114 additions & 0 deletions lambda-ecs-durable-python-sam/src/callback_handler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
import json
import boto3
import os
from aws_durable_execution_sdk_python import (
DurableContext,
durable_execution,
)

ecs_client = boto3.client('ecs')

def start_ecs_task_with_callback(cluster, task_definition, subnet1, subnet2, security_group,
callback_token, message, processing_time):
"""
Starts an ECS task and passes the callback token via environment variable.
The ECS task will call Lambda durable execution callback APIs when complete.
"""
print(f"[CALLBACK] Starting ECS task with callback token")

response = ecs_client.run_task(
cluster=cluster,
taskDefinition=task_definition,
launchType='FARGATE',
networkConfiguration={
'awsvpcConfiguration': {
'subnets': [subnet1, subnet2],
'securityGroups': [security_group],
'assignPublicIp': 'ENABLED'
}
},
overrides={
'containerOverrides': [
{
'name': 'python-callback-container',
'environment': [
{'name': 'CALLBACK_TOKEN', 'value': callback_token},
{'name': 'MESSAGE', 'value': message},
{'name': 'PROCESSING_TIME', 'value': str(processing_time)}
]
}
]
}
)

if not response['tasks']:
raise Exception("Failed to start ECS task")

task_arn = response['tasks'][0]['taskArn']
print(f"[CALLBACK] Task started: {task_arn}")

return task_arn

@durable_execution
def lambda_handler(event, context: DurableContext):
"""
Lambda durable function that invokes an ECS task and waits for callback.

The ECS task receives a callback token and calls Lambda durable execution
callback APIs (SendDurableExecutionCallbackSuccess/Failure) when complete.

This function pauses execution while waiting for the callback, with no
compute charges during the wait period.
"""

# Get configuration from environment variables
cluster = os.environ['ECS_CLUSTER']
task_definition = os.environ['TASK_DEFINITION']
subnet1 = os.environ['SUBNET_1']
subnet2 = os.environ['SUBNET_2']
security_group = os.environ['SECURITY_GROUP']

# Get input parameters
message = event.get('message', 'No message provided')
processing_time = event.get('processingTime', 5)

try:
# Create callback to get callback token
callback = context.create_callback()

print(f"[CALLBACK] Created callback with token: {callback.callback_id[:20]}...")

# Start ECS task with callback token (call directly, no context.step!)
task_arn = start_ecs_task_with_callback(
cluster, task_definition, subnet1, subnet2, security_group,
callback.callback_id, message, processing_time
)

print(f"[CALLBACK] Waiting for callback from ECS task...")

# Wait for callback (pauses execution here, no compute charges)
result = callback.result()

print(f"[CALLBACK] Received callback with result")

# Return the result from the callback
return {
'statusCode': 200,
'body': json.dumps({
'status': 'success',
'message': 'ECS task completed and sent callback',
'taskArn': task_arn,
'result': result
})
}

except Exception as e:
context.logger.error(f"[CALLBACK] Error: {str(e)}")

return {
'statusCode': 500,
'body': json.dumps({
'status': 'error',
'error': str(e)
})
}
2 changes: 2 additions & 0 deletions lambda-ecs-durable-python-sam/src/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
boto3
aws-durable-execution-sdk-python
Loading