Skip to content

Commit 11c7981

Browse files
wangzleiyiyuan-he
andauthored
Evals for ECS django enablement support (#511)
*Issue description:* awslabs/mcp#1808 *Description of changes:* ECS Django CDK: ``` (mcp-testing) [/Volumes/workplace/wangzl/aws-application-signals-test-framework/mcp-testing]$ python -m evals tasks --task-id ecs_python_django_cdk Using MCP repository: /Volumes/workplace/wangzl/mcpdev/mcp Starting MCP tool evaluation for tasks Loaded 1 task(s) - ecs_python_django_cdk ============================================================ EVALUATION RESULT: ecs_python_django_cdk ============================================================ Duration: 101.46s Turns: 10 Tool Calls: 9 (4 unique) Hit Rate: 100.0% Success Rate: 100.0% File Operations: 8 Tool Breakdown: - get_enablement_guide: 1 calls (1 success, 0 failed) - list_files: 4 calls (4 success, 0 failed) - read_file: 3 calls (3 success, 0 failed) - write_file: 1 calls (1 success, 0 failed) Validation Results: Build: ✅ PASS (1/1 criteria met) [PASS] Build succeeds Reasoning: Build completed with exit code 0 LLM Judge: ✅ PASS (23/23 criteria met) [PASS] IAM: CloudWatchAgentServerPolicy attached to ECS application role Reasoning: CloudWatchAgentServerPolicy is attached to the ECS task role in the managedPolicies array. [PASS] CloudWatch Agent: Sidecar container added to ECS task definition Reasoning: CloudWatch Agent sidecar container is added to the ECS task definition with proper configuration. [PASS] CloudWatch Agent: CloudWatch Agent configuration file added to ECS task definition Reasoning: CloudWatch Agent configuration is provided via CW_CONFIG_CONTENT environment variable. [PASS] CloudWatch Agent: Configuration file created with traces.traces_collected.application_signals Reasoning: Configuration includes traces.traces_collected.application_signals in the CW_CONFIG_CONTENT. [PASS] CloudWatch Agent: Configuration file created with logs.metrics_collected.application_signals Reasoning: Configuration includes logs.metrics_collected.application_signals in the CW_CONFIG_CONTENT. [PASS] CloudWatch Agent: Log group created for CloudWatch Agent logs Reasoning: CloudWatch Agent log group is created with name '/ecs/ecs-cwagent'. [PASS] ADOT SDK: ADOT SDK container added as sidecar to ECS task definition Reasoning: ADOT SDK container (init container) is added as sidecar to copy auto-instrumentation files. [PASS] ADOT SDK: ADOT SDK container added volumeMount for opentelemetry-auto-instrumentation Reasoning: ADOT SDK container includes volumeMount for opentelemetry-auto-instrumentation-python. [PASS] Integrity: Application container modified to include volumeMount from ADOT SDK container Reasoning: Application container is modified to include volumeMount from ADOT SDK container. [PASS] Integrity: Aplication containder added container dependency on ADOT SDK container Reasoning: Application container has container dependency on ADOT SDK container (init container) with SUCCESS condition. [PASS] Integrity: Application container added container dependency on CloudWatch Agent container Reasoning: Application container has container dependency on CloudWatch Agent container with START condition. [PASS] Integrity: Application container in ECS task definition modified to include OTEL environment variables Reasoning: Application container includes multiple OTEL environment variables for configuration. [PASS] Integrity: -e OTEL_RESOURCE_ATTRIBUTES with service.name Reasoning: OTEL_RESOURCE_ATTRIBUTES is set with service.name using the config.serviceName value. [PASS] Integrity: -e OTEL_LOGS_EXPORTER=none Reasoning: OTEL_LOGS_EXPORTER is set to 'none'. [PASS] Integrity: -e OTEL_METRICS_EXPORTER=none Reasoning: OTEL_METRICS_EXPORTER is set to 'none'. [PASS] Integrity: -e OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf Reasoning: OTEL_EXPORTER_OTLP_PROTOCOL is set to 'http/protobuf'. [PASS] Integrity: -e OTEL_AWS_APPLICATION_SIGNALS_ENABLED=true Reasoning: OTEL_AWS_APPLICATION_SIGNALS_ENABLED is set to 'true'. [PASS] Integrity: -e OTEL_AWS_APPLICATION_SIGNALS_EXPORTER_ENDPOINT=http://localhost:4316/v1/metrics Reasoning: OTEL_AWS_APPLICATION_SIGNALS_EXPORTER_ENDPOINT is set to 'http://localhost:4316/v1/metrics'. [PASS] Integrity: -e OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4316/v1/traces Reasoning: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is set to 'http://localhost:4316/v1/traces'. [PASS] Integrity: -e OTEL_PYTHON_CONFIGURATOR=aws_configurator Reasoning: OTEL_PYTHON_CONFIGURATOR is set to 'aws_configurator'. [PASS] Integrity: -e OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true Reasoning: OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED is set to 'true'. [PASS] Integrity: -e PYTHONPATH includes volumeMount from ADOT Python container and auto_instrumentation of the volumeMount Reasoning: PYTHONPATH includes the volumeMount path and auto_instrumentation subdirectory. [PASS] Integrity: -e DJANGO_SETTINGS_MODULE set to django application settings module Reasoning: DJANGO_SETTINGS_MODULE is set to 'djangoapp.settings' matching the Django application structure. Overall Task Status: ✅ PASS ``` ECS Django terraform: ``` (mcp-testing) [/Volumes/workplace/wangzl/aws-application-signals-test-framework/mcp-testing]$ python -m evals tasks --task-id ecs_python_django_terraform --no-cleanup Using MCP repository: /Volumes/workplace/wangzl/mcpdev/mcp Starting MCP tool evaluation for tasks Loaded 1 task(s) - ecs_python_django_terraform ============================================================ EVALUATION RESULT: ecs_python_django_terraform ============================================================ Duration: 184.03s Turns: 11 Tool Calls: 10 (4 unique) Hit Rate: 100.0% Success Rate: 100.0% File Operations: 9 Tool Breakdown: - get_enablement_guide: 1 calls (1 success, 0 failed) - list_files: 3 calls (3 success, 0 failed) - read_file: 5 calls (5 success, 0 failed) - write_file: 1 calls (1 success, 0 failed) Validation Results: Build: ✅ PASS (1/1 criteria met) [PASS] Build succeeds Reasoning: Build completed with exit code 0 LLM Judge: ✅ PASS (23/23 criteria met) [PASS] IAM: CloudWatchAgentServerPolicy attached to ECS application role Reasoning: CloudWatchAgentServerPolicy is attached to the ECS task role via aws_iam_role_policy_attachment.task_role_cloudwatch_agent_policy. [PASS] CloudWatch Agent: Sidecar container added to ECS task definition Reasoning: CloudWatch Agent sidecar container "ecs-cwagent-${var.app_name}" is added to the ECS task definition with proper configuration. [PASS] CloudWatch Agent: CloudWatch Agent configuration file added to ECS task definition Reasoning: CloudWatch Agent configuration file is added via CW_CONFIG_CONTENT environment variable in the CloudWatch Agent container. [PASS] CloudWatch Agent: Configuration file created with traces.traces_collected.application_signals Reasoning: Configuration file contains traces.traces_collected.application_signals section as specified in the CW_CONFIG_CONTENT. [PASS] CloudWatch Agent: Configuration file created with logs.metrics_collected.application_signals Reasoning: Configuration file contains logs.metrics_collected.application_signals section as specified in the CW_CONFIG_CONTENT. [PASS] CloudWatch Agent: Log group created for CloudWatch Agent logs Reasoning: CloudWatch log group "cw_agent_log_group" is created specifically for CloudWatch Agent logs with name "/ecs/ecs-cwagent". [PASS] ADOT SDK: ADOT SDK container added as sidecar to ECS task definition Reasoning: ADOT SDK container "init" is added as sidecar using the public.ecr.aws/aws-observability/adot-autoinstrumentation-python:v0.12.0 image. [PASS] ADOT SDK: ADOT SDK container added volumeMount for opentelemetry-auto-instrumentation Reasoning: ADOT SDK container includes volumeMount for "opentelemetry-auto-instrumentation-python" volume. [PASS] Integrity: Application container modified to include volumeMount from ADOT SDK container Reasoning: Application container includes volumeMount from ADOT SDK container for "opentelemetry-auto-instrumentation-python". [PASS] Integrity: Aplication containder added container dependency on ADOT SDK container Reasoning: Application container has dependency on ADOT SDK container "init" with condition "SUCCESS". [PASS] Integrity: Application container added container dependency on CloudWatch Agent container Reasoning: Application container has dependency on CloudWatch Agent container "ecs-cwagent-${var.app_name}" with condition "START". [PASS] Integrity: Application container in ECS task definition modified to include OTEL environment variables Reasoning: Application container includes multiple OTEL environment variables for OpenTelemetry configuration. [PASS] Integrity: -e OTEL_RESOURCE_ATTRIBUTES with service.name Reasoning: OTEL_RESOURCE_ATTRIBUTES is set with service.name=${var.app_name}. [PASS] Integrity: -e OTEL_LOGS_EXPORTER=none Reasoning: OTEL_LOGS_EXPORTER is set to "none". [PASS] Integrity: -e OTEL_METRICS_EXPORTER=none Reasoning: OTEL_METRICS_EXPORTER is set to "none". [PASS] Integrity: -e OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf Reasoning: OTEL_EXPORTER_OTLP_PROTOCOL is set to "http/protobuf". [PASS] Integrity: -e OTEL_AWS_APPLICATION_SIGNALS_ENABLED=true Reasoning: OTEL_AWS_APPLICATION_SIGNALS_ENABLED is set to "true". [PASS] Integrity: -e OTEL_AWS_APPLICATION_SIGNALS_EXPORTER_ENDPOINT=http://localhost:4316/v1/metrics Reasoning: OTEL_AWS_APPLICATION_SIGNALS_EXPORTER_ENDPOINT is set to "http://localhost:4316/v1/metrics". [PASS] Integrity: -e OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4316/v1/traces Reasoning: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is set to "http://localhost:4316/v1/traces". [PASS] Integrity: -e OTEL_PYTHON_CONFIGURATOR=aws_configurator Reasoning: OTEL_PYTHON_CONFIGURATOR is set to "aws_configurator". [PASS] Integrity: -e OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true Reasoning: OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED is set to "true". [PASS] Integrity: -e PYTHONPATH includes volumeMount from ADOT Python container and auto_instrumentation of the volumeMount Reasoning: PYTHONPATH includes the volumeMount path "/otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation:/otel-auto-instrumentation-python". [PASS] Integrity: -e DJANGO_SETTINGS_MODULE set to django application settings module Reasoning: DJANGO_SETTINGS_MODULE is set to "djangoapp.settings" matching the Django application structure. Overall Task Status: ✅ PASS ``` *Rollback procedure:* <Can we safely revert this commit if needed? If not, detail what must be done to safely revert and why it is needed.> *Ensure you've run the following tests on your changes and include the link below:* To do so, create a `test.yml` file with `name: Test` and workflow description to test your changes, then remove the file for your PR. Link your test run in your PR description. This process is a short term solution while we work on creating a staging environment for testing. NOTE: TESTS RUNNING ON A SINGLE EKS CLUSTER CANNOT BE RUN IN PARALLEL. See the [needs](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idneeds) keyword to run tests in succession. - Run Java EKS on `e2e-playground` in us-east-1 and eu-central-2 - Run Python EKS on `e2e-playground` in us-east-1 and eu-central-2 - Run metric limiter on EKS cluster `e2e-playground` in us-east-1 and eu-central-2 - Run EC2 tests in all regions - Run K8s on a separate K8s cluster (check IAD test account for master node endpoints; these will change as we create and destroy clusters for OS patching) By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Michael He <53622546+yiyuan-he@users.noreply.github.com>
1 parent c3e3d7b commit 11c7981

File tree

3 files changed

+65
-7
lines changed
  • mcp-testing/evals/tasks/applicationsignals

3 files changed

+65
-7
lines changed

mcp-testing/evals/tasks/applicationsignals/get_enablement_guide/configs/ecs.py

Lines changed: 63 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,10 @@
6161
'Integrity: -e PYTHONPATH includes volumeMount from ADOT Python container and auto_instrumentation of the volumeMount',
6262
]
6363

64+
PYTHON_DJANGO_ENV_VARS_RUBRIC = [
65+
'Integrity: -e DJANGO_SETTINGS_MODULE set to django application settings module',
66+
]
67+
6468
# OpenTelemetry Node.js environment variables for CJS applications
6569
NODEJS_OTEL_ENV_VARS_RUBRIC = [
6670
'Integrity: -e NODE_OPTIONS=--require /otel-auto-instrumentation-node/autoinstrumentation.js',
@@ -87,9 +91,9 @@
8791
# Task definitions - compose rubrics from components
8892
ECS_TASKS = [
8993

90-
# CDK - Python
94+
# CDK - Python Flask
9195
EnablementTask(
92-
id='ecs_python_cdk',
96+
id='ecs_python_flask_cdk',
9397
prompt_template=ENABLEMENT_PROMPT,
9498
git_paths=[
9599
'infrastructure/ecs/cdk',
@@ -113,9 +117,9 @@
113117
),
114118
),
115119

116-
# Terraform - Python
120+
# Terraform - Python Flask
117121
EnablementTask(
118-
id='ecs_python_terraform',
122+
id='ecs_python_flask_terraform',
119123
prompt_template=ENABLEMENT_PROMPT,
120124
git_paths=[
121125
'infrastructure/ecs/terraform',
@@ -139,6 +143,60 @@
139143
),
140144
),
141145

146+
# CDK - Python Django
147+
EnablementTask(
148+
id='ecs_python_django_cdk',
149+
prompt_template=ENABLEMENT_PROMPT,
150+
git_paths=[
151+
'infrastructure/ecs/cdk',
152+
'docker-apps/python/django',
153+
],
154+
iac_dir='infrastructure/ecs/cdk',
155+
app_dir='docker-apps/python/django',
156+
language='python',
157+
framework='django',
158+
platform='ecs',
159+
build_command='npm install && npm run build',
160+
build_working_dir='infrastructure/ecs/cdk',
161+
expected_tools=['get_enablement_guide'],
162+
modifies_code=True,
163+
validation_rubric=(
164+
CLOUDWATCH_AGENT_RUBRIC +
165+
ADOT_SDK_RUBRIC +
166+
APPLICATION_RUBIC +
167+
COMMON_OTEL_ENV_VARS_RUBRIC +
168+
PYTHON_OTEL_ENV_VARS_RUBRIC +
169+
PYTHON_DJANGO_ENV_VARS_RUBRIC
170+
),
171+
),
172+
173+
# Terraform - Python Django
174+
EnablementTask(
175+
id='ecs_python_django_terraform',
176+
prompt_template=ENABLEMENT_PROMPT,
177+
git_paths=[
178+
'infrastructure/ecs/terraform',
179+
'docker-apps/python/django',
180+
],
181+
iac_dir='infrastructure/ecs/terraform',
182+
app_dir='docker-apps/python/django',
183+
language='python',
184+
framework='django',
185+
platform='ecs',
186+
build_command='terraform init && terraform validate',
187+
build_working_dir='infrastructure/ecs/terraform',
188+
expected_tools=['get_enablement_guide'],
189+
modifies_code=True,
190+
validation_rubric=(
191+
CLOUDWATCH_AGENT_RUBRIC +
192+
ADOT_SDK_RUBRIC +
193+
APPLICATION_RUBIC +
194+
COMMON_OTEL_ENV_VARS_RUBRIC +
195+
PYTHON_OTEL_ENV_VARS_RUBRIC +
196+
PYTHON_DJANGO_ENV_VARS_RUBRIC
197+
),
198+
),
199+
142200
# CDK - Node.js CJS
143201
EnablementTask(
144202
id='ecs_nodejs_cdk',
@@ -161,7 +219,7 @@
161219
ADOT_SDK_RUBRIC +
162220
APPLICATION_RUBIC +
163221
COMMON_OTEL_ENV_VARS_RUBRIC +
164-
PYTHON_OTEL_ENV_VARS_RUBRIC
222+
NODEJS_OTEL_ENV_VARS_RUBRIC
165223
),
166224
),
167225

mcp-testing/evals/tasks/applicationsignals/samples/get-enablement-guide-samples/infrastructure/ecs/cdk/lib/cdk-stack.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ export class ECSAppStack extends cdk.Stack {
9595
image: ecs.ContainerImage.fromRegistry(ecrImageUri),
9696
essential: true,
9797
memoryReservationMiB: 512,
98-
readonlyRootFilesystem: true,
98+
readonlyRootFilesystem: false,
9999
environment: {
100100
PORT: config.port.toString(),
101101
},

mcp-testing/evals/tasks/applicationsignals/samples/get-enablement-guide-samples/infrastructure/ecs/terraform/main.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ resource "aws_ecs_task_definition" "app" {
201201
image = local.ecr_image_uri
202202
essential = true
203203
memory = 512
204-
readonlyRootFilesystem = true
204+
readonlyRootFilesystem = false
205205
user = "0:0"
206206

207207
environment = [

0 commit comments

Comments
 (0)