Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 20, 2025

Resolves #3194

What is being addressed

Workspace deletions fail intermittently with dependency ordering errors when Terraform attempts to delete Azure Monitor resources. The issue manifests as AnotherOperationInProgress errors when deleting the AMPLS private endpoint and its private DNS zone group simultaneously.

How is this addressed

  • Enhanced dependency management: Added both AMPLS scoped services (ampls_app_insights and ampls_log_anaytics) to the private endpoint's depends_on list to ensure proper creation and deletion ordering
  • Update CHANGELOG.md with bug fix entry
  • Increment workspace base template version from 2.7.1 to 2.7.2

Technical Details

The fix addresses the root cause through explicit dependency ordering:

Enhanced Private Endpoint Configuration:

resource "azurerm_private_endpoint" "azure_monitor_private_endpoint" {
  # ... resource configuration ...
  
  lifecycle { ignore_changes = [tags] }
  
  private_dns_zone_group {
    name = "azure-monitor-private-dns-zone-group"
    private_dns_zone_ids = [...]
  }
  
  depends_on = [
    azurerm_monitor_private_link_scoped_service.ampls_app_insights,
    azurerm_monitor_private_link_scoped_service.ampls_log_anaytics,
  ]
}

The explicit depends_on relationships ensure:

  • Creation order: Scoped services are fully configured → Private endpoint with DNS zone group is created
  • Deletion order: Private endpoint with DNS zone group is deleted → Scoped services are removed

References

Original prompt

This section details on the original issue you should resolve

<issue_title>Workspace deletion fails due to AnotherOperationInProgress pe-ampls-*</issue_title>
<issue_description>Describe the bug

deleting Private DNS Zone Group "azure-monitor-private-dns-zone-group" (Private Endpoint "pe-ampls--ws-ab34" / Resource Group "rg--ws-ab34"): network.PrivateDNSZoneGroupsClient#Delete: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status= Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/*******/providers/Microsoft.Network/locations/***/operations/7fb24a45-f648-41ff-ba74-5cc4a320e2b3?api-version=2019-02-01." Details=[]

Steps to reproduce

</issue_description>

<agent_instructions>Please read this issue: hashicorp/terraform-provider-azurerm#28715

We see this issue frequently in running e2e tests when workspaces get deleted. We need to find the route cause. If this is not obvious his may mean creating a number of "versions" of the base template, with test deployment scripts and seeing if we can narrow down the issue.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@marrobi Looks like should have been closed by PR @marrobi Another occurrence: https://github.com/microsoft/AzureTRE/actions/runs/6110043155/job/16879370630

6d965480-3883-4853-895d-361e2da5f8fe: Error message: uninstalling bundle ╷ │ Error: deleting Private DNS Zone Group "azure-monitor-private-dns-zone-group" (Private Endpoint "pe-ampls--ws-f8fe" / Resource Group "rg--ws-f8fe"): network.PrivateDNSZoneGroupsClient#Delete: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status= Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/*******/providers/Microsoft.Network/locations/westus3/operations/7d4faa5e-935e-46a4-ad1e-d2464c7c754d?api-version=2019-02-01." Details=[] │ │ ╵ error running command /cnab/app/terraform /usr/bin/terraform destroy -auto-approve -input=false -var aad_redirect_uris_b64=W10= -var address_spaces=WyIxMC4xLjIuMC8yNCJd -var app_role_id_workspace_airlock_manager=bfb9e0d6-8429-f6fd-1e0c-15e9c601151d -var app_role_id_workspace_owner=c480f2ce-90b7-40b0-7261-78b56b4b2602 -var app_role_id_workspace_researcher=c299dbb6-62e7-7450-ec0d-1cb3d21326e3 -var app_service_plan_sku=S1 -var arm_environment=public -var auth_client_id=******* -var auth_client_secret=******* -var auth_tenant_id=******* -var client_id=814a34f2-3b75-44dc-aa71-7b045340df18 -var create_aad_groups=false -var enable_airlock=false -var enable_local_debugging=false -var location=westus3 -var register_aad_application=true -var scope_id=api://-ws-f8fe -var sp_id=6f2d2ff5-41fd-4eaf-96b6-372d53a47b11 -var tre_id= -var tre_resource_id=6d965480-3883-4853-895d-361e2da5f8fe -var workspace_owner_object_id=ade81c51-edee-4f17-ac8d-1becb3dd7352: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform destroy -auto-approve -input=false -var aad_redirect_uris_b64=W10= -var address_spaces=WyIxMC4xLjIuMC8yNCJd -var app_role_id_workspace_airlock_manager=bfb9e0d6-8429-f6fd-1e0c-15e9c601151d -var app_role_id_workspace_owner=c480f2ce-90b7-40b0-7261-78b56b4b2602 -var app_role_id_workspace_researcher=c299dbb6-62e7-7450-ec0d-1cb3d21326e3 -var app_service_plan_sku=S1 -var arm_environment=public -var auth_client_id=******* -var auth_client_secret=******* -var auth_tenant_id=******* -var client_id=814a34f2-3b75-44dc-aa71-7b045340df18 -var create_aad_groups=false -var enable_airlock=false -var enable_local_debugging=false -var location=westus3 -var register_aad_application=true -var scope_id=api://-ws-f8fe -var sp_id=6f2d2ff5-41fd-4eaf-96b6-372d53a47b11 -var tre_id= -var tre_resource_id=6d965480-3883-4853-895d-361e2da5f8fe -var workspace_owner_object_id=ade81c51-edee-4f17-ac8d-1becb3dd7352: exit status 1 1 error occurred: * mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime uninstall ╷ </comment_new>
<comment_new>@marrobi
Still see this sometimes:

Private Dns Zone Group Name: "azure-monitor-private-dns-zone-group"): performing Delete: unexpected status 409 (409 Conflict) with error: AnotherOperationInProgress: Another operation on this or dependent resource is in progress.</comment_new>
<comment_new>@marrobi
See this discussion: hashicorp/terraform-provider-azurerm#28715</comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…PLS private endpoints

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix workspace deletion failure due to AnotherOperationInProgress error Fix workspace deletion race condition in AMPLS private endpoint Nov 20, 2025
Copilot AI requested a review from marrobi November 20, 2025 22:15
@marrobi
Copy link
Member

marrobi commented Nov 20, 2025

/test-extended c32a87a

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19553208613 (with refid d81faf5c)

(in response to this comment from @marrobi)

…o fix deletion ordering

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 21, 2025

@copilot this failed, same issue, see https://[productionresultssa1.blob.core.windows.net/actions-results/792780c6-7f37-45d1-8de6-7062b4b8358e/workflow-job-run-9bd45ba8-ed52-5b06-ac86-bd5f0db51320/log...

The test revealed a different root cause. The error was that Private DNS Zones couldn't be deleted because their Virtual Network Links still existed. I've added create_before_destroy = true to all Azure Monitor DNS zones and their VNet links to ensure proper deletion ordering. Changes are in commit 5d9bb7d.

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copilot AI changed the title Fix workspace deletion race condition in AMPLS private endpoint Fix workspace deletion race condition in AMPLS private endpoint and DNS zones Nov 21, 2025
@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

/test-extended d66e33f

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19571335811 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

/test-extended d66e33f

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19576408604 (with refid d81faf5c)

(in response to this comment from @marrobi)

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

Unit Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit fc12afe.

♻️ This comment has been updated with latest results.

@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

@copilot please resolve: Code update without version change

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 21, 2025

@copilot please resolve: Code update without version change

Version updates applied in commit 38788f5:

  • Workspace base template: 2.7.1 → 2.7.2
  • Core: 0.16.10 → 0.16.11
  • CHANGELOG.md updated with component version entries

@marrobi
Copy link
Member

marrobi commented Nov 21, 2025

@marrobi
Copy link
Member

marrobi commented Nov 26, 2025

/test-extended aa82947

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19710129614 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 26, 2025

@copilot do we need the create_before_destroy = true , as the dependence added in 21caa6e might have resolved the issue. Can we remove these and I will run some more tests. We can always add them in a later PR.

Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 26, 2025

@copilot do we need the create_before_destroy = true , as the dependence added in 21caa6e might have resolved...

Removed create_before_destroy = true from all DNS zones, VNet links, and the private endpoint in commit 50d36ae. Only the enhanced depends_on with both AMPLS scoped services remains. Ready for testing.

Copilot AI changed the title Fix workspace deletion race condition in AMPLS private endpoint and DNS zones Fix workspace deletion race condition in AMPLS private endpoint Nov 26, 2025
@marrobi
Copy link
Member

marrobi commented Nov 26, 2025

/test-extended 50d36ae

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19716425026 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 26, 2025

/test-extended 50d36ae

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19720597199 (with refid d81faf5c)

(in response to this comment from @marrobi)

Copy link
Contributor Author

Copilot AI commented Nov 27, 2025

@copilot this failed, see https://[productionresultssa13.blob.core.windows.net/actions-results/7b2fb5e9-a214-4b89-b173-bbdb210d6109/workflow-job-run-557ae3aa-2ec4-538e-9c04-19410108e10e/logs/job/job-l...

This is a separate issue, not related to removing create_before_destroy. The failure is from Virtual Network Links in core/shared DNS zones (privatelink.mysql.database.azure.com, privatelink.azuredatabricks.net, privatelink.database.windows.net, privatelink.api.azureml.ms, privatelink.cert.api.azureml.ms) timing out with context deadline exceeded. These are not the Azure Monitor DNS zones we modified in this PR. This appears to be an Azure API timeout during the delete polling operation.

@marrobi
Copy link
Member

marrobi commented Nov 27, 2025

/test-extended 50d36ae

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19730219587 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 27, 2025

/test-extended 50d36ae

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19733426740 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Nov 28, 2025

/test-extended 5941585

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/19763804503 (with refid d81faf5c)

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Dec 17, 2025

/test-extended 5941585

@github-actions
Copy link

🤖 pr-bot 🤖

⚠️ Cannot run tests as PR is not mergeable. Ensure that the PR is open and doesn't have any conflicts.

(in response to this comment from @marrobi)

@marrobi
Copy link
Member

marrobi commented Dec 18, 2025

/test-extended fc12afe

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/20336363140 (with refid d81faf5c)

(in response to this comment from @marrobi)

1 similar comment
@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/20336363140 (with refid d81faf5c)

(in response to this comment from @marrobi)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Workspace deletion fails due to AnotherOperationInProgress pe-ampls-*

2 participants