Skip to content

feat: Script to migrate data from existing deployment to new deployment#567

Open
NirajC-Microsoft wants to merge 3 commits intodevfrom
psl-data-migration-script
Open

feat: Script to migrate data from existing deployment to new deployment#567
NirajC-Microsoft wants to merge 3 commits intodevfrom
psl-data-migration-script

Conversation

@NirajC-Microsoft
Copy link
Copy Markdown
Contributor

@NirajC-Microsoft NirajC-Microsoft commented Mar 17, 2026

Purpose

Data Migration Tool for DKM Solution Accelerator

Summary

Adds a Python-based data migration script that migrates Azure AI Search, Cosmos DB (MongoDB API), and Azure Blob Storage data between Azure resource groups. This enables teams to move an existing DKM deployment to a new environment without data loss.

What's Included

New Files:

  • migrate.py — Main migration script (~1100 lines)
  • README.md — User-facing documentation with setup, usage, examples, and Azure credential lookup guide
  • requirement.txt

Updated Files:

  • DeploymentGuide.md — Added data migration note after deployment completion step

Features

  • Three-service migration — Azure AI Search (schemas + documents), Cosmos DB (ChatHistory & Documents collections), Azure Blob Storage (all containers with directory structure)
  • Interactive prompts — No config files needed; all endpoints, connection strings, and resource groups collected at runtime
  • RBAC auto-management — Temporarily assigns required Azure roles before migration and revokes them after
  • Content-type preservation — Blob Content-Type metadata exported via sidecar file and restored on import, ensuring documents display correctly in browser
  • Idempotent operations — Safe to re-run without duplicating data (upserts for Cosmos, create-or-replace for Search, overwrite for Blobs)
  • Retry with backoff — All network operations retry up to 3 times
  • Data integrity — SHA-256 checksums for Cosmos DB exports, verified before import
  • Windows long-path support — Handles paths >260 characters via \?\ prefix
  • Service filters — --search-only, --cosmos-only, --blob-only flags for selective migration
  • Execution timing — Total runtime displayed at completion

Usage

  • python migrate.py export # Export from source
  • python migrate.py import # Import into target
  • python migrate.py export-import # Full migration

Testing

  • Tested with a real migration between two Azure resource groups
  • Successfully migrated 2,535 blob files (221.9 MB), Search indexes with documents, and Cosmos DB collections
  • All services exported and imported with exit code 0
  • Content-type preservation verified — documents render correctly in browser post-migration

Does this introduce a breaking change?

  • Yes
  • No

Golden Path Validation

  • I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

  • I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid

  • ...

Other Information

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants