A flexible, customizable Python tool for exporting and importing OpenMetadata entities. Supports full backups, selective exports, and cross-instance migrations with clear NDJSON output format.
- OpenMetadata SDK Integration: Uses official OpenMetadata Python SDK for robust API interaction
- Full Export/Import: Backup and restore complete OpenMetadata instances
- Selective Export: Export specific entity types with
--entitiesflag - Round-Trip Tested: Verified export → import → validation workflow with real data
- Relationship-Aware: Maintains links between domains, data products, and assets
- Flexible Configuration: YAML config with environment variable overrides
- Rich Console Output: Beautiful progress indicators and informative logging
- NDJSON Format: Human-readable, editable export format
- Version Flexible: Configurable OpenMetadata SDK version support (defaults to 1.8.0+)
Option A: Automated Setup (Recommended)
git clone <repository>
cd omd_migrate
./setup.shOption B: Manual Installation
git clone <repository>
cd omd_migrate
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtNote: The setup.sh script creates a virtual environment (omd_venv) and installs all dependencies automatically.
The tool uses both config.yaml and .env files for configuration:
Option A: Use .env file (recommended for credentials)
cp .env.example .env
# Edit .env with your OpenMetadata server detailsOption B: Edit config.yaml directly
# Edit config.yaml with your server URL and JWT token# Export all entities (based on config.yaml settings)
python export.py
# Selective export of specific entity types
python export.py --entities data_products --entities domains
# Clear previous exports before starting
python export.py --clear
# Export to custom directory
python export.py --output-dir /path/to/backup
# Combine options for targeted exports
python export.py --clear --entities data_products --entities domains --output-dir /backup/domains-only# Import all entities
python import.py
# Import from custom directory
python import.py --input-dir /path/to/backup
# Import specific entity type only
python import.py --entity-type domains
# Dry run (see what would be imported)
python import.py --dry-run# Server Configuration
OPENMETADATA_SERVER_URL=http://your-openmetadata-server:8585/api
OPENMETADATA_JWT_TOKEN=your_jwt_token_here
# Export Configuration
EXPORT_OUTPUT_DIR=./exports
EXPORT_BATCH_SIZE=100
EXPORT_INCLUDE_DELETED=false
# Import Configuration
IMPORT_INPUT_DIR=./exports
IMPORT_UPDATE_EXISTING=true
IMPORT_SKIP_ON_ERROR=true
# Logging
LOG_LEVEL=INFOConfigure selective exports in config.yaml:
export:
selective:
# Export specific domains by name
domains: ["Finance", "Marketing"]
# Only export data products linked to specified domains
linked_data_products_only: true
# Only export assets (tables, topics, etc.) linked to domains/data products
linked_assets_only: trueSupported entities for export (use with --entities flag):
Core Entities:
domains- Business domains and subdomainsdata_products- Data products with domain relationshipsteams- Teams and usersusers- Individual userspolicies- Access policies
Knowledge Management:
glossaries- Business glossariesglossary_terms- Glossary terms
Data Assets:
databases- Database services and databasesdatabase_schemas- Database schemastables- Data tables with lineage
Additional Entity Types (available via config.yaml):
topics- Kafka topics and streamsdashboards- BI dashboardscharts- Dashboard chartspipelines- Data pipelinesml_models- Machine learning modelscontainers- Data containersstored_procedures- Database proceduresdashboard_data_models- Dashboard data modelssearch_indexes- Search indexes
Example usage:
# Export core entities only
python export.py --entities domains --entities data_products --entities teams
# Export data assets
python export.py --entities databases --entities tables# 1. Export everything from source instance
python export.py --config source-config.yaml --output-dir backup-2024-01-15
# 2. Import to target instance
python import.py --config target-config.yaml --input-dir backup-2024-01-15Use command-line flags for targeted exports:
# Export only domains and data products
python export.py --clear --entities domains --entities data_products
# Export specific entities to custom location
python export.py --entities users --entities teams --output-dir /backup/identity
# Clear and export tables only
python export.py --clear --entities tablesConfigure selective export in config.yaml:
export:
selective:
domains: ["Data Science", "Analytics"]
linked_data_products_only: true
linked_assets_only: trueThen export and import:
python export.py # Exports only Data Science and Analytics domains + linked entities
python import.py --config target-config.yaml# Export from production
OPENMETADATA_SERVER_URL=https://prod.your-company.com python export.py
# Import to staging
OPENMETADATA_SERVER_URL=https://staging.your-company.com python import.pyExports are saved as NDJSON files (one JSON object per line):
exports/
├── domains.ndjson # Business domains
├── data_products.ndjson # Data products
├── teams.ndjson # Teams and users
├── tables.ndjson # Data tables
├── topics.ndjson # Kafka topics
└── export_summary.json # Export metadata
Each NDJSON file can be:
- Viewed and edited with any text editor
- Processed with command-line tools (jq, grep, etc.)
- Imported partially or completely
Run the test suite:
pytest test_migration.py -vTest the complete export/import workflow:
# 1. Export current data products
python export.py --clear --entities data_products
# 2. Verify export succeeded
cat exports/export_summary.json
# 3. Test import functionality (creates new entities)
# Note: Import creates new entities, so use carefully in production
python import.py --input-dir exports --entity-type data_products --dry-run
# 4. Validate in OpenMetadata UI
# Check that exported entities maintain all relationships and metadata- Verify your JWT token is valid and not expired
- Check server URL is correct and accessible
- Ensure you have proper permissions for the entities you're trying to export/import
- Check OpenMetadata server connectivity
- Verify entity types are supported in your OpenMetadata version
- Review export logs for specific entity errors
- Ensure NDJSON files are properly formatted
- Check import order for dependency issues
- Use
--dry-runto preview imports before execution
- Adjust
batch_sizein configuration for large datasets - Use selective export for large instances
- Monitor memory usage with
memory_limit_mbsetting
The project includes a Makefile with useful development commands:
# Setup and cleanup
make setup # Run setup.sh to create virtual environment
make clean # Clean up virtual environment and exports
make clean-exports # Clean only export files
# Testing
make test # Run all tests with pytest
make test-verbose # Run tests with verbose output
# Export shortcuts
make export # Export all entities
make export-clean # Clean exports then export all
make export-core # Export core entities (domains, data_products, teams)
# Import shortcuts
make import # Import all entities
make import-dry # Dry run import (preview only)
# Development
make lint # Run code linting (if configured)
make format # Format code (if configured)
make help # Show all available commandsUsage Examples:
# Quick setup and test
make setup
make export-core
# Clean slate export
make clean-exports
make export
# Safe import testing
make import-dryopenmetadata:
server_url: "http://your-openmetadata-server:8585/api"
auth:
jwt_token: "your_jwt_token_here"
export:
output_dir: "./exports"
selective:
domains: []
linked_data_products_only: false
linked_assets_only: false
entities:
domains: true
data_products: true
teams: true
# ... all other entity types
include_deleted: false
batch_size: 100
import:
input_dir: "./exports"
update_existing: true
skip_on_error: true
create_missing_dependencies: true
import_order:
- teams
- users
- domains
- data_products
# ... ordered list for dependency handling
logging:
level: "INFO"
console_output: true
advanced:
request_timeout: 30
max_retries: 3
max_workers: 5This project is licensed under the MIT License - see the LICENSE file for details.
This project uses the following open-source packages:
- OpenMetadata SDK: Apache 2.0 License
- Rich: MIT License
- PyYAML: MIT License
- Click: BSD License
- python-dotenv: BSD License
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
For issues and questions:
- Check the troubleshooting section above
- Review OpenMetadata documentation
- Open an issue in this repository