- Clone the repository
- Create a virtual environment:
python -m venv venv - Activate:
source venv/bin/activate(macOS/Linux) orvenv\Scripts\activate(Windows) - Install with dev dependencies:
pip install -e ".[dev]"- Note:
pact-pythonis installed with base dependencies for API Pact integration.
- Note:
- Formatting: Black (88 char line length)
- Linting: Ruff
- Type Hints: Full type annotations
- Tests: Pytest with >80% coverage
- Create a feature branch:
git checkout -b feature/your-feature - Make changes and add tests
- Run checks:
black src/ tests/ ruff check src/ tests/ mypy src/ pytest
- Commit with clear messages
- Push and open a pull request
- Create
src/datapact/validators/your_validator.py - Implement validator class with
validate()method returning(bool, List[str]) - Add tests in
tests/test_your_validator.py - Export from
validators/__init__.py - Integrate into
cli.py(run afterCustomRuleValidator, convert output toErrorRecordwith a newcode)
Example: See src/datapact/validators/pii_validator.py for a reference implementation of a non-blocking validator with contract-declared metadata (PIIConfig) and auto-detection logic.
To add new PII categories or detection patterns:
- New category: add to
VALID_PII_CATEGORIESincontracts.pyand add a regex to_VALUE_PATTERNSinpii_validator.py - New column-name keywords: add entries to
_NAME_KEYWORDSinpii_validator.py - Detection threshold: adjust
_MATCH_THRESHOLD(default 0.20 = 20% of sampled values must match)
DataPact supports multiple contract formats via provider abstraction. To add a new provider:
-
Create
src/datapact/providers/your_format_provider.pyimplementingContractProviderinterface:from datapact.providers import ContractProvider from datapact.contracts import Contract class YourFormatProvider(ContractProvider): def can_load(self, file_path: str) -> bool: """Return True if this provider can load the file.""" return file_path.endswith('.your_format') def load(self, file_path: str, *args, **kwargs) -> Contract: """Load contract and return Contract object.""" # Parse your format → Contract dataclass pass
-
Export from
src/datapact/providers/__init__.py:from datapact.providers.your_format_provider import YourFormatProvider
-
Register in provider dispatch (auto-discoverable via
can_load()) -
Add fixtures in
tests/fixtures/your_format_sample.* -
Add tests in
tests/test_contract_providers.py:- Test
can_load()detection - Test
load()with valid and invalid inputs - Test field inference (type mapping)
- Test error handling
- Test
-
Update
README.mdanddocs/EXAMPLES.mdwith usage examples
Example: See src/datapact/providers/odcs_provider.py and pact_provider.py for reference implementations.
- Update
DataSource._detect_format()with new file extension - Implement loading in
DataSource.load() - Add test fixture and test case
- Update
DatabaseSource._connect()insrc/datapact/datasource.py - Wire CLI flags in
src/datapact/cli.py - Add tests in
tests/test_db_source.py
- Rule severities can be specified per rule (WARN/ERROR) in YAML.
- CLI overrides are supported via
--severity-override field.rule=warn. - Profiling uses
datapact profileandprofile_dataframe()for rule baselines. - Schema drift policy is configured in
schema.extra_columns.severity. - SLA checks are configured in
sla.min_rowsandsla.max_rows. - Chunked validation is available via
--chunksizewith optional sampling. - Custom rule plugins are configured via
rules.customandcustom_rules. - Report sinks are configured via
--report-sinkand webhook options. - Policy packs are configured via
policiesentries in contracts.
- Add version info to
VERSION_REGISTRYinsrc/datapact/versioning.py - Implement migration path in
VersionMigration._migrate_step()method - Update
TOOL_COMPATIBILITYmatrix if needed - Add test fixtures in
tests/fixtures/ - Add test cases in
tests/test_versioning.py - Update
docs/VERSIONING.mdwith:- Version release notes
- Breaking changes list
- Migration guide
- Examples
- Update
src/datapact/odcs_contracts.pywith new ODCS fields or mappings - Add fixtures under
tests/fixtures/(use.odcs.yamlextension when possible) - Add tests in
tests/test_odcs_contract.py - Update README/QUICKSTART to document new ODCS mappings and CLI flags
# All tests
pytest
# Enable MySQL-backed DB source tests
export DATAPACT_MYSQL_TESTS=1
export DATAPACT_MYSQL_PASSWORD=<your-mysql-password>
export DATAPACT_MYSQL_HOST=127.0.0.1
export DATAPACT_MYSQL_PORT=3306
export DATAPACT_MYSQL_USER=root
export DATAPACT_MYSQL_DB=datapact_test
export DATAPACT_MYSQL_TABLE=customers
pytest tests/test_db_source.py -v
# Specific test file
pytest tests/test_validator.py -v
pytest tests/test_versioning.py -v
# With coverage
pytest --cov=src/datapact- Contracts use semantic versioning (major.minor.patch)
- Tool version tracks compatibility (currently 0.2.0)
- Always maintain backward compatibility with auto-migration
- Document breaking changes clearly in VERSIONING.md