Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

### Added
- Added `ci` command for CI/CD-optimized test runs: multi-file support, GitHub Actions annotations and step summary, Azure DevOps annotations, `--fail-on` flag, `--json` output

### Fixed
- Fix SQL export generating multiple PRIMARY KEY constraints for composite keys (#1026)
- Preserve parametrized physicalTypes for SQL export (#1086)
Expand Down
152 changes: 149 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,7 @@ Commands
- [init](#init)
- [lint](#lint)
- [test](#test)
- [ci](#ci)
- [export](#export)
- [import](#import)
- [catalog](#catalog)
Expand Down Expand Up @@ -374,6 +375,8 @@ Data Contract CLI connects to a data source and runs schema and quality tests to
$ datacontract test --server production datacontract.yaml
```

For CI/CD pipelines, see [`ci`](#ci).

To connect to the databases the `server` block in the datacontract.yaml is used to set up the connection.
In addition, credentials, such as username and passwords, may be defined with environment variables.

Expand Down Expand Up @@ -1066,6 +1069,148 @@ models:
```


### ci
```

Usage: datacontract ci [OPTIONS] [LOCATIONS]...

Run tests for CI/CD pipelines. Emits GitHub Actions annotations and step
summary.

╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ locations [LOCATIONS]... The location(s) (url or path) of the data │
│ contract yaml file(s). │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --schema TEXT The location (url │
│ or path) of the │
│ ODCS JSON Schema │
│ --server TEXT The server │
│ configuration to │
│ run the schema and │
│ quality tests. Use │
│ the key of the │
│ server object in │
│ the data contract │
│ yaml file to refer │
│ to a server, e.g., │
│ `production`, or │
│ `all` for all │
│ servers (default). │
│ [default: all] │
│ --publish TEXT The url to publish │
│ the results after │
│ the test. │
│ --output PATH Specify the file │
│ path where the test │
│ results should be │
│ written to (e.g., │
│ './test-results/TE… │
│ --output-format [json|junit] The target format │
│ for the test │
│ results. │
│ --logs --no-logs Print logs │
│ [default: no-logs] │
│ --json --no-json Print test results │
│ as JSON to stdout. │
│ [default: no-json] │
│ --fail-on TEXT Minimum severity │
│ that causes a │
│ non-zero exit code: │
│ 'warning', 'error', │
│ or 'never'. │
│ [default: error] │
│ --ssl-verification --no-ssl-verific… SSL verification │
│ when publishing the │
│ data contract. │
│ [default: │
│ ssl-verification] │
│ --debug --no-debug Enable debug │
│ logging │
│ --help Show this message │
│ and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯

```

The `ci` command wraps [`test`](#test) with CI/CD-specific features:

- **Multiple contracts**: `datacontract ci contracts/*.yaml`
- **CI annotations:** Inline annotations for failed checks (GitHub Actions and Azure DevOps)
- **Markdown summary** of the test results (GitHub Actions)
- **`--json`**: Print test results as JSON to stdout for machine-readable output
- **`--fail-on`**: Control the minimum severity that causes a non-zero exit code. Default is `error`; set to `warning` to also fail on warnings, or `never` to always exit 0.

See the [test command](#test) for supported server types and their configuration.

```bash
# Single contract
$ datacontract ci datacontract.yaml

# Multiple contracts
$ datacontract ci contracts/*.yaml

# Fail on warnings too
$ datacontract ci --fail-on warning datacontract.yaml

# JSON output for scripting
$ datacontract ci --json datacontract.yaml
```

<details>
<summary>GitHub Actions workflow example</summary>

```yaml
# .github/workflows/datacontract.yml
name: Data Contract CI

on:
push:
branches: [main]
pull_request:

jobs:
datacontract-ci:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install datacontract-cli
# Test one or more data contracts (supports globs, e.g. contracts/*.yaml)
- run: datacontract ci datacontract.yaml
```

</details>

<details>
<summary>Azure DevOps pipeline example</summary>

```yaml
# azure-pipelines.yml
trigger:
branches:
include:
- main

pool:
vmImage: "ubuntu-latest"

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: "3.11"
- script: pip install datacontract-cli
displayName: "Install datacontract-cli"
# Test one or more data contracts (supports globs, e.g. contracts/*.yaml)
- script: datacontract ci datacontract.yaml
displayName: "Run data contract tests"
```

</details>


### export
```

Expand Down Expand Up @@ -1881,10 +2026,11 @@ Create a data contract based on the actual data. This is the fastest way to get
$ datacontract lint
```

4. Set up a CI pipeline that executes daily for continuous quality checks. You can also report the
test results to tools like [Data Mesh Manager](https://datamesh-manager.com)
4. Set up a CI pipeline that executes daily for continuous quality checks. Use the [`ci`](#ci) command for
CI-optimized output (GitHub Actions annotations and step summary, Azure DevOps annotations).
You can also report the test results to tools like [Data Mesh Manager](https://datamesh-manager.com).
```bash
$ datacontract test --publish https://api.datamesh-manager.com/api/test-results
$ datacontract ci --publish https://api.datamesh-manager.com/api/test-results
```

### Contract-First
Expand Down
106 changes: 103 additions & 3 deletions datacontract/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from pathlib import Path
from typing import Iterable, List, Optional

import click
import typer
from click import Context
from rich.console import Console
Expand All @@ -20,6 +21,7 @@
)
from datacontract.lint.resolve import resolve_data_contract, resolve_data_contract_dict
from datacontract.model.exceptions import DataContractException
from datacontract.output.ci_output import write_ci_output, write_ci_summary, write_json_results
from datacontract.output.output_format import OutputFormat
from datacontract.output.test_results_writer import write_test_result

Expand Down Expand Up @@ -187,6 +189,102 @@ def test(
write_test_result(run, console, output_format, output, data_contract)


@app.command(name="ci")
def ci(
locations: Annotated[
Optional[list[str]],
typer.Argument(help="The location(s) (url or path) of the data contract yaml file(s)."),
] = None,
schema: Annotated[
str,
typer.Option(help="The location (url or path) of the ODCS JSON Schema"),
] = None,
server: Annotated[
str,
typer.Option(
help="The server configuration to run the schema and quality tests. "
"Use the key of the server object in the data contract yaml file "
"to refer to a server, e.g., `production`, or `all` for all "
"servers (default)."
),
] = "all",
publish: Annotated[str, typer.Option(help="The url to publish the results after the test.")] = None,
output: Annotated[
Path,
typer.Option(
help="Specify the file path where the test results should be written to (e.g., './test-results/TEST-datacontract.xml')."
),
] = None,
output_format: Annotated[OutputFormat, typer.Option(help="The target format for the test results.")] = None,
logs: Annotated[bool, typer.Option(help="Print logs")] = False,
json_output: Annotated[bool, typer.Option("--json", help="Print test results as JSON to stdout.")] = False,
fail_on: Annotated[
str,
typer.Option(
click_type=click.Choice(["warning", "error", "never"], case_sensitive=False),
help="Minimum severity that causes a non-zero exit code.",
),
] = "error",
ssl_verification: Annotated[
bool,
typer.Option(help="SSL verification when publishing the data contract."),
] = True,
debug: debug_option = None,
):
"""
Run tests for CI/CD pipelines. Emits GitHub Actions annotations and step summary.
"""
enable_debug_logging(debug)

if not locations:
locations = ["datacontract.yaml"]

if output and len(locations) > 1:
console.print("Error: --output cannot be used with multiple contracts (results would overwrite each other).")
raise typer.Exit(code=1)

if server == "all":
server = None

# Plain text output for CI logs; --json sends human output to stderr.
out = Console(stderr=True, no_color=True) if json_output else Console(no_color=True)

results = []
fail_results = {
"warning": {"warning", "failed", "error"},
"error": {"failed", "error"},
"never": set(),
}
should_fail = False

for location in locations:
out.print(f"Testing {location}")
run = DataContract(
data_contract_file=location,
schema_location=schema,
publish_url=publish,
server=server,
ssl_verification=ssl_verification,
).test()
if logs:
_print_logs(run, out)
results.append((location, run))
write_ci_output(run, location, json_mode=json_output)
try:
write_test_result(run, out, output_format, output)
except typer.Exit:
pass
if run.result in fail_results[fail_on]:
should_fail = True

write_ci_summary(results)
if json_output:
write_json_results(results)

if should_fail:
raise typer.Exit(code=1)


@app.command(name="export")
def export(
format: Annotated[ExportFormat, typer.Option(help="The export format.")],
Expand Down Expand Up @@ -508,10 +606,12 @@ def api(
uvicorn.run(**uvicorn_args)


def _print_logs(run):
console.print("\nLogs:")
def _print_logs(run, out=None):
if out is None:
out = console
out.print("\nLogs:")
for log in run.logs:
console.print(log.timestamp.strftime("%y-%m-%d %H:%M:%S"), log.level.ljust(5), log.message)
out.print(log.timestamp.strftime("%y-%m-%d %H:%M:%S"), log.level.ljust(5), log.message)


if __name__ == "__main__":
Expand Down
Loading
Loading