-
Notifications
You must be signed in to change notification settings - Fork 16
Lading in Python #1907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
StephenWakely
wants to merge
5
commits into
main
Choose a base branch
from
stephen/lading-py
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Lading in Python #1907
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
b50aeeb
feat(lading-py): Python port of lading with DogStatsD emission via do…
StephenWakely 8a995ae
Ensure rust isnt being used
StephenWakely 5fc947b
fix(lading-py): match Rust lading CLI interface
StephenWakely be339ac
fix(lading-py): handle directory config path like Rust lading
StephenWakely d38228d
fix(lading-py): add backports.zstd for aiohttp zstd content-encoding
StephenWakely File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,90 +1,15 @@ | ||
| # Update the rust version in-sync with the version in rust-toolchain.toml | ||
|
|
||
| # Stage 0: Planner - Extract dependency metadata | ||
| FROM docker.io/rust:1.90.0-slim-bookworm AS planner | ||
| WORKDIR /app | ||
| RUN cargo install cargo-chef --version 0.1.73 | ||
| COPY . . | ||
| RUN cargo chef prepare --recipe-path recipe.json | ||
|
|
||
| # Stage 1: Cacher - Build dependencies only | ||
| FROM docker.io/rust:1.90.0-slim-bookworm AS cacher | ||
| ARG SCCACHE_BUCKET | ||
| ARG SCCACHE_REGION | ||
| ARG AWS_ACCESS_KEY_ID | ||
| ARG AWS_SECRET_ACCESS_KEY | ||
| ARG AWS_SESSION_TOKEN | ||
| ENV CARGO_INCREMENTAL=0 | ||
| # Stage 0: Build — install dependencies into a venv | ||
| FROM docker.io/python:3.12-slim-bookworm AS builder | ||
| WORKDIR /app | ||
| RUN apt-get update && apt-get install -y \ | ||
| pkg-config=1.8.1-1 \ | ||
| libssl-dev=3.0.18-1~deb12u2 \ | ||
| protobuf-compiler=3.21.12-3 \ | ||
| fuse3=3.14.0-4 \ | ||
| libfuse3-dev=3.14.0-4 \ | ||
| curl \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
| # Download pre-built sccache binary | ||
| RUN case "$(uname -m)" in \ | ||
| x86_64) ARCH=x86_64-unknown-linux-musl ;; \ | ||
| aarch64) ARCH=aarch64-unknown-linux-musl ;; \ | ||
| *) echo "Unsupported architecture" && exit 1 ;; \ | ||
| esac && \ | ||
| curl -L https://github.com/mozilla/sccache/releases/download/v0.8.2/sccache-v0.8.2-${ARCH}.tar.gz | tar xz && \ | ||
| mv sccache-v0.8.2-${ARCH}/sccache /usr/local/cargo/bin/ && \ | ||
| rm -rf sccache-v0.8.2-${ARCH} | ||
| RUN cargo install cargo-chef --version 0.1.73 | ||
| COPY --from=planner /app/recipe.json recipe.json | ||
| # This layer is cached until Cargo.toml/Cargo.lock change | ||
| # Use BuildKit secrets to pass AWS credentials securely (not exposed in image metadata) | ||
| RUN --mount=type=secret,id=aws_access_key_id \ | ||
| --mount=type=secret,id=aws_secret_access_key \ | ||
| --mount=type=secret,id=aws_session_token \ | ||
| export AWS_ACCESS_KEY_ID=$(cat /run/secrets/aws_access_key_id) && \ | ||
| export AWS_SECRET_ACCESS_KEY=$(cat /run/secrets/aws_secret_access_key) && \ | ||
| export AWS_SESSION_TOKEN=$(cat /run/secrets/aws_session_token) && \ | ||
| if [ -n "${SCCACHE_BUCKET:-}" ]; then export RUSTC_WRAPPER=sccache; fi && \ | ||
| cargo chef cook --release --locked --features logrotate_fs --recipe-path recipe.json | ||
| RUN pip install --upgrade pip | ||
| COPY lading_py/ lading_py/ | ||
| RUN pip install --prefix=/install lading_py/ | ||
|
|
||
| # Stage 2: Builder - Build source code | ||
| FROM docker.io/rust:1.90.0-slim-bookworm AS builder | ||
| ARG SCCACHE_BUCKET | ||
| ARG SCCACHE_REGION | ||
| ENV CARGO_INCREMENTAL=0 | ||
| ENV SCCACHE_BUCKET=${SCCACHE_BUCKET} | ||
| ENV SCCACHE_REGION=${SCCACHE_REGION} | ||
| WORKDIR /app | ||
| RUN apt-get update && apt-get install -y \ | ||
| pkg-config=1.8.1-1 \ | ||
| libssl-dev=3.0.18-1~deb12u2 \ | ||
| protobuf-compiler=3.21.12-3 \ | ||
| fuse3=3.14.0-4 \ | ||
| libfuse3-dev=3.14.0-4 \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
| # Copy cached dependencies and sccache from cacher | ||
| COPY --from=cacher /app/target target | ||
| COPY --from=cacher /usr/local/cargo /usr/local/cargo | ||
| # Copy source code (frequently changes) | ||
| COPY . . | ||
| # Build binary - reuses cached dependencies + sccache | ||
| # Use BuildKit secrets to pass AWS credentials securely (not exposed in image metadata) | ||
| RUN --mount=type=secret,id=aws_access_key_id \ | ||
| --mount=type=secret,id=aws_secret_access_key \ | ||
| --mount=type=secret,id=aws_session_token \ | ||
| export AWS_ACCESS_KEY_ID=$(cat /run/secrets/aws_access_key_id) && \ | ||
| export AWS_SECRET_ACCESS_KEY=$(cat /run/secrets/aws_secret_access_key) && \ | ||
| export AWS_SESSION_TOKEN=$(cat /run/secrets/aws_session_token) && \ | ||
| if [ -n "${SCCACHE_BUCKET:-}" ]; then export RUSTC_WRAPPER=sccache; fi && \ | ||
| cargo build --release --locked --bin lading --features logrotate_fs | ||
|
|
||
| # Stage 3: Runtime | ||
| FROM docker.io/debian:bookworm-20241202-slim | ||
| RUN apt-get update && apt-get install -y \ | ||
| libfuse3-dev=3.14.0-4 \ | ||
| fuse3=3.14.0-4 \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
| COPY --from=builder /app/target/release/lading /usr/bin/lading | ||
| # Stage 1: Runtime | ||
| FROM docker.io/python:3.12-slim-bookworm | ||
| COPY --from=builder /install /usr/local | ||
|
|
||
| # Smoke test | ||
| RUN ["/usr/bin/lading", "--help"] | ||
| ENTRYPOINT ["/usr/bin/lading"] | ||
| RUN lading-py --help | ||
|
|
||
| ENTRYPOINT ["lading-py"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,254 @@ | ||
| # lading-py | ||
|
|
||
| A Python port of [lading](https://github.com/datadog/lading) focused on DogStatsD | ||
| load generation. Uses the [dogstatsd-py](https://github.com/DataDog/datadogpy) | ||
| library for all metric emission, making it suitable for testing the client library | ||
| itself under realistic load. | ||
|
|
||
| All other lading capabilities are preserved: Prometheus and expvar telemetry | ||
| collection from a running Datadog Agent, JSONL/Parquet capture output, and a | ||
| passive Prometheus exporter for real-time scraping. | ||
|
|
||
| ## Requirements | ||
|
|
||
| - Python 3.10+ | ||
| - A Unix domain socket to send DogStatsD traffic to (typically the Datadog Agent's | ||
| `/tmp/dsd.socket` or `DD_DOGSTATSD_SOCKET`) | ||
|
|
||
| ## Installation | ||
|
|
||
| ```bash | ||
| pip install -e /path/to/lading_py | ||
| ``` | ||
|
|
||
| Or from the directory: | ||
|
|
||
| ```bash | ||
| cd lading_py | ||
| pip install -e . | ||
| ``` | ||
|
|
||
| This installs the `lading-py` command. | ||
|
|
||
| ## Configuration | ||
|
|
||
| lading-py uses the same YAML config format as the Rust lading binary. A minimal | ||
| config that sends DogStatsD metrics and writes a JSONL capture file: | ||
|
|
||
| ```yaml | ||
| generator: | ||
| - unix_datagram: | ||
| seed: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, | ||
| 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131] | ||
| path: "/tmp/dsd.socket" | ||
| bytes_per_second: "1 MiB" | ||
| parallel_connections: 1 | ||
| variant: | ||
| dogstatsd: | ||
| contexts: | ||
| inclusive: | ||
| min: 50 | ||
| max: 50 | ||
| tags_per_msg: | ||
| inclusive: | ||
| min: 3 | ||
| max: 3 | ||
| kind_weights: | ||
| metric: 90 | ||
| event: 5 | ||
| service_check: 5 | ||
| metric_weights: | ||
| count: 1 | ||
| gauge: 1 | ||
| distribution: 3 | ||
| timer: 1 | ||
| set: 0 | ||
| histogram: 0 | ||
| metric_names: | ||
| - myapp.requests{{0-9}} | ||
| tag_names: | ||
| - env | ||
| - service | ||
| - version | ||
| tag_values: | ||
| - prod{{0-2}} | ||
|
|
||
| telemetry: | ||
| path: "/tmp/lading-output.jsonl" | ||
|
|
||
| warmup_duration_secs: 5 | ||
| experiment_duration_secs: 60 | ||
| ``` | ||
|
|
||
| ### Config reference | ||
|
|
||
| #### `generator[].unix_datagram` | ||
|
|
||
| | Field | Type | Default | Description | | ||
| |-------|------|---------|-------------| | ||
| | `seed` | list[int] (32 bytes) | required | RNG seed for deterministic payload generation | | ||
| | `path` | string | required | Unix domain socket path | | ||
| | `bytes_per_second` | string | `"1 MiB"` | Rate limit. Accepts human-readable sizes: `"500 KiB"`, `"4 MiB"`, `"1 GiB"` | | ||
| | `parallel_connections` | int | `1` | Number of concurrent sender threads | | ||
| | `variant.dogstatsd` | object | | DogStatsD payload config (see below) | | ||
|
|
||
| #### `variant.dogstatsd` | ||
|
|
||
| | Field | Type | Default | Description | | ||
| |-------|------|---------|-------------| | ||
| | `contexts` | ConfRange | `{inclusive: {min: 50, max: 50}}` | Number of unique metric contexts (name + tag set) to pre-generate | | ||
| | `tags_per_msg` | ConfRange | `{inclusive: {min: 3, max: 3}}` | Tags per metric | | ||
| | `multivalue_count` | ConfRange | `{inclusive: {min: 2, max: 32}}` | Messages per batch when multi-value packing fires | | ||
| | `multivalue_pack_probability` | float | `0.08` | Probability of packing multiple metrics into one datagram | | ||
| | `kind_weights` | object | `{metric: 90, event: 0, service_check: 0}` | Relative weight of each DogStatsD message kind | | ||
| | `metric_weights` | object | `{distribution: 5, ...rest 0}` | Relative weight of each metric type | | ||
| | `metric_names` | list[string] | `["metric{{0-9}}"]` | Metric name templates. `{{0-9}}` expands to 10 variants | | ||
| | `tag_names` | list[string] | `["tag1","tag2","tag3"]` | Tag name templates | | ||
| | `tag_values` | list[string] | `["value{{0-9}}"]` | Tag value templates | | ||
| | `sampling_range` | ConfRange | `{inclusive: {min: 0.1, max: 1.0}}` | Range for sample rate values | | ||
| | `sampling_probability` | float | `0.5` | Probability that a metric includes a sample rate | | ||
| | `length_prefix_framed` | bool | `false` | **Unsupported** — lading-py will reject configs with this set to `true` | | ||
|
|
||
| #### `telemetry` | ||
|
|
||
| Short form (JSONL output): | ||
| ```yaml | ||
| telemetry: | ||
| path: "/tmp/output.jsonl" | ||
| ``` | ||
|
|
||
| Long form with format control: | ||
| ```yaml | ||
| telemetry: | ||
| log: | ||
| path: "/tmp/output" | ||
| format: | ||
| jsonl: | ||
| flush_seconds: 60 | ||
| # or: parquet: {flush_seconds: 60} | ||
| # or: multi: {flush_seconds: 60} # writes both .jsonl and .parquet | ||
| ``` | ||
|
|
||
| Prometheus exporter (passive scrape endpoint): | ||
| ```yaml | ||
| telemetry: | ||
| prometheus: | ||
| addr: "0.0.0.0:9000" | ||
| ``` | ||
|
|
||
| #### `target_metrics` | ||
|
|
||
| Collect telemetry from a running Datadog Agent: | ||
|
|
||
| ```yaml | ||
| target_metrics: | ||
| - prometheus: | ||
| uri: "http://127.0.0.1:5000/telemetry" | ||
| tags: | ||
| sub_agent: "core" | ||
| - expvar: | ||
| uri: "http://127.0.0.1:5012/debug/vars" | ||
| vars: | ||
| - "/forwarder/Transactions/Success" | ||
| - "/uptime" | ||
| tags: | ||
| sub_agent: "trace" | ||
|
|
||
| sample_period_milliseconds: 1000 | ||
| ``` | ||
|
|
||
| #### `blackhole` | ||
|
|
||
| Absorb HTTP traffic from the target (e.g. agent intake forwarder in test): | ||
|
|
||
| ```yaml | ||
| blackhole: | ||
| - http: | ||
| binding_addr: "127.0.0.1:9091" | ||
| ``` | ||
|
|
||
| #### Lifecycle | ||
|
|
||
| ```yaml | ||
| warmup_duration_secs: 10 # wait before starting emission | ||
| experiment_duration_secs: 60 # how long to run after warmup | ||
| ``` | ||
|
|
||
| ## Running | ||
|
|
||
| ```bash | ||
| lading-py --config lading.yaml | ||
| ``` | ||
|
|
||
| The process runs for `warmup_duration_secs + experiment_duration_secs` seconds, | ||
| then exits. The capture file (if configured) is finalized on exit. | ||
|
|
||
| ## Output format | ||
|
|
||
| ### JSONL | ||
|
|
||
| One JSON object per line, one line per metric per flush interval: | ||
|
|
||
| ```json | ||
| {"run_id": "550e8400-...", "time": 1717959420000, "fetch_index": 0, "metric_name": "bytes_written", "metric_kind": "counter", "value": 1048576.0, "labels": {"generator": "dogstatsd"}} | ||
| {"run_id": "550e8400-...", "time": 1717959420000, "fetch_index": 0, "metric_name": "cpu_usage", "metric_kind": "gauge", "value": 0.73, "labels": {"sub_agent": "core"}} | ||
| ``` | ||
|
|
||
| Fields: | ||
|
|
||
| | Field | Type | Description | | ||
| |-------|------|-------------| | ||
| | `run_id` | UUID string | Unique identifier for this lading-py run | | ||
| | `time` | int | Milliseconds since Unix epoch | | ||
| | `fetch_index` | int | Flush counter (increments each flush interval) | | ||
| | `metric_name` | string | Metric name | | ||
| | `metric_kind` | string | `"counter"`, `"gauge"`, or `"histogram"` | | ||
| | `value` | float | Counter delta, gauge value, or histogram mean | | ||
| | `labels` | object | Key-value label pairs | | ||
| | `value_histogram` | string (base64) | Protobuf DDSketch bytes (omitted if empty) | | ||
|
|
||
| ### Parquet | ||
|
|
||
| Same schema as JSONL, written as columnar Parquet. Suitable for analysis with | ||
| pandas, DuckDB, or similar: | ||
|
|
||
| ```python | ||
| import pyarrow.parquet as pq | ||
| table = pq.read_table("/tmp/output.parquet") | ||
| df = table.to_pandas() | ||
| ``` | ||
|
|
||
| ## Docker | ||
|
|
||
| ```bash | ||
| docker build -t lading-py /path/to/lading | ||
| docker run --rm \ | ||
| -v /tmp/dsd.socket:/tmp/dsd.socket \ | ||
| -v /path/to/lading.yaml:/etc/lading/lading.yaml \ | ||
| -v /tmp/output:/tmp/output \ | ||
| lading-py --config /etc/lading/lading.yaml | ||
| ``` | ||
|
|
||
| ## Differences from Rust lading | ||
|
|
||
| | Feature | Rust lading | lading-py | | ||
| |---------|------------|-----------| | ||
| | Emission library | Raw Unix datagram socket | `dogstatsd-py` (`datadog` package) | | ||
| | Generators | TCP, UDP, HTTP, Unix stream, Fluent, OTLP, DogStatsD | DogStatsD only | | ||
| | `length_prefix_framed` | Supported | **Not supported** (rejected at config load) | | ||
| | RNG | ChaCha (SeededStdRng) | Mersenne Twister (`random.Random`) | | ||
| | Reproducibility | Bit-exact across runs with same seed | Statistically equivalent; not bit-exact | | ||
| | Histogram output | Full DDSketch protobuf | Mean value only; `value_histogram` always empty | | ||
|
|
||
| ## Development | ||
|
|
||
| ```bash | ||
| pip install -e ".[dev]" | ||
| pytest tests/ | ||
| ``` | ||
|
|
||
| Run just the unit tests (fast, no socket needed): | ||
|
|
||
| ```bash | ||
| pytest tests/ -k "not smoke" | ||
| ``` |
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When anyone runs the existing Rust executable, including
cargo run --bin lading, installedlading, or smoke tests that still exercise the Rust binary,mainnow panics before parsing the CLI or running any workload. The crate still builds, so this becomes a runtime break of the primary lading binary rather than a compile-time failure.Useful? React with 👍 / 👎.