[Track] Alternate writers for canonical formats (parquet-mr, parquet-go, ...)

Add optional alternate writers for canonical formats — Parquet first — alongside the default pyarrow output. File-format research benefits from corpora produced by multiple writer implementations: encoding choices, page sizing, dictionary thresholds, and stats policies differ per library and shape downstream compression / pushdown evaluation.

Likely shares machinery with #5; both produce additional sibling artifacts under the slug's output directory.

## Per writer

- New convert stage variant (or generalised stage that dispatches on writer + format).
- Extend `sources.json`: per-writer flag and skip-reason, e.g. `convert.parquet_java`.
- Update `validate_manifest` invariants.
- Outputs at `outputs/v{n}/<slug>/<fmt>-<writer>/<slug>.<ext>` (e.g. `parquet-java/`).
- Regen `docs/datasets.md` + `docs/snapshot.json`.

## Writers in scope

- **parquet-mr (Java)** — reference writer; subprocess via `java -jar`.
- **parquet-go** — Go-native writer; subprocess.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Track] Alternate writers for canonical formats (parquet-mr, parquet-go, ...) #6

Per writer

Writers in scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Track] Alternate writers for canonical formats (parquet-mr, parquet-go, ...) #6

Description

Per writer

Writers in scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions