Skip to content

Add bt datasets command for managing remote datasets#104

Open
Parker Henderson (parkerhendo) wants to merge 8 commits intomainfrom
parkerhendo/datasets-e2e
Open

Add bt datasets command for managing remote datasets#104
Parker Henderson (parkerhendo) wants to merge 8 commits intomainfrom
parkerhendo/datasets-e2e

Conversation

@parkerhendo
Copy link
Copy Markdown
Contributor

@parkerhendo Parker Henderson (parkerhendo) commented Apr 9, 2026

TL;DR

Added a new bt datasets command for managing remote Braintrust datasets with full CRUD operations and data upload capabilities.

What changed?

Added comprehensive dataset management functionality:

  • New bt datasets command with subcommands: list, create, update/add/refresh, view, and delete
  • Dataset operations support multiple input methods: files (JSON/JSONL), inline JSON via --rows, or stdin
  • Smart record handling with configurable ID fields via --id-field and automatic ID generation when missing
  • Efficient data sync that compares local and remote records to minimize uploads (create/update/unchanged tracking)
  • Interactive features including dataset selection, confirmation prompts, and browser opening
  • Comprehensive API client with pagination support, cursor handling, and batch uploading
  • Updated CLI help and README documentation with usage examples

The implementation includes a new datasets module with API client, record processing, upload handling, and individual command implementations. Added project context utilities for shared authentication and project resolution across commands.

How to test?

# List datasets
bt datasets list

# Create datasets with different input methods
bt datasets create my-dataset --file records.jsonl
cat records.jsonl | bt datasets create my-dataset
bt datasets create my-dataset --rows '[{"id":"case-1","input":{"text":"hi"},"expected":"hello"}]'

# Update existing datasets
bt datasets update my-dataset --file new-records.jsonl
bt datasets refresh my-dataset --file records.jsonl --id-field metadata.case_id

# View and delete datasets
bt datasets view my-dataset --verbose
bt datasets delete my-dataset --force

Why make this change?

This change enables users to manage Braintrust datasets directly from the CLI without requiring local sync artifacts. It provides a streamlined workflow for dataset creation, updates, and inspection that integrates with existing Braintrust project management, making it easier to maintain evaluation datasets for AI/ML workflows.

@parkerhendo Parker Henderson (parkerhendo) changed the title feat: add remote dataset management commands Add bt datasets command for managing remote datasets Apr 9, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

Latest downloadable build artifacts for this PR commit f00eda32d300:

Available artifact names
  • ``artifacts-build-global
  • ``artifacts-build-local-x86_64-pc-windows-msvc
  • ``artifacts-build-local-x86_64-apple-darwin
  • ``artifacts-build-local-aarch64-unknown-linux-musl
  • ``artifacts-build-local-x86_64-unknown-linux-musl
  • ``artifacts-build-local-aarch64-apple-darwin
  • ``artifacts-build-local-aarch64-unknown-linux-gnu
  • ``artifacts-build-local-x86_64-unknown-linux-gnu
  • ``artifacts-plan-dist-manifest
  • ``cargo-dist-cache

@ankrgyl
Copy link
Copy Markdown
Contributor

  • What if I want to upload records without an id?
  • If I upload something which contains extraneous fields, they get silently ignored:
Ankurs-MacBook-Pro:~/projects/braintrust-cli ankur$ cat test.json
{"foo": "bar", "id": 1}

just uploads the id

I think the PR description is a bit out of date. There's no refresh command, I think. But I also like that (I was about to ask, why would we have one!?)

Copy link
Copy Markdown
Contributor Author

Parker Henderson (parkerhendo) commented Apr 13, 2026

There's no refresh command, I think. But I also like that

datasets refresh exists! it's an alias for datasets update along with datasets add. Missed the update in the pr description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants