Conversation
📦 TestPyPI package publishedpip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.2.dev23491757659or with uv: uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.2.dev23491757659MCP server for Claude Codeclaude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.13.2.dev23491757659" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table📖 Docs preview🎨 Storybook preview |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38a51e33e6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| The HTML file references ``static-embed.js`` and ``static-embed.css``. | ||
| These are included in the buckaroo package under ``buckaroo/static/`` — | ||
| copy them alongside your HTML or serve them from a web server. |
There was a problem hiding this comment.
Stop telling users to copy assets that the wheel doesn't ship
Anyone following this section on a clean install will generate HTML that never renders: to_html() hardcodes static-embed.js/css, but the package build still only ships widget.js, compiled.css, standalone.js, and standalone.css (see pyproject.toml's tool.hatch.build.artifacts). The same missing files also break the new DDD pages on ReadTheDocs, because scripts/generate_ddd_static_html.py copies from buckaroo/static/ and only warns when static-embed.* is absent.
Useful? React with 👍 / 👎.
scripts/generate_ddd_static_html.py
Outdated
| ('weird-types-polars', 'Weird Types (Polars → Pandas)', | ||
| pl_df_with_weird_types_as_pandas(), |
There was a problem hiding this comment.
Generate the DDD Polars page from a real Polars DataFrame
This entry bypasses the Polars static-embed path by converting the sample to pandas before calling to_html(). buckaroo.artifact.prepare_buckaroo_artifact() only switches to PolarsBuckarooWidget for an actual pl.DataFrame, so the published weird-types-polars.html page won't exercise the Polars serializer/analysis at all and can miss regressions like Duration/Decimal formatting while the docs still claim Polars embedding works.
Useful? React with 👍 / 👎.
📖 Docs previewhttps://buckaroo-data.readthedocs.io/en/638/ Key pages on this branch: |
This file belongs on the content-plan branch (PR #638), not here. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: Dastardly DataFrame Dataset post with static embeds
- DDD article with inline iframe embeds for 10 tricky DataFrames
- Generator script to produce static embed HTML at RTD build time
- RTD config: build JS bundle, generate DDD pages, copy to output
- Fix: coerce Period/Interval/Timedelta to strings in parquet b64
(hyparquet can't decode pandas Arrow extension types)
- Tests for weird-type parquet roundtrip
- Ship static-embed.js/css in wheel (pyproject.toml artifacts)
- Docs preview link in TestPyPI PR comment
- BuckarooCompare article
- Content plan outline
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add voice/personality to DDD post, add type coverage research
Incorporate colorful wording from buckaroo_writing notes into the DDD blog
post intro ("wonderfully variant splendor", "weirdest DataFrames in the
wild", "hard fought experience"). Add Buckaroo's displayability philosophy.
New research doc tests every pandas (classic, extension, arrow-backed) and
polars dtype through both JSON and parquet serialization paths, documenting
which are covered by the DDD and test suite.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add full dtype coverage table to DDD post
Cross-engine table showing Yes/not tested/fail for every dtype across
pandas-classic, pandas-arrow, and polars. Footnote noting that serialization
complexity warrants its own follow-up post.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add parquet, JS type, and display columns to dtype table
Expand the dtype coverage table with three new columns showing the full
pipeline: what Parquet physical type each dtype maps to, what JS type
hyparquet produces after decode, and how buckaroo's display formatter
renders it. Footnotes explain BigInt handling, coercion for types without
native Parquet equivalents, and the two serialization failures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: credit Cecil Curry / beartype for DDD inspiration
The naming and early shape of the DDD was influenced by an exchange
with leycec on beartype#529. Added a shout-out in the intro.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update copyright years to include 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: link first DDD mention to source on GitHub main
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: show actual rendered values in dtype table Buckaroo display column
Replace generic type names with example values buckaroo actually displays:
1,234 for integers, 1d 2h 3m 4s for durations, 68656c6c6f for binary, etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: fix dtype table — actual display values, footnotes, no blank cells
- Replace generic type names with actual buckaroo display values
- Add footnotes for BigInt (stringified, no commas) and Period (time span)
- Fill all blank cells with — for dtypes that don't exist in that engine
- Clarify Period label as "Period (time span)"
- Fix Nullable int/float/bool row with proper values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* WIP
* docs: remove content-plan.md from ddd-post branch
This file belongs on the content-plan branch (PR #638), not here.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: address inline review comments on DDD article
- Shorten Cecil Curry praise (remove "makes open source worth doing")
- Add LLM-generated DataFrames and inherited pipelines to "why this matters"
- Fix "three values" → "three non-numeric values" for infinity/NaN section
- Remove duplicate None mention from MultiIndex rows section
- Add note about planned MultiIndex-both-axes improvements
- Update footnote 1: "putting together this table exposed areas that need work"
- Fix comma→period typo in footnote 1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: Dastardly DataFrame Dataset post with static embeds
- DDD article with inline iframe embeds for 10 tricky DataFrames
- Generator script to produce static embed HTML at RTD build time
- RTD config: build JS bundle, generate DDD pages, copy to output
- Fix: coerce Period/Interval/Timedelta to strings in parquet b64
(hyparquet can't decode pandas Arrow extension types)
- Tests for weird-type parquet roundtrip
- Ship static-embed.js/css in wheel (pyproject.toml artifacts)
- Docs preview link in TestPyPI PR comment
- BuckarooCompare article
- Content plan outline
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add voice/personality to DDD post, add type coverage research
Incorporate colorful wording from buckaroo_writing notes into the DDD blog
post intro ("wonderfully variant splendor", "weirdest DataFrames in the
wild", "hard fought experience"). Add Buckaroo's displayability philosophy.
New research doc tests every pandas (classic, extension, arrow-backed) and
polars dtype through both JSON and parquet serialization paths, documenting
which are covered by the DDD and test suite.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add full dtype coverage table to DDD post
Cross-engine table showing Yes/not tested/fail for every dtype across
pandas-classic, pandas-arrow, and polars. Footnote noting that serialization
complexity warrants its own follow-up post.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add parquet, JS type, and display columns to dtype table
Expand the dtype coverage table with three new columns showing the full
pipeline: what Parquet physical type each dtype maps to, what JS type
hyparquet produces after decode, and how buckaroo's display formatter
renders it. Footnotes explain BigInt handling, coercion for types without
native Parquet equivalents, and the two serialization failures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: credit Cecil Curry / beartype for DDD inspiration
The naming and early shape of the DDD was influenced by an exchange
with leycec on beartype#529. Added a shout-out in the intro.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update copyright years to include 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: link first DDD mention to source on GitHub main
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: show actual rendered values in dtype table Buckaroo display column
Replace generic type names with example values buckaroo actually displays:
1,234 for integers, 1d 2h 3m 4s for durations, 68656c6c6f for binary, etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: fix dtype table — actual display values, footnotes, no blank cells
- Replace generic type names with actual buckaroo display values
- Add footnotes for BigInt (stringified, no commas) and Period (time span)
- Fill all blank cells with — for dtypes that don't exist in that engine
- Clarify Period label as "Period (time span)"
- Fix Nullable int/float/bool row with proper values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* WIP
* docs: remove content-plan.md from ddd-post branch
This file belongs on the content-plan branch (PR #638), not here.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: address inline review comments on DDD article
- Shorten Cecil Curry praise (remove "makes open source worth doing")
- Add LLM-generated DataFrames and inherited pipelines to "why this matters"
- Fix "three values" → "three non-numeric values" for infinity/NaN section
- Remove duplicate None mention from MultiIndex rows section
- Add note about planned MultiIndex-both-axes improvements
- Update footnote 1: "putting together this table exposed areas that need work"
- Fix comma→period typo in footnote 1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* build: auto-create static stubs in hatch hook, remove manual touch hacks
The hatch build hook (initialize2 → initialize) now auto-creates empty
stub files in buckaroo/static/ when the real JS artifacts haven't been
built. This eliminates the need for manual mkdir/touch blocks in CI
and ReadTheDocs.
- scripts/hatch_build.py: rename initialize2 → initialize (was dead
code), create stubs for editable installs, build JS for wheel builds
- pyproject.toml: use glob for artifacts (buckaroo/static/*.js|*.css)
instead of incomplete file list
- checks.yml: remove 4 separate mkdir/touch stub blocks
- .readthedocs.yaml: remove pnpm install for js-core (stubs handle it)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* build: stop tracking standalone.js/css in git
These are build artifacts produced by full_build.sh. They're already
in .gitignore but were force-tracked. The glob artifact pattern in
pyproject.toml ensures they're still included in the wheel.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: build all static assets in hatch hook pnpm fallback
The pnpm fallback only built buckaroo-widget (widget.js/css) but not
compiled.css, standalone.js/css, or static-embed.js/css. A bare
`uv build` would produce an incomplete wheel.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Post 1: Dastardly DataFrame Dataset with inline iframe embeds - Post 3: Static Embedding & the Incredible Shrinking Widget - Post 5: Buckaroo Embedding Guide - Post 8: BuckarooCompare — Diff Your DataFrames - Script to generate DDD static HTML pages at docs build time - RTD config runs generate_ddd_static_html.py before copying extra-html - Fleshed out content-plan.md with 9-post publishing sequence Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Touch empty static files (compiled.css, widget.js, etc.) before running generate_ddd_static_html.py so anywidget import succeeds without a full JS build - Widen RST table columns to fit "Buckaroo static embed" row Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a step to the CheckDocs job that comments on PRs with the ReadTheDocs preview URL and links to key article pages. Uses the same create-or-update pattern as the TestPyPI comment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Converts so-you-want-to-write-a-dataframe-viewer to RST with a proper list-table for the comparison of open source DataFrame viewers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates based on research into each project: - Buckaroo: Static Export → Yes - Perspective: Both server/browser, Arrow serialization, Jupyter compatible - DTale: JSON confirmed, No static export, uses react-virtualized - Marimo: JSON confirmed, not Jupyter compatible, not anywidget - ipydatagrid: No static export (confirmed broken), Lumino DataGrid - Mito: Endo (custom) table viewer - iTables: anywidget optional - Add quak (manzt): DuckDB-backed, Arrow, anywidget - Add hyperlinks to all project GitHub repos Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mparison as published Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DDD article, pyproject.toml, and readthedocs.yaml were modified by earlier commits on this branch but the canonical versions were merged to main via PRs 641 and 642. Reset to main to avoid regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove Python code block for column renaming (not needed) - Drop "smaller payloads" reason — not true for Parquet (column names stored once in metadata, not per-row) - Clarify caching is buckaroo's LRU in resolveDFData.ts, not hyparquet - Add TypeScript type flow diagram (string → ArrayBuffer → rows → cells) - Add "why it ended up this way" section: evolution from default AG-Grid + pandas JSON (slow) to Parquet - Explain BigInt flow: fast INT64 on Python side, hyparquet decodes as BigInt, buckaroo stringifies only when > MAX_SAFE_INTEGER - Explain duration flow: whole column coerced to string in Python, parsed back to human-readable on JS side Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Drop nonexistent "Hytable" from intro - Add Panel Tabulator and Streamlit st.dataframe to comparison table - Add hyperlinks throughout prose: AG-Grid, Perspective, glide-data-grid, tanstack-table, react-data-grid, anywidget, JS Jabber podcast - Update glide-data-grid description (actively maintained, not abandoned) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ting - Add before/after comparison: 4 jobs in 6 min → 23 jobs in 7 min - Add section on dual dependency testing strategy (min pinned + max versions) - Explain why pandas/pyarrow/polars compatibility testing needs fast CI - Link to Depot open source sponsorship program Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Dec 24 (pre-Depot): 3 jobs. Now: 23 jobs (20 added since sponsorship). Itemized breakdown of what was added: 6 Playwright suites, 8 Python matrix jobs, MCP integration, smoke tests, screenshots, docs, TestPyPI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…speed Same-pipeline benchmarks show Depot isn't measurably faster than GitHub runners (I/O-bound workload). The real value is consistent provisioning (no Monday afternoon queue delays) and no minute quotas, which gave confidence to invest in CI optimization and grow from 3 to 23 jobs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- "How I made Buckaroo fast" — do less, not optimize more - "Testing Buckaroo" — unit, integration, Playwright, screenshots, smoke, MCP, dual deps, DDD as test suite - Mark Depot article as draft with pending CTO input Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Article now includes a 4-scenario comparison table with real data: - GitHub Sunday night (3m15s), Monday sequential (5m19s), Monday parallel (5m58s), Depot Monday parallel (4m18s) - Key finding: per-job Depot is slightly slower, but consistent provisioning (19s stagger vs 114s) wins on critical path - GitHub ranges 3m–6m+, Depot is 4m–4m30s regardless New scripts: - ci_critical_path.sh: critical path for a single run - ci_list_runs.sh: list runs for a PR or branch - ci_all_timings.sh: JSON output of all timing data per run - ci_timing_table.py: formatted comparison table from JSON Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete rewrite with controlled benchmark results: - 21 runs across 6 scenarios (cold/warm, parallel/sequential, Sunday/Monday) - GitHub Monday: 7m46s ±143s. Depot Monday: 4m03s ±20s. - Key finding: per-job Depot is slightly slower, but consistent provisioning (all jobs start in 20s vs 1-7 min stagger) wins - Includes reproduction steps with scripts - Restructured: benchmark → results → analysis → before/after → advice Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix "GitHub Actions is" grammar - Explain per-job slowness: Depot has per-runner provisioning overhead, but provisions all in parallel; GitHub provisions sequentially from pool - Verify Monday cache write speed (GH 0.8s vs Depot 2.1s, confirmed) - Fix "Monday morning" typo - Remove "no minute quotas" point — reads as "Depot is great if free" - Renumber value prop list (now 2 items: consistency + confidence) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The scariest part was transferring paddymul/buckaroo to the buckaroo-data org (required for Depot's open source program). Feared losing GitHub stars — turns out GitHub's transfer preserves everything (stars, issues, PRs, forks, redirects). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
No description provided.