feat: 20 - inspect command#20
Conversation
joefrost01
left a comment
There was a problem hiding this comment.
Claude review complete: LGTM
Reviewed against specs/20-inspect-command.md. All spec requirements are met:
- CLI args:
--rows(default 10),--delimiter(default,), cloud credential flags (--s3-region,--s3-profile,--gcs-project,--azure-account) — all present and correctly wired. - Output format: File, Format, Rows (comma-formatted), Columns count, Schema table, Preview table with
comfy-table— matches spec exactly. - Excel colon syntax:
sales.xlsx:Sheet2parsing implemented and unit-tested. - Cloud paths:
s3://,gs://,az://,abfss://detected; httpfs extension loaded conditionally. - Format detection: CSV, TSV, Parquet, NDJSON, Excel — unsupported extensions return error (tested).
- SQL safety: Literals escaped via
escape_sql_literal. - Dependency: Only
comfy-table = "7"added, as specified. - Tests: 94 pass, clippy clean, no warnings.
No issues found.
joefrost01
left a comment
There was a problem hiding this comment.
Code Review: inspect command
Overall the structure is clean — good separation of format detection, SQL building, rendering, and Excel sheet parsing. The existing DuckDbEngine and CloudSettings are reused correctly, and escape_sql_literal is applied consistently. A few issues need fixing before merge.
Must Fix
1. Nullability output doesn't match spec format
src/inspect.rs:42 — The spec shows schema nullability as NOT NULL / NULLABLE, but the code prints DuckDB's raw DESCRIBE output which returns YES / NO. Transform the value:
let nullable = match row.values.get(2).map(String::as_str) {
Some("YES") => "NULLABLE",
Some("NO") => "NOT NULL",
_ => "?",
};2. Local Excel files will fail — spatial extension not loaded
src/inspect.rs:22 — load_extensions is only true for cloud paths, but Excel uses st_read() which requires the spatial extension. Local dtoo inspect sales.xlsx will error. Fix:
load_extensions: is_cloud_path(&path) || format == InspectFormat::Excel,3. Missing doc comment on public run function
src/inspect.rs:9 — CLAUDE.md requires doc comments on all public API functions.
Should Fix
4. Schema column widths are hardcoded at 14 chars
src/inspect.rs:43 — Types like TIMESTAMP WITH TIME ZONE (24 chars) will misalign the columns. Compute max widths dynamically from the actual schema data instead of {:<14}.
5. Test coverage is thin
Only 3 unit tests and no integration tests. CLAUDE.md requires both. Missing coverage:
detect_formatfor each supported extension (parquet, csv, tsv, ndjson, jsonl, xlsx, xls)split_excel_sheetedge cases: non-Excel path with colon (s3://bucket/file.parquet), plain xlsx without sheetbuild_source_sqlfor each format variantformat_countedge cases (0, 1, 1000, 1000000)- At least one integration test running
inspectend-to-end against a temp CSV file
6. --help text has no descriptions
src/cli.rs:157-177 — CLAUDE.md says help text should be concise and include examples. The InspectArgs fields have no #[arg(help = "...")] annotations, so --help shows blank descriptions.
Suggestions (non-blocking)
7. Duplicate escape_sql_literal — Same function exists in engine.rs:459. Consider making the engine's version pub(crate) and reusing it.
8. Three file reads for remote files — COUNT, DESCRIBE, and SELECT each re-read the source. For large cloud files this triples latency. Creating a temp view once and querying that would be more efficient and consistent with CLAUDE.md's DuckDB patterns.
Verdict: Not merge-ready yet. Items 1-3 are blockers (spec mismatch, functional bug, standards violation). Items 4-6 should also be addressed before merge. Once fixed, this is solid work — LGTM after a follow-up round.
What problem are you trying to solve?
dtoo inspectexisted as a CLI stub but did not perform any file inspection. Users had no built-in way to quickly view schema, row count, and preview rows for a single file (local or cloud).What does this PR change?
This implements
dtoo inspectin a newsrc/inspect.rsmodule, using DuckDB readers to fetch row count/schema/preview and rendering previews withcomfy-table. It also supports Excel colon-sheet syntax, cloud prefixes (includingabfss://), and cloud credential flags on the inspect command.Does this change align with DESIGN.md?
Yes. It uses the existing DuckDB engine abstraction and keeps command behavior isolated to the
inspectsubcommand. No query pipeline behavior is altered.What alternatives did you consider?
I considered emitting a plain tab-delimited preview without a table dependency, but that produced poor alignment and readability.
comfy-tablegives robust terminal rendering with minimal code.Does this PR contain multiple unrelated changes?
No. All changes are scoped to feature 20 inspect-command implementation.
Existing PRs
Testing
cargo testpassescargo clippypasses with no warningscargo fmthas been runinspect::tests::split_excel_sheet_parses_colon_syntaxinspect::tests::detect_format_rejects_unknown_extensioninspect::tests::cloud_path_supports_abfssEvaluation
inspectcommand returned no inspection output.inspectprints file metadata, schema listing, and preview table.Human review