Add AI skill to check current repository against upstream APIs#1460
Add AI skill to check current repository against upstream APIs#1460timsaucer wants to merge 8 commits intoapache:mainfrom
Conversation
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM!
love to see this, i might adopt it for other projects 😄
|
Thanks for the review @kevinjqliu ! I think this was really helpful to finding some missing functions. I'm sure we'll want to make changes to it, but I also think this is a good starting point. |
Document the full FFI type pipeline (Rust PyO3 wrapper → Protocol type → Python wrapper → ABC base class → exports → example) and catalog which upstream datafusion-ffi types are supported, which have been evaluated as not needing direct exposure, and how to check for new gaps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add "ffi types" to the argument-hint and description so users can invoke the skill with `/check-upstream ffi types`. Also add pipeline verification step to ensure each supported FFI type has the full end-to-end chain (PyO3 wrapper, Protocol, Python wrapper with type hints, ABC, exports). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Section 7 (FFI Types) was incorrectly placed after the Output Format and Implementation Pattern sections. Move it to sit after Section 6 (SessionContext Methods), consistent with the other checkable areas. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The supported FFI types list would go stale as new types are added. Replace it with a grep instruction to discover them at check time, keeping only the "evaluated and not requiring exposure" list which captures rationale not derivable from code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Functions exposed in Python (e.g., as aliases of other Rust bindings) were being falsely reported as missing because they lacked a dedicated #[pyfunction] in Rust. The user-facing API is the Python layer, so coverage should be measured there. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
I'm going to hold off on merging this until I've run through all of those generated issues. I've already needed a couple of updates. |
show_limit is covered by DataFrame.show() and with_param_values is covered by SessionContext.sql(param_values=...), so neither needs separate exposure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
||
| ## Checking Upstream DataFusion Coverage | ||
|
|
||
| This project includes a [Claude Code](https://claude.com/claude-code) skill for auditing which |
There was a problem hiding this comment.
maybe worth using more neutral ("AI Agent") language?
There was a problem hiding this comment.
ANother potentially non-claude specific path would be something like .agents/skills/check-upstream/SKILL.md
There was a problem hiding this comment.
That's a very good point. I will make this in the next round of changes.
Which issue does this PR close?
None
Rationale for this change
One area we sometimes fail to update in this repository as compared to our upstream
datafusionis to ensure all of the new upstream features are available via Python. This PR adds an AI agent tool to help ensure we have coverage.What changes are included in this PR?
Adds a claude code skill file. Updates the documentation.
Are there any user-facing changes?
No, no code impacted.
Context
The following PRs were all generated using this skill: