Skip to content

Add github-repository-inventory skill 🤖🤖🤖#1995

Open
Kinosaur wants to merge 2 commits into
github:stagedfrom
Kinosaur:skill/github-repository-inventory
Open

Add github-repository-inventory skill 🤖🤖🤖#1995
Kinosaur wants to merge 2 commits into
github:stagedfrom
Kinosaur:skill/github-repository-inventory

Conversation

@Kinosaur

Copy link
Copy Markdown

A read-only, privacy-first Agent Skill that answers one question the platform does not make easy and models do unreliably: what has this GitHub account actually contributed to — across repos it does not own — and what is the evidence? It models the user's relationship to each repo (owner / collaborator / org-member / PR author / PR reviewer / issue author / commit-contributor) with the evidence and a confidence level, and never conflates repository access with verified contribution.

Fills a gap in the collection

Existing skills operate on a single checked-out repo (e.g. acquire-codebase-knowledge, repo-story-time); none work at the account level across owned and external repositories with a relationship-evidence model. This is a distinct capability, not a duplicate.

Why this is meaningful uplift (not something frontier models already do well)

  • Relationship evidence + confidence. Every relationship is recorded with the evidence that established it and a confidence level. Access is never reported as contribution — a distinction most tools conflate.
  • Privacy-first by construction. Public-only and read-only by default; no token storage; private repos require an explicit flag, have README content redacted, and are written only to a gitignored location (with a guard that warns otherwise). Private repo names never reach shareable output.
  • Deterministic, testable core. Identical input produces byte-identical output. All GitHub access goes through gh; the transform/render core is pure and unit-tested against recorded fixtures (no live API in tests).
  • Honest about gaps. Deleted repos, expired permissions, the commit-contribution time window, and rate-limit truncation are surfaced in warnings.json rather than hidden.

What it produces

catalog.json (canonical), PROJECTS.md (human-readable, localizable — English/Burmese), warnings.json, and optional derived reports (technology summary, portfolio candidates, missing READMEs, contribution summary).

Best for

Reconstructing a verifiable contribution history (job applications, reviews, proving cross-org/contract work) and giving an agent a truthful tool instead of letting it guess. A focused, occasional-use utility — and honest about that.

Requirements

Python 3.11+ and an authenticated GitHub CLI (gh) 2.90.0+. Bundles a small standard-library Python program under scripts/ (no third-party packages; gh owns auth). Read-only; never modifies repositories.

Upstream source & full test suite (80+ tests): https://github.com/Kinosaur/github-repository-inventory


🤖 Authored with Claude Code.

Read-only, privacy-first Agent Skill that builds an evidence-based inventory of a
GitHub account's footprint — owned, collaborator/organization (access), and
contribution-evidenced (authored/reviewed PRs, authored issues, commits) repos —
modelling each relationship with evidence and a confidence level, never conflating
access with contribution. Fills an account-level gap not covered by existing
single-repo skills. Standard-library Python under scripts/; gh owns auth; read-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 14, 2026 17:15
@Kinosaur Kinosaur requested a review from aaronpowell as a code owner June 14, 2026 17:15
@github-actions github-actions Bot added new-submission PR adds at least one new contribution skills PR touches skills labels Jun 14, 2026
@github-actions

github-actions Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

🔍 Skill Validator Results

✅ All checks passed

Scope Checked
Skills 1
Agents 0
Total 1
Severity Count
❌ Errors 0
⚠️ Warnings 0
ℹ️ Advisories 0

Summary

Level Finding
ℹ️ Found 1 skill(s)
ℹ️ [github-repository-inventory] 📊 github-repository-inventory: 1,204 BPE tokens [chars/4: 1,379] (detailed ✓), 15 sections, 1 code blocks
ℹ️ ✅ All checks passed (1 skill(s))
Full validator output
Found 1 skill(s)
[github-repository-inventory] 📊 github-repository-inventory: 1,204 BPE tokens [chars/4: 1,379] (detailed ✓), 15 sections, 1 code blocks
✅ All checks passed (1 skill(s))

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Introduces a new github-repository-inventory skill that scans an authenticated GitHub account (via gh) to produce a deterministic, privacy-first inventory of owned, contributed-to, and optionally accessible/private repositories, plus derived reports and schema validation.

Changes:

  • Add end-to-end inventory CLI (scan/discover/inspect/render/report/validate) that discovers repos via REST/GraphQL/Search, inspects README + top-level contents, and writes catalog.json, PROJECTS.md, and warnings.json.
  • Add pure technology detection and project-type classification based on root manifests + dependency parsing.
  • Add documentation/reference materials and register the skill in docs/README.skills.md.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
skills/github-repository-inventory/scripts/inventory.py Implements the CLI workflow, discovery/inspection pipeline, deterministic rendering, caching, and schema validation.
skills/github-repository-inventory/scripts/detect_technologies.py Adds pure manifest detection + dependency parsing to emit evidence-backed technologies and classify project types.
skills/github-repository-inventory/references/repository-schema.md Documents the repository record/catalog schema for humans.
skills/github-repository-inventory/references/relationship-model.md Documents relationship types/evidence/confidence model used in outputs.
skills/github-repository-inventory/references/privacy-rules.md Documents privacy guarantees and safe handling rules for private data.
skills/github-repository-inventory/references/limitations.md Documents known coverage/inspection limitations surfaced in warnings.
skills/github-repository-inventory/assets/projects-template.md Provides a PROJECTS.md template reference for the skill.
skills/github-repository-inventory/SKILL.md Defines the skill metadata, safety defaults, and usage workflow/commands.
docs/README.skills.md Registers the new skill in the skills index.

Comment on lines +555 to +566
warnings.append(
{
"code": "coverage-scope",
"repo": "",
"message": (
"Covers PUBLIC repositories the account owns plus those with PUBLIC "
"contribution evidence (authored/reviewed pull requests, authored issues). "
"Collaborator, organization, and private repositories are not included by "
"default. This is not 'everything you ever contributed to'."
),
}
)
Comment on lines +7 to +8
- **Public repositories only.** Private repositories are excluded unless the user passes
`--include-private`. (Private support is not in Phase 1.)
Comment on lines +14 to +17
## When private repos are included (later phase)

- Write output only to a local cache directory, and ensure that directory is gitignored
(`github-inventory/`, `.inventory-cache/`).

| Field | Notes |
|---|---|
| `schema_version` | e.g. `"0.1.0"`. |
Comment on lines +95 to +97
DISCOVERY_ENDPOINT = (
"/user/repos?affiliation=owner&visibility=public&per_page=100"
)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-submission PR adds at least one new contribution skills PR touches skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants