-
Notifications
You must be signed in to change notification settings - Fork 5
Fix broken vocabulary links by restoring generation from TTL sources #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: rdhyee <153266+rdhyee@users.noreply.github.com>
Co-authored-by: rdhyee <153266+rdhyee@users.noreply.github.com>
Co-authored-by: rdhyee <153266+rdhyee@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request fixes broken vocabulary documentation links on the /models/ page by restoring the generation of vocabulary documentation from TTL (Turtle) source files. The previous documentation was hosted at isamplesorg.github.io/metadata/vocabularies/ but was missing because the generation script was non-functional.
Key Changes:
- Updated the generation script to use the working
vocab2md.pyPython script instead of the unavailablevocabCLI tool - Generated 10 vocabulary documentation files (3 core + 7 extensions) from canonical TTL sources
- Updated links in
models/index.qmdto point to the newly generated local documentation
Reviewed changes
Copilot reviewed 2 out of 12 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
scripts/generate_vocab_docs.sh |
Switched from vocab CLI to vocab2md.py script for all vocabulary generation loops |
models/index.qmd |
Updated three broken external links to point to locally generated vocabulary files |
models/generated/vocabularies/*.qmd |
Generated 3 core vocabulary documentation files from TTL sources |
models/generated/extensions/*.md |
Generated 7 extension vocabulary documentation files from TTL sources |
All generated files include appropriate warnings that they are auto-generated and should not be edited manually. The changes successfully restore the vocabulary documentation that was previously inaccessible.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
rdhyee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review: Approve with Suggestions
✅ What's Good
- Fixes real broken links - the old external URLs no longer exist
- Uses existing tooling -
vocab2md.pywas already in scripts, just commented out - Proper separation - generated files in
models/generated/directory - Complete coverage - all 10 vocabularies (3 core + 7 extensions) regenerated
💡 Suggestions for Future Cleanup
-
Generated content committed: Consider generating docs at build time (CI/pre-render) rather than committing 6K+ lines of generated files. Could add
models/generated/to.gitignore. -
Comment clutter in script: The dead
vocab markdownlines could be removed entirely since that CLI is unavailable:# Just delete these commented lines: # vocab markdown "${SOURCE_BASE}${src}" > "${DEST_FOLDER}${fname}"
These are minor cleanups - the fix itself is correct and needed. Approving!
🤖 Review by Claude Code
rdhyee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Update: Request Changes
After further analysis (with Codex confirmation), requesting changes before merge.
Required Changes
1. Generate at build time, not commit time
Don't commit the generated files. Instead:
- Add
models/generated/to.gitignore - Run
scripts/generate_vocab_docs.shin the Quarto pre-render or GitHub Actions workflow - Update
.github/workflows/to call the script beforequarto render
Example workflow addition:
- name: Generate vocabulary documentation
run: ./scripts/generate_vocab_docs.sh
- name: Render Quarto site
run: quarto render2. Clean up the script
Remove the dead vocab CLI lines entirely:
# DELETE these commented lines (the vocab CLI is unavailable):
# vocab markdown "${SOURCE_BASE}${src}" > "${DEST_FOLDER}${fname}"Just keep the working Python version.
3. Remove generated files from PR
The 10 generated .md and .qmd files should not be committed:
models/generated/vocabularies/*.qmdmodels/generated/extensions/*.md
Why This Matters
- Repo bloat: 6,400 lines of generated content shouldn't live in git
- Drift risk: Generated files get stale if TTL sources change
- Best practice: Generate artifacts at build time, not commit time
What to Keep
✅ The link updates in models/index.qmd are correct
✅ The script fix (Python instead of vocab CLI) is correct
🤖 Review updated by Claude Code (with Codex consultation)
Co-authored-by: rdhyee <153266+rdhyee@users.noreply.github.com>
All requested changes have been implemented in commit 27701f7:
The vocabularies will now be generated at build time by the existing workflow step (line 58 of |
rdhyee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated Review: Approve ✅
After checking the workflow, I see quarto-pages.yml already calls generate_vocab_docs.sh before rendering. The generated files will be regenerated at build time regardless.
What's correct:
- ✅ Script fix (
vocab2md.pyinstead of brokenvocabCLI) - ✅ Link updates in
models/index.qmd
What's suboptimal but acceptable:
⚠️ Generated files committed (redundant - workflow regenerates them)⚠️ Dead commented code in script
These cleanups are tracked in issue #47 and can be done separately. Merging now fixes the broken links.
🤖 Claude Code
PR #46 switched from `vocab markdown` to `python vocab2md.py`, but vocab2md.py requires rdflib which isn't in requirements.txt. The workflow already installs vocab_tools via pipx, which provides the `vocab` CLI. This reverts to using that CLI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
PR #46 switched from `vocab markdown` to `python vocab2md.py`, but vocab2md.py requires rdflib which isn't in requirements.txt. The workflow already installs vocab_tools via pipx, which provides the `vocab` CLI. This reverts to using that CLI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The
/models/page had broken links to vocabulary documentation that previously lived atisamplesorg.github.io/metadata/vocabularies/. These pages are generated from TTL files but were missing because the generation script was non-functional.Changes
Updated
scripts/generate_vocab_docs.shvocabCLI to existingvocab2md.pyscriptUpdated
models/index.qmdlinksBuild-time generation (not committed)
models/generated/patterns to.gitignoreto exclude generated filesreadme.mdfiles in generated directories for documentationThe vocabularies are now co-located with the main site and generated fresh from canonical TTL sources at build time, preventing repository bloat and ensuring documentation stays in sync with source files. The GitHub Actions workflow already runs
scripts/generate_vocab_docs.shbeforequarto render(line 58 of.github/workflows/quarto-pages.yml).Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.