Skip to content

fix: use archive_org_indexer in update-databases workflow#2

Merged
matheusfillipe merged 1 commit into
matheusfillipe:mainfrom
h4ksclaw:fix/update-databases-workflow
Apr 28, 2026
Merged

fix: use archive_org_indexer in update-databases workflow#2
matheusfillipe merged 1 commit into
matheusfillipe:mainfrom
h4ksclaw:fix/update-databases-workflow

Conversation

@h4ksclaw
Copy link
Copy Markdown
Contributor

The update-databases.yml workflow was still calling myrient_indexer.py (dead since March 2026) instead of archive_org_indexer.py. This caused:

  1. The indexer to fail (Myrient is down)
  2. No per-platform TSV files generated in release_databases/
  3. The combined romi_db.tsv contained only comment headers
  4. validate_db.py failed with "No data lines"

Changes:

  • myrient_indexer.pyarchive_org_indexer.py in the CI indexer step
  • --compact-urls--full-urls (archive.org indexer stores complete URLs by default; sources.txt is still copied for offline mode)
  • Updated comment (no longer Myrient-specific)

Verified locally — full pipeline runs clean: indexer → combined DB → validation ✅

The CI workflow was still calling myrient_indexer.py (dead since March 2026)
instead of archive_org_indexer.py, causing the combined romi_db.tsv to be
empty and validate_db.py to fail.

Changes:
- Replace myrient_indexer.py → archive_org_indexer.py in CI step
- Use --full-urls instead of --compact-urls (archive.org indexer defaults to
  full URLs, and sources.txt is still copied for offline mode)
- Update comment about sources.txt (no longer Myrient-specific)
@matheusfillipe matheusfillipe merged commit 943d1cf into matheusfillipe:main Apr 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants