Skip to content

add zephyr + catalogue strategy#35

Open
spoorcc wants to merge 4 commits intomainfrom
claude/add-zephyr-west-catalog-a0pSZ
Open

add zephyr + catalogue strategy#35
spoorcc wants to merge 4 commits intomainfrom
claude/add-zephyr-west-catalog-a0pSZ

Conversation

@spoorcc
Copy link
Contributor

@spoorcc spoorcc commented Mar 8, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Added Zephyr west catalog source for discovering packages from the Zephyr project
    • Introduced catalog-file strategy for catalog discovery, improving manifest handling
    • Added support for parsing west manifest files with project metadata including groups and descriptions

claude and others added 4 commits March 8, 2026 22:39
Introduces a new `west-manifest` source strategy that parses west
manifest files (west.yml) to discover project dependencies listed in
the Zephyr RTOS ecosystem.

- `dfetch_hub/catalog/sources/west.py`: new parser module
  - `WestProject(BaseManifest)` with a `groups` field for west group membership
  - `parse_west_yaml(path, limit)` resolves each project's upstream URL
    from the manifest's `remotes`/`defaults` section (or an explicit `url`)
  - Fetches README content for GitHub-hosted projects
  - Returns `None` / skips entries with unresolvable URLs gracefully
- `dfetch_hub/commands/update.py`: adds `west-manifest` strategy dispatch
  - `_download_west_yaml()` fetches the manifest file via raw GitHub HTTP
    (tries configured branch, then `main`/`master` fallbacks)
  - `_process_west_manifest_source()` orchestrates download → parse → write
- `dfetch-hub.toml`: adds active `[[source]]` example for
  `zephyrproject-rtos/zephyr` with `strategy = "west-manifest"`
- `test/test_catalog_west.py`: 45 new tests covering remote map building,
  URL resolution, group extraction, limit handling, and error paths

https://claude.ai/code/session_015y8fNYDtzCHGFXiB9TVApv
Pre-commit black formatter reformatted these test files during the
west-manifest feature run (textwrap.dedent call style normalisation).

https://claude.ai/code/session_015y8fNYDtzCHGFXiB9TVApv
Instead of a dedicated west-manifest strategy, git-wiki now dispatches
to the appropriate parser based on the source.manifest filename — the
same pattern used by the subfolders strategy.

- Add _FILE_MANIFEST_PARSERS dict mapping filenames to list-parsers:
    "Packages.md" → parse_packages_md (existing clib support)
    "west.yml"    → parse_west_yaml   (new)
- _process_git_wiki_source now looks up the parser from this dict
  instead of hardcoding parse_packages_md, and drops all technology-
  specific language from its docstring
- Remove _download_west_yaml, _process_west_manifest_source, and the
  west-manifest dispatch branch in _process_source
- Remove RAW_BRANCHES/fetch_raw/parse_vcs_slug/raw_url imports that
  were only needed by the deleted HTTP-fetch helper
- Update dfetch-hub.toml zephyr example: strategy = "git-wiki"

https://claude.ai/code/session_015y8fNYDtzCHGFXiB9TVApv
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

📝 Walkthrough

Walkthrough

The pull request introduces a new "catalog-file" strategy for catalog source discovery, replacing the existing "git-wiki" approach. It adds support for Zephyr west manifest files through a new parser module, updates the configuration to reflect this change, and refactors the update command to fetch and parse designated manifest files from repositories.

Changes

Cohort / File(s) Summary
Configuration Updates
dfetch-hub.toml, dfetch_hub/config.py
Updated clib source strategy from "git-wiki" to "catalog-file"; added new zephyr-west public source with "catalog-file" strategy; updated SourceConfig documentation to reflect new strategy name.
West YAML Parser
dfetch_hub/catalog/sources/west.py
New module implementing west manifest parsing with WestProject dataclass, remote URL mapping, and project extraction; exports parse_west_yaml() function to parse Zephyr west.yml files and build project entries with groups and metadata.
Update Command Refactor
dfetch_hub/commands/update.py
Implemented catalog-file strategy replacing git-wiki workflow; added _FILE_MANIFEST_PARSERS registry mapping manifest filenames to parsers; refactored _process_git_wiki_source to _process_catalog_file_source; integrated west.yml parsing support.
Test Suite
test/test_catalog_west.py
Comprehensive test coverage for west parser including remote map building, URL resolution, group extraction, project building, and end-to-end YAML parsing; verifies dataclass inheritance and parser integration.

Sequence Diagram(s)

sequenceDiagram
    participant Config as Source Config
    participant Fetcher as File Fetcher
    participant Parser as Manifest Parser
    participant Writer as Catalog Writer
    
    Config->>Fetcher: Fetch manifest file (catalog-file strategy)
    Note over Fetcher: Clone with single file<br/>(e.g., west.yml)
    Fetcher-->>Parser: Return manifest file
    
    alt Manifest Type Detection
        Parser->>Parser: Identify parser (west.yml → parse_west_yaml)
    end
    
    Parser->>Parser: Build remote map & extract projects
    Parser->>Parser: Resolve URLs & populate metadata
    Parser-->>Writer: Return list of project entries
    
    Writer->>Writer: Write entries to catalog
    Note over Writer: Update summary with<br/>entry count
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Better-info #26: Introduces WestProject dataclass and west.yml parser that populates manifest.urls mapping for Repository/Homepage, directly building on this PR's parser integration.
  • New-strategy #21: Modifies the same parser-dispatch mechanism in update.py, introducing manifest-parser registry that routes file-based parsers including parse_west_yaml.
  • Add dev project to main #11: Changes the same update command implementation in update.py by introducing the new "catalog-file" flow that builds on prior refactoring work.

Poem

🐰 A western breeze now fills the air,
Manifest files fetched with care,
From zephyr's wings to catalog's keep,
New sources parsed, no secrets deep,
The rabbit hops through git-wiki's past! 🌿

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add zephyr + catalogue strategy' accurately captures the main changes: adding Zephyr support and implementing a new 'catalog-file' strategy for manifest discovery across the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 98.28% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/add-zephyr-west-catalog-a0pSZ

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
test/test_catalog_west.py (1)

345-475: Add an import-manifest regression fixture here.

These tests cover flat manifest.projects well, but there is no case for self.import / imported YAML files. A tiny fixture covering one imported manifest would keep that behavior from regressing once it lands.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/test_catalog_west.py` around lines 345 - 475, Add a new test to cover
manifest imports: create a tiny fixture (e.g. import_manifest_west_yaml) or use
tmp_path to write a west.yml that uses self.import to include another YAML
containing one project, then call parse_west_yaml(...) (patching
fetch_readme_for_homepage) and assert the imported project's presence and
expected fields (package_name/homepage/version/groups) to prevent regressions
for imported manifests; reference parse_west_yaml and the patched
fetch_readme_for_homepage in the test so it mirrors existing tests' style.
dfetch_hub/catalog/sources/west.py (1)

115-118: Skip non-string group values instead of coercing them.

groups is external YAML. Turning every truthy item into str(...) can leak malformed values downstream as fake tags.

💡 Tighten the filter
-    return [str(g) for g in raw if g]
+    return [g for g in raw if isinstance(g, str) and g]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch_hub/catalog/sources/west.py` around lines 115 - 118, The current
handling of entry.get("groups") coerces any truthy item to a string, which can
leak malformed values; update the logic that reads raw = entry.get("groups") to
validate and only include items that are already strings (e.g., check
isinstance(g, str)) and non-empty after trimming, skipping all other types, so
the returned list contains only proper string group names instead of str(...)
conversions; adjust the list comprehension that currently uses [str(g) for g in
raw if g] to perform this type-and-empty check before including values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dfetch_hub/catalog/sources/west.py`:
- Around line 236-265: parse_west_yaml currently only processes
manifest.get("projects") and will miss projects declared via "self: import: ..."
submanifests; update parse_west_yaml (and helpers like _load_west_manifest and
_collect_projects) to detect when the manifest contains a "self" remote with an
"import" or "import: submanifests" entry, then resolve those referenced YAML
files by loading them (recursively if they themselves contain imports) and
merging their "projects" into the projects list before returning; alternatively,
if you prefer to fail fast, make parse_west_yaml raise/log an explicit error
when any "self.import" declaration is present so incomplete catalogs are not
produced.

In `@dfetch-hub.toml`:
- Around line 122-128: The strategy overview section omits the new
"catalog-file" strategy; update the strategy list near the top of the document
to include an entry for catalog-file, briefly describing it (e.g., "catalog-file
— parse a single manifest file such as west.yml to produce per-project catalog
entries; uses manifest parsers like WestProject") and mention the manifest field
behavior so readers see this workflow without reading the examples; edit the
strategy list (the overview that enumerates available strategies) to add that
short description alongside existing strategies.

---

Nitpick comments:
In `@dfetch_hub/catalog/sources/west.py`:
- Around line 115-118: The current handling of entry.get("groups") coerces any
truthy item to a string, which can leak malformed values; update the logic that
reads raw = entry.get("groups") to validate and only include items that are
already strings (e.g., check isinstance(g, str)) and non-empty after trimming,
skipping all other types, so the returned list contains only proper string group
names instead of str(...) conversions; adjust the list comprehension that
currently uses [str(g) for g in raw if g] to perform this type-and-empty check
before including values.

In `@test/test_catalog_west.py`:
- Around line 345-475: Add a new test to cover manifest imports: create a tiny
fixture (e.g. import_manifest_west_yaml) or use tmp_path to write a west.yml
that uses self.import to include another YAML containing one project, then call
parse_west_yaml(...) (patching fetch_readme_for_homepage) and assert the
imported project's presence and expected fields
(package_name/homepage/version/groups) to prevent regressions for imported
manifests; reference parse_west_yaml and the patched fetch_readme_for_homepage
in the test so it mirrors existing tests' style.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0746e60-0ebd-4976-8fcb-cdc0641a6fce

📥 Commits

Reviewing files that changed from the base of the PR and between 2124bce and b9d9a9a.

📒 Files selected for processing (5)
  • dfetch-hub.toml
  • dfetch_hub/catalog/sources/west.py
  • dfetch_hub/commands/update.py
  • dfetch_hub/config.py
  • test/test_catalog_west.py

Comment on lines +236 to +265
def parse_west_yaml(west_yaml: "Path", limit: int | None = None) -> list[WestProject]:
"""Parse a ``west.yml`` manifest file into a list of :class:`WestProject`.

Reads the YAML file at *west_yaml*, extracts remote definitions and the
project list, and resolves each project's upstream URL. Projects whose
URL cannot be determined (missing remote, no ``url-base``) are silently
skipped with a ``debug`` log message.

Args:
west_yaml: Path to the ``west.yml`` (or equivalent) manifest file.
limit: Maximum number of projects to return. ``None`` = unlimited.

Returns:
A list of :class:`WestProject` instances, one per discovered project.
Returns an empty list on parse errors.
"""
manifest = _load_west_manifest(west_yaml)
if manifest is None:
return []

remote_bases = _build_remote_map(manifest.get("remotes", []))
default_remote = _extract_default_remote(manifest)

projects_raw = manifest.get("projects", [])
if not isinstance(projects_raw, list):
logger.warning("'manifest.projects' is not a list in %s", west_yaml)
return []

projects = _collect_projects(projects_raw, remote_bases, default_remote, limit)
logger.debug("Parsed %d project(s) from %s", len(projects), west_yaml)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the west.py file and understand its full structure
find . -name "west.py" -type f | head -20

Repository: dfetch-org/dfetch-hub

Length of output: 101


🏁 Script executed:

# Check the full content of the file to see if imports are handled anywhere
fd -e py -p "west" | head -20

Repository: dfetch-org/dfetch-hub

Length of output: 125


🏁 Script executed:

# Look for the full west.py file
git ls-files | grep -E "west\.py$"

Repository: dfetch-org/dfetch-hub

Length of output: 125


🏁 Script executed:

# Search for any mention of "import" in the west.py file and related tests
rg "import" --type py dfetch_hub/catalog/sources/west.py | head -30

Repository: dfetch-org/dfetch-hub

Length of output: 333


🏁 Script executed:

# Get the full content of west.py to see complete implementation
cat -n dfetch_hub/catalog/sources/west.py

Repository: dfetch-org/dfetch-hub

Length of output: 11070


🏁 Script executed:

# Check the test file for import-related tests
cat -n test/test_catalog_west.py

Repository: dfetch-org/dfetch-hub

Length of output: 26628


🌐 Web query:

Zephyr west.yml self import manifest submanifests

💡 Result:

In a Zephyr/west manifest (west.yml), manifest: self: import: is how you split your own manifest repository into multiple “submanifests” (extra YAML files) and have west load them automatically.

Typical pattern (import a directory of submanifests)

manifest:
  projects:
    - name: zephyr
      remote: zephyrproject-rtos
      revision: v3.7.0
      import: west.yml     # import Zephyr's manifest from the zephyr repo
  self:
    import: submanifests   # import YAML files from this manifest repo (local FS)

West will import all .yml files under submanifests/ (directory tree), sorted by filename. Prefixing with numbers is a common way to control ordering (e.g., 01-base.yml, 02-vendor.yml, etc.). [1]

Key behavior differences: projects: import: vs self: import:

  • projects: … import: pulls manifest data from Git at the project’s manifest-rev (what west update fetches/checks out).
  • self: import: pulls manifest data from your local filesystem (your manifest repo), because west does not manage version control for the manifest repository itself. [1]

Precedence / override behavior

self: import: is processed before the main manifest file and before project imports; projects defined in your submanifests therefore have highest precedence (they can override/replace projects of the same name coming from Zephyr’s west.yml, etc.). [1]

Source: Zephyr “West Manifests” documentation (Manifest Imports, self: import, directory import ordering, and precedence). [1]


🏁 Script executed:

# Also search for any mention of "import" handling in the codebase
rg "import" dfetch_hub/catalog/sources/west.py

Repository: dfetch-org/dfetch-hub

Length of output: 333


🏁 Script executed:

# Check if there are any helper functions or future plans for import support
rg -A 5 -B 5 "manifest.get" dfetch_hub/catalog/sources/west.py

Repository: dfetch-org/dfetch-hub

Length of output: 568


🏁 Script executed:

# Confirm the specific projects mentioned in the review are real
# by checking if there's any documentation or test coverage
rg -i "canopennode|chre|tflite-micro|thrift|zephyr-lang-rust" dfetch_hub/

Repository: dfetch-org/dfetch-hub

Length of output: 47


Resolve west self.import manifests before shipping the Zephyr source.

parse_west_yaml() only walks the current file's manifest.projects. Upstream Zephyr's current west.yml ends with self: import: submanifests, and submanifests/optional.yaml adds extra projects like canopennode, chre, tflite-micro, thrift, and zephyr-lang-rust. With the current parser — and the catalog-file flow fetching only west.ymlzephyr-west will silently publish an incomplete catalog unless imported manifests are loaded too. Either resolve self.import entries recursively or fail fast when a manifest declares them.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch_hub/catalog/sources/west.py` around lines 236 - 265, parse_west_yaml
currently only processes manifest.get("projects") and will miss projects
declared via "self: import: ..." submanifests; update parse_west_yaml (and
helpers like _load_west_manifest and _collect_projects) to detect when the
manifest contains a "self" remote with an "import" or "import: submanifests"
entry, then resolve those referenced YAML files by loading them (recursively if
they themselves contain imports) and merging their "projects" into the projects
list before returning; alternatively, if you prefer to fail fast, make
parse_west_yaml raise/log an explicit error when any "self.import" declaration
is present so incomplete catalogs are not produced.

Comment on lines +122 to +128
# ── EXAMPLE: catalog-file — Zephyr project dependencies ─────────────────────
# Parses the west.yml in the Zephyr RTOS repository and catalogues every
# project listed in manifest.projects (HAL modules, MCUboot, mbedTLS, …).
# Each project becomes a separate catalog entry pointing to its upstream repo.
# Uses strategy='catalog-file' with manifest='west.yml' — only that single file
# is fetched; the manifest field selects the parser
# (west.yml → WestProject; Packages.md → CLibPackage, …).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document catalog-file in the strategy overview, not just in the examples.

This sample config now teaches a new strategy, but the strategy list near Lines 7-17 still omits it. Readers who start at the header will miss the new workflow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch-hub.toml` around lines 122 - 128, The strategy overview section omits
the new "catalog-file" strategy; update the strategy list near the top of the
document to include an entry for catalog-file, briefly describing it (e.g.,
"catalog-file — parse a single manifest file such as west.yml to produce
per-project catalog entries; uses manifest parsers like WestProject") and
mention the manifest field behavior so readers see this workflow without reading
the examples; edit the strategy list (the overview that enumerates available
strategies) to add that short description alongside existing strategies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants