feat: replace mx-source/mx-target with unified mx loader by nicolasnoble · Pull Request #147 · ai-dynamo/modelexpress

nicolasnoble · 2026-02-24T20:11:02Z

Replace the two separate loaders (mx-source, mx-target) with a single
MxModelLoader registered as --load-format mx. It auto-detects whether to
load model weights from disk or receive them via RDMA from an existing
source, eliminating the need to manually label nodes.

Add MxModelLoader with one-shot _detect_source() check against MX
server: if ready source exists, receive via RDMA; otherwise load from
disk. Both paths register with NIXL and publish metadata so future
nodes can discover this one
Remove MxSourceModelLoader and MxTargetModelLoader classes
Extract shared helpers (_collect_cuda_tensors, _init_nixl_manager,
_log_tensor_summary, _publish_metadata_and_ready) from duplicated code
Delete vllm-target.yaml, rename vllm-source.yaml to vllm.yaml with
--load-format mx
Add unit tests for shared helpers and detection logic
Update documentation (ARCHITECTURE.md, DEPLOYMENT.md, CONTRIBUTING.md,
K8s README)

PR built on top of docs: overhaul AI instructions and extract reference documentation #146 to avoid documentation drift.

Summary by CodeRabbit

Release Notes

Documentation
- Added Architecture and Deployment reference guides for comprehensive system overview
- Reorganized Contributing guidelines with development setup, commands, and Git workflow
New Features
- Unified vLLM model loader interface for streamlined P2P weight transfer configuration
- Enhanced security auditing with dependency vulnerability scanning
Chores
- Updated Kubernetes deployment examples and configuration naming

Consolidated the duplicated project context across CLAUDE.md, Copilot instructions, and Cursor rules into concise pointers to new standalone reference docs. Extracted architecture details into docs/ARCHITECTURE.md and deployment/configuration info into docs/DEPLOYMENT.md. Updated CONTRIBUTING.md, README.md, and example READMEs to match the new structure. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>

- Fix copy-paste artifact in CONTRIBUTING.md ("The Dynamo project" -> "The ModelExpress project") - Add ARCHITECTURE.md and DEPLOYMENT.md to the docs/ directory tree in ARCHITECTURE.md - Add missing min_free_space_bytes field to the ServerConfig YAML example in ARCHITECTURE.md to match DEPLOYMENT.md and the actual config struct - Add REDIS_URL to the P2P environment variables table in DEPLOYMENT.md, consistent with CONTRIBUTING.md and the K8s examples Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>

The "Enable P2P Transfers via NIXL" PR (#135) introduced significant new functionality without updating all documentation files per the project's documentation rules. This commit closes the remaining gaps: ARCHITECTURE.md: - Fix __init__.py description (no longer claims auto-registration on import) - Add --worker-cls usage to vllm_worker.py description - Add check_session_changed() and close() to MxClient methods table - Fix NixlTransferManager: correct __init__ params, add get_registered_descriptors(), fix destroy() -> shutdown(), update receive_from_source signature - Correct MxTargetModelLoader base class to DummyModelLoader - Document transfer-time coalescing of contiguous regions - Document _raw_tensor_registry and _nixl_managers globals - Add MODEL_NAME and MX_SERVER_ADDRESS environment variables DEPLOYMENT.md: - Add MODEL_NAME and MX_SERVER_ADDRESS to P2P env var table - Add --worker-cls usage note for vLLM instances CONTRIBUTING.md: - Add Python 3.10+ to prerequisites - Add pip install -e for Python client dev setup - Add pytest and generate_proto.sh to Available Commands table Also includes pre-commit auto-fixes: trailing whitespace cleanup and missing final newlines in several files. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>

- "not complete" -> "incomplete" in CLAUDE.md, copilot-instructions.md, and rust.mdc - "Read files" -> "Always read files" in copilot-instructions.md and rust.mdc to match CLAUDE.md - Vary repeated "For" sentence openers in README.md Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>

Replace the two separate loaders (mx-source, mx-target) with a single MxModelLoader registered as --load-format mx. It auto-detects whether to load model weights from disk or receive them via RDMA from an existing source, eliminating the need to manually label nodes. - Add MxModelLoader with one-shot _detect_source() check against MX server: if ready source exists, receive via RDMA; otherwise load from disk. Both paths register with NIXL and publish metadata so future nodes can discover this one - Remove MxSourceModelLoader and MxTargetModelLoader classes - Extract shared helpers (_collect_cuda_tensors, _init_nixl_manager, _log_tensor_summary, _publish_metadata_and_ready) from duplicated code - Delete vllm-target.yaml, rename vllm-source.yaml to vllm.yaml with --load-format mx - Add unit tests for shared helpers and detection logic - Update documentation (ARCHITECTURE.md, DEPLOYMENT.md, CONTRIBUTING.md, K8s README) Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>

coderabbitai · 2026-02-24T20:11:08Z

Walkthrough

This PR reorganizes project documentation to emphasize development practices and deployment guides, consolidates the Python vLLM loader implementation from a dual-source/target approach into a unified auto-detecting loader, and simplifies Kubernetes deployment patterns and naming conventions.

Changes

Cohort / File(s)	Summary
Documentation Restructuring `.cursor/rules/rust.mdc`, `.github/copilot-instructions.md`, `CLAUDE.md`, `CONTRIBUTING.md`, `README.md`, `SECURITY.md`, `CODE_OF_CONDUCT.md`	Replaced architectural overviews and project narrative with structured development guidelines, prerequisites, build/test workflows, pre-commit hooks, git workflow, and documentation hygiene requirements. Shift toward external documentation references and concrete operational practices.
New Architecture & Deployment Docs `docs/ARCHITECTURE.md`, `docs/DEPLOYMENT.md`	Added comprehensive reference documentation: ARCHITECTURE.md covers project structure, component breakdown, gRPC services, server/client APIs, NIXL integration, FP8 handling, and configuration; DEPLOYMENT.md provides layered configuration guide, Docker/Kubernetes deployment instructions, P2P GPU transfer setup, and debugging reference.
Python vLLM Loader Refactoring `modelexpress_client/python/modelexpress/vllm_loader.py`, `modelexpress_client/python/modelexpress/vllm_worker.py`, `modelexpress_client/python/modelexpress/__init__.py`	Consolidated dual-loader approach (mx-source, mx-target) into single unified MxModelLoader with auto-detection logic; added shared helper functions for tensor collection, NIXL initialization, metadata publishing; restructured loader registration to use single `@register_model_loader("mx")` entry point.
vLLM Loader Tests & Infrastructure `modelexpress_client/python/tests/conftest.py`, `modelexpress_client/python/tests/test_vllm_loader.py`, `modelexpress_client/python/tests/__init__.py`, `modelexpress_client/python/modelexpress/types.py`	Added comprehensive test suite for vLLM loader with mock vLLM dependencies; introduced conftest with BaseModelLoader ABC and mock infrastructure; minor formatting cleanup.
Kubernetes Deployment Updates `examples/p2p_transfer_k8s/deploy/vllm.yaml`, `examples/p2p_transfer_k8s/deploy/vllm-target.yaml`, `examples/p2p_transfer_k8s/deploy/persistence/vllm-target-redis.yaml`, `examples/p2p_transfer_k8s/README.md`, `examples/p2p_transfer_k8s/model-download.yaml`	Renamed mx-source deployment to mx-vllm; changed load-format from mx-source to mx; replaced static sleep wait with dynamic health-endpoint polling for readiness; removed separate vllm-target.yaml; updated documentation and deployment references.
CI & Configuration Updates `.dockerignore`, `.github/dco.yml`, `.github/workflows/ci.yml`, `helm/README.md`, `helm/deploy.sh`	Minor updates: added trailing newlines, integrated cargo-deny step into security-audit workflow, added deployment guide link to Helm README, whitespace refinements in deploy script.
Protobuf & Minor Edits `modelexpress_common/proto/p2p.proto`	Formatting-only changes to protobuf field spacing; no semantic alterations to service or message definitions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hops of joy through docs so clear,
Loaders unified, no source/target fear!
Health checks poll where silence dwelt,
Dev practices felt, deployment knelt,
One loader to bind them all, my dear! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 52.83% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately summarizes the main change: replacing two separate MX loaders (mx-source/mx-target) with a unified mx loader, which is the primary feature of this PR.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

nv-hwoo

Left few small comments but looks good overall 👍
Have you also tried running this on the clusters?

nv-hwoo · 2026-02-25T05:58:19Z

examples/p2p_transfer_k8s/README.md

+    end
+    A -- "RDMA via NIXL" --> B
+    A --> S
+    B --> S


Should the arrow be oppposite?

Suggested change

B --> S

B <-- S

nv-hwoo · 2026-02-25T06:05:45Z