diff --git a/architecture/cli.md b/architecture/cli.md index eba2a3ca7..26bc725c0 100644 --- a/architecture/cli.md +++ b/architecture/cli.md @@ -1,12 +1,12 @@ # CLI -The CLI (`data-designer`) provides an interactive command-line interface for configuring models, providers, tools, and personas, as well as running dataset generation. It uses a layered architecture for config management and delegates generation to the public `DataDesigner` API. +The CLI (`data-designer`) provides an interactive command-line interface for configuring models, providers, MCP providers, and tools, downloading managed persona datasets, discovering, installing, and uninstalling plugin packages from catalogs, and running dataset generation. It uses a layered architecture for setup workflows and delegates generation to the public `DataDesigner` API. Source: `packages/data-designer/src/data_designer/cli/` ## Overview -The CLI is built on Typer with lazy command loading to keep startup fast. Config management commands follow a **command → controller → service → repository** layering pattern. Generation commands bypass this stack and use the public `DataDesigner` class directly. +The CLI is built on Typer with lazy command loading to keep startup fast. Config management and plugin catalog commands follow a **command → controller → service → repository** layering pattern. Generation commands bypass this stack and use the public `DataDesigner` class directly. ## Key Components @@ -20,9 +20,9 @@ The CLI is built on Typer with lazy command loading to keep startup fast. Config `create_lazy_typer_group` and `_LazyCommand` stubs defer importing command modules until a command is actually invoked. This keeps `data-designer --help` fast — only the command names and descriptions are loaded eagerly; the full module (and its dependencies) loads on first use. -### Layering Pattern (Config Management) +### Layering Pattern (Setup Workflows) -Config management commands (models, providers, tools, personas) follow a consistent four-layer pattern: +Config management commands (models, providers, MCP providers, tools) follow a consistent four-layer pattern: | Layer | Role | Example | |-------|------|---------| @@ -31,10 +31,22 @@ Config management commands (models, providers, tools, personas) follow a consist | **Service** | Domain rules: uniqueness, merge, delete-all | `ModelService.add/update/delete` over `ModelRepository` | | **Repository** | File I/O for typed config registries | `ModelRepository` extends `ConfigRepository[ModelConfigRegistry]` | -Repositories: `ModelRepository`, `ProviderRepository`, `ToolRepository`, `MCPProviderRepository`, `PersonaRepository`. +Repositories: `ModelRepository`, `ProviderRepository`, `MCPProviderRepository`, and `ToolRepository`. +`PersonaRepository` provides read-only locale metadata for managed persona dataset downloads. Services mirror the repository domains with business logic (validation, conflict resolution). +Plugin catalog commands use the same layering shape: + +| Layer | Role | Example | +|-------|------|---------| +| **Command** | Thin Typer entry, wires `DATA_DESIGNER_HOME` and command options | `plugin` subcommands (`list`, `search`, `info`, `install`, `uninstall`, `installed`, `catalog`) → `PluginCatalogController(DATA_DESIGNER_HOME)` | +| **Controller** | UX flow: catalog tables, package metadata, compatibility display, install/uninstall confirmations | `PluginCatalogController` composes catalog + install services | +| **Service** | Domain rules: package listing, compatibility checks, uv/pip install and uninstall commands, plugin discovery verification | `PluginCatalogService`, `PluginInstallService` | +| **Repository** | File/cache I/O for catalog aliases and catalog documents | `PluginCatalogRepository` | + +The built-in `nvidia` catalog points at `https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json`. `NVIDIA-NeMo/DataDesignerPlugins` defines the catalog format. Each catalog entry is an installable package with docs, install metadata, compatibility constraints, and one or more runtime plugins. Users install and uninstall packages, not individual runtime plugins. Commands that take a package name also accept the package alias from the `data-designer-{alias}` package-name pattern; for example, `data-designer-calculator` can be addressed as `calculator`. + ### Generation Commands `preview`, `create`, and `validate` commands use `GenerationController`, which: @@ -62,6 +74,37 @@ User invokes command (e.g., `data-designer config models`) → Repository reads/writes config files ``` +### Plugin Catalog Discovery +``` +User invokes command (e.g., `data-designer plugin list`) + → Command function wires DATA_DESIGNER_HOME and catalog options + → PluginCatalogController resolves the catalog alias + → PluginCatalogService loads packages and filters out incompatible packages by default + → PluginCatalogRepository reads local config and cached/remote catalog JSON +``` + +### Plugin Install/Uninstall +``` +User invokes command (e.g., `data-designer plugin install calculator`) + → PluginCatalogController resolves the plugin package name or package alias + → PluginCatalogService evaluates Python and Data Designer compatibility + → PluginInstallService chooses uv or pip and builds the command. + In active uv projects it uses `uv add` so the package is recorded in + `pyproject.toml`; otherwise it installs into the current Python environment. + Data Designer itself is already installed, so its packages are not reinstalled + or replaced while installing plugin dependencies. + → PluginInstallService verifies Data Designer can discover the package's runtime plugins +``` + +``` +User invokes command (e.g., `data-designer plugin uninstall calculator`) + → PluginCatalogController resolves the plugin package name or package alias + → PluginInstallService chooses uv or pip and builds the uninstall command. + Active uv projects remove the dependency from project metadata and uninstall + the package from the current environment. + → PluginInstallService verifies Data Designer no longer discovers the package's runtime plugins +``` + ### Generation ``` User invokes command (e.g., `data-designer create config.yaml`) @@ -73,8 +116,9 @@ User invokes command (e.g., `data-designer create config.yaml`) ## Design Decisions - **Lazy command loading** keeps `data-designer --help` responsive: command modules (and their heavy dependencies, such as the engine and model stacks) load only when a command is invoked, not at process startup. -- **Controller/service/repo for config, direct API for generation** — config management benefits from the layered pattern (testable services, swappable repositories). Generation doesn't need this indirection; it delegates to the same `DataDesigner` class that Python users call directly. -- **`DATA_DESIGNER_HOME`** centralizes all CLI-managed state (model configs, provider configs, tool configs, personas) in a single directory, defaulting to `~/.data_designer/`. +- **Controller/service/repo for setup workflows, direct API for generation** — config and plugin catalog workflows benefit from the layered pattern (testable services, swappable repositories). Generation doesn't need this indirection; it delegates to the same `DataDesigner` class that Python users call directly. +- **`DATA_DESIGNER_HOME`** centralizes CLI-managed state (model configs, provider configs, MCP provider configs, tool configs, managed assets, plugin catalog aliases, and catalog caches) in a single directory, defaulting to `~/.data-designer/`. +- **Package-first plugin catalogs** match how users install plugins: one package can provide one or more runtime plugins, but install and uninstall commands always target the package. - **Rich-based UI** provides formatted tables, progress bars, and interactive prompts without requiring a web interface. ## Cross-References diff --git a/packages/data-designer/src/data_designer/cli/README.md b/packages/data-designer/src/data_designer/cli/README.md index f15b752e9..b5f6f087a 100644 --- a/packages/data-designer/src/data_designer/cli/README.md +++ b/packages/data-designer/src/data_designer/cli/README.md @@ -1,14 +1,19 @@ # 🎨 NeMo Data Designer CLI -This directory contains the Command-Line Interface (CLI) for configuring model providers and model configurations used in Data Designer. +This directory contains the Command-Line Interface (CLI) for configuring model providers, model configurations, MCP providers, tool configs, managed assets, and plugin catalogs used in Data Designer. ## Overview The CLI provides an interactive interface for managing: - **Model Providers**: LLM API endpoints (NVIDIA, OpenAI, Anthropic, custom providers) - **Model Configs**: Specific model configurations with inference parameters +- **MCP Providers**: MCP server configurations for tool integration +- **Tool Configs**: Tool definitions used by configured models and workflows +- **Managed Assets**: Persona dataset downloads under the Data Designer home directory +- **Plugin Catalogs**: Catalog aliases for finding Data Designer plugin packages +- **Plugin Packages**: Install and uninstall packages from catalogs, check version compatibility first, and verify Data Designer can discover the plugins they provide -Configuration files are stored in `~/.data-designer/` by default and can be referenced by Data Designer workflows. +Configuration files and CLI-managed state are stored in `~/.data-designer/` by default. ## Architecture @@ -17,7 +22,7 @@ The CLI follows a **layered architecture** pattern, separating concerns into dis ``` ┌─────────────────────────────────────────────────────────────┐ │ Commands │ -│ Entry points for CLI commands (list, providers, models) │ +│ Entry points for CLI commands (config, download, plugin) │ └─────────────────────────────────────────────────────────────┘ │ ▼ @@ -50,9 +55,13 @@ The CLI follows a **layered architecture** pattern, separating concerns into dis - Handle top-level error reporting - **Files**: - `list.py`: List current configurations + - `mcp.py`: Configure MCP providers - `models.py`: Configure models - `providers.py`: Configure providers + - `download.py`: Download managed assets + - `plugin.py`: Discover, install, and uninstall plugin packages from catalogs - `reset.py`: Reset/delete configurations + - `tools.py`: Configure tool configs #### 2. **Controllers** (`controllers/`) - **Purpose**: Orchestrate user workflows and coordinate between services, forms, and UI @@ -62,8 +71,12 @@ The CLI follows a **layered architecture** pattern, separating concerns into dis - Handle user navigation and session state - Manage associated resource deletion (e.g., deleting models when provider is deleted) - **Files**: + - `download_controller.py`: Orchestrates managed asset download workflows + - `mcp_provider_controller.py`: Orchestrates MCP provider configuration workflows - `model_controller.py`: Orchestrates model configuration workflows - `provider_controller.py`: Orchestrates provider configuration workflows + - `plugin_catalog_controller.py`: Orchestrates plugin catalog browsing, alias management, and package workflows + - `tool_controller.py`: Orchestrates tool configuration workflows **Key Features**: - **Associated Resource Management**: When deleting a provider, the controller checks for associated models and prompts the user to delete them together @@ -77,8 +90,12 @@ The CLI follows a **layered architecture** pattern, separating concerns into dis - Coordinate between multiple repositories when needed - Handle default management (e.g., default provider selection) - **Files**: + - `mcp_provider_service.py`: MCP provider configuration business logic - `model_service.py`: Model configuration business logic - `provider_service.py`: Provider business logic + - `plugin_catalog_service.py`: Plugin catalog loading, search, compatibility checks, and installed plugin listing + - `plugin_install_service.py`: Chooses and runs uv or pip commands for installing/uninstalling plugin packages, keeps installed Data Designer packages in place, and verifies installed plugins + - `tool_service.py`: Tool configuration business logic **Key Methods**: - `list_all()`: Get all configured items @@ -91,16 +108,20 @@ The CLI follows a **layered architecture** pattern, separating concerns into dis - `set_default()`, `get_default()`: Manage default provider (providers only) #### 4. **Repositories** (`repositories/`) -- **Purpose**: Handle data persistence (YAML file I/O) +- **Purpose**: Handle data persistence and read-only reference metadata - **Responsibilities**: - Load configuration from YAML files - Save configuration to YAML files - - Check file existence - - Delete configuration files + - Check file existence and delete configuration files where applicable + - Provide read-only metadata for built-in managed assets - **Files**: - `base.py`: Abstract base repository with common operations + - `mcp_provider_repository.py`: MCP provider configuration persistence - `model_repository.py`: Model configuration persistence + - `persona_repository.py`: Read-only persona locale metadata - `provider_repository.py`: Provider persistence + - `plugin_catalog_repository.py`: Plugin catalog aliases, catalog fetching, and URL-keyed catalog cache + - `tool_repository.py`: Tool configuration persistence **Base Repository Pattern**: ```python @@ -122,8 +143,10 @@ class ConfigRepository(ABC, Generic[T]): - `builder.py`: Abstract form builder base - `field.py`: Form field types (TextField, SelectField, NumericField) - `form.py`: Form container and prompt orchestration + - `mcp_provider_builder.py`: Interactive MCP provider configuration builder - `model_builder.py`: Interactive model configuration builder - `provider_builder.py`: Interactive provider configuration builder + - `tool_builder.py`: Interactive tool configuration builder **Form Features**: - Field-level validation @@ -152,7 +175,7 @@ class ConfigRepository(ABC, Generic[T]): ## Configuration Files -The CLI manages two YAML configuration files: +The CLI manages YAML configuration files, managed assets, and plugin catalog caches under `~/.data-designer/`: ### `~/.data-designer/model_providers.yaml` @@ -206,6 +229,61 @@ model_configs: max_parallel_requests: 4 ``` +### `~/.data-designer/mcp_providers.yaml` + +Stores MCP provider configurations: + +```yaml +providers: + - name: local-tools + provider_type: stdio + command: python + args: + - "-m" + - my_mcp_server +``` + +### `~/.data-designer/tool_configs.yaml` + +Stores tool configurations that reference MCP providers: + +```yaml +tool_configs: + - tool_alias: research-tools + providers: + - local-tools + max_tool_call_turns: 5 +``` + +### `~/.data-designer/managed-assets/` + +Stores managed assets downloaded by CLI commands such as +`data-designer download personas`. Set `DATA_DESIGNER_MANAGED_ASSETS_PATH` to +store managed assets outside `DATA_DESIGNER_HOME`. + +### `~/.data-designer/plugin_catalogs.yaml` + +Stores user-added plugin catalog aliases. The built-in NVIDIA catalog points at +`https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json`, is +always available, and is not written to this file. Set +`DATA_DESIGNER_DEFAULT_PLUGIN_CATALOG_URL` to repoint the built-in catalog for QA or +staging. + +```yaml +catalogs: + - alias: research + url: https://raw.githubusercontent.com/acme/dd-plugins/main/catalog/plugins.json +``` + +### `~/.data-designer/plugin-catalog-cache/` + +Stores fetched plugin catalog payloads as JSON cache files keyed by catalog alias and URL hash. This prevents a re-pointed alias from serving stale catalog data from a previous URL. + +Plugin package arguments accept either the full package name or the package +alias. For packages named `data-designer-{alias}`, the alias is `{alias}`. For +example, `data-designer-github` can be addressed as `github` in `info`, +`install`, and `uninstall`. + ## Usage Examples ### Configure Providers @@ -248,3 +326,49 @@ data-designer config list # Delete configuration files (with confirmation) data-designer config reset ``` + +### Discover, Install, and Uninstall Plugin Packages + +```bash +# List compatible plugin packages from the default NVIDIA catalog +data-designer plugin list + +# Search a specific catalog +data-designer plugin --catalog research search transform + +# Show package metadata, compatibility, docs, and the install command +data-designer plugin info github + +# Install a plugin package from a catalog and verify Data Designer can discover its plugins +data-designer plugin install github --yes + +# Preview without changing the current environment +data-designer plugin install github --dry-run + +# Uninstall a plugin package and verify Data Designer no longer discovers its plugins +data-designer plugin uninstall github --yes + +# Preview without changing the current environment +data-designer plugin uninstall github --dry-run + +# Add and manage catalog aliases +data-designer plugin catalog add research https://github.com/acme/dd-plugins +data-designer plugin catalog list +data-designer plugin catalog remove research + +# List installed runtime plugin entry points without importing plugin modules +data-designer plugin installed +``` + +When installing a plugin package, the CLI first checks the package's Python and +Data Designer version requirements. The plugin package and its other +dependencies are installed normally, but the currently installed Data Designer +packages (`data-designer`, `data-designer-config`, and `data-designer-engine`) +are kept in place. This prevents a plugin dependency from upgrading, +downgrading, or reinstalling Data Designer itself. + +In an active virtual environment with a user `pyproject.toml`, `uv` uses +`uv add` so the plugin package is recorded in the project. Otherwise the CLI +installs into the current Python environment with `uv pip install` or `pip`. +`uv` plugin installs require `uv >= 0.6.0`; auto mode falls back to `pip` when +`uv` is missing or too old. `pip` remains supported for pip-only environments. diff --git a/packages/data-designer/src/data_designer/cli/commands/plugin.py b/packages/data-designer/src/data_designer/cli/commands/plugin.py new file mode 100644 index 000000000..a9dbf974e --- /dev/null +++ b/packages/data-designer/src/data_designer/cli/commands/plugin.py @@ -0,0 +1,258 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import click +import typer + +from data_designer.cli.controllers.plugin_catalog_controller import PluginCatalogController +from data_designer.cli.ui import print_info +from data_designer.config.utils.constants import DATA_DESIGNER_HOME + + +def list_command( + ctx: typer.Context, + catalog: str | None = typer.Option( + None, + "--catalog", + help="Plugin catalog alias to read. Can also be provided before the subcommand.", + ), + refresh: bool = typer.Option( + False, + "--refresh", + help="Fetch the catalog even when a fresh cache entry exists.", + ), + include_incompatible: bool = typer.Option( + False, + "--include-incompatible", + help="Show catalog packages that do not satisfy the local Python or Data Designer version.", + ), +) -> None: + """List installable Data Designer plugin packages from a catalog.""" + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_list( + catalog_alias=_resolve_catalog_alias(ctx, catalog), + refresh=refresh, + include_incompatible=include_incompatible, + ) + + +def search_command( + ctx: typer.Context, + query: str = typer.Argument( + help="Keyword, package name or alias, description, runtime plugin name, or runtime plugin type to search for." + ), + catalog: str | None = typer.Option( + None, + "--catalog", + help="Plugin catalog alias to search. Can also be provided before the subcommand.", + ), + refresh: bool = typer.Option( + False, + "--refresh", + help="Fetch the catalog even when a fresh cache entry exists.", + ), + include_incompatible: bool = typer.Option( + False, + "--include-incompatible", + help="Search catalog packages that do not satisfy the local Python or Data Designer version.", + ), +) -> None: + """Search installable Data Designer plugin packages from a catalog.""" + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_search( + query, + catalog_alias=_resolve_catalog_alias(ctx, catalog), + refresh=refresh, + include_incompatible=include_incompatible, + ) + + +def info_command( + ctx: typer.Context, + package: str = typer.Argument( + help="Plugin package name or package alias from the catalog.", + metavar="PACKAGE", + ), + catalog: str | None = typer.Option( + None, + "--catalog", + help="Plugin catalog alias to read. Can also be provided before the subcommand.", + ), + refresh: bool = typer.Option( + False, + "--refresh", + help="Fetch the catalog even when a fresh cache entry exists.", + ), +) -> None: + """Show metadata, compatibility, docs, and install plan for one plugin package.""" + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_info( + package, + catalog_alias=_resolve_catalog_alias(ctx, catalog), + refresh=refresh, + ) + + +def install_command( + ctx: typer.Context, + package: str = typer.Argument( + help="Plugin package name or package alias from the catalog.", + metavar="PACKAGE", + ), + catalog: str | None = typer.Option( + None, + "--catalog", + help="Plugin catalog alias to install from. Can also be provided before the subcommand.", + ), + refresh: bool = typer.Option( + False, + "--refresh", + help="Fetch the catalog even when a fresh cache entry exists.", + ), + manager: str = typer.Option( + "auto", + "--manager", + click_type=click.Choice(["auto", "uv", "pip"]), + help=( + "Package manager to use. auto prefers uv; uv adds to the active project when one is detected; " + "pip mutates the environment." + ), + ), + yes: bool = typer.Option( + False, + "--yes", + "-y", + help="Install without an interactive confirmation prompt.", + ), + dry_run: bool = typer.Option( + False, + "--dry-run", + help="Print the install plan without mutating the current environment.", + ), +) -> None: + """Install one Data Designer plugin package, then verify declared runtime entry points.""" + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_install( + package, + catalog_alias=_resolve_catalog_alias(ctx, catalog), + refresh=refresh, + manager=manager, + yes=yes, + dry_run=dry_run, + ) + + +def uninstall_command( + ctx: typer.Context, + package: str = typer.Argument( + help="Plugin package name or package alias from the catalog.", + metavar="PACKAGE", + ), + catalog: str | None = typer.Option( + None, + "--catalog", + help="Plugin catalog alias to uninstall from. Can also be provided before the subcommand.", + ), + refresh: bool = typer.Option( + False, + "--refresh", + help="Fetch the catalog even when a fresh cache entry exists.", + ), + manager: str = typer.Option( + "auto", + "--manager", + click_type=click.Choice(["auto", "uv", "pip"]), + help=( + "Package manager to use. auto prefers uv; uv removes from the active project and environment when a " + "project is detected; pip mutates the environment." + ), + ), + yes: bool = typer.Option( + False, + "--yes", + "-y", + help="Uninstall without an interactive confirmation prompt.", + ), + dry_run: bool = typer.Option( + False, + "--dry-run", + help="Print the uninstall plan without mutating the current environment.", + ), +) -> None: + """Uninstall one Data Designer plugin package, then verify declared runtime entry points are removed.""" + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_uninstall( + package, + catalog_alias=_resolve_catalog_alias(ctx, catalog), + refresh=refresh, + manager=manager, + yes=yes, + dry_run=dry_run, + ) + + +def installed_command(ctx: typer.Context) -> None: + """List installed Data Designer runtime plugin entry points.""" + _warn_if_parent_catalog_unused(ctx, "installed runtime plugins are discovered from the current Python environment") + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_installed() + + +def catalog_list_command(ctx: typer.Context) -> None: + """List configured plugin catalogs.""" + _warn_if_parent_catalog_unused(ctx, "catalog management commands operate on aliases directly") + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_catalog_list() + + +def catalog_add_command( + ctx: typer.Context, + alias: str = typer.Argument(help="Local alias for the plugin catalog."), + url: str = typer.Argument( + help="Catalog repository URL, catalog URL, local catalog file, or local catalog directory." + ), +) -> None: + """Add a plugin catalog alias.""" + _warn_if_parent_catalog_unused(ctx, "catalog management commands operate on aliases directly") + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_catalog_add( + alias=alias, + url=url, + ) + + +def catalog_remove_command( + ctx: typer.Context, + alias: str = typer.Argument(help="Plugin catalog alias to remove."), +) -> None: + """Remove a plugin catalog alias.""" + _warn_if_parent_catalog_unused(ctx, "catalog management commands operate on aliases directly") + controller = PluginCatalogController(DATA_DESIGNER_HOME) + controller.run_catalog_remove(alias=alias) + + +def _resolve_catalog_alias(ctx: typer.Context, catalog_alias: str | None) -> str | None: + if catalog_alias is not None: + return catalog_alias + + return _parent_catalog_alias(ctx) + + +def _parent_catalog_alias(ctx: typer.Context) -> str | None: + """Return --catalog from the plugin parent command when present.""" + + parent = ctx.parent + while parent is not None: + candidate = parent.params.get("catalog") if parent.params else None + if isinstance(candidate, str) and candidate: + return candidate + parent = parent.parent + return None + + +def _warn_if_parent_catalog_unused(ctx: typer.Context, reason: str) -> None: + catalog_alias = _parent_catalog_alias(ctx) + if catalog_alias is not None: + print_info(f"Ignoring --catalog {catalog_alias!r}; {reason}.") diff --git a/packages/data-designer/src/data_designer/cli/controllers/plugin_catalog_controller.py b/packages/data-designer/src/data_designer/cli/controllers/plugin_catalog_controller.py new file mode 100644 index 000000000..4b5777813 --- /dev/null +++ b/packages/data-designer/src/data_designer/cli/controllers/plugin_catalog_controller.py @@ -0,0 +1,540 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import shlex +from pathlib import Path + +import typer +from pydantic import ValidationError +from rich.style import Style +from rich.table import Table +from rich.text import Text + +from data_designer.cli.plugin_catalog import ( + DEFAULT_PLUGIN_CATALOG_ALIAS, + PLUGIN_CATALOG_ALIAS_PATTERN, + CompatibilityResult, + InstalledPluginInfo, + PluginCatalogConfig, + PluginCatalogEntry, + PluginCatalogError, +) +from data_designer.cli.repositories.plugin_catalog_repository import PluginCatalogRepository +from data_designer.cli.services.plugin_catalog_service import PluginCatalogService +from data_designer.cli.services.plugin_install_service import PluginInstallService +from data_designer.cli.ui import ( + confirm_action, + console, + display_config_preview, + print_error, + print_header, + print_info, + print_success, + print_warning, +) +from data_designer.config.utils.constants import NordColor + + +class PluginCatalogController: + """Controller for plugin catalog browsing, alias management, and package workflows. + + Catalog browsing and environment mutation intentionally use separate services so + read-only catalog operations stay decoupled from package-manager execution. + """ + + def __init__(self, config_dir: Path) -> None: + self.config_dir = config_dir + self.catalog_repository = PluginCatalogRepository(config_dir) + self.catalog_service = PluginCatalogService(self.catalog_repository) + self.install_service = PluginInstallService() + + def run_list( + self, + *, + catalog_alias: str | None = None, + refresh: bool = False, + include_incompatible: bool = False, + ) -> None: + """List plugin packages from a catalog.""" + catalog = self._get_catalog_or_exit(catalog_alias) + entries = self._list_entries_or_exit(catalog.alias, refresh=refresh, include_incompatible=include_incompatible) + + print_header("Data Designer Plugin Packages") + print_info(f"Catalog: {catalog.alias} ({catalog.url})") + console.print() + + if not entries: + self._display_empty_list_state(catalog.alias, include_incompatible=include_incompatible) + return + + self._display_catalog_entries(entries) + + def run_search( + self, + query: str, + *, + catalog_alias: str | None = None, + refresh: bool = False, + include_incompatible: bool = False, + ) -> None: + """Search plugin packages from a catalog.""" + catalog = self._get_catalog_or_exit(catalog_alias) + entries = self._search_entries_or_exit( + query, + catalog.alias, + refresh=refresh, + include_incompatible=include_incompatible, + ) + + print_header("Data Designer Plugin Package Search") + print_info(f"Catalog: {catalog.alias} ({catalog.url})") + print_info(f"Query: {query}") + console.print() + + if not entries: + self._display_empty_search_state( + query, + catalog.alias, + include_incompatible=include_incompatible, + ) + return + + self._display_catalog_entries(entries) + + def run_info( + self, + package_name: str, + *, + catalog_alias: str | None = None, + refresh: bool = False, + ) -> None: + """Show full metadata for one plugin package.""" + catalog = self._get_catalog_or_exit(catalog_alias) + package_entries = self._get_package_entries_or_exit( + package_name, + catalog.alias, + refresh=refresh, + include_incompatible=True, + ) + entry = package_entries[0] + compatibility = self.catalog_service.evaluate_compatibility(entry) + + print_header(f"Plugin Package: {entry.package.name}") + print_info(f"Catalog: {catalog.alias} ({catalog.url})") + console.print(f" Runtime plugins: [bold]{_format_runtime_plugins(package_entries)}[/bold]") + self._display_compatibility(compatibility) + + try: + plan = self.install_service.build_install_plan(entry, catalog) + console.print(f" Requirement: [bold]{entry.install.requirement}[/bold]") + if entry.install.index_url is not None: + console.print(f" Index URL: [bold]{entry.install.index_url}[/bold]") + console.print(f" Install target: [bold]{_target_description(plan.install_mode, plan.project_root)}[/bold]") + if plan.data_designer_protection is not None: + console.print(f" Data Designer: [bold]{plan.data_designer_protection}[/bold]") + console.print(f" Install command: [bold]{shlex.join(plan.command)}[/bold]") + if plan.source_warning is not None: + print_warning(plan.source_warning) + except ValueError as e: + print_warning(str(e)) + + console.print() + display_config_preview( + { + "package": { + "name": entry.package.name, + "description": entry.description, + }, + "install": entry.install.model_dump(mode="json", exclude_none=True), + "compatibility": ( + entry.compatibility.model_dump(mode="json", exclude_none=True) + if entry.compatibility is not None + else None + ), + "docs": entry.docs.model_dump(mode="json", exclude_none=True) if entry.docs is not None else None, + "plugins": [_runtime_plugin_metadata(plugin) for plugin in package_entries], + }, + "Plugin Metadata", + ) + + def run_install( + self, + package_name: str, + *, + catalog_alias: str | None = None, + refresh: bool = False, + manager: str = "auto", + yes: bool = False, + dry_run: bool = False, + ) -> None: + """Install one plugin package from the catalog.""" + catalog = self._get_catalog_or_exit(catalog_alias) + package_entries = self._get_package_entries_or_exit( + package_name, + catalog.alias, + refresh=refresh, + include_incompatible=True, + ) + entry = package_entries[0] + compatibility = self.catalog_service.evaluate_compatibility(entry) + + if not compatibility.is_compatible and not dry_run: + print_error(f"Plugin package {entry.package.name!r} is not compatible with this environment") + for reason in compatibility.reasons: + console.print(f" - {reason}") + raise typer.Exit(code=1) + + try: + plan = self.install_service.build_install_plan(entry, catalog, manager=manager) + except ValueError as e: + print_error(f"Failed to build plugin install plan: {e}") + raise typer.Exit(code=1) + + print_header("Install Data Designer Plugin Package") + console.print(f" Package: [bold]{entry.package.name}[/bold]") + console.print(f" Catalog: [bold]{catalog.alias}[/bold] ({catalog.url})") + console.print(f" Requirement: [bold]{entry.install.requirement}[/bold]") + if entry.install.index_url is not None: + console.print(f" Index URL: [bold]{entry.install.index_url}[/bold]") + console.print(f" Install target: [bold]{_target_description(plan.install_mode, plan.project_root)}[/bold]") + if plan.data_designer_protection is not None: + console.print(f" Data Designer: [bold]{plan.data_designer_protection}[/bold]") + console.print(f" Command: [bold]{shlex.join(plan.command)}[/bold]") + self._display_compatibility(compatibility) + + if plan.source_warning is not None: + print_warning(plan.source_warning) + + if dry_run: + if not compatibility.is_compatible: + print_warning( + "Dry run complete; no changes made. A real install would be blocked because compatibility " + "checks failed." + ) + else: + print_info("Dry run complete; no changes made") + return + + if not yes and not confirm_action( + f"Install this package into the {_target_description(plan.install_mode, plan.project_root)}?", + default=False, + ): + print_info("No changes made") + return + + try: + self.install_service.install(plan) + except RuntimeError as e: + print_error(str(e)) + raise typer.Exit(code=1) + + if self.install_service.verify_entry_points(package_entries): + print_success(f"Plugin package {entry.package.name!r} installed and runtime entry points verified") + else: + print_warning( + f"Plugin package {entry.package.name!r} was installed, but Data Designer did not discover every " + "declared runtime entry point. Restart the shell or check the package entry point metadata." + ) + + def run_uninstall( + self, + package_name: str, + *, + catalog_alias: str | None = None, + refresh: bool = False, + manager: str = "auto", + yes: bool = False, + dry_run: bool = False, + ) -> None: + """Uninstall one plugin package resolved from the catalog.""" + catalog = self._get_catalog_or_exit(catalog_alias) + package_entries = self._get_package_entries_or_exit( + package_name, + catalog.alias, + refresh=refresh, + include_incompatible=True, + ) + entry = package_entries[0] + + try: + plan = self.install_service.build_uninstall_plan(entry, catalog, manager=manager) + except ValueError as e: + print_error(f"Failed to build plugin uninstall plan: {e}") + raise typer.Exit(code=1) + + print_header("Uninstall Data Designer Plugin Package") + console.print(f" Package: [bold]{entry.package.name}[/bold]") + console.print(f" Catalog: [bold]{catalog.alias}[/bold] ({catalog.url})") + console.print(f" Uninstall target: [bold]{_target_description(plan.uninstall_mode, plan.project_root)}[/bold]") + _display_commands(plan.commands or [plan.command]) + + if dry_run: + print_info("Dry run complete; no changes made") + return + + if not yes and not confirm_action( + f"Uninstall this package from the {_target_description(plan.uninstall_mode, plan.project_root)}?", + default=False, + ): + print_info("No changes made") + return + + try: + self.install_service.uninstall(plan) + except RuntimeError as e: + print_error(str(e)) + raise typer.Exit(code=1) + + if self.install_service.verify_entry_points_removed(package_entries): + print_success(f"Plugin package {entry.package.name!r} uninstalled and runtime entry points removed") + else: + print_warning( + f"Plugin package {entry.package.name!r} was uninstalled, but Data Designer still discovers one or " + "more declared runtime entry points. Restart the shell or check the package environment." + ) + + def run_installed(self) -> None: + """List installed runtime plugin entry points without importing plugin modules.""" + print_header("Installed Data Designer Runtime Plugins") + installed_plugins = self.catalog_service.list_installed_plugins() + if not installed_plugins: + print_warning("No installed Data Designer runtime plugins were discovered") + return + self._display_installed_plugins(installed_plugins) + + def run_catalog_list(self) -> None: + """List configured plugin catalogs.""" + print_header("Data Designer Plugin Catalogs") + try: + catalogs = self.catalog_service.list_catalogs() + except (PluginCatalogError, OSError) as e: + print_error(f"Failed to list plugin catalogs: {e}") + raise typer.Exit(code=1) + + table = Table(title="Plugin Catalogs", border_style=NordColor.NORD8.value) + table.add_column("Alias", style=NordColor.NORD14.value, no_wrap=True) + table.add_column("URL", style=NordColor.NORD4.value) + + for catalog in catalogs: + table.add_row( + catalog.alias, + catalog.url, + ) + console.print(table) + + def run_catalog_add( + self, + *, + alias: str, + url: str, + ) -> None: + """Add a plugin catalog alias.""" + try: + catalog = self.catalog_service.add_catalog( + alias, + url, + ) + except ValidationError as e: + if any(tuple(error["loc"]) == ("alias",) for error in e.errors()): + print_error(f"Invalid catalog alias {alias!r}: must match `{PLUGIN_CATALOG_ALIAS_PATTERN}`") + else: + print_error(f"Invalid plugin catalog configuration: {e}") + raise typer.Exit(code=1) + except (PluginCatalogError, OSError, ValueError) as e: + print_error(f"Failed to add plugin catalog: {e}") + raise typer.Exit(code=1) + + print_success(f"Plugin catalog {catalog.alias!r} added") + print_info(f"Catalog: {catalog.url}") + + def run_catalog_remove(self, *, alias: str) -> None: + """Remove a plugin catalog alias.""" + try: + self.catalog_service.remove_catalog(alias) + except (PluginCatalogError, OSError, ValueError) as e: + print_error(f"Failed to remove plugin catalog: {e}") + raise typer.Exit(code=1) + print_success(f"Plugin catalog {alias!r} removed") + + def _get_catalog_or_exit(self, catalog_alias: str | None) -> PluginCatalogConfig: + try: + return self.catalog_service.get_catalog(catalog_alias or DEFAULT_PLUGIN_CATALOG_ALIAS) + except (PluginCatalogError, OSError, ValueError) as e: + print_error(str(e)) + raise typer.Exit(code=1) + + def _list_entries_or_exit( + self, + catalog_alias: str, + *, + refresh: bool, + include_incompatible: bool, + ) -> list[PluginCatalogEntry]: + try: + return self.catalog_service.list_entries( + catalog_alias, + refresh=refresh, + include_incompatible=include_incompatible, + ) + except (PluginCatalogError, OSError, ValueError) as e: + print_error(f"Failed to load plugin catalog: {e}") + raise typer.Exit(code=1) + + def _search_entries_or_exit( + self, + query: str, + catalog_alias: str, + *, + refresh: bool, + include_incompatible: bool, + ) -> list[PluginCatalogEntry]: + try: + return self.catalog_service.search_entries( + query, + catalog_alias, + refresh=refresh, + include_incompatible=include_incompatible, + ) + except (PluginCatalogError, OSError, ValueError) as e: + print_error(f"Failed to search plugin catalog: {e}") + raise typer.Exit(code=1) + + def _get_package_entries_or_exit( + self, + package_name: str, + catalog_alias: str, + *, + refresh: bool, + include_incompatible: bool, + ) -> list[PluginCatalogEntry]: + try: + package_entries = self.catalog_service.get_package_entries( + package_name, + catalog_alias, + refresh=refresh, + include_incompatible=include_incompatible, + ) + except (PluginCatalogError, OSError, ValueError) as e: + print_error(f"Failed to load plugin package metadata: {e}") + raise typer.Exit(code=1) + if not package_entries: + print_error(f"Plugin package or alias {package_name!r} was not found in catalog {catalog_alias!r}") + raise typer.Exit(code=1) + return package_entries + + def _display_empty_list_state(self, catalog_alias: str, *, include_incompatible: bool) -> None: + if include_incompatible: + print_warning("No plugin packages found") + return + + all_entries = self._list_entries_or_exit(catalog_alias, refresh=False, include_incompatible=True) + if all_entries: + print_warning("No compatible plugin packages found") + print_info("Incompatible catalog packages are hidden. Use --include-incompatible to show them.") + return + + print_warning("No plugin packages found") + + def _display_empty_search_state( + self, + query: str, + catalog_alias: str, + *, + include_incompatible: bool, + ) -> None: + if include_incompatible: + print_warning("No matching plugin packages found") + return + + all_matches = self._search_entries_or_exit( + query, + catalog_alias, + refresh=False, + include_incompatible=True, + ) + if all_matches: + print_warning("No compatible plugin packages matched") + print_info("Matching incompatible catalog packages are hidden. Use --include-incompatible to show them.") + return + + print_warning("No matching plugin packages found") + + def _display_catalog_entries(self, entries: list[PluginCatalogEntry]) -> None: + table = Table(title="Catalog Plugin Packages", border_style=NordColor.NORD8.value) + table.add_column("Package", style=NordColor.NORD14.value, no_wrap=True) + table.add_column("Description", style=NordColor.NORD4.value) + table.add_column("Runtime Plugins", style=NordColor.NORD9.value) + table.add_column("Compatible", style=NordColor.NORD13.value, no_wrap=True) + table.add_column("Docs", style=NordColor.NORD7.value) + + for package_entries in self.catalog_service.group_entries_by_package(entries).values(): + entry = package_entries[0] + compatibility = self.catalog_service.evaluate_compatibility(entry) + docs_url = entry.docs.url if entry.docs is not None and entry.docs.url is not None else "" + table.add_row( + entry.package.name, + entry.description, + _format_runtime_plugins(package_entries), + "yes" if compatibility.is_compatible else "no", + _format_docs_link(docs_url), + ) + console.print(table) + + @staticmethod + def _display_installed_plugins(installed_plugins: list[InstalledPluginInfo]) -> None: + table = Table(title="Installed Runtime Plugins", border_style=NordColor.NORD8.value) + table.add_column("Runtime Plugin", style=NordColor.NORD14.value, no_wrap=True) + table.add_column("Entry Point", style=NordColor.NORD4.value) + + for plugin in installed_plugins: + table.add_row( + plugin.name, + plugin.entry_point_value, + ) + console.print(table) + + @staticmethod + def _display_compatibility(compatibility: CompatibilityResult) -> None: + if compatibility.is_compatible: + console.print(" Compatibility: [bold green]compatible[/bold green]") + return + + console.print(" Compatibility: [bold yellow]not compatible[/bold yellow]") + for reason in compatibility.reasons: + console.print(f" - {reason}") + + +def _display_commands(commands: list[list[str]]) -> None: + if len(commands) == 1: + console.print(f" Command: [bold]{shlex.join(commands[0])}[/bold]") + return + + console.print(" Commands:") + for command in commands: + console.print(f" [bold]{shlex.join(command)}[/bold]") + + +def _target_description(mode: str, project_root: str | None) -> str: + if mode == "uv-project" and project_root is not None: + return f"current uv project ({project_root})" + return "current Python environment" + + +def _format_runtime_plugins(entries: list[PluginCatalogEntry]) -> str: + return ", ".join(f"{entry.name} ({entry.plugin_type.value})" for entry in entries) + + +def _format_docs_link(docs_url: str | None) -> Text: + if not docs_url: + return Text("") + return Text("docs", style=Style(color=NordColor.NORD7.value, link=docs_url)) + + +def _runtime_plugin_metadata(entry: PluginCatalogEntry) -> dict[str, object]: + return { + "name": entry.name, + "plugin_type": entry.plugin_type.value, + "entry_point": entry.entry_point.model_dump(mode="json", exclude_none=True), + } diff --git a/packages/data-designer/src/data_designer/cli/lazy_group.py b/packages/data-designer/src/data_designer/cli/lazy_group.py index f498b3230..6b6f055aa 100644 --- a/packages/data-designer/src/data_designer/cli/lazy_group.py +++ b/packages/data-designer/src/data_designer/cli/lazy_group.py @@ -83,6 +83,12 @@ def create_lazy_typer_group( """ class LazyTyperGroup(TyperGroup): + def parse_args(self, ctx: click.Context, args: list[str]) -> list[str]: + if not args and self.no_args_is_help and not ctx.resilient_parsing: + click.echo(ctx.get_help(), color=ctx.color) + ctx.exit(0) + return super().parse_args(ctx, args) + def list_commands(self, ctx: click.Context) -> list[str]: eager = super().list_commands(ctx) lazy_names = [name for name in lazy_subcommands if name not in eager] diff --git a/packages/data-designer/src/data_designer/cli/main.py b/packages/data-designer/src/data_designer/cli/main.py index 757ff8073..07b8a183b 100644 --- a/packages/data-designer/src/data_designer/cli/main.py +++ b/packages/data-designer/src/data_designer/cli/main.py @@ -141,6 +141,84 @@ def _is_version_request(args: list[str]) -> bool: no_args_is_help=True, ) +# Create plugin command group +plugin_app = typer.Typer( + name="plugin", + help="Discover, install, and uninstall Data Designer plugin packages from catalogs", + cls=create_lazy_typer_group( + { + "list": { + "module": f"{_CMD}.plugin", + "attr": "list_command", + "help": "List plugin packages from a catalog", + }, + "search": { + "module": f"{_CMD}.plugin", + "attr": "search_command", + "help": "Search plugin packages from a catalog", + }, + "info": { + "module": f"{_CMD}.plugin", + "attr": "info_command", + "help": "Show plugin package metadata and install plan", + }, + "install": { + "module": f"{_CMD}.plugin", + "attr": "install_command", + "help": "Install a plugin package and verify declared runtime entry points", + }, + "uninstall": { + "module": f"{_CMD}.plugin", + "attr": "uninstall_command", + "help": "Uninstall a plugin package and verify declared runtime entry points are removed", + }, + "installed": { + "module": f"{_CMD}.plugin", + "attr": "installed_command", + "help": "List installed runtime plugin entry points", + }, + } + ), + no_args_is_help=True, +) + + +@plugin_app.callback() +def plugin_callback( + catalog: str | None = typer.Option( + None, + "--catalog", + help="Plugin catalog alias to use for commands that read package metadata.", + ), +) -> None: + _ = catalog + + +plugin_catalog_app = typer.Typer( + name="catalog", + help="Manage plugin catalog aliases", + cls=create_lazy_typer_group( + { + "list": { + "module": f"{_CMD}.plugin", + "attr": "catalog_list_command", + "help": "List configured plugin catalogs", + }, + "add": { + "module": f"{_CMD}.plugin", + "attr": "catalog_add_command", + "help": "Add a plugin catalog alias", + }, + "remove": { + "module": f"{_CMD}.plugin", + "attr": "catalog_remove_command", + "help": "Remove a plugin catalog alias", + }, + } + ), + no_args_is_help=True, +) + _AGENT_CMD = f"{_CMD}.agent" @@ -167,10 +245,12 @@ def _build_agent_lazy_group(prefix: str) -> dict[str, dict[str, str]]: ) agent_app.add_typer(agent_state_app, name="state") +plugin_app.add_typer(plugin_catalog_app, name="catalog") # Add setup command groups app.add_typer(config_app, name="config", rich_help_panel="Setup") app.add_typer(download_app, name="download", rich_help_panel="Setup") +app.add_typer(plugin_app, name="plugin", rich_help_panel="Setup") app.add_typer(agent_app, name="agent", rich_help_panel="Agent") diff --git a/packages/data-designer/src/data_designer/cli/plugin_catalog.py b/packages/data-designer/src/data_designer/cli/plugin_catalog.py new file mode 100644 index 000000000..4cfe32e03 --- /dev/null +++ b/packages/data-designer/src/data_designer/cli/plugin_catalog.py @@ -0,0 +1,528 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import os +from dataclasses import dataclass +from urllib.parse import urlparse + +from packaging.markers import InvalidMarker, Marker +from packaging.requirements import InvalidRequirement, Requirement +from packaging.specifiers import InvalidSpecifier, SpecifierSet +from packaging.utils import InvalidName, canonicalize_name +from pydantic import BaseModel, ConfigDict, Field + +from data_designer.plugins.plugin import PluginType + +DEFAULT_PLUGIN_CATALOG_ALIAS = "nvidia" +DEFAULT_PLUGIN_CATALOG_URL = "https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json" +DEFAULT_PLUGIN_CATALOG_URL_ENV_VAR = "DATA_DESIGNER_DEFAULT_PLUGIN_CATALOG_URL" +PLUGIN_CATALOGS_FILE_NAME = "plugin_catalogs.yaml" +PLUGIN_CATALOG_CACHE_DIR_NAME = "plugin-catalog-cache" +PLUGIN_CATALOG_DEFAULT_CACHE_TTL_SECONDS = 24 * 60 * 60 +MAX_PLUGIN_CATALOG_SIZE_BYTES = 1 * 1024 * 1024 +PLUGIN_CATALOG_SCHEMA_VERSION = 2 +PLUGIN_CATALOG_ALIAS_PATTERN = r"^[A-Za-z0-9_.-]+$" +DATA_DESIGNER_DISTRIBUTION_NAME = "data-designer" +DATA_DESIGNER_PLUGIN_PACKAGE_PREFIX = "data-designer-" +PLUGIN_ENTRY_POINT_GROUP = "data_designer.plugins" +PYPI_SIMPLE_INDEX_URL = "https://pypi.org/simple/" +CATALOG_DOCUMENT_KEYS = {"packages", "schema_version"} +CATALOG_PACKAGE_KEYS = { + "compatibility", + "description", + "docs", + "install", + "name", + "plugins", +} +CATALOG_PLUGIN_KEYS = {"entry_point", "name", "plugin_type"} +CATALOG_ENTRY_POINT_KEYS = {"group", "name", "value"} +CATALOG_COMPATIBILITY_KEYS = {"data_designer", "python"} +CATALOG_PYTHON_COMPATIBILITY_KEYS = {"specifier"} +CATALOG_DATA_DESIGNER_COMPATIBILITY_KEYS = {"marker", "requirement", "specifier"} +CATALOG_DOCS_KEYS = {"url"} +CATALOG_INSTALL_REQUIRED_KEYS = {"requirement"} +CATALOG_INSTALL_OPTIONAL_KEYS = {"index_url"} +SUPPORTED_PLUGIN_TYPE_VALUES = {plugin_type.value for plugin_type in PluginType} + + +class PluginCatalogError(ValueError): + """Raised when a plugin catalog cannot be loaded or validated.""" + + +class PluginCompatibilityTarget(BaseModel): + """Version requirement for one environment target.""" + + model_config = ConfigDict(extra="forbid") + + requirement: str | None = None + specifier: str = Field(min_length=1) + marker: str | None = None + + +class PluginCompatibility(BaseModel): + """Compatibility requirements declared by a catalog package.""" + + model_config = ConfigDict(extra="forbid") + + python: PluginCompatibilityTarget + data_designer: PluginCompatibilityTarget + + +class PluginPackageInfo(BaseModel): + """Python distribution metadata for a catalog entry.""" + + model_config = ConfigDict(extra="forbid") + + name: str + + +class PluginEntryPointInfo(BaseModel): + """Runtime entry point exposed by an installable plugin package.""" + + model_config = ConfigDict(extra="forbid") + + group: str = PLUGIN_ENTRY_POINT_GROUP + name: str + value: str + + +class PluginInstallInfo(BaseModel): + """Resolver-native install metadata for a catalog package.""" + + model_config = ConfigDict(extra="forbid") + + requirement: str + index_url: str | None = None + + +class PluginDocsInfo(BaseModel): + """Documentation metadata for a catalog package.""" + + model_config = ConfigDict(extra="forbid") + + url: str = Field(min_length=1) + + +class PluginCatalogEntry(BaseModel): + """One discoverable runtime plugin entry from a catalog package.""" + + model_config = ConfigDict(extra="forbid") + + name: str + plugin_type: PluginType + description: str = Field(min_length=1) + package: PluginPackageInfo + install: PluginInstallInfo + entry_point: PluginEntryPointInfo + compatibility: PluginCompatibility + docs: PluginDocsInfo + + +class PluginCatalogRuntimePlugin(BaseModel): + """Runtime plugin metadata nested under one catalog package.""" + + model_config = ConfigDict(extra="forbid") + + name: str + plugin_type: PluginType + entry_point: PluginEntryPointInfo + + +class PluginCatalogPackage(BaseModel): + """One installable package from a package-first plugin catalog.""" + + model_config = ConfigDict(extra="forbid") + + name: str + description: str = Field(min_length=1) + install: PluginInstallInfo + compatibility: PluginCompatibility + docs: PluginDocsInfo + plugins: list[PluginCatalogRuntimePlugin] = Field(min_length=1) + + def entries(self) -> list[PluginCatalogEntry]: + """Flatten nested runtime plugins while preserving package-level metadata.""" + package = PluginPackageInfo(name=self.name) + return [ + PluginCatalogEntry( + name=plugin.name, + plugin_type=plugin.plugin_type, + description=self.description, + package=package, + install=self.install, + entry_point=plugin.entry_point, + compatibility=self.compatibility, + docs=self.docs, + ) + for plugin in self.plugins + ] + + +class PluginCatalog(BaseModel): + """Versioned plugin catalog.""" + + model_config = ConfigDict(extra="forbid") + + schema_version: int + packages: list[PluginCatalogPackage] = Field(default_factory=list) + + @property + def entries(self) -> list[PluginCatalogEntry]: + """Return the runtime plugin entries described by every package.""" + return [entry for package in self.packages for entry in package.entries()] + + @property + def plugins(self) -> list[PluginCatalogEntry]: + """Convenience alias for flattened runtime plugin entries.""" + return self.entries + + +class PluginCatalogConfig(BaseModel): + """Persisted catalog configuration.""" + + alias: str = Field(pattern=PLUGIN_CATALOG_ALIAS_PATTERN) + url: str + cache_ttl_seconds: int = Field(default=PLUGIN_CATALOG_DEFAULT_CACHE_TTL_SECONDS, ge=0) + + +class PluginCatalogRegistry(BaseModel): + """Persisted collection of user-configured plugin catalogs.""" + + catalogs: list[PluginCatalogConfig] = Field(default_factory=list) + + +@dataclass(frozen=True) +class CompatibilityResult: + """Compatibility result for one catalog entry in the local environment.""" + + is_compatible: bool + reasons: list[str] + + +@dataclass(frozen=True) +class InstallCommandTemporaryFile: + """Temporary file needed only while executing one install command.""" + + placeholder: str + filename: str + content: str + + +@dataclass(frozen=True) +class InstallPlan: + """Resolved package-manager command for installing one plugin package.""" + + package_name: str + source_description: str + command: list[str] + manager: str + catalog_alias: str + source_warning: str | None = None + data_designer_protection: str | None = None + command_stdin: str | None = None + temporary_file: InstallCommandTemporaryFile | None = None + install_mode: str = "environment" + project_root: str | None = None + + +@dataclass(frozen=True) +class UninstallPlan: + """Resolved package-manager command for uninstalling one plugin package.""" + + package_name: str + command: list[str] + manager: str + catalog_alias: str + commands: list[list[str]] | None = None + uninstall_mode: str = "environment" + project_root: str | None = None + + +@dataclass(frozen=True) +class InstalledPluginInfo: + """Installed runtime plugin entry point discovered without importing plugin code.""" + + name: str + entry_point_value: str + + +def get_default_plugin_catalog_url() -> str: + """Return the built-in plugin catalog URL, honoring a local override for QA/staging.""" + return os.getenv(DEFAULT_PLUGIN_CATALOG_URL_ENV_VAR, DEFAULT_PLUGIN_CATALOG_URL) + + +def validate_plugin_catalog_payload(payload: object, *, source: str) -> None: + """Validate a decoded plugin catalog against the schema v2 contract.""" + try: + _validate_plugin_catalog_payload(payload) + except PluginCatalogError as e: + raise PluginCatalogError(f"Invalid plugin catalog at {source!r}: {e}") from e + + +def _validate_plugin_catalog_payload(payload: object) -> None: + catalog = _required_catalog_object("catalog document", payload, CATALOG_DOCUMENT_KEYS) + schema_version = catalog["schema_version"] + if ( + not isinstance(schema_version, int) + or isinstance(schema_version, bool) + or schema_version != PLUGIN_CATALOG_SCHEMA_VERSION + ): + raise PluginCatalogError( + f"unsupported catalog schema_version {schema_version!r}; expected {PLUGIN_CATALOG_SCHEMA_VERSION}" + ) + + packages = catalog["packages"] + if not isinstance(packages, list): + raise PluginCatalogError("catalog document has invalid packages; expected a list") + + package_names: dict[str, str] = {} + runtime_names: dict[str, tuple[str, str]] = {} + for index, raw_package in enumerate(packages): + validated_plugins = _validate_catalog_package(raw_package, index) + package_name = validated_plugins[0][0] + canonical_package_name = canonicalize_name(package_name) + previous_package_name = package_names.get(canonical_package_name) + if previous_package_name is not None: + raise PluginCatalogError( + f"duplicate package name {package_name!r}; canonical name {canonical_package_name!r} " + f"already used by {previous_package_name!r}" + ) + package_names[canonical_package_name] = package_name + + for package_name, plugin_name, entry_point_name in validated_plugins: + previous = runtime_names.get(plugin_name) + if previous is not None: + previous_package, previous_entry_point_name = previous + raise PluginCatalogError( + f"duplicate runtime plugin name {plugin_name!r} from " + f"{previous_package!r} entry point {previous_entry_point_name!r} and " + f"{package_name!r} entry point {entry_point_name!r}" + ) + runtime_names[plugin_name] = (package_name, entry_point_name) + + +def _validate_catalog_package(raw_package: object, index: int) -> list[tuple[str, str, str]]: + context = f"catalog packages[{index}]" + package = _required_catalog_object(context, raw_package, CATALOG_PACKAGE_KEYS) + compatibility = _required_catalog_object( + f"{context}.compatibility", + package["compatibility"], + CATALOG_COMPATIBILITY_KEYS, + ) + python_compatibility = _required_catalog_object( + f"{context}.compatibility.python", + compatibility["python"], + CATALOG_PYTHON_COMPATIBILITY_KEYS, + ) + data_designer_compatibility = _required_catalog_object( + f"{context}.compatibility.data_designer", + compatibility["data_designer"], + CATALOG_DATA_DESIGNER_COMPATIBILITY_KEYS, + ) + install = _required_catalog_object(f"{context}.install", package["install"]) + docs = _required_catalog_object(f"{context}.docs", package["docs"], CATALOG_DOCS_KEYS) + + package_name = _catalog_package_name(f"{context}.name", package["name"]) + _required_catalog_string(f"{context}.description", package["description"]) + _catalog_version_specifier( + package_name, + f"{context}.compatibility.python.specifier", + python_compatibility["specifier"], + ) + _catalog_data_designer_compatibility( + package_name, + f"{context}.compatibility.data_designer", + data_designer_compatibility, + ) + _validate_install_metadata(package_name, f"{context}.install", install) + _catalog_http_url(f"{context}.docs.url", docs["url"]) + + plugins = package["plugins"] + if not isinstance(plugins, list) or not plugins: + raise PluginCatalogError(f"{context}.plugins is invalid; expected a non-empty list") + + return [ + _validate_catalog_plugin( + raw_plugin, + package_name=package_name, + context=f"{context}.plugins[{plugin_index}]", + ) + for plugin_index, raw_plugin in enumerate(plugins) + ] + + +def _validate_catalog_plugin(raw_plugin: object, *, package_name: str, context: str) -> tuple[str, str, str]: + plugin = _required_catalog_object(context, raw_plugin, CATALOG_PLUGIN_KEYS) + entry_point = _required_catalog_object( + f"{context}.entry_point", + plugin["entry_point"], + CATALOG_ENTRY_POINT_KEYS, + ) + + plugin_type = _required_catalog_string(f"{context}.plugin_type", plugin["plugin_type"]) + if plugin_type not in SUPPORTED_PLUGIN_TYPE_VALUES: + raise PluginCatalogError( + f"{context}.plugin_type {plugin_type!r} is invalid; expected one of " + f"{_format_catalog_choices(SUPPORTED_PLUGIN_TYPE_VALUES)}" + ) + + plugin_name = _required_catalog_string(f"{context}.name", plugin["name"]) + entry_point_group = _required_catalog_string(f"{context}.entry_point.group", entry_point["group"]) + if entry_point_group != PLUGIN_ENTRY_POINT_GROUP: + raise PluginCatalogError( + f"{context}.entry_point.group {entry_point_group!r} is invalid; expected {PLUGIN_ENTRY_POINT_GROUP!r}" + ) + entry_point_name = _required_catalog_string(f"{context}.entry_point.name", entry_point["name"]) + _required_catalog_string(f"{context}.entry_point.value", entry_point["value"]) + return package_name, plugin_name, entry_point_name + + +def _validate_install_metadata(package_name: str, context: str, install: dict[str, object]) -> None: + keys = set(install) + missing_keys = CATALOG_INSTALL_REQUIRED_KEYS - keys + extra_keys = keys - CATALOG_INSTALL_REQUIRED_KEYS - CATALOG_INSTALL_OPTIONAL_KEYS + if missing_keys or extra_keys: + expected_required = _format_catalog_keys(CATALOG_INSTALL_REQUIRED_KEYS) + expected_optional = _format_catalog_keys(CATALOG_INSTALL_OPTIONAL_KEYS) + raise PluginCatalogError( + f"package {package_name!r} has invalid install fields; " + f"expected {{{expected_required}; optional {{{expected_optional}}}}}, " + f"got {{{_format_catalog_keys(keys)}}}" + ) + + requirement_text = _required_catalog_string(f"{context}.requirement", install["requirement"]) + try: + requirement = Requirement(requirement_text) + except InvalidRequirement as e: + raise PluginCatalogError( + f"package {package_name!r} has invalid {context}.requirement {requirement_text!r}: {e}" + ) from e + if canonicalize_name(requirement.name) != canonicalize_name(package_name): + raise PluginCatalogError( + f"package {package_name!r} has invalid {context}.requirement {requirement_text!r}; " + f"expected a requirement for {package_name!r}" + ) + + if "index_url" in install: + _catalog_http_url(f"package {package_name!r} install.index_url", install["index_url"]) + + +def _required_catalog_object( + context: str, + value: object, + expected_keys: set[str] | None = None, +) -> dict[str, object]: + if not isinstance(value, dict): + raise PluginCatalogError(f"{context} is invalid; expected an object") + if expected_keys is not None: + _validate_catalog_object_keys(context, value, expected_keys) + return value + + +def _validate_catalog_object_keys(context: str, value: dict[str, object], expected_keys: set[str]) -> None: + keys = set(value) + if keys != expected_keys: + # Catalog v2 is strict by design: additive wire-schema changes should bump + # schema_version so older CLIs do not silently ignore new fields. + raise PluginCatalogError( + f"{context} has invalid fields; expected {{{_format_catalog_keys(expected_keys)}}}, " + f"got {{{_format_catalog_keys(keys)}}}" + ) + + +def _required_catalog_string(context: str, value: object) -> str: + if not isinstance(value, str) or not value: + raise PluginCatalogError(f"{context} is invalid; expected a non-empty string") + return value + + +def _required_catalog_nullable_string(context: str, value: object) -> str | None: + if value is None: + return None + if isinstance(value, str): + return value + raise PluginCatalogError(f"{context} is invalid; expected a string or null") + + +def _catalog_package_name(context: str, value: object) -> str: + package_name = _required_catalog_string(context, value) + try: + canonicalize_name(package_name, validate=True) + except InvalidName as e: + raise PluginCatalogError(f"{context} {package_name!r} is invalid; expected a valid package name") from e + return package_name + + +def _catalog_version_specifier(package_name: str, context: str, value: object) -> str: + raw_specifier = _required_catalog_string(context, value) + try: + specifier = SpecifierSet(raw_specifier) + except InvalidSpecifier as e: + raise PluginCatalogError(f"package {package_name!r} has invalid {context} {raw_specifier!r}: {e}") from e + if not str(specifier): + raise PluginCatalogError(f"package {package_name!r} has invalid {context}; expected at least one specifier") + return str(specifier) + + +def _catalog_data_designer_compatibility( + package_name: str, + context: str, + compatibility: dict[str, object], +) -> None: + requirement_text = _required_catalog_string(f"{context}.requirement", compatibility["requirement"]) + try: + requirement = Requirement(requirement_text) + except InvalidRequirement as e: + raise PluginCatalogError( + f"package {package_name!r} has invalid {context}.requirement {requirement_text!r}: {e}" + ) from e + if canonicalize_name(requirement.name) != DATA_DESIGNER_DISTRIBUTION_NAME: + raise PluginCatalogError( + f"package {package_name!r} has invalid {context}.requirement {requirement_text!r}; " + f"expected a {DATA_DESIGNER_DISTRIBUTION_NAME!r} requirement" + ) + if not requirement.specifier: + raise PluginCatalogError(f"package {package_name!r} has invalid {context}.requirement; expected a specifier") + + specifier = _catalog_version_specifier(package_name, f"{context}.specifier", compatibility["specifier"]) + if specifier != str(requirement.specifier): + raise PluginCatalogError( + f"package {package_name!r} has invalid {context}.specifier {specifier!r}; " + f"expected {str(requirement.specifier)!r} from requirement" + ) + + marker = _catalog_marker(package_name, f"{context}.marker", compatibility["marker"]) + expected_marker = str(requirement.marker) if requirement.marker is not None else None + if marker != expected_marker: + raise PluginCatalogError( + f"package {package_name!r} has invalid {context}.marker {marker!r}; expected {expected_marker!r}" + ) + + +def _catalog_marker(package_name: str, context: str, value: object) -> str | None: + raw_marker = _required_catalog_nullable_string(context, value) + if raw_marker is None: + return None + try: + return str(Marker(raw_marker)) + except InvalidMarker as e: + raise PluginCatalogError(f"package {package_name!r} has invalid {context} {raw_marker!r}: {e}") from e + + +def _catalog_http_url(context: str, value: object) -> str: + url = _required_catalog_string(context, value) + parsed = urlparse(url) + if parsed.scheme not in {"http", "https"} or not parsed.netloc: + raise PluginCatalogError(f"{context} {url!r} is invalid; expected an absolute HTTP(S) URL") + return url + + +def _format_catalog_keys(keys: set[str]) -> str: + return ", ".join(sorted(keys)) + + +def _format_catalog_choices(choices: set[str]) -> str: + return ", ".join(repr(choice) for choice in sorted(choices)) diff --git a/packages/data-designer/src/data_designer/cli/repositories/plugin_catalog_repository.py b/packages/data-designer/src/data_designer/cli/repositories/plugin_catalog_repository.py new file mode 100644 index 000000000..bc5b14ae3 --- /dev/null +++ b/packages/data-designer/src/data_designer/cli/repositories/plugin_catalog_repository.py @@ -0,0 +1,342 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import hashlib +import json +import os +from datetime import datetime, timezone +from pathlib import Path +from urllib.error import HTTPError, URLError +from urllib.parse import urlparse +from urllib.request import Request, urlopen + +from pydantic import ValidationError + +from data_designer.cli.plugin_catalog import ( + DEFAULT_PLUGIN_CATALOG_ALIAS, + MAX_PLUGIN_CATALOG_SIZE_BYTES, + PLUGIN_CATALOG_CACHE_DIR_NAME, + PLUGIN_CATALOG_DEFAULT_CACHE_TTL_SECONDS, + PLUGIN_CATALOGS_FILE_NAME, + PluginCatalog, + PluginCatalogConfig, + PluginCatalogError, + PluginCatalogRegistry, + get_default_plugin_catalog_url, + validate_plugin_catalog_payload, +) +from data_designer.cli.repositories.base import ConfigRepository +from data_designer.config.errors import InvalidConfigError, InvalidFileFormatError, InvalidFilePathError +from data_designer.config.utils.io_helpers import load_config_file, save_config_file + + +class PluginCatalogRepository(ConfigRepository[PluginCatalogRegistry]): + """Repository for plugin catalog aliases and cached catalog payloads.""" + + @property + def config_file(self) -> Path: + """Get the plugin catalog configuration file path.""" + return self.config_dir / PLUGIN_CATALOGS_FILE_NAME + + @property + def cache_dir(self) -> Path: + """Get the plugin catalog cache directory path.""" + return self.config_dir / PLUGIN_CATALOG_CACHE_DIR_NAME + + def load(self) -> PluginCatalogRegistry | None: + """Load user-configured plugin catalogs.""" + if not self.exists(): + return None + + try: + config_dict = load_config_file(self.config_file) + return PluginCatalogRegistry.model_validate(config_dict) + except (InvalidConfigError, InvalidFileFormatError, InvalidFilePathError, OSError, ValidationError) as e: + raise PluginCatalogError(f"Failed to load plugin catalog registry at {self.config_file}: {e}") from e + + def save(self, config: PluginCatalogRegistry) -> None: + """Save user-configured plugin catalogs.""" + config_dict = config.model_dump(mode="json", exclude_none=True, exclude_defaults=True) + save_config_file(self.config_file, config_dict) + + def list_catalogs(self) -> list[PluginCatalogConfig]: + """Return the built-in NVIDIA catalog followed by user-configured catalogs.""" + catalogs = [self.default_catalog()] + registry = self.load() + if registry is not None: + catalogs.extend(sorted(registry.catalogs, key=lambda catalog: catalog.alias.casefold())) + return catalogs + + def get_catalog(self, alias: str | None = None) -> PluginCatalogConfig | None: + """Return a catalog by alias, defaulting to the built-in NVIDIA catalog.""" + resolved_alias = alias or DEFAULT_PLUGIN_CATALOG_ALIAS + return next((catalog for catalog in self.list_catalogs() if _same_alias(catalog.alias, resolved_alias)), None) + + def add_catalog( + self, + alias: str, + url: str, + *, + cache_ttl_seconds: int = PLUGIN_CATALOG_DEFAULT_CACHE_TTL_SECONDS, + ) -> PluginCatalogConfig: + """Persist a new catalog alias. + + Raises: + ValueError: If the alias already exists or is reserved for the built-in catalog. + """ + if self.get_catalog(alias) is not None: + raise ValueError(f"Plugin catalog alias {alias!r} already exists") + + catalog = PluginCatalogConfig( + alias=alias, + url=normalize_catalog_location(url), + cache_ttl_seconds=cache_ttl_seconds, + ) + registry = self.load() or PluginCatalogRegistry() + registry.catalogs.append(catalog) + registry.catalogs = sorted(registry.catalogs, key=lambda item: item.alias.casefold()) + self.save(registry) + return catalog + + def remove_catalog(self, alias: str) -> None: + """Remove a user-configured catalog alias. + + Raises: + ValueError: If the alias is reserved or does not exist. + """ + if _same_alias(alias, DEFAULT_PLUGIN_CATALOG_ALIAS): + raise ValueError(f"Cannot remove the built-in {DEFAULT_PLUGIN_CATALOG_ALIAS!r} plugin catalog") + + registry = self.load() + matching_catalog = ( + next((catalog for catalog in registry.catalogs if _same_alias(catalog.alias, alias)), None) + if registry + else None + ) + if registry is None or matching_catalog is None: + raise ValueError(f"Plugin catalog alias {alias!r} not found") + + registry.catalogs = [catalog for catalog in registry.catalogs if not _same_alias(catalog.alias, alias)] + if registry.catalogs: + self.save(registry) + else: + self.delete() + + self._remove_cache_files(matching_catalog) + + def load_catalog(self, alias: str | None = None, *, refresh: bool = False) -> PluginCatalog: + """Load a catalog from cache or source.""" + catalog_config = self.get_catalog(alias) + if catalog_config is None: + raise ValueError(f"Plugin catalog alias {alias!r} not found") + + if not refresh: + cached_catalog = self._load_cached_catalog(catalog_config, require_fresh=True) + if cached_catalog is not None: + return cached_catalog + + try: + payload = self._fetch_catalog_payload(catalog_config.url) + except (PluginCatalogError, OSError, ValueError): + if not refresh: + cached_catalog = self._load_cached_catalog(catalog_config, require_fresh=False) + if cached_catalog is not None: + return cached_catalog + raise + + catalog = self._validate_catalog(payload, source=catalog_config.url) + self._save_catalog_cache(catalog_config, payload) + return catalog + + def _load_cached_catalog(self, catalog: PluginCatalogConfig, *, require_fresh: bool) -> PluginCatalog | None: + cache_file = self._cache_file(catalog) + if not cache_file.exists(): + return None + + try: + with open(cache_file) as f: + cache_payload = json.load(f) + fetched_at = datetime.fromisoformat(cache_payload["fetched_at"]) + if fetched_at.tzinfo is None: + fetched_at = fetched_at.replace(tzinfo=timezone.utc) + if require_fresh and catalog.cache_ttl_seconds == 0: + return None + if require_fresh: + age_seconds = (datetime.now(timezone.utc) - fetched_at).total_seconds() + if age_seconds > catalog.cache_ttl_seconds: + return None + catalog_payload = cache_payload["catalog"] + return self._validate_catalog(catalog_payload, source=str(cache_file)) + except (OSError, json.JSONDecodeError, KeyError, TypeError, ValueError): + return None + + def _save_catalog_cache(self, catalog: PluginCatalogConfig, catalog_payload: dict[str, object]) -> None: + self.cache_dir.mkdir(parents=True, exist_ok=True) + cache_payload = { + "catalog_alias": catalog.alias, + "catalog_url": catalog.url, + "fetched_at": datetime.now(timezone.utc).isoformat(), + "catalog": catalog_payload, + } + cache_file = self._cache_file(catalog) + temp_path = cache_file.with_name(f"{cache_file.name}.{os.getpid()}.tmp") + try: + temp_path.write_text(json.dumps(cache_payload, indent=2, sort_keys=True), encoding="utf-8") + temp_path.replace(cache_file) + finally: + temp_path.unlink(missing_ok=True) + + def _cache_file(self, catalog: PluginCatalogConfig) -> Path: + url_hash = hashlib.sha256(catalog.url.encode("utf-8")).hexdigest()[:12] + return self.cache_dir / f"{catalog.alias}-{url_hash}.json" + + def _remove_cache_files(self, catalog: PluginCatalogConfig) -> None: + if not self.cache_dir.exists(): + return + + self._cache_file(catalog).unlink(missing_ok=True) + legacy_cache_file = self.cache_dir / f"{catalog.alias}.json" + legacy_cache_file.unlink(missing_ok=True) + + for cache_file in self.cache_dir.glob("*.json"): + try: + with open(cache_file) as f: + cache_payload = json.load(f) + except (OSError, json.JSONDecodeError): + continue + cached_alias = cache_payload.get("catalog_alias") + if isinstance(cached_alias, str) and _same_alias(cached_alias, catalog.alias): + cache_file.unlink(missing_ok=True) + + @staticmethod + def _fetch_catalog_payload(location: str) -> dict[str, object]: + if _is_http_url(location): + return _fetch_remote_catalog(location) + return _fetch_local_catalog(location) + + @staticmethod + def _validate_catalog(payload: dict, *, source: str) -> PluginCatalog: + validate_plugin_catalog_payload(payload, source=source) + try: + catalog = PluginCatalog.model_validate(payload) + except ValidationError as e: + raise PluginCatalogError(f"Invalid plugin catalog at {source!r}: {e}") from e + return catalog + + @staticmethod + def default_catalog() -> PluginCatalogConfig: + """Return the built-in NVIDIA plugin catalog configuration.""" + catalog_url = get_default_plugin_catalog_url() + return PluginCatalogConfig( + alias=DEFAULT_PLUGIN_CATALOG_ALIAS, + url=catalog_url, + cache_ttl_seconds=PLUGIN_CATALOG_DEFAULT_CACHE_TTL_SECONDS, + ) + + +def normalize_catalog_location(location: str) -> str: + """Normalize a catalog repository, catalog URL, or local path to a catalog location.""" + if _is_http_url(location): + return _normalize_catalog_url(location) + + path = Path(location).expanduser() + if path.suffix.lower() == ".json": + return str(path.resolve(strict=False)) + return str(_catalog_plugins_path(path).resolve(strict=False)) + + +def _same_alias(left: str, right: str) -> bool: + return left.casefold() == right.casefold() + + +def _normalize_catalog_url(url: str) -> str: + parsed = urlparse(url) + hostname = parsed.hostname or "" + segments = [segment for segment in parsed.path.split("/") if segment] + + if hostname in {"github.com", "www.github.com"} and len(segments) >= 2: + owner, repo = segments[0], segments[1] + if len(segments) == 2: + return f"https://raw.githubusercontent.com/{owner}/{repo}/main/catalog/plugins.json" + if len(segments) >= 5 and segments[2] == "blob": + ref = segments[3] + path = "/".join(segments[4:]) + return f"https://raw.githubusercontent.com/{owner}/{repo}/{ref}/{path}" + if len(segments) >= 4 and segments[2] == "tree": + ref = segments[3] + catalog_root = "/".join(segments[4:]) + catalog_path = _catalog_plugins_url_path(catalog_root) + return f"https://raw.githubusercontent.com/{owner}/{repo}/{ref}/{catalog_path}" + + return url + + +def _catalog_plugins_path(path: Path) -> Path: + if path.name == "catalog": + return path / "plugins.json" + return path / "catalog" / "plugins.json" + + +def _catalog_plugins_url_path(catalog_root: str) -> str: + if not catalog_root: + return "catalog/plugins.json" + if catalog_root.rstrip("/").endswith("/catalog") or catalog_root == "catalog": + return f"{catalog_root}/plugins.json" + return f"{catalog_root}/catalog/plugins.json" + + +def _fetch_local_catalog(location: str) -> dict[str, object]: + path = Path(location).expanduser() + if not path.exists(): + raise PluginCatalogError(f"Plugin catalog file not found: {path}") + if path.stat().st_size > MAX_PLUGIN_CATALOG_SIZE_BYTES: + raise PluginCatalogError( + f"Plugin catalog at {path} exceeds maximum size of {MAX_PLUGIN_CATALOG_SIZE_BYTES} bytes" + ) + + try: + with open(path) as f: + payload = json.load(f) + except json.JSONDecodeError as e: + raise PluginCatalogError(f"Failed to parse plugin catalog JSON at {path}: {e}") from e + + if not isinstance(payload, dict): + raise PluginCatalogError(f"Plugin catalog at {path} must be a JSON object") + return payload + + +def _fetch_remote_catalog(url: str) -> dict[str, object]: + request = Request(url, headers={"User-Agent": "data-designer"}) + try: + with urlopen(request, timeout=10) as response: + status = getattr(response, "status", 200) + if isinstance(status, int) and status >= 400: + raise PluginCatalogError(f"Failed to fetch plugin catalog {url!r}: HTTP {status}") + # Read one byte past the limit so oversized chunked responses are + # rejected without keeping the full response body in memory. + content = response.read(MAX_PLUGIN_CATALOG_SIZE_BYTES + 1) + except HTTPError as e: + raise PluginCatalogError(f"Failed to fetch plugin catalog {url!r}: HTTP {e.code}") from e + except URLError as e: + raise PluginCatalogError(f"Failed to fetch plugin catalog {url!r}: {e.reason}") from e + + if len(content) > MAX_PLUGIN_CATALOG_SIZE_BYTES: + raise PluginCatalogError( + f"Plugin catalog at {url!r} exceeds maximum size of {MAX_PLUGIN_CATALOG_SIZE_BYTES} bytes" + ) + + try: + payload = json.loads(content.decode("utf-8")) + except (UnicodeDecodeError, json.JSONDecodeError) as e: + raise PluginCatalogError(f"Failed to parse plugin catalog JSON at {url!r}: {e}") from e + + if not isinstance(payload, dict): + raise PluginCatalogError(f"Plugin catalog at {url!r} must be a JSON object") + return payload + + +def _is_http_url(value: str) -> bool: + parsed = urlparse(value) + return parsed.scheme in {"http", "https"} and bool(parsed.netloc) diff --git a/packages/data-designer/src/data_designer/cli/services/plugin_catalog_service.py b/packages/data-designer/src/data_designer/cli/services/plugin_catalog_service.py new file mode 100644 index 000000000..f4f992896 --- /dev/null +++ b/packages/data-designer/src/data_designer/cli/services/plugin_catalog_service.py @@ -0,0 +1,244 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import importlib.metadata +import platform +from collections import defaultdict +from collections.abc import Iterable + +from packaging.markers import InvalidMarker, Marker +from packaging.specifiers import InvalidSpecifier, SpecifierSet +from packaging.utils import canonicalize_name +from packaging.version import InvalidVersion, Version + +from data_designer.cli.plugin_catalog import ( + DATA_DESIGNER_PLUGIN_PACKAGE_PREFIX, + PLUGIN_ENTRY_POINT_GROUP, + CompatibilityResult, + InstalledPluginInfo, + PluginCatalogConfig, + PluginCatalogEntry, + PluginCompatibilityTarget, +) +from data_designer.cli.repositories.plugin_catalog_repository import PluginCatalogRepository + + +class PluginCatalogService: + """Business logic for plugin catalog discovery and compatibility checks.""" + + def __init__( + self, + repository: PluginCatalogRepository, + *, + python_version: str | None = None, + data_designer_version: str | None = None, + ) -> None: + self.repository = repository + self.python_version = python_version or platform.python_version() + self.data_designer_version = data_designer_version or _get_installed_data_designer_version() + + def list_entries( + self, + catalog_alias: str | None = None, + *, + refresh: bool = False, + include_incompatible: bool = False, + ) -> list[PluginCatalogEntry]: + """List catalog entries for a catalog, filtering incompatible entries by default.""" + catalog = self.repository.load_catalog(catalog_alias, refresh=refresh) + entries = sorted(catalog.entries, key=lambda entry: (canonicalize_name(entry.package.name), entry.name)) + if include_incompatible: + return entries + return [entry for entry in entries if self.evaluate_compatibility(entry).is_compatible] + + def search_entries( + self, + query: str, + catalog_alias: str | None = None, + *, + refresh: bool = False, + include_incompatible: bool = False, + ) -> list[PluginCatalogEntry]: + """Search catalog entries by package metadata and runtime plugin metadata.""" + query_tokens = _tokenize(query) + if not query_tokens: + return [] + + return [ + entry + for entry in self.list_entries( + catalog_alias, + refresh=refresh, + include_incompatible=include_incompatible, + ) + if all(token in _entry_search_text(entry) for token in query_tokens) + ] + + def get_package_entries( + self, + package: str, + catalog_alias: str | None = None, + *, + refresh: bool = False, + include_incompatible: bool = True, + ) -> list[PluginCatalogEntry]: + """Return all runtime plugin entries declared by one catalog package name or package alias.""" + entries = self.list_entries( + catalog_alias, + refresh=refresh, + include_incompatible=include_incompatible, + ) + canonical_package = canonicalize_name(package) + exact_matches = [entry for entry in entries if canonicalize_name(entry.package.name) == canonical_package] + if exact_matches: + return exact_matches + + return [ + entry for entry in entries if _package_alias(canonicalize_name(entry.package.name)) == canonical_package + ] + + @staticmethod + def group_entries_by_package(entries: Iterable[PluginCatalogEntry]) -> dict[str, list[PluginCatalogEntry]]: + """Group catalog entries by installable package name.""" + grouped_entries: dict[str, list[PluginCatalogEntry]] = defaultdict(list) + for entry in entries: + grouped_entries[canonicalize_name(entry.package.name)].append(entry) + return { + package_name: sorted(items, key=lambda item: item.name) for package_name, items in grouped_entries.items() + } + + def evaluate_compatibility(self, entry: PluginCatalogEntry) -> CompatibilityResult: + """Evaluate whether a catalog entry is compatible with the local environment.""" + compatibility = entry.compatibility + reasons = [] + reasons.extend( + self._evaluate_target( + target=compatibility.python, + label="Python", + version=self.python_version, + marker_environment={"python_version": _major_minor(self.python_version)}, + ) + ) + reasons.extend( + self._evaluate_target( + target=compatibility.data_designer, + label="Data Designer", + version=self.data_designer_version, + marker_environment={"python_version": _major_minor(self.python_version)}, + ) + ) + return CompatibilityResult(is_compatible=not reasons, reasons=reasons) + + def list_catalogs(self) -> list[PluginCatalogConfig]: + """List available plugin catalogs.""" + return self.repository.list_catalogs() + + def get_catalog(self, alias: str | None = None) -> PluginCatalogConfig: + """Return a plugin catalog or raise a user-facing error.""" + catalog = self.repository.get_catalog(alias) + if catalog is None: + raise ValueError(f"Plugin catalog alias {alias!r} not found") + return catalog + + def add_catalog( + self, + alias: str, + url: str, + ) -> PluginCatalogConfig: + """Add a plugin catalog alias.""" + return self.repository.add_catalog( + alias, + url, + ) + + def remove_catalog(self, alias: str) -> None: + """Remove a plugin catalog alias.""" + self.repository.remove_catalog(alias) + + def list_installed_plugins(self) -> list[InstalledPluginInfo]: + """List installed Data Designer runtime plugin entry points without importing plugin modules.""" + entry_points = importlib.metadata.entry_points(group=PLUGIN_ENTRY_POINT_GROUP) + installed_plugins = [ + InstalledPluginInfo(name=entry_point.name, entry_point_value=entry_point.value) + for entry_point in entry_points + ] + return sorted(installed_plugins, key=lambda plugin: plugin.name) + + def _evaluate_target( + self, + *, + target: PluginCompatibilityTarget, + label: str, + version: str | None, + marker_environment: dict[str, str], + ) -> list[str]: + marker_error = _marker_error(target.marker, marker_environment) + if marker_error is not None: + return [f"{label} marker {target.marker!r} is invalid: {marker_error}"] + if target.marker and not Marker(target.marker).evaluate(marker_environment): + return [] + + if version is None: + return [f"Unable to resolve installed {label} version for constraint {target.specifier!r}"] + + try: + specifier = SpecifierSet(target.specifier) + except InvalidSpecifier as e: + return [f"{label} specifier {target.specifier!r} is invalid: {e}"] + + try: + parsed_version = Version(version) + except InvalidVersion as e: + return [f"Installed {label} version {version!r} is invalid: {e}"] + + if not specifier.contains(parsed_version, prereleases=True): + return [f"{label} {version} does not satisfy {target.specifier}"] + return [] + + +def _get_installed_data_designer_version() -> str | None: + try: + return importlib.metadata.version("data-designer") + except importlib.metadata.PackageNotFoundError: + return None + + +def _tokenize(value: str) -> list[str]: + return [token.strip().lower() for token in value.split() if token.strip()] + + +def _entry_search_text(entry: PluginCatalogEntry) -> str: + package_name = canonicalize_name(entry.package.name) + values = [ + entry.package.name, + _package_alias(package_name) or "", + entry.description, + entry.name, + entry.plugin_type.value, + ] + return " ".join(values).lower() + + +def _package_alias(canonical_package_name: str) -> str | None: + if not canonical_package_name.startswith(DATA_DESIGNER_PLUGIN_PACKAGE_PREFIX): + return None + return canonical_package_name.removeprefix(DATA_DESIGNER_PLUGIN_PACKAGE_PREFIX) + + +def _major_minor(version: str) -> str: + parts = version.split(".") + if len(parts) < 2: + return version + return ".".join(parts[:2]) + + +def _marker_error(marker: str | None, environment: dict[str, str]) -> str | None: + if marker is None: + return None + try: + Marker(marker).evaluate(environment) + except InvalidMarker as e: + return str(e) + return None diff --git a/packages/data-designer/src/data_designer/cli/services/plugin_install_service.py b/packages/data-designer/src/data_designer/cli/services/plugin_install_service.py new file mode 100644 index 000000000..09545e6bb --- /dev/null +++ b/packages/data-designer/src/data_designer/cli/services/plugin_install_service.py @@ -0,0 +1,553 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import importlib +import importlib.metadata +import os +import shutil +import subprocess +import sys +import tempfile +from collections.abc import Callable, Iterator +from contextlib import contextmanager +from dataclasses import dataclass +from pathlib import Path +from typing import Any + +from packaging.requirements import InvalidRequirement, Requirement +from packaging.utils import canonicalize_name +from packaging.version import InvalidVersion, Version + +from data_designer.cli.plugin_catalog import ( + DATA_DESIGNER_DISTRIBUTION_NAME, + PLUGIN_ENTRY_POINT_GROUP, + PYPI_SIMPLE_INDEX_URL, + InstallCommandTemporaryFile, + InstallPlan, + PluginCatalogConfig, + PluginCatalogEntry, + UninstallPlan, +) + +InstallRunner = Callable[[list[str], str | None], int] +PIP_EXTRA_INDEX_SOURCE_WARNING = ( + "pip --extra-index-url is not source-pinned; pip may choose a same-named package from another configured index. " + "Use uv or a direct reference when strict source selection is required." +) +DATA_DESIGNER_DISTRIBUTION_NAMES = ( + DATA_DESIGNER_DISTRIBUTION_NAME, + "data-designer-config", + "data-designer-engine", +) +DATA_DESIGNER_PROJECT_NAMES = (*DATA_DESIGNER_DISTRIBUTION_NAMES, "data-designer-workspace") +PIP_DATA_DESIGNER_CONSTRAINT_FILE_NAME = "data-designer-constraint.txt" +DATA_DESIGNER_CONSTRAINT_PLACEHOLDER = "" +UV_PLUGIN_INSTALL_MIN_VERSION = Version("0.6.0") + +try: + import tomllib +except ModuleNotFoundError: # pragma: no cover - exercised only on Python 3.10. + tomllib = None # type: ignore[assignment] + + +@dataclass(frozen=True) +class _InstallTarget: + manager: str + mode: str + project_root: Path | None = None + warning: str | None = None + + +class PluginInstallService: + """Resolve, execute, and verify plugin package install and uninstall plans. + + When no working directory is provided, plan resolution uses the current + process directory at build time so CLI calls follow the user's active shell. + """ + + def __init__( + self, + runner: InstallRunner | None = None, + *, + working_dir: Path | None = None, + active_virtualenv: bool | None = None, + ) -> None: + self._runner = runner or _run_subprocess + self._working_dir = working_dir + self._active_virtualenv = active_virtualenv + + def build_install_plan( + self, + entry: PluginCatalogEntry, + catalog: PluginCatalogConfig, + *, + manager: str = "auto", + ) -> InstallPlan: + """Build the exact package-manager command for one catalog entry.""" + target = _resolve_install_target( + manager, + working_dir=self._working_dir or Path.cwd(), + active_virtualenv=self._active_virtualenv, + ) + data_designer_versions = _installed_data_designer_distribution_versions() + protection_args, data_designer_protection, command_stdin, temporary_file = _data_designer_protection_args( + target.mode, + data_designer_versions, + ) + install_args, source_description, source_warning = _install_args_for_entry(entry, target) + command = _base_command(target) + protection_args + install_args + return InstallPlan( + package_name=entry.package.name, + source_description=source_description, + command=command, + manager=target.manager, + catalog_alias=catalog.alias, + source_warning=_combine_warnings(target.warning, source_warning), + data_designer_protection=data_designer_protection, + command_stdin=command_stdin, + temporary_file=temporary_file, + install_mode=target.mode, + project_root=str(target.project_root) if target.project_root is not None else None, + ) + + def build_uninstall_plan( + self, + entry: PluginCatalogEntry, + catalog: PluginCatalogConfig, + *, + manager: str = "auto", + ) -> UninstallPlan: + """Build the exact package-manager command to uninstall one catalog package.""" + target = _resolve_install_target( + manager, + working_dir=self._working_dir or Path.cwd(), + active_virtualenv=self._active_virtualenv, + ) + commands = _uninstall_commands(target, entry.package.name) + return UninstallPlan( + package_name=entry.package.name, + command=commands[0], + manager=target.manager, + catalog_alias=catalog.alias, + commands=commands, + uninstall_mode=target.mode, + project_root=str(target.project_root) if target.project_root is not None else None, + ) + + def install(self, plan: InstallPlan) -> None: + """Run an installation plan. + + Raises: + RuntimeError: If the package manager exits unsuccessfully. + """ + with _materialized_install_command(plan) as (command, command_stdin): + return_code = self._runner(command, command_stdin) + if return_code != 0: + raise RuntimeError(f"Plugin package installer exited with status {return_code}") + + def uninstall(self, plan: UninstallPlan) -> None: + """Run an uninstall plan. + + Raises: + RuntimeError: If the package manager exits unsuccessfully. + """ + for command in plan.commands or [plan.command]: + return_code = self._runner(command, None) + if return_code != 0: + raise RuntimeError(f"Plugin package uninstaller exited with status {return_code}") + + def verify_entry_point(self, entry: PluginCatalogEntry) -> bool: + """Verify the runtime plugin's declared entry point is installed.""" + return self.verify_entry_points([entry]) + + def verify_entry_points(self, entries: list[PluginCatalogEntry]) -> bool: + """Verify every declared runtime entry point for an installed catalog package.""" + if not entries: + return False + + importlib.invalidate_caches() + installed_entry_points = list(importlib.metadata.entry_points(group=PLUGIN_ENTRY_POINT_GROUP)) + return all( + any( + _installed_entry_point_matches(installed_entry_point, entry) + for installed_entry_point in installed_entry_points + ) + for entry in entries + ) + + def verify_entry_points_removed(self, entries: list[PluginCatalogEntry]) -> bool: + """Verify every declared runtime entry point for a catalog package is no longer installed.""" + if not entries: + return False + + importlib.invalidate_caches() + installed_entry_points = list(importlib.metadata.entry_points(group=PLUGIN_ENTRY_POINT_GROUP)) + return all( + not any( + _installed_entry_point_matches(installed_entry_point, entry) + for installed_entry_point in installed_entry_points + ) + for entry in entries + ) + + +def _run_subprocess(command: list[str], stdin_text: str | None) -> int: + if stdin_text is None: + result = subprocess.run(command, check=False, stdin=subprocess.DEVNULL) + else: + result = subprocess.run(command, check=False, input=stdin_text, text=True) + return result.returncode + + +def _installed_entry_point_matches( + installed_entry_point: importlib.metadata.EntryPoint, + entry: PluginCatalogEntry, +) -> bool: + if installed_entry_point.name != entry.entry_point.name: + return False + if installed_entry_point.value != entry.entry_point.value: + return False + + distribution_name = _entry_point_distribution_name(installed_entry_point) + if distribution_name is None: + return True + return canonicalize_name(distribution_name) == canonicalize_name(entry.package.name) + + +def _entry_point_distribution_name(installed_entry_point: importlib.metadata.EntryPoint) -> str | None: + distribution = getattr(installed_entry_point, "dist", None) + if distribution is None: + return None + + metadata = getattr(distribution, "metadata", None) + if metadata is None: + return None + + name = metadata.get("Name") + if not isinstance(name, str) or not name: + return None + return name + + +def _resolve_install_target( + manager: str, + *, + working_dir: Path, + active_virtualenv: bool | None, +) -> _InstallTarget: + if manager not in {"auto", "uv", "pip"}: + raise ValueError(f"Unsupported plugin installer {manager!r}. Expected 'auto', 'uv', or 'pip'.") + + uv_path = shutil.which("uv") if manager in {"auto", "uv"} else None + uv_error = _uv_plugin_install_error(uv_path) if uv_path is not None else None + if manager == "auto": + if uv_path is None: + return _InstallTarget(manager="pip", mode="pip-environment") + if uv_error is not None: + return _InstallTarget( + manager="pip", + mode="pip-environment", + warning=f"{uv_error}; falling back to pip.", + ) + project_root = _project_root_for_uv_add(working_dir, active_virtualenv) + if project_root is not None: + return _InstallTarget(manager="uv", mode="uv-project", project_root=project_root) + return _InstallTarget(manager="uv", mode="uv-environment") + + if manager == "uv": + if uv_path is None: + raise ValueError("uv was requested for plugin package installation, but it is not available on PATH") + if uv_error is not None: + raise ValueError(f"{uv_error}. Use --manager pip or update uv.") + project_root = _project_root_for_uv_add(working_dir, active_virtualenv) + if project_root is not None: + return _InstallTarget(manager="uv", mode="uv-project", project_root=project_root) + return _InstallTarget(manager="uv", mode="uv-environment") + + return _InstallTarget(manager="pip", mode="pip-environment") + + +def _base_command(target: _InstallTarget) -> list[str]: + if target.mode == "uv-project": + if target.project_root is None: + raise ValueError("uv project install target requires a project root") + return ["uv", "add", "--project", str(target.project_root), "--active", "--no-install-project"] + if target.mode == "uv-environment": + return ["uv", "pip", "install", "--python", sys.executable] + return [sys.executable, "-m", "pip", "install"] + + +def _uninstall_commands(target: _InstallTarget, package_name: str) -> list[list[str]]: + if target.mode == "uv-project": + if target.project_root is None: + raise ValueError("uv project uninstall target requires a project root") + return [ + ["uv", "remove", "--project", str(target.project_root), "--no-sync", package_name], + ["uv", "pip", "uninstall", "--python", sys.executable, package_name], + ] + return [_base_uninstall_command(target) + [package_name]] + + +def _base_uninstall_command(target: _InstallTarget) -> list[str]: + if target.manager == "uv": + return ["uv", "pip", "uninstall", "--python", sys.executable] + return [sys.executable, "-m", "pip", "uninstall", "--yes"] + + +def _project_root_for_uv_add(working_dir: Path, active_virtualenv: bool | None) -> Path | None: + if not _has_active_virtualenv(active_virtualenv): + return None + + project_root = _find_nearest_pyproject_root(working_dir) + if project_root is None or _is_data_designer_source_project(project_root): + return None + return project_root + + +def _has_active_virtualenv(active_virtualenv: bool | None) -> bool: + if active_virtualenv is not None: + return active_virtualenv + return sys.prefix != getattr(sys, "base_prefix", sys.prefix) or bool(os.getenv("VIRTUAL_ENV")) + + +def _uv_plugin_install_error(uv_path: str) -> str | None: + try: + result = subprocess.run( + [uv_path, "--version"], + check=False, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + text=True, + timeout=5, + ) + except (OSError, subprocess.TimeoutExpired) as e: + return f"Unable to verify uv at {uv_path!r}: {e}" + + output = (result.stdout or result.stderr).strip() + if result.returncode != 0: + details = f": {output}" if output else "" + return f"Unable to verify uv at {uv_path!r}; `uv --version` exited with status {result.returncode}{details}" + + uv_version = _parse_uv_version(output) + if uv_version is None: + return ( + f"Unable to parse uv version from {output!r}; plugin package installs require " + f"uv >= {UV_PLUGIN_INSTALL_MIN_VERSION}" + ) + if uv_version < UV_PLUGIN_INSTALL_MIN_VERSION: + return f"Found uv {uv_version}, but plugin package installs require uv >= {UV_PLUGIN_INSTALL_MIN_VERSION}" + return None + + +def _parse_uv_version(output: str) -> Version | None: + for token in output.split(): + try: + return Version(token) + except InvalidVersion: + continue + return None + + +def _find_nearest_pyproject_root(working_dir: Path) -> Path | None: + resolved_working_dir = working_dir.resolve() + for candidate in (resolved_working_dir, *resolved_working_dir.parents): + if (candidate / "pyproject.toml").is_file(): + return candidate + return None + + +def _is_data_designer_source_project(project_root: Path) -> bool: + pyproject_data = _load_pyproject_data(project_root / "pyproject.toml") + project = pyproject_data.get("project", {}) + if isinstance(project, dict): + project_name = project.get("name") + if isinstance(project_name, str) and canonicalize_name(project_name) in DATA_DESIGNER_PROJECT_NAMES: + return True + + try: + relative_source_file = Path(__file__).resolve().relative_to(project_root.resolve()) + except (OSError, ValueError): + return False + source_parts = relative_source_file.parts + return source_parts[:3] == ("packages", "data-designer", "src") or source_parts[:2] == ("src", "data_designer") + + +def _load_pyproject_data(pyproject_path: Path) -> dict[str, Any]: + try: + text = pyproject_path.read_text(encoding="utf-8") + except OSError: + return {} + + if tomllib is not None: + try: + data = tomllib.loads(text) + except tomllib.TOMLDecodeError: + return {} + return data if isinstance(data, dict) else {} + + return _load_pyproject_markers_without_tomllib(text) + + +def _load_pyproject_markers_without_tomllib(text: str) -> dict[str, Any]: + # Python 3.10 only needs a deliberately lossy fallback: detect simple + # [project] name markers from this repo's pyprojects, not parse TOML. + project: dict[str, Any] = {} + section = "" + + for raw_line in text.splitlines(): + line = raw_line.split("#", maxsplit=1)[0].strip() + if not line: + continue + if line.startswith("[") and line.endswith("]"): + section = line.strip("[]").strip() + continue + if "=" not in line: + continue + + key, raw_value = (part.strip() for part in line.split("=", maxsplit=1)) + if section == "project" and key == "name": + project["name"] = _parse_simple_toml_value(raw_value) + + data: dict[str, Any] = {} + if project: + data["project"] = project + return data + + +def _parse_simple_toml_value(raw_value: str) -> str | None: + value = raw_value.strip() + if (value.startswith('"') and value.endswith('"')) or (value.startswith("'") and value.endswith("'")): + return value[1:-1] + return None + + +def _installed_data_designer_distribution_versions() -> dict[str, str]: + versions: dict[str, str] = {} + for distribution_name in DATA_DESIGNER_DISTRIBUTION_NAMES: + try: + version = importlib.metadata.version(distribution_name) + except importlib.metadata.PackageNotFoundError as e: + raise ValueError( + f"Unable to resolve installed {distribution_name!r} version; " + "plugin package installs require the Data Designer package family to be installed first." + ) from e + + try: + Version(version) + except InvalidVersion as e: + raise ValueError( + f"Installed {distribution_name!r} version {version!r} is not a valid package version; " + "cannot protect the current Data Designer installation during plugin package install." + ) from e + versions[distribution_name] = version + return versions + + +def _data_designer_protection_args( + mode: str, + versions: dict[str, str], +) -> tuple[list[str], str, str | None, InstallCommandTemporaryFile | None]: + data_designer_version = versions[DATA_DESIGNER_DISTRIBUTION_NAME] + if mode == "uv-environment": + return ( + ["--excludes", "-"], + f"using installed {DATA_DESIGNER_DISTRIBUTION_NAME} {data_designer_version}; " + "uv will not resolve Data Designer packages", + "".join(f"{distribution_name}\n" for distribution_name in DATA_DESIGNER_DISTRIBUTION_NAMES), + None, + ) + + if mode == "uv-project": + return ( + [ + *[ + item + for distribution_name in DATA_DESIGNER_DISTRIBUTION_NAMES + for item in ("--no-install-package", distribution_name) + ], + ], + f"using installed {DATA_DESIGNER_DISTRIBUTION_NAME} {data_designer_version}; " + "uv will not install Data Designer packages", + None, + None, + ) + + return ( + ["--constraint", DATA_DESIGNER_CONSTRAINT_PLACEHOLDER], + f"pinned installed Data Designer packages; {DATA_DESIGNER_DISTRIBUTION_NAME} {data_designer_version}", + None, + _data_designer_constraint_file(versions), + ) + + +def _data_designer_constraint_file(versions: dict[str, str]) -> InstallCommandTemporaryFile: + constraints = "\n".join( + f"{distribution_name}=={versions[distribution_name]}" for distribution_name in DATA_DESIGNER_DISTRIBUTION_NAMES + ) + return InstallCommandTemporaryFile( + placeholder=DATA_DESIGNER_CONSTRAINT_PLACEHOLDER, + filename=PIP_DATA_DESIGNER_CONSTRAINT_FILE_NAME, + content=f"# Data Designer is provided by the active CLI environment.\n{constraints}\n", + ) + + +@contextmanager +def _materialized_install_command(plan: InstallPlan) -> Iterator[tuple[list[str], str | None]]: + temporary_file = plan.temporary_file + if temporary_file is None: + yield plan.command, plan.command_stdin + return + + with tempfile.TemporaryDirectory(prefix="data-designer-plugin-install-") as temp_dir: + temporary_path = Path(temp_dir) / temporary_file.filename + temporary_path.write_text(temporary_file.content, encoding="utf-8") + command = [str(temporary_path) if part == temporary_file.placeholder else part for part in plan.command] + yield command, plan.command_stdin + + +def _install_args_for_entry(entry: PluginCatalogEntry, target: _InstallTarget) -> tuple[list[str], str, str | None]: + requirement = entry.install.requirement + index_url = entry.install.index_url + if target.mode == "uv-project": + args = ["--raw"] if index_url is None and _requirement_is_direct_reference(requirement) else [] + if index_url is not None: + args.extend(["--index", index_url]) + args.append(requirement) + return args, _source_description(requirement, index_url), None + + if index_url is None: + return [requirement], requirement, None + + if target.manager == "uv": + return ( + ["--default-index", PYPI_SIMPLE_INDEX_URL, "--index", index_url, requirement], + f"{requirement} via {index_url}", + None, + ) + return ( + ["--extra-index-url", index_url, requirement], + f"{requirement} via {index_url}", + PIP_EXTRA_INDEX_SOURCE_WARNING, + ) + + +def _source_description(requirement: str, index_url: str | None) -> str: + if index_url is None: + return requirement + return f"{requirement} via {index_url}" + + +def _combine_warnings(*warnings: str | None) -> str | None: + active_warnings = [warning for warning in warnings if warning] + if not active_warnings: + return None + return "\n".join(active_warnings) + + +def _requirement_is_direct_reference(requirement: str) -> bool: + try: + return Requirement(requirement).url is not None + except InvalidRequirement: + return " @ " in requirement diff --git a/packages/data-designer/tests/cli/commands/test_plugin_command.py b/packages/data-designer/tests/cli/commands/test_plugin_command.py new file mode 100644 index 000000000..19ac5102b --- /dev/null +++ b/packages/data-designer/tests/cli/commands/test_plugin_command.py @@ -0,0 +1,173 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +from unittest.mock import MagicMock, patch + +from typer.testing import CliRunner + +from data_designer.cli.main import app + +runner = CliRunner() + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_list_command_delegates_to_controller(mock_ctrl_cls: MagicMock) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke(app, ["plugin", "--catalog", "research", "list", "--refresh", "--include-incompatible"]) + + assert result.exit_code == 0 + mock_ctrl.run_list.assert_called_once_with( + catalog_alias="research", + refresh=True, + include_incompatible=True, + ) + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_search_command_delegates_to_controller(mock_ctrl_cls: MagicMock) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke(app, ["plugin", "search", "github", "--catalog", "research"]) + + assert result.exit_code == 0 + mock_ctrl.run_search.assert_called_once_with( + "github", + catalog_alias="research", + refresh=False, + include_incompatible=False, + ) + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_install_command_delegates_to_controller(mock_ctrl_cls: MagicMock) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke( + app, + ["plugin", "install", "data-designer-text-transform", "--manager", "pip", "--yes", "--dry-run"], + ) + + assert result.exit_code == 0 + mock_ctrl.run_install.assert_called_once_with( + "data-designer-text-transform", + catalog_alias=None, + refresh=False, + manager="pip", + yes=True, + dry_run=True, + ) + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_uninstall_command_delegates_to_controller(mock_ctrl_cls: MagicMock) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke( + app, + ["plugin", "uninstall", "data-designer-text-transform", "--manager", "pip", "--yes", "--dry-run"], + ) + + assert result.exit_code == 0 + mock_ctrl.run_uninstall.assert_called_once_with( + "data-designer-text-transform", + catalog_alias=None, + refresh=False, + manager="pip", + yes=True, + dry_run=True, + ) + + +def test_plugin_info_help_uses_package_argument() -> None: + result = runner.invoke(app, ["plugin", "info", "--help"]) + + assert result.exit_code == 0 + assert "PACKAGE" in result.output + assert "Plugin package name or package alias" in result.output + assert "runtime plugin name" not in result.output + + +def test_plugin_install_help_uses_package_first_wording() -> None: + result = runner.invoke(app, ["plugin", "install", "--help"]) + + assert result.exit_code == 0 + assert "PACKAGE" in result.output + assert "Plugin package name or package alias" in result.output + assert "runtime plugin name" not in result.output + assert "Print the install plan" in result.output + + +def test_plugin_uninstall_help_uses_package_first_wording() -> None: + result = runner.invoke(app, ["plugin", "uninstall", "--help"]) + + assert result.exit_code == 0 + assert "PACKAGE" in result.output + assert "Plugin package name or package alias" in result.output + assert "runtime plugin name" not in result.output + assert "Print the uninstall plan" in result.output + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_catalog_add_command_delegates_to_controller(mock_ctrl_cls: MagicMock) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke( + app, + [ + "plugin", + "catalog", + "add", + "research", + "https://github.com/acme/dd-plugins", + ], + ) + + assert result.exit_code == 0 + mock_ctrl.run_catalog_add.assert_called_once_with( + alias="research", + url="https://github.com/acme/dd-plugins", + ) + + +@patch("data_designer.cli.commands.plugin.print_info") +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_installed_warns_when_parent_catalog_is_unused( + mock_ctrl_cls: MagicMock, + mock_print_info: MagicMock, +) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke(app, ["plugin", "--catalog", "research", "installed"]) + + assert result.exit_code == 0 + mock_print_info.assert_called_once_with( + "Ignoring --catalog 'research'; installed runtime plugins are discovered from the current Python environment." + ) + mock_ctrl.run_installed.assert_called_once_with() + + +@patch("data_designer.cli.commands.plugin.print_info") +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_plugin_catalog_list_warns_when_parent_catalog_is_unused( + mock_ctrl_cls: MagicMock, + mock_print_info: MagicMock, +) -> None: + mock_ctrl = MagicMock() + mock_ctrl_cls.return_value = mock_ctrl + + result = runner.invoke(app, ["plugin", "--catalog", "research", "catalog", "list"]) + + assert result.exit_code == 0 + mock_print_info.assert_called_once_with( + "Ignoring --catalog 'research'; catalog management commands operate on aliases directly." + ) + mock_ctrl.run_catalog_list.assert_called_once_with() diff --git a/packages/data-designer/tests/cli/controllers/test_plugin_catalog_controller.py b/packages/data-designer/tests/cli/controllers/test_plugin_catalog_controller.py new file mode 100644 index 000000000..a21e0185f --- /dev/null +++ b/packages/data-designer/tests/cli/controllers/test_plugin_catalog_controller.py @@ -0,0 +1,682 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +from io import StringIO +from pathlib import Path +from unittest.mock import MagicMock, call, patch + +import pytest +import typer +from rich.console import Console +from rich.table import Table +from rich.text import Text + +from data_designer.cli.controllers.plugin_catalog_controller import PluginCatalogController +from data_designer.cli.plugin_catalog import ( + CompatibilityResult, + InstallPlan, + PluginCatalogConfig, + PluginCatalogEntry, + PluginCatalogError, + UninstallPlan, +) + + +@pytest.fixture +def controller(tmp_path: Path) -> PluginCatalogController: + plugin_controller = PluginCatalogController(tmp_path) + plugin_controller.catalog_service = MagicMock() + plugin_controller.install_service = MagicMock() + return plugin_controller + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_info") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_list_mentions_hidden_incompatible_packages_when_visible_list_is_empty( + mock_print_warning: MagicMock, + mock_print_info: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.list_entries.side_effect = [[], [entry]] + + controller.run_list(catalog_alias="local") + + assert controller.catalog_service.list_entries.call_args_list == [ + call("local", refresh=False, include_incompatible=False), + call("local", refresh=False, include_incompatible=True), + ] + mock_print_warning.assert_called_once_with("No compatible plugin packages found") + mock_print_info.assert_any_call( + "Incompatible catalog packages are hidden. Use --include-incompatible to show them." + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_info") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_search_mentions_hidden_incompatible_packages_when_visible_matches_are_empty( + mock_print_warning: MagicMock, + mock_print_info: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.search_entries.side_effect = [[], [entry]] + + controller.run_search("text", catalog_alias="local") + + assert controller.catalog_service.search_entries.call_args_list == [ + call("text", "local", refresh=False, include_incompatible=False), + call("text", "local", refresh=False, include_incompatible=True), + ] + mock_print_warning.assert_called_once_with("No compatible plugin packages matched") + mock_print_info.assert_any_call( + "Matching incompatible catalog packages are hidden. Use --include-incompatible to show them." + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +def test_run_list_renders_package_first_catalog_table( + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + package_entries = [ + _entry(name="text-column", plugin_type="column-generator"), + _entry(name="text-processor", plugin_type="processor"), + ] + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.list_entries.return_value = package_entries + controller.catalog_service.group_entries_by_package.return_value = { + "data-designer-text-transform": package_entries, + } + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + + controller.run_list(catalog_alias="local", include_incompatible=True) + + printed_tables = [ + call.args[0] for call in mock_console.print.call_args_list if call.args and isinstance(call.args[0], Table) + ] + assert printed_tables + assert printed_tables[0].title == "Catalog Plugin Packages" + assert [column.header for column in printed_tables[0].columns] == [ + "Package", + "Description", + "Runtime Plugins", + "Compatible", + "Docs", + ] + assert list(printed_tables[0].columns[1].cells) == ["Transform text records"] + docs_cell = list(printed_tables[0].columns[4].cells)[0] + assert isinstance(docs_cell, Text) + assert docs_cell.plain == "docs" + assert docs_cell.style is not None + assert docs_cell.style.link == "https://docs.example.test/plugins/data-designer-text-transform/" + + rendered_output = StringIO() + narrow_console = Console( + file=rendered_output, + force_terminal=True, + color_system="standard", + width=60, + legacy_windows=False, + ) + narrow_console.print(printed_tables[0]) + assert "https://docs.example.test/plugins/data-designer-text-transform/" in rendered_output.getvalue() + controller.catalog_service.group_entries_by_package.assert_called_once_with(package_entries) + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.display_config_preview") +def test_run_info_renders_package_metadata_with_nested_runtime_plugins( + mock_display_config_preview: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + package_entries = [ + _entry(name="text-column", plugin_type="column-generator"), + _entry(name="text-processor", plugin_type="processor"), + ] + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = package_entries + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + controller.install_service.build_install_plan.return_value = _plan(catalog) + + controller.run_info("text-transform", catalog_alias="local") + + metadata = mock_display_config_preview.call_args.args[0] + assert metadata["package"] == { + "name": "data-designer-text-transform", + "description": "Transform text records", + } + assert metadata["install"] == { + "requirement": "data-designer-text-transform", + "index_url": "https://docs.example.test/simple/", + } + assert metadata["plugins"] == [ + { + "name": "text-column", + "plugin_type": "column-generator", + "entry_point": { + "group": "data_designer.plugins", + "name": "text-column", + "value": "data_designer_text_transform.plugin:plugin", + }, + }, + { + "name": "text-processor", + "plugin_type": "processor", + "entry_point": { + "group": "data_designer.plugins", + "name": "text-processor", + "value": "data_designer_text_transform.plugin:plugin", + }, + }, + ] + assert all("package" not in plugin for plugin in metadata["plugins"]) + assert all("install" not in plugin for plugin in metadata["plugins"]) + assert all("compatibility" not in plugin for plugin in metadata["plugins"]) + assert all("docs" not in plugin for plugin in metadata["plugins"]) + controller.catalog_service.get_package_entries.assert_called_once_with( + "text-transform", + "local", + refresh=False, + include_incompatible=True, + ) + mock_display_config_preview.assert_called_once() + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.display_config_preview") +def test_run_info_warns_when_install_plan_has_source_warning( + mock_display_config_preview: MagicMock, + mock_console: MagicMock, + mock_print_warning: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + controller.install_service.build_install_plan.return_value = _plan( + catalog, + source_warning="pip source warning", + ) + + controller.run_info("text-transform", catalog_alias="local") + + mock_print_warning.assert_called_once_with("pip source warning") + mock_display_config_preview.assert_called_once() + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +def test_run_info_rejects_runtime_plugin_name_that_is_not_package_alias( + mock_print_error: MagicMock, + controller: PluginCatalogController, +) -> None: + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [] + + with pytest.raises(typer.Exit) as exc_info: + controller.run_info("text-column", catalog_alias="local") + + assert exc_info.value.exit_code == 1 + controller.catalog_service.get_package_entries.assert_called_once_with( + "text-column", + "local", + refresh=False, + include_incompatible=True, + ) + mock_print_error.assert_called_once_with("Plugin package or alias 'text-column' was not found in catalog 'local'") + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_info") +def test_run_install_dry_run_renders_plan_without_installing( + mock_print_info: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + plan = _plan(catalog, data_designer_protection="pinned installed Data Designer packages; data-designer 0.5.10") + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + controller.install_service.build_install_plan.return_value = plan + + controller.run_install("data-designer-text-transform", catalog_alias="local", dry_run=True) + + controller.catalog_service.get_package_entries.assert_called_once_with( + "data-designer-text-transform", + "local", + refresh=False, + include_incompatible=True, + ) + controller.install_service.install.assert_not_called() + controller.install_service.verify_entry_points.assert_not_called() + mock_print_info.assert_any_call("Dry run complete; no changes made") + mock_console.print.assert_any_call( + " Data Designer: [bold]pinned installed Data Designer packages; data-designer 0.5.10[/bold]" + ) + assert all("Runtime plugins" not in str(call_args.args[0]) for call_args in mock_console.print.call_args_list) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +def test_run_install_blocks_incompatible_package( + mock_print_error: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult( + False, + ["Data Designer 0.5.7 does not satisfy >=99.0"], + ) + + with pytest.raises(typer.Exit) as exc_info: + controller.run_install("data-designer-text-transform", catalog_alias="local") + + assert exc_info.value.exit_code == 1 + controller.catalog_service.get_package_entries.assert_called_once_with( + "data-designer-text-transform", + "local", + refresh=False, + include_incompatible=True, + ) + controller.install_service.build_install_plan.assert_not_called() + mock_print_error.assert_called_once_with( + "Plugin package 'data-designer-text-transform' is not compatible with this environment" + ) + mock_console.print.assert_any_call(" - Data Designer 0.5.7 does not satisfy >=99.0") + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +def test_run_install_rejects_runtime_plugin_name_as_target( + mock_print_error: MagicMock, + controller: PluginCatalogController, +) -> None: + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [] + + with pytest.raises(typer.Exit) as exc_info: + controller.run_install("text-column", catalog_alias="local") + + assert exc_info.value.exit_code == 1 + controller.catalog_service.get_package_entries.assert_called_once_with( + "text-column", + "local", + refresh=False, + include_incompatible=True, + ) + controller.install_service.build_install_plan.assert_not_called() + mock_print_error.assert_called_once_with("Plugin package or alias 'text-column' was not found in catalog 'local'") + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_install_dry_run_renders_incompatible_plan_and_block_message( + mock_print_warning: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult( + False, + ["Data Designer 0.5.7 does not satisfy >=99.0"], + ) + controller.install_service.build_install_plan.return_value = _plan(catalog) + + controller.run_install("data-designer-text-transform", catalog_alias="local", dry_run=True) + + controller.install_service.build_install_plan.assert_called_once_with(entry, catalog, manager="auto") + controller.install_service.install.assert_not_called() + controller.install_service.verify_entry_points.assert_not_called() + mock_console.print.assert_any_call(" Command: [bold]python -m pip install data-designer-text-transform[/bold]") + mock_console.print.assert_any_call(" Compatibility: [bold yellow]not compatible[/bold yellow]") + mock_console.print.assert_any_call(" - Data Designer 0.5.7 does not satisfy >=99.0") + mock_print_warning.assert_called_once_with( + "Dry run complete; no changes made. A real install would be blocked because compatibility checks failed." + ) + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_install_dry_run_allows_incompatible_entry_for_inspection( + mock_print_warning: MagicMock, + mock_print_error: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult( + False, + ["Data Designer 0.5.7 does not satisfy >=99.0"], + ) + controller.install_service.build_install_plan.return_value = _plan(catalog) + + controller.run_install("data-designer-text-transform", catalog_alias="local", dry_run=True) + + controller.catalog_service.get_package_entries.assert_called_once_with( + "data-designer-text-transform", + "local", + refresh=False, + include_incompatible=True, + ) + controller.install_service.build_install_plan.assert_called_once_with(entry, catalog, manager="auto") + controller.install_service.install.assert_not_called() + mock_print_error.assert_not_called() + mock_print_warning.assert_called_once_with( + "Dry run complete; no changes made. A real install would be blocked because compatibility checks failed." + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_install_warns_when_install_plan_has_source_warning( + mock_print_warning: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + controller.install_service.build_install_plan.return_value = _plan( + catalog, + source_warning="pip source warning", + ) + + controller.run_install("data-designer-text-transform", catalog_alias="local", dry_run=True) + + mock_print_warning.assert_called_once_with("pip source warning") + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_success") +def test_run_install_reports_success_when_verification_finds_entry_point( + mock_print_success: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + plan = _plan(catalog) + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + controller.install_service.build_install_plan.return_value = plan + controller.install_service.verify_entry_points.return_value = True + + controller.run_install("data-designer-text-transform", catalog_alias="local", yes=True) + + controller.install_service.install.assert_called_once_with(plan) + controller.install_service.verify_entry_points.assert_called_once_with([entry]) + mock_print_success.assert_called_once_with( + "Plugin package 'data-designer-text-transform' installed and runtime entry points verified" + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_install_warns_when_verification_misses_entry_point( + mock_print_warning: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + plan = _plan(catalog) + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.catalog_service.evaluate_compatibility.return_value = CompatibilityResult(True, []) + controller.install_service.build_install_plan.return_value = plan + controller.install_service.verify_entry_points.return_value = False + + controller.run_install("data-designer-text-transform", catalog_alias="local", yes=True) + + controller.install_service.install.assert_called_once_with(plan) + controller.install_service.verify_entry_points.assert_called_once_with([entry]) + mock_print_warning.assert_called_once_with( + "Plugin package 'data-designer-text-transform' was installed, but Data Designer did not discover every " + "declared runtime entry point. Restart the shell or check the package entry point metadata." + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_info") +def test_run_uninstall_dry_run_renders_plan_without_uninstalling( + mock_print_info: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + plan = _uninstall_plan(catalog) + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.install_service.build_uninstall_plan.return_value = plan + + controller.run_uninstall("data-designer-text-transform", catalog_alias="local", dry_run=True) + + controller.catalog_service.get_package_entries.assert_called_once_with( + "data-designer-text-transform", + "local", + refresh=False, + include_incompatible=True, + ) + controller.install_service.build_uninstall_plan.assert_called_once_with(entry, catalog, manager="auto") + controller.install_service.uninstall.assert_not_called() + controller.install_service.verify_entry_points_removed.assert_not_called() + mock_console.print.assert_any_call( + " Command: [bold]python -m pip uninstall --yes data-designer-text-transform[/bold]" + ) + assert all("Runtime plugins" not in str(call_args.args[0]) for call_args in mock_console.print.call_args_list) + mock_print_info.assert_any_call("Dry run complete; no changes made") + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +def test_run_uninstall_wraps_plan_error( + mock_print_error: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.install_service.build_uninstall_plan.side_effect = ValueError("uv was requested") + + with pytest.raises(typer.Exit) as exc_info: + controller.run_uninstall("data-designer-text-transform", catalog_alias="local") + + assert exc_info.value.exit_code == 1 + controller.install_service.uninstall.assert_not_called() + mock_print_error.assert_called_once_with("Failed to build plugin uninstall plan: uv was requested") + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_success") +def test_run_uninstall_reports_success_when_entry_points_are_removed( + mock_print_success: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + plan = _uninstall_plan(catalog) + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.install_service.build_uninstall_plan.return_value = plan + controller.install_service.verify_entry_points_removed.return_value = True + + controller.run_uninstall("data-designer-text-transform", catalog_alias="local", yes=True) + + controller.install_service.uninstall.assert_called_once_with(plan) + controller.install_service.verify_entry_points_removed.assert_called_once_with([entry]) + mock_print_success.assert_called_once_with( + "Plugin package 'data-designer-text-transform' uninstalled and runtime entry points removed" + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.console") +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_warning") +def test_run_uninstall_warns_when_entry_points_remain( + mock_print_warning: MagicMock, + mock_console: MagicMock, + controller: PluginCatalogController, +) -> None: + entry = _entry() + catalog = _catalog() + plan = _uninstall_plan(catalog) + controller.catalog_service.get_catalog.return_value = catalog + controller.catalog_service.get_package_entries.return_value = [entry] + controller.install_service.build_uninstall_plan.return_value = plan + controller.install_service.verify_entry_points_removed.return_value = False + + controller.run_uninstall("data-designer-text-transform", catalog_alias="local", yes=True) + + controller.install_service.uninstall.assert_called_once_with(plan) + controller.install_service.verify_entry_points_removed.assert_called_once_with([entry]) + mock_print_warning.assert_called_once_with( + "Plugin package 'data-designer-text-transform' was uninstalled, but Data Designer still discovers one or " + "more declared runtime entry points. Restart the shell or check the package environment." + ) + assert mock_console.print.call_count >= 1 + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +def test_run_catalog_add_wraps_invalid_alias_validation_error( + mock_print_error: MagicMock, + tmp_path: Path, +) -> None: + plugin_controller = PluginCatalogController(tmp_path) + + with pytest.raises(typer.Exit) as exc_info: + plugin_controller.run_catalog_add( + alias="foo/bar", + url="https://github.com/acme/dd-plugins", + ) + + assert exc_info.value.exit_code == 1 + mock_print_error.assert_called_once_with("Invalid catalog alias 'foo/bar': must match `^[A-Za-z0-9_.-]+$`") + + +@patch("data_designer.cli.controllers.plugin_catalog_controller.print_error") +def test_run_catalog_list_wraps_registry_load_error( + mock_print_error: MagicMock, + controller: PluginCatalogController, +) -> None: + controller.catalog_service.list_catalogs.side_effect = PluginCatalogError("bad registry") + + with pytest.raises(typer.Exit) as exc_info: + controller.run_catalog_list() + + assert exc_info.value.exit_code == 1 + mock_print_error.assert_called_once_with("Failed to list plugin catalogs: bad registry") + + +def _catalog() -> PluginCatalogConfig: + return PluginCatalogConfig( + alias="local", + url="https://raw.githubusercontent.com/acme/dd-plugins/main/catalog/plugins.json", + ) + + +def _plan( + catalog: PluginCatalogConfig, + *, + source_warning: str | None = None, + data_designer_protection: str | None = None, +) -> InstallPlan: + return InstallPlan( + package_name="data-designer-text-transform", + source_description="data-designer-text-transform", + command=["python", "-m", "pip", "install", "data-designer-text-transform"], + manager="pip", + catalog_alias=catalog.alias, + source_warning=source_warning, + data_designer_protection=data_designer_protection, + ) + + +def _uninstall_plan(catalog: PluginCatalogConfig) -> UninstallPlan: + return UninstallPlan( + package_name="data-designer-text-transform", + command=["python", "-m", "pip", "uninstall", "--yes", "data-designer-text-transform"], + manager="pip", + catalog_alias=catalog.alias, + ) + + +def _entry( + *, + name: str = "text-transform", + plugin_type: str = "processor", + package_name: str = "data-designer-text-transform", +) -> PluginCatalogEntry: + return PluginCatalogEntry.model_validate( + { + "name": name, + "plugin_type": plugin_type, + "description": "Transform text records", + "package": { + "name": package_name, + }, + "install": { + "requirement": package_name, + "index_url": "https://docs.example.test/simple/", + }, + "entry_point": { + "group": "data_designer.plugins", + "name": name, + "value": "data_designer_text_transform.plugin:plugin", + }, + "compatibility": { + "python": {"specifier": ">=3.10"}, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": None, + }, + }, + "docs": { + "url": f"https://docs.example.test/plugins/{package_name}/", + }, + } + ) diff --git a/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-invalid-install.json b/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-invalid-install.json new file mode 100644 index 000000000..4c6065a3f --- /dev/null +++ b/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-invalid-install.json @@ -0,0 +1,36 @@ +{ + "schema_version": 2, + "packages": [ + { + "name": "data-designer-invalid-install", + "description": "Invalid install fixture", + "install": { + "requirement": "other-package==0.1.0" + }, + "compatibility": { + "python": { + "specifier": ">=3.10" + }, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-invalid-install/" + }, + "plugins": [ + { + "name": "invalid-install", + "plugin_type": "column-generator", + "entry_point": { + "group": "data_designer.plugins", + "name": "invalid-install", + "value": "data_designer_invalid_install.plugin:plugin" + } + } + ] + } + ] +} diff --git a/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-unsupported-version.json b/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-unsupported-version.json new file mode 100644 index 000000000..8603f09a8 --- /dev/null +++ b/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-unsupported-version.json @@ -0,0 +1,4 @@ +{ + "schema_version": 999, + "packages": [] +} diff --git a/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-valid.json b/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-valid.json new file mode 100644 index 000000000..e295ccffb --- /dev/null +++ b/packages/data-designer/tests/cli/fixtures/upstream-catalogs/catalog-valid.json @@ -0,0 +1,204 @@ +{ + "schema_version": 2, + "packages": [ + { + "name": "data-designer-compatible-column", + "description": "Compatible index-backed column generator fixture", + "install": { + "requirement": "data-designer-compatible-column", + "index_url": "https://docs.example.test/simple/" + }, + "compatibility": { + "python": { + "specifier": ">=3.10" + }, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-compatible-column/" + }, + "plugins": [ + { + "name": "compatible-column", + "plugin_type": "column-generator", + "entry_point": { + "group": "data_designer.plugins", + "name": "compatible-column", + "value": "data_designer_compatible_column.plugin:plugin" + } + } + ] + }, + { + "name": "data-designer-git-seed-reader", + "description": "Compatible Git direct reference seed reader fixture", + "install": { + "requirement": "data-designer-git-seed-reader @ git+https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git@data-designer-git-seed-reader/v0.2.0#subdirectory=plugins/data-designer-git-seed-reader" + }, + "compatibility": { + "python": { + "specifier": ">=3.10" + }, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-git-seed-reader/" + }, + "plugins": [ + { + "name": "compatible-git-seed-reader", + "plugin_type": "seed-reader", + "entry_point": { + "group": "data_designer.plugins", + "name": "compatible-git-seed-reader", + "value": "data_designer_git_seed_reader.plugin:plugin" + } + } + ] + }, + { + "name": "data-designer-url-processor", + "description": "Compatible direct URL processor fixture", + "install": { + "requirement": "data-designer-url-processor @ https://packages.example.test/data_designer_url_processor-0.2.1-py3-none-any.whl" + }, + "compatibility": { + "python": { + "specifier": ">=3.10" + }, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-url-processor/" + }, + "plugins": [ + { + "name": "compatible-url-processor", + "plugin_type": "processor", + "entry_point": { + "group": "data_designer.plugins", + "name": "compatible-url-processor", + "value": "data_designer_url_processor.plugin:plugin" + } + } + ] + }, + { + "name": "data-designer-python312-column", + "description": "Python compatibility rejection fixture", + "install": { + "requirement": "data-designer-python312-column", + "index_url": "https://docs.example.test/simple/" + }, + "compatibility": { + "python": { + "specifier": ">=3.12" + }, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-python312-column/" + }, + "plugins": [ + { + "name": "python312-column", + "plugin_type": "column-generator", + "entry_point": { + "group": "data_designer.plugins", + "name": "python312-column", + "value": "data_designer_python312_column.plugin:plugin" + } + } + ] + }, + { + "name": "data-designer-future-dd-processor", + "description": "Data Designer compatibility rejection fixture", + "install": { + "requirement": "data-designer-future-dd-processor", + "index_url": "https://docs.example.test/simple/" + }, + "compatibility": { + "python": { + "specifier": ">=3.10" + }, + "data_designer": { + "requirement": "data-designer>=999.0", + "specifier": ">=999.0", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-future-dd-processor/" + }, + "plugins": [ + { + "name": "future-dd-processor", + "plugin_type": "processor", + "entry_point": { + "group": "data_designer.plugins", + "name": "future-dd-processor", + "value": "data_designer_future_dd_processor.plugin:plugin" + } + } + ] + }, + { + "name": "data-designer-multi-plugin-package", + "description": "Multi-plugin package fixture", + "install": { + "requirement": "data-designer-multi-plugin-package", + "index_url": "https://docs.example.test/simple/" + }, + "compatibility": { + "python": { + "specifier": ">=3.10" + }, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": null + } + }, + "docs": { + "url": "https://docs.example.test/plugins/data-designer-multi-plugin-package/" + }, + "plugins": [ + { + "name": "multi-seed-reader", + "plugin_type": "seed-reader", + "entry_point": { + "group": "data_designer.plugins", + "name": "multi-seed-reader", + "value": "data_designer_multi_plugin_package.seed:plugin" + } + }, + { + "name": "multi-processor", + "plugin_type": "processor", + "entry_point": { + "group": "data_designer.plugins", + "name": "multi-processor", + "value": "data_designer_multi_plugin_package.processor:plugin" + } + } + ] + } + ] +} diff --git a/packages/data-designer/tests/cli/repositories/test_plugin_catalog_repository.py b/packages/data-designer/tests/cli/repositories/test_plugin_catalog_repository.py new file mode 100644 index 000000000..86dc7898f --- /dev/null +++ b/packages/data-designer/tests/cli/repositories/test_plugin_catalog_repository.py @@ -0,0 +1,571 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import json +import os +from pathlib import Path +from unittest.mock import Mock, patch +from urllib.error import HTTPError + +import pytest +from pydantic import ValidationError + +from data_designer.cli.plugin_catalog import ( + DEFAULT_PLUGIN_CATALOG_ALIAS, + DEFAULT_PLUGIN_CATALOG_URL_ENV_VAR, + MAX_PLUGIN_CATALOG_SIZE_BYTES, + PluginCatalog, + PluginCatalogError, +) +from data_designer.cli.repositories.plugin_catalog_repository import PluginCatalogRepository, normalize_catalog_location + +UPSTREAM_CATALOG_FIXTURES_DIR = Path(__file__).parents[1] / "fixtures" / "upstream-catalogs" + + +def test_repository_includes_default_nvidia_catalog(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + + catalogs = repository.list_catalogs() + + assert [catalog.alias for catalog in catalogs] == [DEFAULT_PLUGIN_CATALOG_ALIAS] + + +def test_default_catalog_honors_url_environment_override(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv(DEFAULT_PLUGIN_CATALOG_URL_ENV_VAR, "https://example.test/catalog/plugins.json") + repository = PluginCatalogRepository(tmp_path) + + catalog = repository.default_catalog() + + assert catalog.url == "https://example.test/catalog/plugins.json" + + +def test_add_catalog_normalizes_github_repository_url(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + + catalog = repository.add_catalog("research", "https://github.com/acme/dd-plugins") + + assert catalog.url == "https://raw.githubusercontent.com/acme/dd-plugins/main/catalog/plugins.json" + assert repository.get_catalog("research") == catalog + + +def test_add_catalog_persists_only_public_catalog_fields(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + + repository.add_catalog("research", "https://github.com/acme/dd-plugins") + + saved_registry = repository.config_file.read_text() + assert "alias: research" in saved_registry + assert "url: https://raw.githubusercontent.com/acme/dd-plugins/main/catalog/plugins.json" in saved_registry + assert "cache_ttl_seconds" not in saved_registry + + +def test_add_catalog_normalizes_github_tree_url_with_subdirectory(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + + catalog = repository.add_catalog("research", "https://github.com/acme/dd-plugins/tree/main/custom-catalog") + + assert catalog.url == "https://raw.githubusercontent.com/acme/dd-plugins/main/custom-catalog/catalog/plugins.json" + + +def test_add_catalog_normalizes_github_tree_url_ending_with_catalog(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + + catalog = repository.add_catalog("research", "https://github.com/acme/dd-plugins/tree/main/catalog") + + assert catalog.url == "https://raw.githubusercontent.com/acme/dd-plugins/main/catalog/plugins.json" + + +def test_catalog_aliases_are_case_insensitive(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + + catalog = repository.add_catalog("Research", "https://github.com/acme/dd-plugins") + + assert repository.get_catalog("research") == catalog + with pytest.raises(ValueError, match="already exists"): + repository.add_catalog("research", "https://github.com/acme/other-plugins") + with pytest.raises(ValueError, match="already exists"): + repository.add_catalog("NVIDIA", "https://github.com/acme/nvidia-plugins") + + repository.remove_catalog("research") + + assert repository.get_catalog("Research") is None + + +def test_normalize_local_catalog_directory() -> None: + normalized = normalize_catalog_location("~/plugins") + + assert normalized.endswith("/plugins/catalog/plugins.json") + + +def test_normalize_local_catalog_directory_ending_with_catalog(tmp_path: Path) -> None: + normalized = normalize_catalog_location(str(tmp_path / "plugins" / "catalog")) + + assert normalized == str((tmp_path / "plugins" / "catalog" / "plugins.json").resolve(strict=False)) + + +def test_load_invalid_catalog_registry_raises_user_facing_error(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + repository.config_file.write_text("catalogs:\n- alias: research\n") + + with pytest.raises(PluginCatalogError, match="Failed to load plugin catalog registry"): + repository.load() + + +def test_add_catalog_does_not_replace_invalid_catalog_registry(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + saved_registry = "catalogs:\n- alias: research\n" + repository.config_file.write_text(saved_registry) + + with pytest.raises(PluginCatalogError, match="Failed to load plugin catalog registry"): + repository.add_catalog("local", "https://github.com/acme/dd-plugins") + + assert repository.config_file.read_text() == saved_registry + + +def test_load_catalog_uses_cache_when_source_is_unavailable(tmp_path: Path) -> None: + catalog_path = _write_catalog(tmp_path) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + first_catalog = repository.load_catalog("local") + catalog_path.unlink() + cached_catalog = repository.load_catalog("local") + + assert first_catalog.plugins[0].name == "text-transform" + assert cached_catalog.plugins[0].name == "text-transform" + + +def test_load_catalog_falls_back_to_stale_cache_when_refresh_fetch_fails(tmp_path: Path) -> None: + catalog_path = _write_catalog(tmp_path, plugin_name="cached-transform") + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path), cache_ttl_seconds=0) + + repository.load_catalog("local") + catalog_path.unlink() + cached_catalog = repository.load_catalog("local") + + assert cached_catalog.plugins[0].name == "cached-transform" + + +def test_load_catalog_does_not_fall_back_to_stale_cache_when_fresh_catalog_is_invalid(tmp_path: Path) -> None: + catalog_path = _write_catalog(tmp_path, plugin_name="cached-transform") + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path), cache_ttl_seconds=0) + + repository.load_catalog("local") + catalog_path.write_text(json.dumps(_catalog_payload(schema_version=999, plugin_name="invalid-transform"))) + + with pytest.raises(PluginCatalogError, match="unsupported catalog schema_version"): + repository.load_catalog("local") + + +def test_load_catalog_with_zero_cache_ttl_refreshes_source(tmp_path: Path) -> None: + catalog_path = _write_catalog(tmp_path, plugin_name="text-transform") + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path), cache_ttl_seconds=0) + + first_catalog = repository.load_catalog("local") + catalog_path.write_text(json.dumps(_catalog_payload(plugin_name="fresh-transform"))) + refreshed_catalog = repository.load_catalog("local") + + assert first_catalog.plugins[0].name == "text-transform" + assert refreshed_catalog.plugins[0].name == "fresh-transform" + + +def test_load_catalog_cache_file_is_keyed_by_alias_and_url(tmp_path: Path) -> None: + catalog_path = _write_catalog(tmp_path) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + repository.load_catalog("local") + + cache_files = list(repository.cache_dir.glob("*.json")) + assert len(cache_files) == 1 + assert cache_files[0].name.startswith("local-") + assert cache_files[0].name != "local.json" + + +@patch("data_designer.cli.repositories.plugin_catalog_repository.urlopen") +def test_load_catalog_reports_remote_http_error(mock_urlopen: Mock, tmp_path: Path) -> None: + mock_urlopen.side_effect = HTTPError( + "https://example.test/catalog/plugins.json", + 404, + "Not Found", + {}, + None, + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("remote", "https://example.test/catalog/plugins.json") + + with pytest.raises(PluginCatalogError, match="HTTP 404"): + repository.load_catalog("remote", refresh=True) + + +@patch("data_designer.cli.repositories.plugin_catalog_repository.urlopen") +def test_load_catalog_rejects_oversized_remote_catalog(mock_urlopen: Mock, tmp_path: Path) -> None: + mock_urlopen.return_value = _RemoteResponse(b"{" + (b" " * MAX_PLUGIN_CATALOG_SIZE_BYTES) + b"}") + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("remote", "https://example.test/catalog/plugins.json") + + with pytest.raises(PluginCatalogError, match="exceeds maximum size"): + repository.load_catalog("remote", refresh=True) + + +@patch("data_designer.cli.repositories.plugin_catalog_repository.urlopen") +def test_load_catalog_reports_remote_json_decode_error(mock_urlopen: Mock, tmp_path: Path) -> None: + mock_urlopen.return_value = _RemoteResponse(b"{") + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("remote", "https://example.test/catalog/plugins.json") + + with pytest.raises(PluginCatalogError, match="Failed to parse plugin catalog JSON"): + repository.load_catalog("remote", refresh=True) + + +def test_load_catalog_rejects_unsupported_schema_version(tmp_path: Path) -> None: + catalog_path = _write_catalog(tmp_path, schema_version=999) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="unsupported catalog schema_version"): + repository.load_catalog("local", refresh=True) + + +def test_load_catalog_accepts_schema_v2_package_catalog(tmp_path: Path) -> None: + catalog_path = _write_catalog( + tmp_path, + packages=[ + _package_entry( + package_name="data-designer-index-package", + plugins=[ + _runtime_plugin("index-column", plugin_type="column-generator"), + _runtime_plugin("index-processor", plugin_type="processor"), + ], + install={ + "requirement": "data-designer-index-package", + "index_url": "https://docs.example.test/simple/", + }, + ), + _package_entry( + package_name="data-designer-git-plugin", + plugins=[_runtime_plugin("git-plugin", plugin_type="seed-reader")], + install={ + "requirement": ( + "data-designer-git-plugin @ " + "git+https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git@" + "data-designer-git-plugin/v0.1.0" + ), + }, + ), + _package_entry( + package_name="data-designer-url-plugin", + plugins=[_runtime_plugin("url-plugin", plugin_type="processor")], + install={ + "requirement": ( + "data-designer-url-plugin @ " + "https://packages.example.test/data_designer_url_plugin-0.1.0-py3-none-any.whl" + ), + }, + ), + ], + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + catalog = repository.load_catalog("local", refresh=True) + + assert [package.name for package in catalog.packages] == [ + "data-designer-index-package", + "data-designer-git-plugin", + "data-designer-url-plugin", + ] + assert [entry.name for entry in catalog.plugins] == [ + "index-column", + "index-processor", + "git-plugin", + "url-plugin", + ] + assert catalog.plugins[0].install.index_url == "https://docs.example.test/simple/" + + +def test_consumer_accepts_upstream_valid_catalog_fixture(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("upstream", str(UPSTREAM_CATALOG_FIXTURES_DIR / "catalog-valid.json")) + + catalog = repository.load_catalog("upstream", refresh=True) + + assert [package.name for package in catalog.packages] == [ + "data-designer-compatible-column", + "data-designer-git-seed-reader", + "data-designer-url-processor", + "data-designer-python312-column", + "data-designer-future-dd-processor", + "data-designer-multi-plugin-package", + ] + assert [entry.name for entry in catalog.plugins][-2:] == ["multi-seed-reader", "multi-processor"] + + +def test_consumer_rejects_upstream_invalid_install_fixture(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("upstream", str(UPSTREAM_CATALOG_FIXTURES_DIR / "catalog-invalid-install.json")) + + with pytest.raises(PluginCatalogError, match="expected a requirement for 'data-designer-invalid-install'"): + repository.load_catalog("upstream", refresh=True) + + +def test_consumer_rejects_upstream_unsupported_version_fixture(tmp_path: Path) -> None: + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("upstream", str(UPSTREAM_CATALOG_FIXTURES_DIR / "catalog-unsupported-version.json")) + + with pytest.raises(PluginCatalogError, match="unsupported catalog schema_version 999"): + repository.load_catalog("upstream", refresh=True) + + +def test_catalog_model_requires_contract_required_package_metadata() -> None: + package = _package_entry() + package.pop("compatibility") + + with pytest.raises(ValidationError, match="compatibility"): + PluginCatalog.model_validate(_catalog_payload(packages=[package])) + + +def test_catalog_model_requires_non_empty_runtime_plugins() -> None: + package = _package_entry(plugins=[]) + + with pytest.raises(ValidationError, match="plugins"): + PluginCatalog.model_validate(_catalog_payload(packages=[package])) + + +def test_fetches_production_catalog_when_enabled(tmp_path: Path) -> None: + if os.getenv("DATA_DESIGNER_TEST_REMOTE_PLUGIN_CATALOG") != "1": + pytest.skip("Set DATA_DESIGNER_TEST_REMOTE_PLUGIN_CATALOG=1 to run the live catalog smoke test") + + catalog = PluginCatalogRepository(tmp_path).load_catalog(refresh=True) + + assert catalog.packages + + +def test_load_catalog_accepts_equivalent_data_designer_marker_quoting(tmp_path: Path) -> None: + package = _package_entry() + package["compatibility"]["data_designer"] = { + "requirement": "data-designer>=0.5.7; python_version < '3.12'", + "specifier": ">=0.5.7", + "marker": "python_version < '3.12'", + } + catalog_path = _write_catalog(tmp_path, packages=[package]) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + catalog = repository.load_catalog("local", refresh=True) + + assert catalog.plugins[0].compatibility is not None + assert catalog.plugins[0].compatibility.data_designer is not None + assert catalog.plugins[0].compatibility.data_designer.marker == "python_version < '3.12'" + + +def test_load_catalog_rejects_invalid_schema_v2_install_metadata(tmp_path: Path) -> None: + catalog_path = _write_catalog( + tmp_path, + packages=[ + _package_entry( + package_name="data-designer-invalid-install", + plugins=[_runtime_plugin("invalid-install")], + install={ + "requirement": "data-designer-other", + "index_url": "https://docs.example.test/simple/", + }, + ) + ], + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="expected a requirement for 'data-designer-invalid-install'"): + repository.load_catalog("local", refresh=True) + + +def test_load_catalog_rejects_null_schema_v2_install_index_url(tmp_path: Path) -> None: + catalog_path = _write_catalog( + tmp_path, + packages=[ + _package_entry( + package_name="data-designer-invalid-index", + plugins=[_runtime_plugin("invalid-index")], + install={ + "requirement": "data-designer-invalid-index", + "index_url": None, + }, + ) + ], + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="install.index_url.*expected a non-empty string"): + repository.load_catalog("local", refresh=True) + + +def test_load_catalog_rejects_empty_schema_v2_install_index_url(tmp_path: Path) -> None: + catalog_path = _write_catalog( + tmp_path, + packages=[ + _package_entry( + package_name="data-designer-empty-index", + plugins=[_runtime_plugin("empty-index")], + install={ + "requirement": "data-designer-empty-index", + "index_url": "", + }, + ) + ], + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="install.index_url.*expected a non-empty string"): + repository.load_catalog("local", refresh=True) + + +def test_load_catalog_rejects_unexpected_schema_v2_fields(tmp_path: Path) -> None: + package = _package_entry() + package["tags"] = ["extra"] + catalog_path = _write_catalog(tmp_path, packages=[package]) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="catalog packages\\[0\\] has invalid fields"): + repository.load_catalog("local", refresh=True) + + +def test_load_catalog_rejects_duplicate_runtime_plugin_names(tmp_path: Path) -> None: + catalog_path = _write_catalog( + tmp_path, + packages=[ + _package_entry( + package_name="data-designer-one", + plugins=[_runtime_plugin("duplicate", entry_point_name="first-entry")], + ), + _package_entry( + package_name="data-designer-two", + plugins=[_runtime_plugin("duplicate", entry_point_name="second-entry")], + ), + ], + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="duplicate runtime plugin name"): + repository.load_catalog("local", refresh=True) + + +def test_load_catalog_rejects_duplicate_canonical_package_names(tmp_path: Path) -> None: + catalog_path = _write_catalog( + tmp_path, + packages=[ + _package_entry( + package_name="data-designer-foo", + plugins=[_runtime_plugin("first-plugin")], + ), + _package_entry( + package_name="data_designer_foo", + plugins=[_runtime_plugin("second-plugin")], + ), + ], + ) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + + with pytest.raises(PluginCatalogError, match="duplicate package name"): + repository.load_catalog("local", refresh=True) + + +class _RemoteResponse: + def __init__(self, content: bytes, *, status: int = 200) -> None: + self._content = content + self.status = status + + def __enter__(self) -> "_RemoteResponse": + return self + + def __exit__(self, exc_type: object, exc_value: object, traceback: object) -> None: + return None + + def read(self, size: int = -1) -> bytes: + _ = size + return self._content + + +def _write_catalog( + tmp_path: Path, + *, + schema_version: int = 2, + plugin_name: str = "text-transform", + packages: list[dict] | None = None, +) -> Path: + catalog_dir = tmp_path / "catalog" + catalog_dir.mkdir() + catalog_path = catalog_dir / "plugins.json" + catalog_path.write_text( + json.dumps(_catalog_payload(schema_version=schema_version, plugin_name=plugin_name, packages=packages)) + ) + return catalog_path + + +def _catalog_payload( + *, + schema_version: int = 2, + plugin_name: str = "text-transform", + packages: list[dict] | None = None, +) -> dict: + return { + "schema_version": schema_version, + "packages": packages if packages is not None else [_package_entry(plugins=[_runtime_plugin(plugin_name)])], + } + + +def _package_entry( + *, + package_name: str = "data-designer-text-transform", + plugins: list[dict] | None = None, + install: dict | None = None, +) -> dict: + return { + "name": package_name, + "description": f"{package_name} package", + "install": install + or { + "requirement": package_name, + "index_url": "https://docs.example.test/simple/", + }, + "compatibility": { + "python": {"specifier": ">=3.10"}, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": None, + }, + }, + "docs": { + "url": f"https://docs.example.test/plugins/{package_name}/", + }, + "plugins": plugins if plugins is not None else [_runtime_plugin("text-transform")], + } + + +def _runtime_plugin( + plugin_name: str, + *, + plugin_type: str = "processor", + entry_point_name: str | None = None, +) -> dict: + runtime_entry_point_name = plugin_name if entry_point_name is None else entry_point_name + return { + "name": plugin_name, + "plugin_type": plugin_type, + "entry_point": { + "group": "data_designer.plugins", + "name": runtime_entry_point_name, + "value": f"data_designer_{plugin_name.replace('-', '_')}.plugin:plugin", + }, + } diff --git a/packages/data-designer/tests/cli/services/test_plugin_catalog_service.py b/packages/data-designer/tests/cli/services/test_plugin_catalog_service.py new file mode 100644 index 000000000..d318f571f --- /dev/null +++ b/packages/data-designer/tests/cli/services/test_plugin_catalog_service.py @@ -0,0 +1,424 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import json +from importlib.metadata import EntryPoint +from pathlib import Path +from unittest.mock import Mock, patch + +import pytest + +from data_designer.cli.plugin_catalog import PluginCatalog, PluginCatalogEntry +from data_designer.cli.repositories.plugin_catalog_repository import PluginCatalogRepository +from data_designer.cli.services.plugin_catalog_service import PluginCatalogService + + +def test_list_entries_filters_incompatible_plugins_by_default(tmp_path: Path) -> None: + repository = _repository_with_catalog(tmp_path) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + + entries = service.list_entries("local") + all_entries = service.list_entries("local", include_incompatible=True) + + assert [entry.name for entry in entries] == [ + "compatible-plugin", + "shared-column", + "shared-processor", + ] + assert [entry.name for entry in all_entries] == [ + "compatible-plugin", + "future-plugin", + "shared-column", + "shared-processor", + ] + + +def test_search_entries_matches_package_description_name_and_type(tmp_path: Path) -> None: + repository = _repository_with_catalog(tmp_path) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + + name_matches = service.search_entries("compatible", "local") + package_matches = service.search_entries("shared-package", "local") + type_matches = service.search_entries("seed-reader", "local") + + assert [entry.name for entry in name_matches] == ["compatible-plugin"] + assert [entry.name for entry in package_matches] == ["shared-column", "shared-processor"] + assert [entry.name for entry in type_matches] == ["compatible-plugin"] + + +def test_search_entries_ignores_install_docs_and_entry_point_metadata(tmp_path: Path) -> None: + package = _package( + package_name="data-designer-retrieval-sdg", + data_designer_specifier=">=0.5.7", + plugins=[_runtime_plugin(name="document-chunker", plugin_type="seed-reader")], + ) + package["install"]["index_url"] = "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/" + package["docs"]["url"] = "https://nvidia-nemo.github.io/DataDesignerPlugins/plugins/data-designer-retrieval-sdg/" + package["plugins"][0]["entry_point"]["value"] = "data_designer_github_noise.plugin:plugin" + catalog_path = tmp_path / "plugins.json" + catalog_path.write_text(json.dumps({"schema_version": 2, "packages": [package]})) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + + matches = service.search_entries("github", "local") + + assert matches == [] + + +def test_evaluate_compatibility_reports_data_designer_constraint(tmp_path: Path) -> None: + repository = _repository_with_catalog(tmp_path) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + entry = _entry_by_name(service.list_entries("local", include_incompatible=True), "future-plugin") + + result = service.evaluate_compatibility(entry) + + assert result.is_compatible is False + assert result.reasons == ["Data Designer 0.5.7 does not satisfy >=99.0"] + + +def test_evaluate_compatibility_reports_python_constraint() -> None: + service = PluginCatalogService( + Mock(spec=PluginCatalogRepository), + python_version="3.11.0", + data_designer_version="0.5.7", + ) + entry = PluginCatalogEntry.model_validate( + _entry( + name="future-python-plugin", + plugin_type="processor", + package_name="data-designer-future-python-plugin", + python_specifier=">=3.12", + data_designer_specifier=">=0.5.7", + ) + ) + + result = service.evaluate_compatibility(entry) + + assert result.is_compatible is False + assert result.reasons == ["Python 3.11.0 does not satisfy >=3.12"] + + +@pytest.mark.parametrize( + ("marker", "expected_is_compatible", "expected_reasons"), + [ + ("python_version >= '3.12'", True, []), + ("python_version < '3.12'", False, ["Data Designer 0.5.7 does not satisfy >=99.0"]), + ], +) +def test_evaluate_compatibility_respects_data_designer_marker( + marker: str, + expected_is_compatible: bool, + expected_reasons: list[str], +) -> None: + service = PluginCatalogService( + Mock(spec=PluginCatalogRepository), + python_version="3.11.0", + data_designer_version="0.5.7", + ) + entry = PluginCatalogEntry.model_validate( + _entry( + name="marker-gated-plugin", + plugin_type="processor", + package_name="data-designer-marker-gated-plugin", + data_designer_specifier=">=99.0", + data_designer_marker=marker, + ) + ) + + result = service.evaluate_compatibility(entry) + + assert result.is_compatible is expected_is_compatible + assert result.reasons == expected_reasons + + +@patch("data_designer.cli.services.plugin_catalog_service._get_installed_data_designer_version", return_value=None) +def test_evaluate_compatibility_reports_missing_data_designer_version(mock_version: Mock) -> None: + service = PluginCatalogService(Mock(spec=PluginCatalogRepository), python_version="3.11.0") + entry = PluginCatalogEntry.model_validate( + _entry( + name="compatible-plugin", + plugin_type="processor", + package_name="data-designer-compatible-plugin", + data_designer_specifier=">=0.5.7", + ) + ) + + result = service.evaluate_compatibility(entry) + + assert result.is_compatible is False + assert result.reasons == ["Unable to resolve installed Data Designer version for constraint '>=0.5.7'"] + mock_version.assert_called_once_with() + + +def test_evaluate_compatibility_accepts_local_dev_version_above_lower_bound(tmp_path: Path) -> None: + repository = _repository_with_catalog(tmp_path) + service = PluginCatalogService( + repository, + python_version="3.11.0", + data_designer_version="0.5.10.dev18+604fdd96", + ) + entry = _entry_by_name(service.list_entries("local", include_incompatible=True), "compatible-plugin") + + result = service.evaluate_compatibility(entry) + + assert result.is_compatible is True + assert result.reasons == [] + + +def test_get_package_entries_resolves_package_alias() -> None: + repository = Mock(spec=PluginCatalogRepository) + repository.load_catalog.return_value = PluginCatalog.model_validate( + { + "schema_version": 2, + "packages": [ + _package( + package_name="data-designer-calculator", + data_designer_specifier=">=0.5.7", + plugins=[ + _runtime_plugin(name="arithmetic-column", plugin_type="column-generator"), + _runtime_plugin(name="arithmetic-processor", plugin_type="processor"), + ], + ), + ], + } + ) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + + entries = service.get_package_entries("calculator", "local", include_incompatible=True) + + assert [entry.name for entry in entries] == ["arithmetic-column", "arithmetic-processor"] + assert {entry.package.name for entry in entries} == {"data-designer-calculator"} + + +def test_get_package_entries_prefers_exact_package_name_over_package_alias() -> None: + repository = Mock(spec=PluginCatalogRepository) + repository.load_catalog.return_value = PluginCatalog.model_validate( + { + "schema_version": 2, + "packages": [ + _package( + package_name="calculator", + data_designer_specifier=">=0.5.7", + plugins=[_runtime_plugin(name="plain-calculator", plugin_type="processor")], + ), + _package( + package_name="data-designer-calculator", + data_designer_specifier=">=0.5.7", + plugins=[_runtime_plugin(name="namespaced-calculator", plugin_type="processor")], + ), + ], + } + ) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + + entries = service.get_package_entries("calculator", "local", include_incompatible=True) + + assert [entry.name for entry in entries] == ["plain-calculator"] + assert entries[0].package.name == "calculator" + + +def test_get_package_entries_does_not_resolve_runtime_plugin_name_that_is_not_package_alias() -> None: + repository = Mock(spec=PluginCatalogRepository) + repository.load_catalog.return_value = PluginCatalog.model_validate( + { + "schema_version": 2, + "packages": [ + _package( + package_name="data-designer-calculator", + data_designer_specifier=">=0.5.7", + plugins=[_runtime_plugin(name="arithmetic", plugin_type="processor")], + ), + ], + } + ) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + + assert service.get_package_entries("arithmetic", "local", include_incompatible=True) == [] + + +def test_group_entries_by_package_groups_multi_plugin_packages(tmp_path: Path) -> None: + repository = _repository_with_catalog(tmp_path) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + entries = service.list_entries("local", include_incompatible=True) + + grouped_entries = service.group_entries_by_package(entries) + + assert [entry.name for entry in grouped_entries["data-designer-shared-package"]] == [ + "shared-column", + "shared-processor", + ] + + +def test_group_entries_by_package_canonicalizes_package_names(tmp_path: Path) -> None: + repository = _repository_with_catalog(tmp_path) + service = PluginCatalogService(repository, python_version="3.11.0", data_designer_version="0.5.7") + entries = [ + PluginCatalogEntry.model_validate( + _entry( + name="hyphen-package", + plugin_type="processor", + package_name="data-designer-shared-package", + data_designer_specifier=">=0.5.7", + ) + ), + PluginCatalogEntry.model_validate( + _entry( + name="underscore-package", + plugin_type="processor", + package_name="data_designer_shared_package", + data_designer_specifier=">=0.5.7", + ) + ), + ] + + grouped_entries = service.group_entries_by_package(entries) + + assert list(grouped_entries) == ["data-designer-shared-package"] + assert [entry.name for entry in grouped_entries["data-designer-shared-package"]] == [ + "hyphen-package", + "underscore-package", + ] + + +@patch("data_designer.cli.services.plugin_catalog_service.importlib.metadata.entry_points") +def test_list_installed_plugins_uses_entry_point_metadata_without_loading_plugins( + mock_entry_points: Mock, + tmp_path: Path, +) -> None: + mock_entry_points.return_value = [ + EntryPoint( + name="installed-plugin", + value="pkg.plugin:plugin", + group="data_designer.plugins", + ) + ] + service = PluginCatalogService(PluginCatalogRepository(tmp_path)) + + installed = service.list_installed_plugins() + + assert len(installed) == 1 + assert installed[0].name == "installed-plugin" + assert installed[0].entry_point_value == "pkg.plugin:plugin" + mock_entry_points.assert_called_once_with(group="data_designer.plugins") + + +def _repository_with_catalog(tmp_path: Path) -> PluginCatalogRepository: + catalog_path = tmp_path / "plugins.json" + catalog_path.write_text(json.dumps(_catalog_payload())) + repository = PluginCatalogRepository(tmp_path) + repository.add_catalog("local", str(catalog_path)) + return repository + + +def _entry_by_name(entries: list[PluginCatalogEntry], name: str) -> PluginCatalogEntry: + return next(entry for entry in entries if entry.name == name) + + +def _catalog_payload() -> dict: + return { + "schema_version": 2, + "packages": [ + _package( + package_name="data-designer-compatible-plugin", + data_designer_specifier=">=0.5.7", + plugins=[_runtime_plugin(name="compatible-plugin", plugin_type="seed-reader")], + ), + _package( + package_name="data-designer-future-plugin", + data_designer_specifier=">=99.0", + plugins=[_runtime_plugin(name="future-plugin", plugin_type="processor")], + ), + _package( + package_name="data-designer-shared-package", + data_designer_specifier=">=0.5.7", + plugins=[ + _runtime_plugin(name="shared-column", plugin_type="column-generator"), + _runtime_plugin(name="shared-processor", plugin_type="processor"), + ], + ), + ], + } + + +def _package( + *, + package_name: str, + data_designer_specifier: str, + plugins: list[dict], + data_designer_marker: str | None = None, + python_specifier: str = ">=3.10", +) -> dict: + return { + "name": package_name, + "description": f"{package_name} description", + "install": { + "requirement": package_name, + "index_url": "https://docs.example.test/simple/", + }, + "compatibility": { + "python": {"specifier": python_specifier}, + "data_designer": { + "requirement": f"data-designer{data_designer_specifier}", + "specifier": data_designer_specifier, + "marker": data_designer_marker, + }, + }, + "docs": { + "url": f"https://docs.example.test/plugins/{package_name}/", + }, + "plugins": plugins, + } + + +def _runtime_plugin(*, name: str, plugin_type: str) -> dict: + return { + "name": name, + "plugin_type": plugin_type, + "entry_point": { + "group": "data_designer.plugins", + "name": name, + "value": f"data_designer_{name.replace('-', '_')}.plugin:plugin", + }, + } + + +def _entry( + *, + name: str, + plugin_type: str, + package_name: str, + data_designer_specifier: str, + data_designer_marker: str | None = None, + python_specifier: str = ">=3.10", +) -> dict: + return { + "name": name, + "plugin_type": plugin_type, + "description": f"{name} description", + "package": { + "name": package_name, + }, + "install": { + "requirement": package_name, + "index_url": "https://docs.example.test/simple/", + }, + "entry_point": { + "group": "data_designer.plugins", + "name": name, + "value": f"{package_name.replace('-', '_')}.plugin:plugin", + }, + "compatibility": { + "python": {"specifier": python_specifier}, + "data_designer": { + "requirement": f"data-designer{data_designer_specifier}", + "specifier": data_designer_specifier, + "marker": data_designer_marker, + }, + }, + "docs": { + "url": f"https://docs.example.test/plugins/{package_name}/", + }, + } diff --git a/packages/data-designer/tests/cli/services/test_plugin_install_service.py b/packages/data-designer/tests/cli/services/test_plugin_install_service.py new file mode 100644 index 000000000..5fe5d163a --- /dev/null +++ b/packages/data-designer/tests/cli/services/test_plugin_install_service.py @@ -0,0 +1,813 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +from __future__ import annotations + +import importlib.metadata +import sys +from collections.abc import Iterator +from pathlib import Path +from types import SimpleNamespace +from unittest.mock import Mock, patch + +import pytest + +from data_designer.cli.plugin_catalog import PluginCatalogConfig, PluginCatalogEntry +from data_designer.cli.services.plugin_install_service import PIP_EXTRA_INDEX_SOURCE_WARNING, PluginInstallService + +DATA_DESIGNER_VERSION = "0.5.10" + + +@pytest.fixture(autouse=True) +def mock_data_designer_version() -> Iterator[None]: + with ( + patch( + "data_designer.cli.services.plugin_install_service.importlib.metadata.version", + return_value=DATA_DESIGNER_VERSION, + ), + patch( + "data_designer.cli.services.plugin_install_service.subprocess.run", + return_value=SimpleNamespace(returncode=0, stdout="uv 0.6.0\n", stderr=""), + ), + ): + yield + + +def test_build_pip_install_plan_uses_requirement_and_extra_index() -> None: + entry = _entry( + package_name="data-designer-template", + install={ + "requirement": "data-designer-template", + "index_url": "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + }, + ) + catalog = PluginCatalogConfig( + alias="nvidia", url="https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json" + ) + service = PluginInstallService() + + plan = service.build_install_plan(entry, catalog, manager="pip") + + assert plan.command == [ + sys.executable, + "-m", + "pip", + "install", + "--constraint", + "", + "--extra-index-url", + "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + "data-designer-template", + ] + assert plan.temporary_file is not None + assert plan.temporary_file.filename == "data-designer-constraint.txt" + assert plan.temporary_file.content == ( + "# Data Designer is provided by the active CLI environment.\n" + f"data-designer=={DATA_DESIGNER_VERSION}\n" + f"data-designer-config=={DATA_DESIGNER_VERSION}\n" + f"data-designer-engine=={DATA_DESIGNER_VERSION}\n" + ) + assert plan.command_stdin is None + assert ( + plan.data_designer_protection + == f"pinned installed Data Designer packages; data-designer {DATA_DESIGNER_VERSION}" + ) + assert plan.source_description == ( + "data-designer-template via https://nvidia-nemo.github.io/DataDesignerPlugins/simple/" + ) + assert plan.source_warning == PIP_EXTRA_INDEX_SOURCE_WARNING + + +def test_build_direct_reference_install_plan_uses_requirement_verbatim() -> None: + requirement = ( + "data-designer-template @ " + "git+https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git@data-designer-template/v0.1.0" + ) + entry = _entry(package_name="data-designer-template", install={"requirement": requirement}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + plan = service.build_install_plan(entry, catalog, manager="pip") + + assert plan.command[-1] == requirement + assert "--extra-index-url" not in plan.command + assert plan.source_warning is None + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_chooses_uv_when_available(mock_which: Mock) -> None: + entry = _entry( + package_name="data-designer-template", + install={ + "requirement": "data-designer-template", + "index_url": "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + }, + ) + catalog = PluginCatalogConfig( + alias="nvidia", url="https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json" + ) + service = PluginInstallService() + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.manager == "uv" + assert plan.install_mode == "uv-environment" + assert plan.command == [ + "uv", + "pip", + "install", + "--python", + sys.executable, + "--excludes", + "-", + "--default-index", + "https://pypi.org/simple/", + "--index", + "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + "data-designer-template", + ] + assert plan.command_stdin == "data-designer\ndata-designer-config\ndata-designer-engine\n" + assert plan.temporary_file is None + assert ( + plan.data_designer_protection + == f"using installed data-designer {DATA_DESIGNER_VERSION}; uv will not resolve Data Designer packages" + ) + assert plan.source_warning is None + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_uses_uv_add_for_active_project(mock_which: Mock, tmp_path: Path) -> None: + working_dir = _write_project(tmp_path) / "src" + working_dir.mkdir() + entry = _entry( + package_name="data-designer-template", + install={ + "requirement": "data-designer-template", + "index_url": "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + }, + ) + catalog = PluginCatalogConfig( + alias="nvidia", url="https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json" + ) + service = PluginInstallService(working_dir=working_dir, active_virtualenv=True) + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.manager == "uv" + assert plan.install_mode == "uv-project" + assert plan.project_root == str(tmp_path) + assert plan.command == [ + "uv", + "add", + "--project", + str(tmp_path), + "--active", + "--no-install-project", + "--no-install-package", + "data-designer", + "--no-install-package", + "data-designer-config", + "--no-install-package", + "data-designer-engine", + "--index", + "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + "data-designer-template", + ] + assert plan.command_stdin is None + assert plan.temporary_file is None + assert ( + plan.data_designer_protection + == f"using installed data-designer {DATA_DESIGNER_VERSION}; uv will not install Data Designer packages" + ) + assert plan.source_warning is None + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_does_not_use_uv_add_without_active_virtualenv( + mock_which: Mock, + tmp_path: Path, +) -> None: + _write_project(tmp_path) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService(working_dir=tmp_path, active_virtualenv=False) + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.install_mode == "uv-environment" + assert plan.command[:6] == ["uv", "pip", "install", "--python", sys.executable, "--excludes"] + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_does_not_use_uv_add_for_data_designer_workspace( + mock_which: Mock, + tmp_path: Path, +) -> None: + (tmp_path / "pyproject.toml").write_text( + '[project]\nname = "data-designer-workspace"\n[tool.uv]\npackage = false\n', + encoding="utf-8", + ) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService(working_dir=tmp_path, active_virtualenv=True) + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.install_mode == "uv-environment" + assert plan.project_root is None + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_does_not_use_uv_add_for_active_virtualenv_without_pyproject( + mock_which: Mock, + tmp_path: Path, +) -> None: + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService(working_dir=tmp_path, active_virtualenv=True) + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.manager == "uv" + assert plan.install_mode == "uv-environment" + assert plan.project_root is None + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_uses_uv_add_for_non_package_user_project( + mock_which: Mock, + tmp_path: Path, +) -> None: + (tmp_path / "pyproject.toml").write_text( + '[project]\nname = "experiment-workspace"\n[tool.uv]\npackage = false\n', + encoding="utf-8", + ) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService(working_dir=tmp_path, active_virtualenv=True) + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.install_mode == "uv-project" + assert plan.project_root == str(tmp_path) + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value=None) +def test_build_auto_install_plan_chooses_pip_when_uv_is_unavailable(mock_which: Mock) -> None: + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.manager == "pip" + assert plan.command == [ + sys.executable, + "-m", + "pip", + "install", + "--constraint", + "", + "data-designer-template", + ] + assert plan.temporary_file is not None + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service._uv_plugin_install_error") +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_install_plan_falls_back_to_pip_when_uv_is_too_old( + mock_which: Mock, + mock_uv_error: Mock, +) -> None: + mock_uv_error.return_value = "Found uv 0.5.0, but plugin package installs require uv >= 0.6.0" + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + plan = service.build_install_plan(entry, catalog, manager="auto") + + assert plan.manager == "pip" + assert plan.install_mode == "pip-environment" + assert plan.source_warning == ( + "Found uv 0.5.0, but plugin package installs require uv >= 0.6.0; falling back to pip." + ) + mock_which.assert_called_once_with("uv") + mock_uv_error.assert_called_once_with("/usr/bin/uv") + + +def test_build_pip_uninstall_plan_uses_package_name_not_install_requirement() -> None: + requirement = ( + "data-designer-template @ " + "git+https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git@data-designer-template/v0.1.0" + ) + entry = _entry(package_name="data-designer-template", install={"requirement": requirement}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + plan = service.build_uninstall_plan(entry, catalog, manager="pip") + + assert plan.command == [ + sys.executable, + "-m", + "pip", + "uninstall", + "--yes", + "data-designer-template", + ] + assert plan.package_name == "data-designer-template" + assert plan.manager == "pip" + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_uninstall_plan_chooses_uv_when_available(mock_which: Mock) -> None: + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + plan = service.build_uninstall_plan(entry, catalog, manager="auto") + + assert plan.command == [ + "uv", + "pip", + "uninstall", + "--python", + sys.executable, + "data-designer-template", + ] + assert plan.manager == "uv" + assert plan.uninstall_mode == "uv-environment" + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_auto_uninstall_plan_uses_uv_remove_for_active_project(mock_which: Mock, tmp_path: Path) -> None: + _write_project(tmp_path) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService(working_dir=tmp_path, active_virtualenv=True) + + plan = service.build_uninstall_plan(entry, catalog, manager="auto") + + assert plan.command == [ + "uv", + "remove", + "--project", + str(tmp_path), + "--no-sync", + "data-designer-template", + ] + assert plan.commands == [ + [ + "uv", + "remove", + "--project", + str(tmp_path), + "--no-sync", + "data-designer-template", + ], + [ + "uv", + "pip", + "uninstall", + "--python", + sys.executable, + "data-designer-template", + ], + ] + assert plan.uninstall_mode == "uv-project" + assert plan.project_root == str(tmp_path) + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_uv_install_plan_targets_current_python_and_adds_catalog_index(mock_which: Mock) -> None: + entry = _entry( + package_name="data-designer-template", + install={ + "requirement": "data-designer-template", + "index_url": "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + }, + ) + catalog = PluginCatalogConfig( + alias="nvidia", url="https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json" + ) + service = PluginInstallService() + + plan = service.build_install_plan(entry, catalog, manager="uv") + + assert plan.command == [ + "uv", + "pip", + "install", + "--python", + sys.executable, + "--excludes", + "-", + "--default-index", + "https://pypi.org/simple/", + "--index", + "https://nvidia-nemo.github.io/DataDesignerPlugins/simple/", + "data-designer-template", + ] + assert plan.command_stdin == "data-designer\ndata-designer-config\ndata-designer-engine\n" + assert plan.temporary_file is None + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_uv_add_plan_preserves_direct_reference_with_raw(mock_which: Mock, tmp_path: Path) -> None: + _write_project(tmp_path) + requirement = ( + "data-designer-template @ " + "git+https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git@data-designer-template/v0.1.0" + ) + entry = _entry(package_name="data-designer-template", install={"requirement": requirement}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService(working_dir=tmp_path, active_virtualenv=True) + + plan = service.build_install_plan(entry, catalog, manager="uv") + + assert plan.command[-2:] == ["--raw", requirement] + assert "--index" not in plan.command + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value=None) +def test_build_uv_install_plan_raises_when_uv_is_unavailable(mock_which: Mock) -> None: + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + with pytest.raises(ValueError, match="uv was requested"): + service.build_install_plan(entry, catalog, manager="uv") + + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service._uv_plugin_install_error") +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_build_uv_install_plan_raises_when_uv_is_too_old( + mock_which: Mock, + mock_uv_error: Mock, +) -> None: + mock_uv_error.return_value = "Found uv 0.5.0, but plugin package installs require uv >= 0.6.0" + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + with pytest.raises(ValueError, match="plugin package installs require uv >= 0.6.0"): + service.build_install_plan(entry, catalog, manager="uv") + + mock_which.assert_called_once_with("uv") + mock_uv_error.assert_called_once_with("/usr/bin/uv") + + +def test_build_install_plan_requires_installed_data_designer_version() -> None: + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + with ( + patch( + "data_designer.cli.services.plugin_install_service.importlib.metadata.version", + side_effect=importlib.metadata.PackageNotFoundError, + ), + pytest.raises(ValueError, match="Unable to resolve installed 'data-designer' version"), + ): + service.build_install_plan(entry, catalog, manager="pip") + + +def test_build_install_plan_rejects_invalid_installed_data_designer_version() -> None: + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + service = PluginInstallService() + + with ( + patch( + "data_designer.cli.services.plugin_install_service.importlib.metadata.version", + return_value="not a version", + ), + pytest.raises(ValueError, match="version 'not a version' is not a valid package version"), + ): + service.build_install_plan(entry, catalog, manager="pip") + + +def test_install_raises_when_runner_fails() -> None: + service = PluginInstallService(runner=lambda command, stdin_text: 2) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + plan = service.build_install_plan(entry, catalog, manager="pip") + + with pytest.raises(RuntimeError, match="status 2"): + service.install(plan) + + +def test_install_materializes_pip_constraint_as_temporary_file() -> None: + seen: dict[str, Path | str | None] = {} + + def runner(command: list[str], stdin_text: str | None) -> int: + constraint_file = Path(command[command.index("--constraint") + 1]) + seen["constraint_file"] = constraint_file + seen["constraint_parent"] = constraint_file.parent + seen["constraint_text"] = constraint_file.read_text(encoding="utf-8") + seen["stdin_text"] = stdin_text + return 0 + + service = PluginInstallService(runner=runner) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + plan = service.build_install_plan(entry, catalog, manager="pip") + + service.install(plan) + + constraint_file = seen["constraint_file"] + constraint_parent = seen["constraint_parent"] + assert isinstance(constraint_file, Path) + assert isinstance(constraint_parent, Path) + assert not constraint_file.exists() + assert not constraint_parent.exists() + assert seen["constraint_text"] == ( + "# Data Designer is provided by the active CLI environment.\n" + f"data-designer=={DATA_DESIGNER_VERSION}\n" + f"data-designer-config=={DATA_DESIGNER_VERSION}\n" + f"data-designer-engine=={DATA_DESIGNER_VERSION}\n" + ) + assert seen["stdin_text"] is None + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_install_passes_uv_exclude_over_stdin(mock_which: Mock) -> None: + seen: dict[str, list[str] | str | None] = {} + + def runner(command: list[str], stdin_text: str | None) -> int: + seen["command"] = command + seen["stdin_text"] = stdin_text + return 0 + + service = PluginInstallService(runner=runner) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + plan = service.build_install_plan(entry, catalog, manager="uv") + + service.install(plan) + + assert seen["command"] == [ + "uv", + "pip", + "install", + "--python", + sys.executable, + "--excludes", + "-", + "data-designer-template", + ] + assert seen["stdin_text"] == "data-designer\ndata-designer-config\ndata-designer-engine\n" + mock_which.assert_called_once_with("uv") + + +def test_uninstall_raises_when_runner_fails() -> None: + service = PluginInstallService(runner=lambda command, stdin_text: 2) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + plan = service.build_uninstall_plan(entry, catalog, manager="pip") + + with pytest.raises(RuntimeError, match="status 2"): + service.uninstall(plan) + + +@patch("data_designer.cli.services.plugin_install_service.shutil.which", return_value="/usr/bin/uv") +def test_uninstall_runs_every_project_uninstall_command(mock_which: Mock, tmp_path: Path) -> None: + seen: list[list[str]] = [] + + def runner(command: list[str], stdin_text: str | None) -> int: + assert stdin_text is None + seen.append(command) + return 0 + + _write_project(tmp_path) + service = PluginInstallService(runner=runner, working_dir=tmp_path, active_virtualenv=True) + entry = _entry(package_name="data-designer-template", install={"requirement": "data-designer-template"}) + catalog = PluginCatalogConfig(alias="local", url="/catalog/plugins.json") + plan = service.build_uninstall_plan(entry, catalog, manager="auto") + + service.uninstall(plan) + + assert seen == plan.commands + mock_which.assert_called_once_with("uv") + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +@patch("data_designer.cli.services.plugin_install_service.importlib.invalidate_caches") +def test_verify_entry_point_invalidates_caches_and_checks_declared_entry_point( + mock_invalidate_caches: Mock, + mock_entry_points: Mock, +) -> None: + entry = _entry( + package_name="data-designer-template", + plugin_name="text-transform-v2", + entry_point_name="text-transform", + install={"requirement": "data-designer-template"}, + ) + mock_entry_points.return_value = [ + SimpleNamespace(name="other-plugin", value="other_package.plugin:plugin"), + SimpleNamespace(name="text-transform", value="data_designer_template.plugin:plugin"), + ] + service = PluginInstallService() + + assert service.verify_entry_point(entry) is True + mock_invalidate_caches.assert_called_once_with() + mock_entry_points.assert_called_once_with(group="data_designer.plugins") + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +def test_verify_entry_points_fails_when_name_matches_but_value_differs(mock_entry_points: Mock) -> None: + entry = _entry( + package_name="data-designer-template", + plugin_name="text-transform", + entry_point_name="text-transform", + entry_point_value="data_designer_template.plugin:plugin", + install={"requirement": "data-designer-template"}, + ) + mock_entry_points.return_value = [ + SimpleNamespace(name="text-transform", value="other_package.plugin:plugin"), + ] + service = PluginInstallService() + + assert service.verify_entry_points([entry]) is False + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +def test_verify_entry_points_succeeds_when_all_declared_entries_match(mock_entry_points: Mock) -> None: + entries = [ + _entry( + package_name="data-designer-template", + plugin_name="text-transform", + entry_point_name="text-transform", + entry_point_value="data_designer_template.plugin:plugin", + install={"requirement": "data-designer-template"}, + ), + _entry( + package_name="data-designer-profiler", + plugin_name="text-profiler", + entry_point_name="text-profiler", + entry_point_value="data_designer_profiler.plugin:plugin", + install={"requirement": "data-designer-profiler"}, + ), + ] + mock_entry_points.return_value = [ + SimpleNamespace(name="text-profiler", value="data_designer_profiler.plugin:plugin"), + SimpleNamespace(name="text-transform", value="data_designer_template.plugin:plugin"), + ] + service = PluginInstallService() + + assert service.verify_entry_points(entries) is True + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +def test_verify_entry_points_requires_every_declared_entry_point(mock_entry_points: Mock) -> None: + entries = [ + _entry( + package_name="data-designer-retrieval-sdg", + plugin_name="document-chunker", + entry_point_name="document-chunker", + entry_point_value="data_designer_retrieval_sdg.chunker:plugin", + install={"requirement": "data-designer-retrieval-sdg"}, + ), + _entry( + package_name="data-designer-retrieval-sdg", + plugin_name="embedding-dedup", + entry_point_name="embedding-dedup", + entry_point_value="data_designer_retrieval_sdg.dedup:plugin", + install={"requirement": "data-designer-retrieval-sdg"}, + ), + ] + mock_entry_points.return_value = [ + SimpleNamespace(name="document-chunker", value="data_designer_retrieval_sdg.chunker:plugin") + ] + service = PluginInstallService() + + assert service.verify_entry_points(entries) is False + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +def test_verify_entry_points_verifies_multi_runtime_package_entries(mock_entry_points: Mock) -> None: + entries = [ + _entry( + package_name="data-designer-retrieval-sdg", + plugin_name="document-chunker", + entry_point_name="document-chunker", + entry_point_value="data_designer_retrieval_sdg.chunker:plugin", + install={"requirement": "data-designer-retrieval-sdg"}, + ), + _entry( + package_name="data-designer-retrieval-sdg", + plugin_name="embedding-dedup", + entry_point_name="embedding-dedup", + entry_point_value="data_designer_retrieval_sdg.dedup:plugin", + install={"requirement": "data-designer-retrieval-sdg"}, + ), + ] + distribution = SimpleNamespace(metadata={"Name": "data-designer-retrieval-sdg"}) + mock_entry_points.return_value = [ + SimpleNamespace( + name="embedding-dedup", + value="data_designer_retrieval_sdg.dedup:plugin", + dist=distribution, + ), + SimpleNamespace( + name="document-chunker", + value="data_designer_retrieval_sdg.chunker:plugin", + dist=distribution, + ), + ] + service = PluginInstallService() + + assert service.verify_entry_points(entries) is True + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +@patch("data_designer.cli.services.plugin_install_service.importlib.invalidate_caches") +def test_verify_entry_points_removed_succeeds_when_declared_entries_are_absent( + mock_invalidate_caches: Mock, + mock_entry_points: Mock, +) -> None: + entry = _entry( + package_name="data-designer-template", + plugin_name="text-transform", + entry_point_name="text-transform", + entry_point_value="data_designer_template.plugin:plugin", + install={"requirement": "data-designer-template"}, + ) + mock_entry_points.return_value = [ + SimpleNamespace(name="other-plugin", value="other_package.plugin:plugin"), + ] + service = PluginInstallService() + + assert service.verify_entry_points_removed([entry]) is True + mock_invalidate_caches.assert_called_once_with() + mock_entry_points.assert_called_once_with(group="data_designer.plugins") + + +@patch("data_designer.cli.services.plugin_install_service.importlib.metadata.entry_points") +def test_verify_entry_points_removed_fails_when_declared_entry_still_exists(mock_entry_points: Mock) -> None: + entry = _entry( + package_name="data-designer-template", + plugin_name="text-transform", + entry_point_name="text-transform", + entry_point_value="data_designer_template.plugin:plugin", + install={"requirement": "data-designer-template"}, + ) + mock_entry_points.return_value = [ + SimpleNamespace(name="text-transform", value="data_designer_template.plugin:plugin"), + ] + service = PluginInstallService() + + assert service.verify_entry_points_removed([entry]) is False + + +def _entry( + *, + package_name: str, + install: dict, + plugin_name: str = "text-transform", + entry_point_name: str = "text-transform", + entry_point_value: str = "data_designer_template.plugin:plugin", +) -> PluginCatalogEntry: + payload = { + "name": plugin_name, + "plugin_type": "processor", + "description": "Transform text records", + "package": { + "name": package_name, + }, + "install": install, + "entry_point": { + "group": "data_designer.plugins", + "name": entry_point_name, + "value": entry_point_value, + }, + "compatibility": { + "python": {"specifier": ">=3.10"}, + "data_designer": { + "requirement": "data-designer>=0.5.7", + "specifier": ">=0.5.7", + "marker": None, + }, + }, + "docs": { + "url": f"https://docs.example.test/plugins/{package_name}/", + }, + } + return PluginCatalogEntry.model_validate(payload) + + +def _write_project(path: Path, *, name: str = "synthetic-data-project") -> Path: + path.mkdir(exist_ok=True) + (path / "pyproject.toml").write_text(f'[project]\nname = "{name}"\n', encoding="utf-8") + return path diff --git a/packages/data-designer/tests/cli/test_main.py b/packages/data-designer/tests/cli/test_main.py index 33b620a35..2e155147f 100644 --- a/packages/data-designer/tests/cli/test_main.py +++ b/packages/data-designer/tests/cli/test_main.py @@ -30,6 +30,17 @@ def test_main_bootstraps_before_running_app(mock_bootstrap: Mock, mock_app: Mock assert call_order.mock_calls == [call.bootstrap(), call.app()] +@patch("data_designer.cli.main.app") +@patch("data_designer.cli.main.ensure_cli_default_model_settings") +def test_main_bootstraps_for_plugin_commands(mock_bootstrap: Mock, mock_app: Mock) -> None: + """The plugin command still runs through CLI default setup before Typer dispatch.""" + with patch("sys.argv", ["data-designer", "plugin", "list"]): + main() + + mock_bootstrap.assert_called_once_with() + mock_app.assert_called_once_with() + + @patch("data_designer.cli.main.app") @patch("data_designer.cli.main.ensure_cli_default_model_settings") def test_main_skips_bootstrap_for_version(mock_bootstrap: Mock, mock_app: Mock) -> None: @@ -167,3 +178,78 @@ def test_app_dispatches_lazy_create_command(mock_controller_cls: Mock) -> None: resume=ResumeMode.NEVER, output_format=None, ) + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_app_dispatches_lazy_plugin_list_command(mock_controller_cls: Mock) -> None: + """The plugin group lazily resolves command callbacks without loading a catalog.""" + mock_controller = Mock() + mock_controller_cls.return_value = mock_controller + + result = runner.invoke( + app, + ["plugin", "--catalog", "local", "list", "--refresh", "--include-incompatible"], + ) + + assert result.exit_code == 0 + mock_controller.run_list.assert_called_once_with( + catalog_alias="local", + refresh=True, + include_incompatible=True, + ) + + +@patch("data_designer.cli.commands.plugin.PluginCatalogController") +def test_app_dispatches_lazy_plugin_catalog_list_command(mock_controller_cls: Mock) -> None: + """Nested plugin catalog commands resolve through the lazy command group.""" + mock_controller = Mock() + mock_controller_cls.return_value = mock_controller + + result = runner.invoke(app, ["plugin", "catalog", "list"]) + + assert result.exit_code == 0 + mock_controller.run_catalog_list.assert_called_once_with() + + +def test_app_help_keeps_config_and_plugin_commands_reachable() -> None: + config_result = runner.invoke(app, ["config", "--help"]) + plugin_result = runner.invoke(app, ["plugin", "--help"]) + + assert config_result.exit_code == 0 + assert "providers" in config_result.output + assert "models" in config_result.output + assert plugin_result.exit_code == 0 + assert "list" in plugin_result.output + assert "install" in plugin_result.output + assert "uninstall" in plugin_result.output + assert "catalog" in plugin_result.output + + +def test_no_args_help_exits_successfully_for_lazy_groups() -> None: + root_result = runner.invoke(app, []) + plugin_result = runner.invoke(app, ["plugin"]) + plugin_catalog_result = runner.invoke(app, ["plugin", "catalog"]) + + assert root_result.exit_code == 0 + assert "Data Designer CLI" in root_result.output + assert "plugin" in root_result.output + assert plugin_result.exit_code == 0 + assert "Discover, install, and uninstall" in plugin_result.output + assert "catalog" in plugin_result.output + assert plugin_catalog_result.exit_code == 0 + assert "Manage plugin catalog aliases" in plugin_catalog_result.output + assert "add" in plugin_catalog_result.output + + +def test_app_does_not_expose_legacy_plugins_command() -> None: + result = runner.invoke(app, ["plugins", "--help"]) + + assert result.exit_code != 0 + assert "No such command" in result.output + + +def test_plugin_does_not_expose_legacy_catalogs_command() -> None: + result = runner.invoke(app, ["plugin", "catalogs", "--help"]) + + assert result.exit_code != 0 + assert "No such command" in result.output diff --git a/uv.lock b/uv.lock index 1ba121034..e8048684b 100644 --- a/uv.lock +++ b/uv.lock @@ -3014,11 +3014,11 @@ wheels = [ [[package]] name = "packaging" -version = "26.0" +version = "26.2" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/65/ee/299d360cdc32edc7d2cf530f3accf79c4fca01e96ffc950d8a52213bd8e4/packaging-26.0.tar.gz", hash = "sha256:00243ae351a257117b6a241061796684b084ed1c516a08c48a3f7e147a9d80b4", size = 143416, upload-time = "2026-01-21T20:50:39.064Z" } +sdist = { url = "https://files.pythonhosted.org/packages/d7/f1/e7a6dd94a8d4a5626c03e4e99c87f241ba9e350cd9e6d75123f992427270/packaging-26.2.tar.gz", hash = "sha256:ff452ff5a3e828ce110190feff1178bb1f2ea2281fa2075aadb987c2fb221661", size = 228134, upload-time = "2026-04-24T20:15:23.917Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/b7/b9/c538f279a4e237a006a2c98387d081e9eb060d203d8ed34467cc0f0b9b53/packaging-26.0-py3-none-any.whl", hash = "sha256:b36f1fef9334a5588b4166f8bcd26a14e521f2b55e6b9de3aaa80d3ff7a37529", size = 74366, upload-time = "2026-01-21T20:50:37.788Z" }, + { url = "https://files.pythonhosted.org/packages/df/b2/87e62e8c3e2f4b32e5fe99e0b86d576da1312593b39f47d8ceef365e95ed/packaging-26.2-py3-none-any.whl", hash = "sha256:5fc45236b9446107ff2415ce77c807cee2862cb6fac22b8a73826d0693b0980e", size = 100195, upload-time = "2026-04-24T20:15:22.081Z" }, ] [[package]]