Dynamically deploy models to match request targets #1

phoevos · 2025-11-18T17:34:20Z

No description provided.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Revamp configuration logic using Pydantic models for better validation and maintainability and extend settings with options related to model deployment through the gateway. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Introduce model records to keep track of deployed models, their deployment type, and idle TTLs. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Purge containers based on deployment type, implementing the following removal strategy for each one: - Static: Never remove - Manual: Remove if the TTL specified during manual deployment has been exceeded - Auto: Remove if the model has been idle for longer than the TTL specified during auto-deployment, according to the database model record Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Downgrade mlflow to avoid compatibility issues with CMS MLflow server. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Rename internal services to avoid clashes with services in the CMS network. If we move to Docker Swarm or Kubernetes in the future, we should be able to use FQDNs to avoid conflicts, but for now this appears to be the simplest solution: * minio -> object-store * postgres -> db * rabbitmq -> queue Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Update integration tests to work with the new config. Tweak config handling and service discovery to fix integration tests: * Explicitly pass config.json path when loading config * Ensure the API can work with IPs as model identifiers since we're forced to use them in the integration tests environment (i.e. accessing containers from the localhost) Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

The integration tests update the config.json file when running, adding the MLflow and MinIO connection details. These are specific to the local testing environment and likely the given run, and therefore don't need to be committed to the repository. This commit adds the config.json file to the .gitignore to prevent accidental commits in the future. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Provide an admin API to expose on-demand model configuration management, including creation, updating, retrieval, listing, and soft-deletion. Configuration are now stored in the database, with versioning support. The Python client is also updated to support these operations. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

Copilot

Pull request overview

This pull request introduces dynamic model deployment capabilities to the CogStack Model Gateway, enabling automatic deployment of models when they're requested. The changes include:

A comprehensive refactoring of the configuration system from a simple dictionary-based approach to a Pydantic-validated schema
Introduction of three deployment types (AUTO, MANUAL, STATIC) with different lifecycle management strategies
New database models for tracking deployed models and on-demand model configurations
Auto-deployment functionality that can deploy models on-demand when requests target them
Enhanced ripper service to support multiple TTL strategies (fixed TTL for manual deployments, idle TTL for auto deployments)

Key Changes

Refactored configuration system with Pydantic validation and hierarchical structure
Added model lifecycle management with database tracking for usage and idle time
Implemented auto-deployment with health checking and concurrent deployment protection
Updated all services (gateway, scheduler, ripper) to use the new config and model management systems

Reviewed changes

Copilot reviewed 48 out of 51 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/unit/ripper/test_main.py	Comprehensive test coverage for new multi-TTL ripper logic with different deployment types
tests/unit/common/test_models.py	New file with extensive tests for ModelManager and OnDemandModelConfig
tests/unit/common/test_config.py	Updated tests for Pydantic-based config system
tests/unit/common/test_tasks.py	Fixed logging import path
tests/unit/client/test_client.py	Updated client tests for renamed timeout parameters and new API endpoints
tests/integration/test_api.py	Added comprehensive integration tests for deployment and on-demand config APIs
tests/integration/utils.py	Added utilities for managing deployed containers in tests
tests/conftest.py	Added shared db_manager fixture
cogstack_model_gateway/common/config/	Complete config system refactor with Pydantic models
cogstack_model_gateway/common/models.py	New model management system with database tracking
cogstack_model_gateway/common/containers.py	Enhanced container discovery and management utilities
cogstack_model_gateway/common/tracking.py	Extended tracking client with model type resolution
cogstack_model_gateway/gateway/core/auto_deploy.py	New auto-deployment implementation with health checks
cogstack_model_gateway/gateway/routers/	Updated model and admin API routers
cogstack_model_gateway/ripper/main.py	Enhanced ripper with multi-TTL support
cogstack_model_gateway/scheduler/main.py	Updated to use new config system
cogstack_model_gateway/migrations/versions/	New database migrations for model tracking
docker-compose.yaml	Service name updates and config file volume mounts
config.json	New JSON-based configuration file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-16T09:33:15Z

tests/unit/client/test_client.py

+    mock_client_instance.request.return_value = mock_response
+
+    async with GatewayClient(base_url="http://test-gateway.com") as client:
+        config = await client.update_on_demand_config(


Variable config is not used.

Copilot · 2025-12-16T09:33:15Z

tests/integration/test_api.py

+        },
+    )
+
+    initial_count = count_deployed_model_containers()


Variable initial_count is not used.

Copilot · 2025-12-16T09:33:16Z

tests/unit/common/test_models.py

+    assert idle_seconds < 1001.0
+
+    # Test is_model_idle returns False for recently used model
+    model = model_manager.create_model(


This assignment to 'model' is unnecessary as it is redefined before this value is used.

Copilot · 2025-12-16T09:33:16Z

tests/integration/test_api.py

+    task_uuid = response.json()["uuid"]
+
+    # Completion should take at least a few seconds for container startup
+    task = wait_for_task_completion(task_uuid, tm, expected_status=Status.SUCCEEDED)


This assignment to 'task' is unnecessary as it is redefined before this value is used.

Copilot · 2025-12-16T09:33:16Z

cogstack_model_gateway/migrations/env.py

 # for 'autogenerate' support
 # from myapp import mymodel
 # target_metadata = mymodel.Base.metadata
+from cogstack_model_gateway.common.models import Model  # noqa: E402, F401


Import of 'Model' is not used.

Suggested change

from cogstack_model_gateway.common.models import Model # noqa: E402, F401

phoevos added 26 commits November 5, 2025 14:51

chore: Add .python-version to .gitignore

06d0d16

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

feat: Revamp configuration logic

a789fc5

Revamp configuration logic using Pydantic models for better validation and maintainability and extend settings with options related to model deployment through the gateway. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

db: Introduce model records

4cc5401

Introduce model records to keep track of deployed models, their deployment type, and idle TTLs. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

gw: Add usage tracking for manual model deployments

c6715fd

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

gw: Update routes to record model usage

a55f3e0

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

feat: Implement core auto-deploy functionality

95880a7

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

migrations: Create base revision with initial schema

6955672

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Mount config.json in docker-compose services

19132c3

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Add ripper to 'gateway' network

fe1e832

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Simplify model deployment function params

4052c14

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

chore: Pin mlflow to >=2.0.0,<3.0.0

4e1ed84

Downgrade mlflow to avoid compatibility issues with CMS MLflow server. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Towards a functional auto-deploy feature

d143fc9

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Allow ripper to remove stopped containers

c7185a2

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Add boto3 dependency for MLflow S3 support

9590e26

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

tmp: Bump CMS image to latest

6d61512

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Revamp tracking client config and init

cf7aca6

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Update CMS entrypoint command

a69048e

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

tracking: Add method to retrieve model type

9fd1fd5

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

gw: Update model API and corresponding client methods

ecc426e

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

fix: Get correct model type for model deployments

1f822a7

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

client: Make request timeout configurable

17aad67

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

client: Fix failing tests due to missing dependency

1995c08

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>

phoevos force-pushed the feature-phoevos-auto-deploy branch from 8736059 to 0eb9b94 Compare December 15, 2025 20:15

phoevos force-pushed the feature-phoevos-auto-deploy branch from 0eb9b94 to fec32bf Compare December 15, 2025 20:31

phoevos marked this pull request as ready for review December 16, 2025 09:25

Copilot AI review requested due to automatic review settings December 16, 2025 09:25

Copilot started reviewing on behalf of phoevos December 16, 2025 09:26 View session

Copilot AI reviewed Dec 16, 2025

View reviewed changes

phoevos merged commit 35de853 into main Dec 16, 2025
12 checks passed

phoevos deleted the feature-phoevos-auto-deploy branch December 16, 2025 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamically deploy models to match request targets #1

Dynamically deploy models to match request targets #1

Uh oh!

phoevos commented Nov 18, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Dynamically deploy models to match request targets #1

Dynamically deploy models to match request targets #1

Uh oh!

Conversation

phoevos commented Nov 18, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants