Skip to content

Conversation

@phoevos
Copy link
Member

@phoevos phoevos commented Nov 18, 2025

No description provided.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Revamp configuration logic using Pydantic models for better validation
and maintainability and extend settings with options related to model
deployment through the gateway.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Introduce model records to keep track of deployed models, their
deployment type, and idle TTLs.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Purge containers based on deployment type, implementing the following
removal strategy for each one:
- Static: Never remove
- Manual: Remove if the TTL specified during manual deployment has been
  exceeded
- Auto: Remove if the model has been idle for longer than the TTL
  specified during auto-deployment, according to the database model
  record

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Downgrade mlflow to avoid compatibility issues with CMS MLflow server.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Rename internal services to avoid clashes with services in the CMS
network. If we move to Docker Swarm or Kubernetes in the future, we
should be able to use FQDNs to avoid conflicts, but for now this appears
to be the simplest solution:
* minio -> object-store
* postgres -> db
* rabbitmq -> queue

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Update integration tests to work with the new config.
Tweak config handling and service discovery to fix integration tests:
* Explicitly pass config.json path when loading config
* Ensure the API can work with IPs as model identifiers since we're
  forced to use them in the integration tests environment (i.e.
  accessing containers from the localhost)

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
The integration tests update the config.json file when running, adding
the MLflow and MinIO connection details. These are specific to the local
testing environment and likely the given run, and therefore don't need
to be committed to the repository. This commit adds the config.json
file to the .gitignore to prevent accidental commits in the future.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
@phoevos phoevos force-pushed the feature-phoevos-auto-deploy branch from 8736059 to 0eb9b94 Compare December 15, 2025 20:15
Provide an admin API to expose on-demand model configuration management,
including creation, updating, retrieval, listing, and soft-deletion.
Configuration are now stored in the database, with versioning support.
The Python client is also updated to support these operations.

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
@phoevos phoevos force-pushed the feature-phoevos-auto-deploy branch from 0eb9b94 to fec32bf Compare December 15, 2025 20:31
@phoevos phoevos marked this pull request as ready for review December 16, 2025 09:25
Copilot AI review requested due to automatic review settings December 16, 2025 09:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces dynamic model deployment capabilities to the CogStack Model Gateway, enabling automatic deployment of models when they're requested. The changes include:

  • A comprehensive refactoring of the configuration system from a simple dictionary-based approach to a Pydantic-validated schema
  • Introduction of three deployment types (AUTO, MANUAL, STATIC) with different lifecycle management strategies
  • New database models for tracking deployed models and on-demand model configurations
  • Auto-deployment functionality that can deploy models on-demand when requests target them
  • Enhanced ripper service to support multiple TTL strategies (fixed TTL for manual deployments, idle TTL for auto deployments)

Key Changes

  • Refactored configuration system with Pydantic validation and hierarchical structure
  • Added model lifecycle management with database tracking for usage and idle time
  • Implemented auto-deployment with health checking and concurrent deployment protection
  • Updated all services (gateway, scheduler, ripper) to use the new config and model management systems

Reviewed changes

Copilot reviewed 48 out of 51 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/unit/ripper/test_main.py Comprehensive test coverage for new multi-TTL ripper logic with different deployment types
tests/unit/common/test_models.py New file with extensive tests for ModelManager and OnDemandModelConfig
tests/unit/common/test_config.py Updated tests for Pydantic-based config system
tests/unit/common/test_tasks.py Fixed logging import path
tests/unit/client/test_client.py Updated client tests for renamed timeout parameters and new API endpoints
tests/integration/test_api.py Added comprehensive integration tests for deployment and on-demand config APIs
tests/integration/utils.py Added utilities for managing deployed containers in tests
tests/conftest.py Added shared db_manager fixture
cogstack_model_gateway/common/config/ Complete config system refactor with Pydantic models
cogstack_model_gateway/common/models.py New model management system with database tracking
cogstack_model_gateway/common/containers.py Enhanced container discovery and management utilities
cogstack_model_gateway/common/tracking.py Extended tracking client with model type resolution
cogstack_model_gateway/gateway/core/auto_deploy.py New auto-deployment implementation with health checks
cogstack_model_gateway/gateway/routers/ Updated model and admin API routers
cogstack_model_gateway/ripper/main.py Enhanced ripper with multi-TTL support
cogstack_model_gateway/scheduler/main.py Updated to use new config system
cogstack_model_gateway/migrations/versions/ New database migrations for model tracking
docker-compose.yaml Service name updates and config file volume mounts
config.json New JSON-based configuration file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mock_client_instance.request.return_value = mock_response

async with GatewayClient(base_url="http://test-gateway.com") as client:
config = await client.update_on_demand_config(
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable config is not used.

Copilot uses AI. Check for mistakes.
},
)

initial_count = count_deployed_model_containers()
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable initial_count is not used.

Copilot uses AI. Check for mistakes.
assert idle_seconds < 1001.0

# Test is_model_idle returns False for recently used model
model = model_manager.create_model(
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to 'model' is unnecessary as it is redefined before this value is used.

Copilot uses AI. Check for mistakes.
task_uuid = response.json()["uuid"]

# Completion should take at least a few seconds for container startup
task = wait_for_task_completion(task_uuid, tm, expected_status=Status.SUCCEEDED)
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to 'task' is unnecessary as it is redefined before this value is used.

Copilot uses AI. Check for mistakes.
# for 'autogenerate' support
# from myapp import mymodel
# target_metadata = mymodel.Base.metadata
from cogstack_model_gateway.common.models import Model # noqa: E402, F401
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Model' is not used.

Suggested change
from cogstack_model_gateway.common.models import Model # noqa: E402, F401

Copilot uses AI. Check for mistakes.
@phoevos phoevos merged commit 35de853 into main Dec 16, 2025
12 checks passed
@phoevos phoevos deleted the feature-phoevos-auto-deploy branch December 16, 2025 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants