Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
06d0d16
chore: Add .python-version to .gitignore
phoevos Nov 5, 2025
a789fc5
feat: Revamp configuration logic
phoevos Nov 12, 2025
4cc5401
db: Introduce model records
phoevos Nov 12, 2025
c6715fd
gw: Add usage tracking for manual model deployments
phoevos Nov 13, 2025
a55f3e0
gw: Update routes to record model usage
phoevos Nov 13, 2025
b58b0da
ripper: Purge containers based on deployment type
phoevos Nov 13, 2025
95880a7
feat: Implement core auto-deploy functionality
phoevos Nov 18, 2025
6955672
migrations: Create base revision with initial schema
phoevos Nov 19, 2025
19132c3
fix: Mount config.json in docker-compose services
phoevos Nov 19, 2025
fe1e832
fix: Add ripper to 'gateway' network
phoevos Nov 19, 2025
4052c14
fix: Simplify model deployment function params
phoevos Nov 19, 2025
4e1ed84
chore: Pin mlflow to >=2.0.0,<3.0.0
phoevos Nov 20, 2025
d143fc9
fix: Towards a functional auto-deploy feature
phoevos Nov 20, 2025
c7185a2
fix: Allow ripper to remove stopped containers
phoevos Nov 20, 2025
9590e26
fix: Add boto3 dependency for MLflow S3 support
phoevos Nov 20, 2025
6d61512
tmp: Bump CMS image to latest
phoevos Nov 20, 2025
cf7aca6
fix: Revamp tracking client config and init
phoevos Nov 27, 2025
236e6f2
fix: Rename internal store Docker services
phoevos Nov 28, 2025
a69048e
fix: Update CMS entrypoint command
phoevos Dec 1, 2025
9fd1fd5
tracking: Add method to retrieve model type
phoevos Dec 2, 2025
ecc426e
gw: Update model API and corresponding client methods
phoevos Dec 2, 2025
1f822a7
fix: Get correct model type for model deployments
phoevos Dec 2, 2025
17aad67
client: Make request timeout configurable
phoevos Dec 3, 2025
1995c08
client: Fix failing tests due to missing dependency
phoevos Dec 3, 2025
5157bd3
tests: Tweak config/API to work with integration test env
phoevos Dec 5, 2025
107b34f
chore: Ensure test config.json is ignored in the future
phoevos Dec 5, 2025
fec32bf
api: Manage on-demand model config lifecycle
phoevos Dec 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
Expand Down Expand Up @@ -170,3 +170,6 @@ cython_debug/

# Mac
.DS_Store

# Tests
tests/integration/assets/config.json
48 changes: 24 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,6 @@ through environment variables. Before deploying the Gateway, make sure to set th
either by exporting them in the shell or by creating a `.env` file in the root directory of the
project. The following variables are required:

* `MLFLOW_TRACKING_URI`: The URI for the MLflow tracking server.
* `CMS_PROJECT_NAME`: The name of the Docker project where the CogStack ModelServe stack is running.
* `CMS_HOST_URL` (optional): Useful when running CogStack ModelServe instances behind a proxy. If
omitted, the Gateway will attempt to reach the services directly over the internal Docker network.
* `CMG_SCHEDULER_MAX_CONCURRENT_TASKS`: The max number of concurrent tasks the scheduler can handle.
* `CMG_DB_USER`: The username for the PostgreSQL database.
* `CMG_DB_PASSWORD`: The password for the PostgreSQL database.
Expand All @@ -65,37 +61,29 @@ not allowed in MinIO bucket names). The configuration should be saved in a `.env
directory of the project before running Docker Compose (or sourced directly in the shell):

```shell
CMS_PROJECT_NAME=<cms-docker-compose-project-name> # e.g. cms

# (optional) Useful when running CMS behind a proxy
CMS_HOST_URL=https://<proxy-docker-service-name>/cms # e.g. https://proxy/cms

CMG_SCHEDULER_MAX_CONCURRENT_TASKS=1

# Postgres
CMG_DB_USER=admin
CMG_DB_PASSWORD=admin
CMG_DB_HOST=postgres
CMG_DB_HOST=db
CMG_DB_PORT=5432
CMG_DB_NAME=cmg_tasks

# RabbitMQ
CMG_QUEUE_USER=admin
CMG_QUEUE_PASSWORD=admin
CMG_QUEUE_HOST=rabbitmq
CMG_QUEUE_HOST=queue
CMG_QUEUE_PORT=5672
CMG_QUEUE_NAME=cmg_tasks

# MinIO
CMG_OBJECT_STORE_ACCESS_KEY=admin
CMG_OBJECT_STORE_SECRET_KEY=admin123
CMG_OBJECT_STORE_HOST=minio
CMG_OBJECT_STORE_HOST=object-store
CMG_OBJECT_STORE_PORT=9000
CMG_OBJECT_STORE_BUCKET_TASKS=cmg-tasks
CMG_OBJECT_STORE_BUCKET_RESULTS=cmg-results

# MLflow (use container IP when running locally)
MLFLOW_TRACKING_URI=http://<mlflow-docker-service-name>:<mlflow-port> # e.g. http://mlflow-ui:5000
```

To install the CogStack Model Gateway, clone the repository and run `docker compose` inside the root
Expand Down Expand Up @@ -127,15 +115,27 @@ monitoring the state of submitted tasks. The following endpoints are available:

* **Model Servers**: Interact with CMS model servers.

* `GET /models`: List all available model servers (i.e. Docker containers with the
"org.cogstack.model-serve" label and "com.docker.compose.project" set to `$CMS_PROJECT_NAME`).
* `GET /models`: List all available model servers, returning both running containers and on-demand
models that can be auto-deployed.

* **Response**: Dictionary with `running` and `on_demand` keys each containing a list of models.
* **Query Parameters**:
* `verbose (bool)`: Include model metadata from the tracking server (if available).
* `verbose (bool, default=false)`: When false, returns minimal info (name, uri, is_running).
When true, includes description, model_type, deployment_type, idle_ttl, resources, tracking
metadata, and runtime info (for running models).

* `GET /models/{model_name}`: Get information about a specific model (running or on-demand)
without triggering auto-deployment.

* **Query Parameters**:
* `verbose (bool, default=false)`: When false, returns minimal info (name, uri, is_running).
When true, includes description, model_type, deployment_type, idle_ttl, resources, tracking
metadata, and runtime info (for running models).

* `GET /models/{model_name}/info`: Get detailed information about a running model server
(equivalent to the CMS `/info` endpoint). May trigger auto-deployment for on-demand models.

* `GET /models/{model_server_name}/info`: Get information about a specific model (equivalent to
the `/info` CMS endpoint).
* `POST /models/{model_server_name}`: Deploy a new model server from a previously trained model.
* `POST /models/{model_name}`: Deploy a new model server from a previously trained model.

* **Body**:
* `tracking_id (str)`: The tracking ID of the run that generated the model to serve (e.g.
Expand All @@ -144,9 +144,9 @@ monitoring the state of submitted tasks. The following endpoints are available:
* `ttl (int, default=86400)`: The deployed model will be deleted after TTL seconds (defaults
to 1 day). Set -1 as the TTL value to protect the model from being deleted.

* `POST /models/{model_server_name}/tasks/{task_name}`: Execute a task on the specified model
server, providing any query parameters or request body required (follows the CMS API, striving
to support the same endpoints).
* `POST /models/{model_name}/tasks/{task_name}`: Execute a task on the specified model server,
providing any query parameters or request body required (follows the CMS API, striving to
support the same endpoints).

* **Tasks**: Monitor the state of submitted tasks.

Expand Down
Loading