diff --git a/deploy/configuration-as-code/addons-porter-yaml.mdx b/deploy/configuration-as-code/addons-porter-yaml.mdx index 1704f96..3a20f48 100644 --- a/deploy/configuration-as-code/addons-porter-yaml.mdx +++ b/deploy/configuration-as-code/addons-porter-yaml.mdx @@ -50,7 +50,7 @@ apps: port: 80 sleep: false private: true - envGroups: # Environment groups can be used to inject environment variables from the addons into the service. + envGroups: - cache - db image: @@ -92,7 +92,7 @@ addons: config: name: db masterUserPassword: password - allocatedStorage: 2 # Persistent storage size in GB. Cannot be changed after creation. + allocatedStorage: 2 cpuCores: 0.1 ramMegabytes: 110 ``` diff --git a/deploy/configuration-as-code/overview.mdx b/deploy/configuration-as-code/overview.mdx index 29d4009..74b0141 100644 --- a/deploy/configuration-as-code/overview.mdx +++ b/deploy/configuration-as-code/overview.mdx @@ -1,65 +1,176 @@ --- title: 'Overview' +sidebarTitle: 'Overview' --- -## Getting Started with `porter.yaml` +## What is porter.yaml? -Writing a `porter.yaml` for your app is a great way to maintain a single source of truth on how your app should be built and deployed. Though this file is not required, it can reduce the time it takes to get started with a Porter app. -Once you start creating your app in Porter and select a Github repository, Porter will automatically read your `porter.yaml` to prepopulate settings for your app's services. +`porter.yaml` is a configuration file that defines how your application should be built and deployed on Porter. It serves as a single source of truth for your application's infrastructure, enabling version-controlled, repeatable deployments. -### Example `porter.yaml` + +Think of `porter.yaml` like a `Dockerfile` for your entire application stack - it declares your services, their resources, environment variables, and deployment settings in one place. + -The following is an example of a v2 `porter.yaml` file, which is the latest version of the spec. This example covers many of the available fields, but not all of them. For a full list of configurable options, see the [full reference](/deploy/configuration-as-code/reference). +## When to Use Configuration-as-Code + +### CI/CD Pipelines + +Deploy your application automatically on every push using `porter apply`: + +```bash +porter apply -f porter.yaml +``` + +### Version-Controlled Infrastructure + +Track infrastructure changes alongside your code. Every deployment configuration change goes through code review. + +### Preview Environments + +Spin up isolated environments for pull requests with consistent configuration: + +```bash +porter apply -f porter.yaml --preview +``` + +## How It Works + + + + Create a `porter.yaml` file in your repository that describes your application's services, resources, and settings. + + + The Porter CLI reads your configuration and sends it to the Porter API. + + + If a `build` section is defined, Porter builds and pushes your container image. + + + Porter deploys or updates your services according to the configuration. + + + +## Getting Started + +### 1. Export Existing Configuration + +If you already have an app deployed on Porter, export its current configuration: + +```bash +porter app yaml my-app > porter.yaml +``` + +### 2. Create from Scratch + +Start with a minimal configuration: ```yaml version: v2 name: my-app services: + - name: web + type: web + run: npm start + port: 3000 + cpuCores: 0.5 + ramMegabytes: 512 + +build: + method: docker + context: . + dockerfile: ./Dockerfile +``` + +### 3. Deploy + +Apply the configuration to deploy your app: + +```bash +porter apply -f porter.yaml +``` + +## Example Configuration + +This example demonstrates common configuration patterns: + +```yaml +version: v2 +name: my-app + +# Build configuration (or use 'image' for pre-built images) +build: + method: docker + context: . + dockerfile: ./Dockerfile + +# Environment variables +env: + NODE_ENV: production + LOG_LEVEL: info + +# Attach shared environment groups +envGroups: + - production-secrets + - shared-config + +# Pre-deploy job (runs before service deployment) +predeploy: + run: npm run migrate + +# Auto-rollback on failed deployments +autoRollback: + enabled: true + +# Service definitions +services: + # Web service (publicly accessible) - name: api type: web - run: node index.js + run: npm start port: 8080 - cpuCores: 0.1 - ramMegabytes: 256 + cpuCores: 0.5 + ramMegabytes: 512 autoscaling: enabled: true - minInstances: 1 - maxInstances: 3 - memoryThresholdPercent: 60 - cpuThresholdPercent: 60 - private: false - domains: - - name: test1.example.com + minInstances: 2 + maxInstances: 10 + cpuThresholdPercent: 70 healthCheck: enabled: true - httpPath: /healthz - - name: example-wkr + httpPath: /health + + # Worker service (background processing) + - name: worker type: worker - run: echo 'work' - port: 8081 - cpuCores: 0.1 + run: npm run worker + cpuCores: 0.25 ramMegabytes: 256 - instances: 1 - - name: example-job + instances: 2 + + # Scheduled job + - name: cleanup type: job - run: echo 'hello world' - allowConcurrent: true + run: npm run cleanup cpuCores: 0.1 - ramMegabytes: 256 - cron: '*/10 * * * *' + ramMegabytes: 128 + cron: "0 0 * * *" # Daily at midnight +``` -predeploy: - run: ls +## Configuration vs Dashboard -build: - method: docker - context: ./ - dockerfile: ./app/Dockerfile + +When you deploy using `porter apply`, the configuration in `porter.yaml` takes precedence. Changes made in the Porter dashboard may be overwritten on the next deployment. + -env: - NODE_ENV: production +For consistent deployments, we recommend: +- Use `porter.yaml` as the source of truth for production +- Use the dashboard for experimentation and one-off changes +- Export dashboard changes with `porter app yaml` to update your configuration file -envGroups: -- production-env-group -``` +## Next Steps + +- [Full Reference](/deploy/configuration-as-code/reference) - Complete documentation of all configuration options +- [Service Configuration](/deploy/configuration-as-code/services/web-service) - Detailed service type documentation +- [porter apply Command](/standard/cli/command-reference/porter-apply) - CLI reference for deployments +- [Using Other CI Tools](/deploy/using-other-ci-tools) - Integrate with GitHub Actions, GitLab CI, etc. diff --git a/deploy/configuration-as-code/reference.mdx b/deploy/configuration-as-code/reference.mdx index ae814e4..b82846c 100644 --- a/deploy/configuration-as-code/reference.mdx +++ b/deploy/configuration-as-code/reference.mdx @@ -1,171 +1,544 @@ --- title: 'Reference for porter.yaml' +sidebarTitle: 'Reference' --- -The following is a full reference for all the fields that can be set in a `porter.yaml` file. - -- [version](#version) \- the version of the `porter.yaml` file. The below documentation is for `v2`. -- [name](#name) \- the name of the app. Must be 31 characters or less. Must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character. -- [build](#build) \- the build settings for the app. Only one of `build` or `image` can be set. - - **method** - the build method for the app. Can be one of `docker` or `pack`. - - **context** - the build context for the app. - - **dockerfile** - the path to the Dockerfile for the app, if the method is `docker`. - - **builder** - the builder image to use for the app, if the method is `pack`. - - **buildpacks** - the buildpacks to use for the app, if the method is `pack`. -- [image](#image) \- the image settings for the app. Only one of `build` or `image` can be set. - - **repository** - the image repository. - - **tag** - the image tag. -- [env](#env) \- the environment variables for the app. -- [envGroups] \- a list of environment groups that will be attached to the app. -- [predeploy](#predeploy) \- the pre-deploy job for the app. - - **run** - the run command for the pre-deploy job. -- [autoRollback](#autoRollback) \- the auto-rollback settings for the app. - - **enabled** - whether auto-rollback is enabled. -- [services](#services) \- a list of services for this app. - - **name** - the unique ID of the resource. Must be 31 characters or less. Must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character. - - **type** - the type of service - being one of `web`, `worker`, or `job`. - - **run** - the run command for the service. - - **instances** - the number of instances of the service to run. - - **cpuCores** - the number of CPU cores to allocate to the service. - - **ramMegabytes** - the amount of RAM to allocate to the service. - - **gpuCoresNvidia** - the number of Nvidia GPU cores to allocate to the service. - - **port** - the port that the service will listen on. - - **nodeGroup** - the ID of the user node group to run the service on. If not provided, the service will be run on the default node group. - - [connections](#connections) - a list of external connections for the service. - - additional type-specific fields. See full reference for [web](/deploy/configuration-as-code/services/web-service), [worker](/deploy/configuration-as-code/services/worker-service), and [job](/deploy/configuration-as-code/services/job-service) services. - -### `version` - -`string` - required +This is the complete reference for all fields that can be set in a `porter.yaml` file. + +## Top-Level Fields + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `version` | string | Yes | Must be `v2` | +| `name` | string | Conditional | App name. Required unless `PORTER_APP_NAME` env var is set | +| `build` | object | Conditional | Build configuration. Cannot be used with `image` | +| `image` | object | Conditional | Pre-built image configuration. Cannot be used with `build` | +| `services` | array | Yes | List of service definitions | +| `env` | object | No | Environment variables | +| `envGroups` | string[] | No | Names of environment groups to attach | +| `predeploy` | object | No | Pre-deploy job configuration | +| `initialDeploy` | object | No | Job to run only on first deployment | +| `autoRollback` | object | No | Automatic rollback settings | +| `requiredApps` | array | No | App dependencies that must deploy first | +| `deploymentStrategy` | object | No | Rolling or blue-green deployment configuration | +| `efsStorage` | object | No | AWS EFS storage configuration | + + +You must specify either `build` or `image`, but not both. Use `build` when Porter should build your container image, or `image` when using a pre-built image from a registry. + + +--- + +## `version` + +**Type:** `string` - **Required** + +The schema version. Must be `v2`. ```yaml version: v2 ``` -### `name` +--- -`string` - optional +## `name` -Either `name` must or the `PORTER_APP_NAME` environment variable must be set when running `porter apply`. +**Type:** `string` - **Required** (unless `PORTER_APP_NAME` is set) + +The application name. Must be 31 characters or less, consist of lowercase alphanumeric characters or `-`, and start and end with an alphanumeric character. ```yaml name: my-app ``` -### `build` +--- -`object` - optional +## `build` -```yaml +**Type:** `object` - **Optional** + +Configuration for building container images. Cannot be used together with `image`. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `method` | string | Yes | Build method: `docker` or `pack` | +| `context` | string | Yes | Build context directory | +| `dockerfile` | string | Conditional | Dockerfile path (required if method is `docker`) | +| `builder` | string | Conditional | Builder image (required if method is `pack`) | +| `buildpacks` | string[] | No | List of buildpacks (for `pack` method) | + + +```yaml Docker Build +build: + method: docker + context: . + dockerfile: ./Dockerfile +``` + +```yaml Buildpack Build build: method: pack context: . builder: heroku/buildpacks:20 buildpacks: - heroku/python + - heroku/nodejs ``` + -### `image` +--- -`object` - optional +## `image` + +**Type:** `object` - **Optional** + +Configuration for using a pre-built container image. Cannot be used together with `build`. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `repository` | string | Yes | Image repository URL | +| `tag` | string | No | Image tag (can be overridden with `--tag` flag) | ```yaml image: - repository: my-registry/my-app - tag: latest + repository: my-registry.io/my-app + tag: v1.2.3 ``` -### `env` +--- + +## `env` -`object` - optional +**Type:** `object` - **Optional** + +Environment variables to set for all services. Values must be strings. ```yaml env: - PORT: 8080 + NODE_ENV: production + LOG_LEVEL: info + DATABASE_URL: "${DATABASE_URL}" # Reference from env group ``` -### `predeploy` + +For sensitive values, use environment groups instead of hardcoding them in `porter.yaml`. + + +--- + +## `envGroups` + +**Type:** `string[]` - **Optional** -`object` - optional +Names of environment groups to attach to the application. Environment groups must already exist in your Porter project. + +```yaml +envGroups: + - production-secrets + - shared-config + - database-credentials +``` + +--- + +## `predeploy` + +**Type:** `object` - **Optional** + +A job that runs before deploying services. Commonly used for database migrations. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `run` | string | Yes | Command to execute | +| `cpuCores` | number | No | CPU allocation | +| `ramMegabytes` | number | No | Memory allocation | ```yaml predeploy: - run: echo "predeploy" + run: npm run migrate + cpuCores: 0.5 + ramMegabytes: 512 +``` + +See [Predeploy Configuration](/deploy/configuration-as-code/services/predeploy) for more details. + +--- + +## `initialDeploy` + +**Type:** `object` - **Optional** + +A job that runs only on the first deployment of the application. Useful for one-time setup tasks like database seeding. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `run` | string | Yes | Command to execute | +| `cpuCores` | number | No | CPU allocation | +| `ramMegabytes` | number | No | Memory allocation | + +```yaml +initialDeploy: + run: npm run seed + cpuCores: 0.25 + ramMegabytes: 256 ``` -### `autoRollback` +--- + +## `autoRollback` + +**Type:** `object` - **Optional** -`object` - optional +Configure automatic rollback when deployments fail. -When this attribute is enabled, Porter will automatically rollback all services in the app to the latest previously successfully-deployed version if the any service of the new version fails to deploy. +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `enabled` | boolean | Yes | Enable or disable auto-rollback | ```yaml autoRollback: enabled: true ``` +When enabled, Porter automatically rolls back all services to the last successfully deployed version if any service fails to deploy. + +--- + +## `requiredApps` + +**Type:** `array` - **Optional** + +Define dependencies on other applications that must be deployed before this app. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | Yes | Name of the required app | +| `fromTarget` | string | No | Deployment target of the required app | + +```yaml +requiredApps: + - name: database-app + - name: auth-service + fromTarget: production +``` + +--- + +## `deploymentStrategy` + +**Type:** `object` - **Optional** + +Configure how deployments are rolled out. + +### Rolling Deployment (Default) + +```yaml +deploymentStrategy: + rolling: + maxSurge: 25% + maxUnavailable: 25% +``` + +### Blue-Green Deployment + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `blueGreen.group` | string | Yes | Name of the blue-green deployment group | + +```yaml +deploymentStrategy: + blueGreen: + group: my-blue-green-group +``` + +--- + +## `efsStorage` + +**Type:** `object` - **Optional** + +Configure AWS EFS (Elastic File System) storage for persistent data. Only available on AWS clusters. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `enabled` | boolean | Yes | Enable EFS storage | +| `fileSystemId` | string | Yes | AWS EFS file system ID | + +```yaml +efsStorage: + enabled: true + fileSystemId: fs-0123456789abcdef0 +``` + +--- + ## `services` -`array` - required +**Type:** `array` - **Required** + +List of service definitions. Each service represents a deployable unit of your application. + +### Common Service Fields + +These fields apply to all service types: + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | Yes | Unique service identifier (max 31 chars, lowercase alphanumeric and `-`) | +| `type` | string | Yes | Service type: `web`, `worker`, or `job` | +| `run` | string | Yes | Command to execute | +| `instances` | integer | No | Number of replicas (not for jobs) | +| `cpuCores` | number | Yes | CPU allocation (e.g., `0.5`, `1`, `2`) | +| `ramMegabytes` | integer | Yes | Memory allocation in MB | +| `gpuCoresNvidia` | integer | No | NVIDIA GPU cores to allocate | +| `port` | integer | Conditional | Port the service listens on (required for `web`) | +| `nodeGroup` | string | No | UUID of a user node group to run on | +| `connections` | array | No | External cloud service connections | +| `terminationGracePeriodSeconds` | integer | No | Seconds to wait before force-killing pods | +| `serviceMeshEnabled` | boolean | No | Enable service mesh for inter-service communication | +| `metricsScraping` | object | No | Prometheus metrics scraping configuration | + +### Service Types + +| Type | Description | Documentation | +|------|-------------|---------------| +| `web` | HTTP services with public or private endpoints | [Web Services](/deploy/configuration-as-code/services/web-service) | +| `worker` | Background processing services | [Worker Services](/deploy/configuration-as-code/services/worker-service) | +| `job` | Scheduled or on-demand tasks | [Job Services](/deploy/configuration-as-code/services/job-service) | + +### Basic Example ```yaml services: - - name: web + - name: api type: web - run: python app.py - instances: 1 - cpuCores: 1 - ramMegabytes: 1024 + run: node server.js port: 8080 - - name: web-on-user-node-group - type: web - run: python app.py + cpuCores: 0.5 + ramMegabytes: 512 + instances: 2 + + - name: worker + type: worker + run: node worker.js + cpuCores: 0.25 + ramMegabytes: 256 instances: 1 - cpuCores: 1 - ramMegabytes: 1024 + + - name: cleanup + type: job + run: node cleanup.js + cpuCores: 0.1 + ramMegabytes: 128 + cron: "0 0 * * *" +``` + +--- + +## `connections` + +**Type:** `array` - **Optional** + +Configure connections to external cloud services. + +### AWS Role Connection + +Attach an IAM role to your service for AWS API access. + +```yaml +services: + - name: api + type: web + run: node server.js port: 8080 - nodeGroup: 123e4567-e89b-12d3-a456-426614174000 + cpuCores: 0.5 + ramMegabytes: 512 + connections: + - type: awsRole + role: my-iam-role-name ``` -### `connections` + +AWS roles must be configured in the Connections tab of your cluster settings. + -`array` - optional +### Cloud SQL Connection (GCP) -Cloud SQL connection (GCP) +Connect to Google Cloud SQL instances. ```yaml services: - - name: web + - name: api type: web - run: python app.py - instances: 1 - cpuCores: 1 - ramMegabytes: 1024 + run: node server.js port: 8080 + cpuCores: 0.5 + ramMegabytes: 512 connections: - - type: cloudSql - config: - cloudSqlConnectionName: project-123456:us-east1:instance-name - cloudSqlDatabasePort: 5432 - cloudSqlServiceAccount: service-account-name - # service accounts must be connected through - # the Connections tab under the application's cluster + - type: cloudSql + config: + cloudSqlConnectionName: project-123:us-east1:instance-name + cloudSqlDatabasePort: 5432 + cloudSqlServiceAccount: my-service-account ``` -AWS role connection +### Persistent Disk Connection -*Note: The AWS role connection feature is currently in development so may be subject to change.* +Attach persistent storage to your service. ```yaml services: - - name: web + - name: api type: web - run: python app.py - instances: 1 - cpuCores: 1 - ramMegabytes: 1024 + run: node server.js port: 8080 + cpuCores: 0.5 + ramMegabytes: 512 connections: - - type: awsRole - role: iam-role-name + - type: disk + config: + mountPath: /data + sizeGb: 10 ``` + +--- + +## `gpu` + +**Type:** `object` - **Optional** + +Configure GPU resources for machine learning workloads. + +| Field | Type | Description | +|-------|------|-------------| +| `gpuCoresNvidia` | integer | Number of NVIDIA GPU cores | + +```yaml +services: + - name: ml-inference + type: web + run: python serve.py + port: 8080 + cpuCores: 2 + ramMegabytes: 4096 + gpuCoresNvidia: 1 + nodeGroup: gpu-node-group-uuid +``` + + +GPU workloads require a node group with GPU-enabled instances. + + +--- + +## `metricsScraping` + +**Type:** `object` - **Optional** + +Configure Prometheus metrics scraping for custom application metrics. + +| Field | Type | Description | +|-------|------|-------------| +| `enabled` | boolean | Enable metrics scraping | +| `path` | string | HTTP path to scrape (default: `/metrics`) | +| `port` | integer | Port to scrape metrics from | + +```yaml +services: + - name: api + type: web + run: node server.js + port: 8080 + cpuCores: 0.5 + ramMegabytes: 512 + metricsScraping: + enabled: true + path: /metrics + port: 9090 +``` + +--- + +## Complete Example + +```yaml +version: v2 +name: my-production-app + +build: + method: docker + context: . + dockerfile: ./Dockerfile + +env: + NODE_ENV: production + LOG_LEVEL: info + +envGroups: + - production-secrets + - database-credentials + +predeploy: + run: npm run migrate + cpuCores: 0.5 + ramMegabytes: 512 + +autoRollback: + enabled: true + +services: + # Public API + - name: api + type: web + run: npm start + port: 8080 + cpuCores: 1 + ramMegabytes: 1024 + autoscaling: + enabled: true + minInstances: 2 + maxInstances: 20 + cpuThresholdPercent: 70 + memoryThresholdPercent: 80 + healthCheck: + enabled: true + httpPath: /health + timeoutSeconds: 5 + domains: + - name: api.example.com + ingressAnnotations: + nginx.ingress.kubernetes.io/proxy-body-size: "50m" + serviceMeshEnabled: true + + # Background worker + - name: worker + type: worker + run: npm run worker + cpuCores: 0.5 + ramMegabytes: 512 + instances: 3 + terminationGracePeriodSeconds: 60 + healthCheck: + enabled: true + command: ./healthcheck.sh + timeoutSeconds: 5 + + # Scheduled cleanup job + - name: daily-cleanup + type: job + run: npm run cleanup + cpuCores: 0.25 + ramMegabytes: 256 + cron: "0 3 * * *" + timeoutSeconds: 3600 +``` + +--- + +## Related Documentation + +- [Web Services](/deploy/configuration-as-code/services/web-service) - Web service configuration +- [Worker Services](/deploy/configuration-as-code/services/worker-service) - Worker service configuration +- [Job Services](/deploy/configuration-as-code/services/job-service) - Job service configuration +- [Predeploy Jobs](/deploy/configuration-as-code/services/predeploy) - Pre-deploy job configuration +- [porter apply](/standard/cli/command-reference/porter-apply) - CLI reference for deployments diff --git a/deploy/configuration-as-code/services/job-service.mdx b/deploy/configuration-as-code/services/job-service.mdx index 7e6fdbf..f997d66 100644 --- a/deploy/configuration-as-code/services/job-service.mdx +++ b/deploy/configuration-as-code/services/job-service.mdx @@ -2,41 +2,305 @@ title: 'Job Services' --- -The following is a full reference for all the fields that can be set for a job service in `porter.yaml`. +Job services are for scheduled or on-demand tasks that run to completion. They're ideal for cron jobs, data processing, and one-time tasks. This is a complete reference for all fields that can be set for a job service in `porter.yaml`. -- [allowConcurrent](#allowConcurrent) \- indicates whether or not runs of the job can be processed concurrently. -- [cron](#cron) \- a cron expression for the job. -- [suspendCron](#suspendCron) \- indicates whether or not the cron schedule should be suspended. -- [timeoutSeconds](#timeoutSeconds) \- the number of seconds to wait before timing out the job. +## Field Reference -### `allowConcurrent` +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | Yes | Service identifier (max 31 chars) | +| `type` | string | Yes | Must be `job` | +| `run` | string | Yes | Command to execute | +| `cpuCores` | number | Yes | CPU allocation | +| `ramMegabytes` | integer | Yes | Memory allocation in MB | +| `cron` | string | No | Cron schedule expression | +| `suspendCron` | boolean | No | Temporarily disable cron schedule | +| `allowConcurrent` | boolean | No | Allow concurrent job runs | +| `timeoutSeconds` | integer | No | Maximum job duration | +| `connections` | array | No | External cloud connections | +| `terminationGracePeriodSeconds` | integer | No | Graceful shutdown timeout | +| `gpuCoresNvidia` | integer | No | NVIDIA GPU cores | +| `nodeGroup` | string | No | Node group UUID | -`boolean` - optional +--- + +## Basic Example ```yaml -allowConcurrent: true +services: + - name: cleanup + type: job + run: npm run cleanup + cpuCores: 0.25 + ramMegabytes: 256 + cron: "0 0 * * *" ``` -### `cron` +--- -`string` - optional +## `cron` + +**Type:** `string` - **Optional** + +A cron expression that defines when the job should run. Uses [standard 5-field cron syntax](https://en.wikipedia.org/wiki/Cron). ```yaml cron: "0 0 * * *" ``` -### `suspendCron` +**Common Schedules:** + +| Expression | Description | +|------------|-------------| +| `0 0 * * *` | Daily at midnight | +| `0 */6 * * *` | Every 6 hours | +| `*/15 * * * *` | Every 15 minutes | +| `0 9 * * 1-5` | Weekdays at 9 AM | +| `0 0 1 * *` | First day of each month | -`boolean` - optional + +If no cron expression is provided, the job must be triggered manually using `porter app run --job`. + + +--- + +## `suspendCron` + +**Type:** `boolean` - **Optional** + +Temporarily disable the cron schedule without removing it. The job won't run on schedule but can still be triggered manually. ```yaml suspendCron: true ``` -### `timeoutSeconds` + +Use this to pause scheduled jobs during maintenance windows or while debugging. + + +--- + +## `allowConcurrent` + +**Type:** `boolean` - **Optional** + +Allow multiple instances of the job to run simultaneously. By default, a new job run won't start if a previous run is still in progress. + +```yaml +allowConcurrent: true +``` + + +Be careful enabling this for jobs that modify shared resources. Concurrent runs may cause race conditions or data inconsistency. + + +--- + +## `timeoutSeconds` -`integer` - optional +**Type:** `integer` - **Optional** + +Maximum number of seconds the job is allowed to run before being terminated. ```yaml timeoutSeconds: 3600 -``` \ No newline at end of file +``` + + +If not specified, jobs may run indefinitely. It's recommended to set a reasonable timeout for all jobs. + + +--- + +## `connections` + +**Type:** `array` - **Optional** + +Connect to external cloud services. See [Reference](/deploy/configuration-as-code/reference#connections) for full documentation. + + +```yaml AWS Role +connections: + - type: awsRole + role: my-iam-role +``` + +```yaml Cloud SQL (GCP) +connections: + - type: cloudSql + config: + cloudSqlConnectionName: project:region:instance + cloudSqlDatabasePort: 5432 + cloudSqlServiceAccount: my-service-account +``` + +```yaml Persistent Disk +connections: + - type: disk + config: + mountPath: /data + sizeGb: 10 +``` + + +--- + +## `terminationGracePeriodSeconds` + +**Type:** `integer` - **Optional** + +Seconds to wait for graceful shutdown before forcefully terminating the job. + +```yaml +terminationGracePeriodSeconds: 30 +``` + + +Use this to ensure jobs can clean up resources or checkpoint progress before termination. + + +--- + +## `gpuCoresNvidia` + +**Type:** `integer` - **Optional** + +Allocate NVIDIA GPU cores for ML training, inference, or GPU-accelerated processing. + +```yaml +gpuCoresNvidia: 1 +nodeGroup: gpu-node-group-uuid +``` + + +Requires a node group with GPU-enabled instances. + + +--- + +## Triggering Jobs Manually + +Jobs can be triggered manually using the Porter CLI: + +```bash +# Trigger a job run +porter app run my-app --job my-job + +# Trigger and wait for completion +porter app run my-app --job my-job --wait + +# Override concurrent restriction +porter app run my-app --job my-job --allow-concurrent +``` + +--- + +## Complete Example + +```yaml +services: + - name: daily-report + type: job + run: python generate_report.py + cpuCores: 1 + ramMegabytes: 2048 + + # Schedule: Daily at 6 AM UTC + cron: "0 6 * * *" + + # Allow up to 2 hours + timeoutSeconds: 7200 + + # Don't allow concurrent runs + allowConcurrent: false + + # Cloud connections + connections: + - type: awsRole + role: report-s3-access + + # Graceful shutdown + terminationGracePeriodSeconds: 60 +``` + +--- + +## Common Use Cases + +### Database Cleanup + +```yaml +services: + - name: db-cleanup + type: job + run: npm run cleanup-old-records + cpuCores: 0.25 + ramMegabytes: 256 + cron: "0 3 * * *" # 3 AM daily + timeoutSeconds: 1800 + allowConcurrent: false +``` + +### Data Export + +```yaml +services: + - name: weekly-export + type: job + run: python export_data.py + cpuCores: 0.5 + ramMegabytes: 1024 + cron: "0 0 * * 0" # Sunday at midnight + timeoutSeconds: 14400 # 4 hours + connections: + - type: awsRole + role: s3-export-role +``` + +### ML Training Job + +```yaml +services: + - name: model-training + type: job + run: python train.py + cpuCores: 8 + ramMegabytes: 32768 + gpuCoresNvidia: 4 + nodeGroup: gpu-node-group-uuid + timeoutSeconds: 86400 # 24 hours + terminationGracePeriodSeconds: 300 +``` + +### Health Check / Monitoring + +```yaml +services: + - name: healthcheck + type: job + run: ./check_dependencies.sh + cpuCores: 0.1 + ramMegabytes: 128 + cron: "*/5 * * * *" # Every 5 minutes + timeoutSeconds: 60 + allowConcurrent: false +``` + +### Manual Migration Job + +```yaml +services: + - name: data-migration + type: job + run: python migrate.py + cpuCores: 2 + ramMegabytes: 4096 + # No cron - triggered manually only + timeoutSeconds: 28800 # 8 hours + allowConcurrent: false + terminationGracePeriodSeconds: 120 +``` + + +Jobs without a `cron` expression must be triggered manually using `porter app run --job`. + diff --git a/deploy/configuration-as-code/services/web-service.mdx b/deploy/configuration-as-code/services/web-service.mdx index 4758975..6f775e6 100644 --- a/deploy/configuration-as-code/services/web-service.mdx +++ b/deploy/configuration-as-code/services/web-service.mdx @@ -2,96 +2,216 @@ title: 'Web Services' --- -The following is a full reference for all the fields that can be set for a web service in `porter.yaml`. - -- [autoscaling](#autoscaling) \- the autoscaling configuration for the service. - - **enabled** \- whether autoscaling is enabled or not. - - **minInstances** \- the minimum number of instances to run. - - **maxInstances** \- the maximum number of instances to run. - - **cpuThresholdPercent** \- the CPU threshold percentage to trigger autoscaling at. - - **memoryThresholdPercent** \- the memory threshold percentage to trigger autoscaling at. -- [domains](#domains) \- the list of custom domains for the service, if the service is exposed publicly. - - **name** \- the name of the domain. -- [private](#private) \- whether the service is private or not. -- **serviceMeshEnabled** \- If 'true', enables enhanced communication between your services with improved performance, reliability, and monitoring. Recommended for applications with multiple services that communicate with each other, especially those using gRPC or WebSockets. -- [ingressAnnotations](#ingressannotations) \- the ingress annotations to apply for the service. -- [pathRouting](#pathrouting) \- the list of URL paths to service port mappings, if path-based routing is enabled. -Note that a path must be specified for the default port (set in [services.port](/deploy/configuration-as-code/reference)). -If routing to a port on a different service, the port must be exposed on that service (either as the default port or through a path routing rule). - - **path** \- the URL path. - - **port** \- the port to route to. - - **appName** \- (optional) the name of the application to route to (as it appears in the dashboard). Defaults to the application of the current service. - - **serviceName** \- (optional) the name of the service to route to (as it appears in the dashboard). Defaults to the current service. Must be specified if appName is set. -- [pathRoutingConfig](#pathroutingconfig) \- optional configuration options for path-based routing. - - **rewriteMode** \- mode for rewriting URL paths. If the path set in **pathRouting** is `/api/v1`, then `api/v1/subpath` will be rewritten as follows: - - `rewrite-all` \- (default) rewrite the entire path to the root path --> `/`. - - `rewrite-prefix` \- rewrite the path for the URL path prefix only --> `/subpath`. - - `rewrite-off` \- disable path rewriting --> `/api/v1/subpath`. -- [healthCheck](#healthcheck) \- the health check configuration for the service. This will configure both the liveness and readiness checks. - - **enabled** \- whether the health check is enabled or not. - - **httpPath** \- the path to check for the health check. - - **timeoutSeconds** \- the timeout for the health check requests. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for sending the health check requests. Minimum value is 0. -- [livenessCheck](#livenesscheck) \- the liveness check configuration for the service. Cannot be used with healthCheck. - - **enabled** \- whether the liveness check is enabled or not. - - **httpPath** \- the path to check for the liveness check. - - **timeoutSeconds** \- the timeout for the liveness check requests. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for sending the liveness check requests. Minimum value is 0. -- [readinessCheck](#readinesscheck) \- the readiness check configuration for the service. Cannot be used with healthCheck. - - **enabled** \- whether the readiness check is enabled or not. - - **httpPath** \- the path to check for the readiness check. - - **timeoutSeconds** \- the timeout for the readiness check requests. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for sending the readiness check requests. Minimum value is 0. -- [startupCheck](#startupcheck) \- the startup check configuration for the service. Cannot be used with healthCheck. - - **enabled** \- whether the startup check is enabled or not. - - **httpPath** \- the path to check for the startup check. - - **timeoutSeconds** \- the timeout for the startup check requests. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for sending the startup check requests. Minimum value is 0. - -### `autoscaling` - -`object` - optional - -All fields are optional. +Web services are HTTP-based services that can be exposed publicly or kept private within your cluster. This is a complete reference for all fields that can be set for a web service in `porter.yaml`. + +## Field Reference + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | Yes | Service identifier (max 31 chars) | +| `type` | string | Yes | Must be `web` | +| `run` | string | Yes | Command to execute | +| `port` | integer | Yes | Port the service listens on | +| `cpuCores` | number | Yes | CPU allocation | +| `ramMegabytes` | integer | Yes | Memory allocation in MB | +| `instances` | integer | No | Number of replicas (default: 1) | +| `private` | boolean | No | Make service private (default: false) | +| `disableTLS` | boolean | No | Disable TLS termination | +| `autoscaling` | object | No | Autoscaling configuration | +| `domains` | array | No | Custom domain configuration | +| `healthCheck` | object | No | Combined health check config | +| `livenessCheck` | object | No | Liveness probe config | +| `readinessCheck` | object | No | Readiness probe config | +| `startupCheck` | object | No | Startup probe config | +| `pathRouting` | array | No | Path-based routing rules | +| `pathRoutingConfig` | object | No | Path routing options | +| `ingressAnnotations` | object | No | Custom ingress annotations | +| `connections` | array | No | External cloud connections | +| `serviceMeshEnabled` | boolean | No | Enable service mesh | +| `metricsScraping` | object | No | Prometheus metrics config | +| `terminationGracePeriodSeconds` | integer | No | Graceful shutdown timeout | +| `gpuCoresNvidia` | integer | No | NVIDIA GPU cores | +| `nodeGroup` | string | No | Node group UUID | + +--- + +## Basic Example + +```yaml +services: + - name: api + type: web + run: node server.js + port: 8080 + cpuCores: 0.5 + ramMegabytes: 512 + instances: 2 +``` + +--- + +## `private` + +**Type:** `boolean` - **Optional** + +When `true`, the service is only accessible within the cluster (not publicly exposed). + +```yaml +private: true +``` + +--- + +## `disableTLS` + +**Type:** `boolean` - **Optional** + +Disable TLS termination at the load balancer. Only use this for services that handle their own TLS or for internal testing. + +```yaml +disableTLS: true +``` + + +Disabling TLS exposes your service over HTTP. Only use this when you have a specific requirement. + + +--- + +## `autoscaling` + +**Type:** `object` - **Optional** + +Configure horizontal pod autoscaling based on CPU and memory utilization. + +| Field | Type | Description | +|-------|------|-------------| +| `enabled` | boolean | Enable autoscaling | +| `minInstances` | integer | Minimum number of replicas | +| `maxInstances` | integer | Maximum number of replicas | +| `cpuThresholdPercent` | integer | CPU usage threshold (0-100) | +| `memoryThresholdPercent` | integer | Memory usage threshold (0-100) | ```yaml autoscaling: enabled: true - minInstances: 1 + minInstances: 2 maxInstances: 10 - cpuThresholdPercent: 80 + cpuThresholdPercent: 70 memoryThresholdPercent: 80 ``` -### `domains` + +When autoscaling is enabled, the `instances` field is ignored. + + +--- -`array` - optional +## `domains` + +**Type:** `array` - **Optional** + +Configure custom domains for your web service. + +| Field | Type | Description | +|-------|------|-------------| +| `name` | string | Domain name | ```yaml domains: - - name: example.com + - name: api.example.com + - name: api.staging.example.com ``` -### `private` +--- -`boolean` - optional +## `healthCheck` + +**Type:** `object` - **Optional** + +Configure a combined health check that applies to liveness, readiness, and startup probes. + +| Field | Type | Description | +|-------|------|-------------| +| `enabled` | boolean | Enable health checks | +| `httpPath` | string | HTTP endpoint to check | +| `timeoutSeconds` | integer | Request timeout (min: 1) | +| `initialDelaySeconds` | integer | Initial delay before checking (min: 0) | ```yaml -private: true +healthCheck: + enabled: true + httpPath: /health + timeoutSeconds: 5 + initialDelaySeconds: 10 ``` -### `ingressAnnotations` + +Cannot be used together with `livenessCheck`, `readinessCheck`, or `startupCheck`. Use either the combined `healthCheck` or the individual checks. + + +--- + +## Advanced Health Checks + +For fine-grained control, configure liveness, readiness, and startup probes separately. + +### `livenessCheck` + +**Type:** `object` - **Optional** -`object` - optional +Determines if the container should be restarted. ```yaml -ingressAnnotations: - nginx.ingress.kubernetes.io/proxy-connect-timeout: '"18000"' +livenessCheck: + enabled: true + httpPath: /livez + timeoutSeconds: 5 + initialDelaySeconds: 15 ``` -### `pathRouting` +### `readinessCheck` + +**Type:** `object` - **Optional** + +Determines if the container is ready to receive traffic. + +```yaml +readinessCheck: + enabled: true + httpPath: /readyz + timeoutSeconds: 3 + initialDelaySeconds: 5 +``` -`array` - optional +### `startupCheck` + +**Type:** `object` - **Optional** + +Used for slow-starting containers. Other probes are disabled until this passes. + +```yaml +startupCheck: + enabled: true + httpPath: /startupz + timeoutSeconds: 10 + initialDelaySeconds: 0 +``` + +--- + +## `pathRouting` + +**Type:** `array` - **Optional** + +Configure path-based routing to direct requests to different ports or services. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `path` | string | Yes | URL path prefix | +| `port` | integer | Yes | Port to route to | +| `serviceName` | string | No | Service to route to (defaults to current) | +| `appName` | string | No | Application to route to (requires `serviceName`) | ```yaml pathRouting: @@ -99,82 +219,230 @@ pathRouting: port: 8080 - path: /api/v2/ port: 8081 - - path: /api/v3/ - port: 8082 - serviceName: other-service-in-same-app - - path: /api/v4/ - port: 8083 - appName: other-app - serviceName: other-service-in-other-app + - path: /admin/ + port: 9000 + serviceName: admin-service + - path: /auth/ + port: 8080 + appName: auth-app + serviceName: auth-service ``` -### `pathRoutingConfig` + +A path must be specified for the default port set in `services.port`. + + +--- + +## `pathRoutingConfig` + +**Type:** `object` - **Optional** + +Configure path rewriting behavior for path-based routing. + +| Field | Type | Description | +|-------|------|-------------| +| `rewriteMode` | string | Path rewrite mode | + +**Rewrite Modes:** -`object` - optional +| Mode | Description | Example: `/api/v1/users` | +|------|-------------|--------------------------| +| `rewrite-all` | Rewrite entire path to root (default) | `/` | +| `rewrite-prefix` | Remove the matched prefix only | `/users` | +| `rewrite-off` | No rewriting, keep original path | `/api/v1/users` | ```yaml pathRouting: - path: /api/v1/ port: 8080 - - path: /api/v2/ - port: 8081 pathRoutingConfig: rewriteMode: rewrite-prefix +``` + +--- + +## `ingressAnnotations` + +**Type:** `object` - **Optional** + +Add custom NGINX ingress annotations for advanced configuration. + +```yaml +ingressAnnotations: + nginx.ingress.kubernetes.io/proxy-body-size: "100m" + nginx.ingress.kubernetes.io/proxy-connect-timeout: "60" + nginx.ingress.kubernetes.io/proxy-read-timeout: "60" + nginx.ingress.kubernetes.io/proxy-send-timeout: "60" +``` + + +Common use cases include increasing upload limits, configuring timeouts, and enabling WebSocket support. + + +--- + +## `connections` + +**Type:** `array` - **Optional** + +Connect to external cloud services. See [Reference](/deploy/configuration-as-code/reference#connections) for full documentation. + + +```yaml AWS Role +connections: + - type: awsRole + role: my-iam-role +``` + +```yaml Cloud SQL (GCP) +connections: + - type: cloudSql + config: + cloudSqlConnectionName: project:region:instance + cloudSqlDatabasePort: 5432 + cloudSqlServiceAccount: my-service-account +``` -# for the path "/api/v1/subpath" -# rewrite-prefix will rewrite to "/subpath" -# rewrite-all will rewrite to "/" -# rewrite-off will leave the path as "/api/v1/subpath" +```yaml Persistent Disk +connections: + - type: disk + config: + mountPath: /data + sizeGb: 10 ``` + -### `healthCheck` +--- + +## `serviceMeshEnabled` + +**Type:** `boolean` - **Optional** -`object` - optional +Enable service mesh for enhanced inter-service communication with improved performance, reliability, and monitoring. ```yaml -healthCheck: - enabled: true - httpPath: /healthz - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional +serviceMeshEnabled: true ``` -## Advanced Health Checks + +Recommended for applications with multiple services that communicate with each other, especially those using gRPC or WebSockets. + + +--- -Advanced health checks allow you to separately configure liveness, readiness, and startup health checks for your service. These cannot be enabled at the same time as the healthCheck field. +## `metricsScraping` -### `livenessCheck` +**Type:** `object` - **Optional** -`object` - optional +Configure Prometheus metrics scraping for custom application metrics. + +| Field | Type | Description | +|-------|------|-------------| +| `enabled` | boolean | Enable metrics scraping | +| `path` | string | HTTP path to scrape (default: `/metrics`) | +| `port` | integer | Port to scrape metrics from | ```yaml -livenessCheck: +metricsScraping: enabled: true - httpPath: /livez - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional + path: /metrics + port: 9090 ``` -### `readinessCheck` +--- -`object` - optional +## `terminationGracePeriodSeconds` + +**Type:** `integer` - **Optional** + +Seconds to wait for graceful shutdown before forcefully terminating the container. ```yaml -readinessCheck: - enabled: true - httpPath: /readyz - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional +terminationGracePeriodSeconds: 60 ``` -### `startupCheck` + +Increase this value for services that need time to complete in-flight requests or cleanup tasks. + + +--- + +## `gpuCoresNvidia` -`object` - optional +**Type:** `integer` - **Optional** + +Allocate NVIDIA GPU cores for ML inference or GPU-accelerated workloads. ```yaml -startupCheck: - enabled: true - httpPath: /startupz - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional -``` \ No newline at end of file +gpuCoresNvidia: 1 +nodeGroup: gpu-node-group-uuid +``` + + +Requires a node group with GPU-enabled instances. + + +--- + +## Complete Example + +```yaml +services: + - name: api + type: web + run: npm start + port: 8080 + cpuCores: 1 + ramMegabytes: 1024 + + # Autoscaling + autoscaling: + enabled: true + minInstances: 2 + maxInstances: 20 + cpuThresholdPercent: 70 + memoryThresholdPercent: 80 + + # Custom domains + domains: + - name: api.example.com + + # Health checks + livenessCheck: + enabled: true + httpPath: /livez + timeoutSeconds: 5 + readinessCheck: + enabled: true + httpPath: /readyz + timeoutSeconds: 3 + + # Path routing + pathRouting: + - path: /api/v1/ + port: 8080 + - path: /api/v2/ + port: 8081 + pathRoutingConfig: + rewriteMode: rewrite-prefix + + # Ingress configuration + ingressAnnotations: + nginx.ingress.kubernetes.io/proxy-body-size: "50m" + + # Service mesh and metrics + serviceMeshEnabled: true + metricsScraping: + enabled: true + path: /metrics + port: 9090 + + # Cloud connections + connections: + - type: awsRole + role: api-s3-access + + # Graceful shutdown + terminationGracePeriodSeconds: 30 +``` diff --git a/deploy/configuration-as-code/services/worker-service.mdx b/deploy/configuration-as-code/services/worker-service.mdx index 848171c..483daa7 100644 --- a/deploy/configuration-as-code/services/worker-service.mdx +++ b/deploy/configuration-as-code/services/worker-service.mdx @@ -2,40 +2,58 @@ title: 'Worker Services' --- -The following is a full reference for all the fields that can be set for a worker service in `porter.yaml`. - -- [autoscaling](#autoscaling) \- the autoscaling configuration for the service. - - **enabled** \- whether autoscaling is enabled or not. - - **minInstances** \- the minimum number of instances to run. - - **maxInstances** \- the maximum number of instances to run. - - **cpuThresholdPercent** \- the CPU threshold percentage to trigger autoscaling at. - - **memoryThresholdPercent** \- the memory threshold percentage to trigger autoscaling at. -- [healthCheck](#healthcheck) \- the health check configuration for the service. This will configure both the liveness and readiness checks. - - **enabled** \- whether the health check is enabled or not. - - **command** \- the command to run for the health check. - - **timeoutSeconds** \- the timeout for the health check command. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for executing the health check command. Minimum value is 0. -- [livenessCheck](#livenesscheck) \- the liveness check configuration for the service. Cannot be used with healthCheck. - - **enabled** \- whether the liveness check is enabled or not. - - **command** \- the command to run for the liveness check. - - **timeoutSeconds** \- the timeout for the liveness check command. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for executing the liveness check command. Minimum value is 0. -- [readinessCheck](#readinesscheck) \- the readiness check configuration for the service. Cannot be used with healthCheck. - - **enabled** \- whether the readiness check is enabled or not. - - **command** \- the command to run for the readiness check. - - **timeoutSeconds** \- the timeout for the readiness check command. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for executing the readiness check command. Minimum value is 0. -- [startupCheck](#startupcheck) \- the startup check configuration for the service. Cannot be used with healthCheck. - - **enabled** \- whether the startup check is enabled or not. - - **command** \- the command to run for the startup check. - - **timeoutSeconds** \- the timeout for the startup check command. Minimum value is 1. - - **initialDelaySeconds** \- (optional) the initial delay for executing the startup check command. Minimum value is 0. - -### `autoscaling` - -`object` - optional - -All fields are optional. +Worker services are background processing services that don't expose HTTP endpoints. They're ideal for queue consumers, background jobs, and long-running processes. This is a complete reference for all fields that can be set for a worker service in `porter.yaml`. + +## Field Reference + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | Yes | Service identifier (max 31 chars) | +| `type` | string | Yes | Must be `worker` | +| `run` | string | Yes | Command to execute | +| `cpuCores` | number | Yes | CPU allocation | +| `ramMegabytes` | integer | Yes | Memory allocation in MB | +| `instances` | integer | No | Number of replicas (default: 1) | +| `autoscaling` | object | No | Autoscaling configuration | +| `healthCheck` | object | No | Combined health check config | +| `livenessCheck` | object | No | Liveness probe config | +| `readinessCheck` | object | No | Readiness probe config | +| `startupCheck` | object | No | Startup probe config | +| `connections` | array | No | External cloud connections | +| `serviceMeshEnabled` | boolean | No | Enable service mesh | +| `terminationGracePeriodSeconds` | integer | No | Graceful shutdown timeout | +| `gpuCoresNvidia` | integer | No | NVIDIA GPU cores | +| `nodeGroup` | string | No | Node group UUID | + +--- + +## Basic Example + +```yaml +services: + - name: queue-worker + type: worker + run: npm run worker + cpuCores: 0.5 + ramMegabytes: 512 + instances: 3 +``` + +--- + +## `autoscaling` + +**Type:** `object` - **Optional** + +Configure horizontal pod autoscaling based on CPU and memory utilization. + +| Field | Type | Description | +|-------|------|-------------| +| `enabled` | boolean | Enable autoscaling | +| `minInstances` | integer | Minimum number of replicas | +| `maxInstances` | integer | Maximum number of replicas | +| `cpuThresholdPercent` | integer | CPU usage threshold (0-100) | +| `memoryThresholdPercent` | integer | Memory usage threshold (0-100) | ```yaml autoscaling: @@ -46,54 +64,256 @@ autoscaling: memoryThresholdPercent: 80 ``` -### `healthCheck` + +When autoscaling is enabled, the `instances` field is ignored. + + +--- + +## `healthCheck` + +**Type:** `object` - **Optional** -`object` - optional +Configure a combined health check that applies to liveness, readiness, and startup probes. Worker services use command-based health checks since they don't expose HTTP endpoints. + +| Field | Type | Description | +|-------|------|-------------| +| `enabled` | boolean | Enable health checks | +| `command` | string | Command to run for health check | +| `timeoutSeconds` | integer | Command timeout (min: 1) | +| `initialDelaySeconds` | integer | Initial delay before checking (min: 0) | ```yaml healthCheck: enabled: true - command: ./healthz.sh - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional + command: ./healthcheck.sh + timeoutSeconds: 5 + initialDelaySeconds: 15 ``` + +Cannot be used together with `livenessCheck`, `readinessCheck`, or `startupCheck`. Use either the combined `healthCheck` or the individual checks. + + +--- + ## Advanced Health Checks -Advanced health checks allow you to separately configure liveness, readiness, and startup health checks for your service. These cannot be enabled at the same time as the healthCheck field. +For fine-grained control, configure liveness, readiness, and startup probes separately. ### `livenessCheck` -`object` - optional +**Type:** `object` - **Optional** + +Determines if the container should be restarted. ```yaml livenessCheck: enabled: true command: ./livez.sh - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional + timeoutSeconds: 5 + initialDelaySeconds: 15 ``` ### `readinessCheck` -`object` - optional +**Type:** `object` - **Optional** + +Determines if the container is ready to receive work. ```yaml readinessCheck: enabled: true command: ./readyz.sh - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional + timeoutSeconds: 5 + initialDelaySeconds: 5 ``` ### `startupCheck` -`object` - optional +**Type:** `object` - **Optional** + +Used for slow-starting containers. Other probes are disabled until this passes. ```yaml startupCheck: enabled: true command: ./startupz.sh - timeoutSeconds: 1 - initialDelaySeconds: 15 #optional + timeoutSeconds: 10 + initialDelaySeconds: 0 +``` + +--- + +## `connections` + +**Type:** `array` - **Optional** + +Connect to external cloud services. See [Reference](/deploy/configuration-as-code/reference#connections) for full documentation. + + +```yaml AWS Role +connections: + - type: awsRole + role: my-iam-role +``` + +```yaml Cloud SQL (GCP) +connections: + - type: cloudSql + config: + cloudSqlConnectionName: project:region:instance + cloudSqlDatabasePort: 5432 + cloudSqlServiceAccount: my-service-account +``` + +```yaml Persistent Disk +connections: + - type: disk + config: + mountPath: /data + sizeGb: 10 +``` + + +--- + +## `serviceMeshEnabled` + +**Type:** `boolean` - **Optional** + +Enable service mesh for enhanced inter-service communication with improved performance, reliability, and monitoring. + +```yaml +serviceMeshEnabled: true +``` + + +Useful for workers that need to communicate with other services in your cluster. + + +--- + +## `terminationGracePeriodSeconds` + +**Type:** `integer` - **Optional** + +Seconds to wait for graceful shutdown before forcefully terminating the container. + +```yaml +terminationGracePeriodSeconds: 120 +``` + + +Set this to a value higher than your longest expected job. This gives workers time to complete in-progress work before shutdown. + + +--- + +## `gpuCoresNvidia` + +**Type:** `integer` - **Optional** + +Allocate NVIDIA GPU cores for ML workloads or GPU-accelerated processing. + +```yaml +gpuCoresNvidia: 1 +nodeGroup: gpu-node-group-uuid +``` + + +Requires a node group with GPU-enabled instances. + + +--- + +## Complete Example + +```yaml +services: + - name: queue-processor + type: worker + run: npm run worker + cpuCores: 1 + ramMegabytes: 2048 + + # Autoscaling + autoscaling: + enabled: true + minInstances: 2 + maxInstances: 20 + cpuThresholdPercent: 70 + memoryThresholdPercent: 80 + + # Health checks + livenessCheck: + enabled: true + command: ./healthcheck.sh + timeoutSeconds: 5 + readinessCheck: + enabled: true + command: ./ready.sh + timeoutSeconds: 3 + + # Cloud connections + connections: + - type: awsRole + role: worker-sqs-access + + # Service mesh + serviceMeshEnabled: true + + # Graceful shutdown (allow 2 minutes for jobs to complete) + terminationGracePeriodSeconds: 120 +``` + +--- + +## Common Use Cases + +### Queue Consumer + +```yaml +services: + - name: sqs-consumer + type: worker + run: node src/workers/sqs-consumer.js + cpuCores: 0.5 + ramMegabytes: 512 + instances: 5 + terminationGracePeriodSeconds: 60 + connections: + - type: awsRole + role: sqs-consumer-role +``` + +### Background Job Processor + +```yaml +services: + - name: job-processor + type: worker + run: bundle exec sidekiq + cpuCores: 1 + ramMegabytes: 1024 + autoscaling: + enabled: true + minInstances: 2 + maxInstances: 10 + cpuThresholdPercent: 70 + terminationGracePeriodSeconds: 300 +``` + +### ML Inference Worker + +```yaml +services: + - name: ml-worker + type: worker + run: python worker.py + cpuCores: 4 + ramMegabytes: 8192 + gpuCoresNvidia: 1 + nodeGroup: gpu-node-group-uuid + instances: 2 ```