Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion charts/agent-docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,4 +79,4 @@ the same Kubernetes cluster to increase overall capacity.
| tolerations | list | `[]` | Tolerations for the Scalr Agent pods, allowing them to run on tainted nodes |

----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
10 changes: 10 additions & 0 deletions charts/agent-job/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [UNRELEASED]


### Added

- Pod labels and annotations are now mounted to `/etc/podinfo` in the agent worker and agent controller containers via the Downward API.
- Added Kubernetes resource attributes autodiscovery. Pod labels (`infra.scalr.io/app`, `infra.scalr.io/env`, `infra.scalr.io/service`) and Datadog annotations (`ad.datadoghq.com/tags`) are now automatically mapped to OTLP resource attributes.

### Changes

- Added `list` permission for `events` to allow the controller to include debug information for failed task pods.

## [v0.5.71]

### Updated
Expand Down
364 changes: 225 additions & 139 deletions charts/agent-job/README.md

Large diffs are not rendered by default.

47 changes: 43 additions & 4 deletions charts/agent-job/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
See the [official documentation](https://docs.scalr.io/docs/agent-pools) for more information about Scalr Agents.

> [!WARNING]
> This chart is in Beta, and implementation details are subject to change. See [Planned Changes for Stable](#planned-changes-for-stable).
> This chart is in Beta, and implementation details are subject to change.

## Table of Contents

- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Overview](#overview)
- [Architecture Diagram](#architecture-diagram)
- [Planned Changes for Stable](#planned-changes-for-stable)
- [Planned Changes](#planned-changes)
- [Agent Task Naming](#agent-task-naming)
- [Custom Runner Images](#custom-runner-images)
- [Performance Optimization](#performance-optimization)
Expand Down Expand Up @@ -97,9 +97,15 @@ See [template](https://github.com/Scalr/agent-helm/blob/master/charts/agent-job/
<img src="assets/deploy-diagram.drawio.svg" />
</p>

## Planned Changes for Stable
## Planned Changes

- Require Kubernetes with containerd 2.2+ as the minimum version once it reaches GA.
This section outlines planned architecture changes that may be relevant for long-term chart maintenance.

### Update Minimum Requirements to Kubernetes 1.36 Once GA

Update the minimum required Kubernetes version to 1.36, which includes the stable [ImageVolume](https://kubernetes.io/docs/tasks/configure-pod-container/image-volumes/) feature and containerd 2.2+ with [subPath](https://github.com/containerd/containerd/pull/11578) support for ImageVolume.
In Kubernetes 1.35 (current minimal required version), ImageVolume is in Beta status but enabled by default, and we consider it ready for limited usage.
This chart relies on ImageVolume to provision application components via OCI registry and plans to use this feature more heavily in the future.

## Agent Task Naming

Expand Down Expand Up @@ -383,6 +389,39 @@ See [all configuration options](#opentelemetry).

Learn more about [available metrics](https://docs.scalr.io/docs/metrics).

### Resource Attributes Autodiscovery

When running in Kubernetes, the agent automatically discovers and enriches OTLP resource attributes
from pod labels and annotations mounted via the Downward API.

#### Scalr Tag Autodiscovery

The following pod labels are mapped to OTLP resource attributes:

| Label | Default | OTLP Attribute |
|---|---|---|
| `infra.scalr.io/app` | — | `app` |
| `infra.scalr.io/env` | — | `deployment.environment.name` |
| `infra.scalr.io/service` | `scalr-agent` | `service.name` |

#### Datadog Tag Autodiscovery

The agent supports [Datadog Tag Autodiscovery](https://docs.datadoghq.com/containers/kubernetes/tag/?tab=datadogoperator#tag-autodiscovery) via the `ad.datadoghq.com/tags` pod annotation. Tags defined in this annotation are parsed as a JSON object and merged into the OTLP resource attributes.

Example:

```yaml
annotations:
ad.datadoghq.com/tags: '{"env":"production","team":"backend"}'
```

When the annotation is present on task pods, it is automatically extended with account and workspace context:

```yaml
annotations:
ad.datadoghq.com/tags: '{"env":"production","team":"backend","account_name":"mainiacp","account_id":"acc-svrcncgh453bi8g","workspace_name":"main","workspace_id":"ws-v0p5qsps90tv7tvuc"}'
```

## Custom Resource Definitions

This chart bundles the **AgentTaskTemplate CRD** (`agenttasktemplates.scalr.io`) and installs or upgrades it automatically via Helm. The CRD defines the job template that the controller uses to create task pods, so no separate manual step is required in most environments.
Expand Down
11 changes: 11 additions & 0 deletions charts/agent-job/templates/agent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,8 @@ spec:
resources:
{{- toYaml .Values.agent.resources | nindent 12 }}
volumeMounts:
- name: podinfo
mountPath: /etc/podinfo
- name: data-dir
mountPath: {{ .Values.agent.dataDir | quote }}
- name: cache-dir
Expand Down Expand Up @@ -188,5 +190,14 @@ spec:
# It is more robust to mount a dedicated volume.
- name: tmp-dir
emptyDir: {}
- name: podinfo
downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
- path: "annotations"
fieldRef:
fieldPath: metadata.annotations
{{- include "agent-job.caCertVolume" . | nindent 8 }}
terminationGracePeriodSeconds: {{ .Values.agent.terminationGracePeriodSeconds }}
14 changes: 13 additions & 1 deletion charts/agent-job/templates/task.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ spec:
- name: runner
image: {{ include "agent-job.image" (dict "context" . "repository" .Values.task.runner.image.repository "tag" .Values.task.runner.image.tag ) }}
imagePullPolicy: {{ .Values.task.runner.image.pullPolicy }}
command: ["sh", "/usr/bin/runner/exec-loop"]
command: ["/usr/bin/runner/exec-loop"]
workingDir: /opt/workdir
{{- with .Values.task.runner.securityContext }}
securityContext:
Expand All @@ -89,6 +89,7 @@ spec:
# TODO: Use subPath to mount a single file.
# Currently the entire worker image filesystem is mounted. Not critical since the component is public, but not ideal either.
# See: https://github.com/bottlerocket-os/bottlerocket/issues/4755
# Ticket: SCALRCORE-37528
- name: data-dir
mountPath: /opt/workdir
subPath: run/config
Expand Down Expand Up @@ -218,6 +219,8 @@ spec:
- name: sa-token
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
readOnly: true
- name: podinfo
mountPath: /etc/podinfo
- name: data-dir
mountPath: {{ .Values.agent.dataDir }}
- name: cache-dir
Expand Down Expand Up @@ -296,6 +299,15 @@ spec:
{{- with .Values.task.extraVolumes }}
{{- toYaml . | nindent 12 }}
{{- end }}
- name: podinfo
downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
- path: "annotations"
fieldRef:
fieldPath: metadata.annotations
{{- with .Values.task.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 12 }}
Expand Down
3 changes: 3 additions & 0 deletions charts/agent-job/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@ rbac:
- apiGroups: ["batch"]
resources: ["jobs/status"]
verbs: ["get", "patch", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["list"]
# -- Cluster-wide RBAC rules (applied via ClusterRole bound in the release namespace).
# @section -- RBAC
clusterRules:
Expand Down
2 changes: 1 addition & 1 deletion charts/agent-k8s/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,4 +372,4 @@ If your cluster doesn't currently support egress NetworkPolicies, you may need t
| workerTolerations | list | `[]` | Kubernetes Node Tolerations for the agent worker and the agent task pods. Expects input structure as per specification <https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#toleration-v1-core>. Example: `--set workerTolerations[0].operator=Equal,workerTolerations[0].effect=NoSchedule,workerTolerations[0].key=dedicated,workerTolerations[0].value=scalr-agent-worker-pool` |

----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.11.0](https://github.com/norwoodj/helm-docs/releases/v1.11.0)
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
Loading