Skip to content

Add Host Profiler Feature#2403

Merged
levan-m merged 20 commits intomainfrom
mackjmr/host-profiler-feature
Jan 13, 2026
Merged

Add Host Profiler Feature#2403
levan-m merged 20 commits intomainfrom
mackjmr/host-profiler-feature

Conversation

@mackjmr
Copy link
Copy Markdown
Member

@mackjmr mackjmr commented Dec 12, 2025

What does this PR do?

Add support for full host profiler.

Motivation

OTAGENT-711

Additional Notes

Anything else we should know when reviewing?

Minimum Agent Versions

Are there minimum versions of the Datadog Agent and/or Cluster Agent required?

  • Agent: Feature has not been released yet. For now, only image available in dockerhub datadog/ddot-ebpf-dev repo.
  • Cluster Agent: vX.Y.Z

Describe your test plan

Prereqs:

  • Staging API Key
  • Staging APP key with continuous_profiler_read permission
  • Cluster in Real Linux environment (The host profiler relies on eBPF, which requires direct access to a real Linux kernel)

Test 1. Configmap

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
  annotations:
    agent.datadoghq.com/host-profiler-enabled: "true"
    agent.datadoghq.com/host-profiler-configmap-name: "custom-config-map"
    experimental.agent.datadoghq.com/image-override-config: |
      {
        "host-profiler": {
          "name": "datadog/ddot-ebpf-dev:nightly-latest"
        }
      }
spec:
  override:
    nodeAgent:
      env:
        - name: DD_HOSTNAME
          value: "test1"
        - name: DD_API_KEY
          value: xxx
        - name: DD_APP_KEY
          value: xxx
        - name: DD_SITE
          value: datad0g.com

  global:
    site: datad0g.com
    kubelet:
      tlsVerify: false
    credentials:
      apiKey: xxx
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-config-map
  namespace: system
data:
  host-profiler-config.yaml: |-
    extensions:
      ddprofiling/default:
      hpflare/default:
    receivers:
      otlp:
        protocols:
          grpc:
          http:
      hostprofiler:
        symbol_uploader:
          enabled: true
          symbol_endpoints:
            - site: ${env:DD_SITE}
              api_key: ${env:DD_API_KEY}
              app_key: ${env:DD_APP_KEY}
    processors:
      infraattributes/default:
        cardinality: 2 # HighCardinality
        allow_hostname_override: true
      cumulativetodelta: {}

    exporters:
      debug: {}
      otlphttp:
        metrics_endpoint: https://otlp.datad0g.com/v1/metrics
        profiles_endpoint: https://intake.profile.datad0g.com/v1development/profiles
        headers:
          dd-api-key: ${env:DD_API_KEY}
          dd-otel-metric-config: '{"resource_attributes_as_tags": true}'
    service:
      extensions: [hpflare/default]
      pipelines:
        profiles:
          receivers: [hostprofiler]
          processors: [infraattributes/default]
          exporters: [otlphttp, debug]   
        metrics:
          receivers: [otlp]
          processors: [cumulativetodelta, infraattributes/default]
          exporters: [otlphttp]

Ensure:

  • host-profiler container is running in agent pod with no error
  • you see profiles under host:test1 in staging.

Test 2. Inline config

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
    agent.datadoghq.com/host-profiler-enabled: "true"
    experimental.agent.datadoghq.com/image-override-config: |
      {
        "host-profiler": {
          "name": "datadog/ddot-ebpf-dev:nightly-latest"
        }
      }
    agent.datadoghq.com/host-profiler-configdata: |-
          extensions:
            ddprofiling/default:
            hpflare/default:
          receivers:
            otlp:
              protocols:
                grpc:
                http:
            hostprofiler:
              symbol_uploader:
                enabled: true
                symbol_endpoints:
                  - site: ${env:DD_SITE}
                    api_key: ${env:DD_API_KEY}
                    app_key: ${env:DD_APP_KEY}
          processors:
            infraattributes/default:
              cardinality: 2 # HighCardinality
              allow_hostname_override: true
            cumulativetodelta: {}
          exporters:
            debug: {}
            otlphttp:
              metrics_endpoint: https://otlp.datad0g.com/v1/metrics
              profiles_endpoint: https://intake.profile.datad0g.com/v1development/profiles
              headers:
                dd-api-key: ${env:DD_API_KEY}
                dd-otel-metric-config: '{"resource_attributes_as_tags": true}'
          service:
            extensions: [hpflare/default]
            pipelines:
              profiles:
                receivers: [hostprofiler]
                processors: [infraattributes/default]
                exporters: [otlphttp, debug]   
              metrics:
                receivers: [otlp]
                processors: [cumulativetodelta, infraattributes/default]
                exporters: [otlphttp]
spec:
  override:
    nodeAgent:
      env:
        - name: DD_HOSTNAME
          value: "test2"
        - name: DD_API_KEY
          value: xxx
        - name: DD_APP_KEY
          value: xxx
        - name: DD_SITE
          value: datad0g.com
  global:
    site: datad0g.com
    kubelet:
      tlsVerify: false
    credentials:
      apiKey: xxx

Test 3. HostPID manually disabled

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
  annotations:
    experimental.agent.datadoghq.com/image-override-config: |
      {
        "host-profiler": {
          "name": "datadog/ddot-ebpf-dev:nightly-latest"
        }
      }
    agent.datadoghq.com/host-profiler-enabled: "true"
spec:
  override:
    nodeAgent:
      env:
        - name: DD_HOSTNAME
          value: "test3"
        - name: DD_API_KEY
          value: xxx
        - name: DD_APP_KEY
          value: xxx
        - name: DD_SITE
          value: datad0g.com
      hostPID: false

  global:
    site: datad0g.com
    kubelet:
      tlsVerify: false
    credentials:
      apiKey: xxx

Ensure:

  • host-profiler container is not running in agent pod
  • operator logs error: Host PID is required to run the host profiler. Please enable host PID or disable the host profiler

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label

@mackjmr mackjmr added the enhancement New feature or request label Dec 15, 2025
@mackjmr mackjmr added this to the v1.23.0 milestone Dec 16, 2025
@mackjmr mackjmr changed the title Mackjmr/host profiler feature Add Host Profiler Feature Dec 18, 2025
@mackjmr mackjmr marked this pull request as ready for review December 18, 2025 13:56
@mackjmr mackjmr requested review from a team as code owners December 18, 2025 13:56
Copy link
Copy Markdown
Contributor

@maycmlee maycmlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a tiny nit, but approving

Comment thread docs/configuration.v2alpha1.md Outdated
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 9, 2026

Codecov Report

❌ Patch coverage is 56.20915% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 38.09%. Comparing base (f2d7517) to head (5154dd9).

Files with missing lines Patch % Lines
...oller/datadogagent/feature/hostprofiler/feature.go 68.54% 29 Missing and 10 partials ⚠️
...controller/datadogagent/component/agent/default.go 0.00% 20 Missing ⚠️
...nal/controller/datadogagent/feature/utils/utils.go 0.00% 5 Missing ⚠️
pkg/images/images.go 0.00% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2403      +/-   ##
==========================================
+ Coverage   37.98%   38.09%   +0.11%     
==========================================
  Files         298      299       +1     
  Lines       25029    25182     +153     
==========================================
+ Hits         9508     9594      +86     
- Misses      14796    14853      +57     
- Partials      725      735      +10     
Flag Coverage Δ
unittests 38.09% <56.20%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/controller/datadogagent/controller.go 53.57% <ø> (ø)
...ontroller/datadogagent/override/podtemplatespec.go 77.39% <100.00%> (+0.15%) ⬆️
pkg/images/images.go 95.27% <0.00%> (-2.31%) ⬇️
...nal/controller/datadogagent/feature/utils/utils.go 0.00% <0.00%> (ø)
...controller/datadogagent/component/agent/default.go 36.36% <0.00%> (-1.14%) ⬇️
...oller/datadogagent/feature/hostprofiler/feature.go 68.54% <68.54%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2d7517...5154dd9. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread internal/controller/datadogagent/feature/hostprofiler/feature.go Outdated
// If a user disabled HostPID manually, error out rather than enabling it for them.
if nodeAgent, ok := ddaSpec.Override[v2alpha1.NodeAgentComponentName]; ok {
if nodeAgent.HostPID != nil && apiutils.BoolValue(nodeAgent.HostPID) == false {
o.logger.Error(errHostPIDDisabledManually, "Host PID is required to run the host profiler. Please enable host PID or disable the host profiler")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this won't be logged unless someone has node agent override configured. Probably log error only if hostProfilerEnabled == true and there is no override or no HostPID is set.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this won't be logged unless someone has node agent override configured

This is the goal. My assumption is that if hostPID was disabled manually by the user, then this was voluntary and we shouldn't enable it for the purpose of the host profiler feature. Instead I want to error out. That said, based on #2403 (comment) I guess I can just force enable it always ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I misunderstood intent here. What you say makes sense and could go either way so we can leave it as is.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill remove it to stay consistent with other paths that do this.

if nodeAgent, ok := ddaSpec.Override[v2alpha1.NodeAgentComponentName]; ok {
if nodeAgent.HostPID != nil && apiutils.BoolValue(nodeAgent.HostPID) == false {
o.logger.Error(errHostPIDDisabledManually, "Host PID is required to run the host profiler. Please enable host PID or disable the host profiler")
o.hostPIDDisabledManually = true
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - recently there was a change #2365 in npm, usm features to enabled host pid when system probe is enabled. I think same can be done here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do enable it

managers.PodTemplateSpec().Spec.HostPID = *apiutils.NewBoolPointer(true)
. But based on the other PR, I guess I can skip the check that hostPID is disabled manually check althogether and just force enable ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment.

Comment thread internal/controller/datadogagent/defaults/datadogagent_default.go Outdated
Comment thread pkg/images/images.go

// GetLatestAgentImage returns the latest host profiler image
func GetLatestHostProfilerImage() string {
image := newImage(DockerHubContainerRegistry, DefaultHostProfilerDevImageName, DefaultHostProfilerDevImageLatestTag, false, false, false)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really work and isn't really needed at all. This is called when Daemonset is initialized and is overwritted by global defaults so it's reset to gcr here.

To override container level image you can include this annotation with the one which enabled profile (this mechanism was added by #1730)

    agent.datadoghq.com/host-profiler-enabled: "true"
    experimental.agent.datadoghq.com/image-override-config: |
      {
        "host-profiler": {
          "name": "docker.io/datadog/ddot-ebpf-dev:nightly-latest"
        }
      }  

Copy link
Copy Markdown
Member Author

@mackjmr mackjmr Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really work and isn't really needed at all.

Yeah, I noticed that and had to set registry: docker.io/datadog

I'm fine with using annotation until we get official image. SHould I just unset the image field:

in the container struct in that case ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove docker.io/GetLatestHostProfilerImage and use default image.

// If a user disabled HostPID manually, error out rather than enabling it for them.
if nodeAgent, ok := ddaSpec.Override[v2alpha1.NodeAgentComponentName]; ok {
if nodeAgent.HostPID != nil && apiutils.BoolValue(nodeAgent.HostPID) == false {
o.logger.Error(errHostPIDDisabledManually, "Host PID is required to run the host profiler. Please enable host PID or disable the host profiler")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I misunderstood intent here. What you say makes sense and could go either way so we can leave it as is.

if nodeAgent, ok := ddaSpec.Override[v2alpha1.NodeAgentComponentName]; ok {
if nodeAgent.HostPID != nil && apiutils.BoolValue(nodeAgent.HostPID) == false {
o.logger.Error(errHostPIDDisabledManually, "Host PID is required to run the host profiler. Please enable host PID or disable the host profiler")
o.hostPIDDisabledManually = true
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment.

Comment thread pkg/images/images.go

// GetLatestAgentImage returns the latest host profiler image
func GetLatestHostProfilerImage() string {
image := newImage(DockerHubContainerRegistry, DefaultHostProfilerDevImageName, DefaultHostProfilerDevImageLatestTag, false, false, false)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove docker.io/GetLatestHostProfilerImage and use default image.

@mackjmr mackjmr mentioned this pull request Jan 13, 2026
3 tasks
@levan-m levan-m merged commit b4b6e01 into main Jan 13, 2026
33 checks passed
@levan-m levan-m deleted the mackjmr/host-profiler-feature branch January 13, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants