Skip to content

Add Transform Processor to Collector Layer #2218

@wpessers

Description

@wpessers

Is your feature request related to a problem? Please describe.

The default collector layer build includes several processors for modifying telemetry data (attributes, resource, span), but there are transformation use cases that none of these included processors can currently address.

Describe the solution you'd like

Add the transform processor to the collector layer. This processor uses the OpenTelemetry Transformation Language (OTTL) to express arbitrary transformations on traces, metrics, and logs.

Capabilities not covered by current processors

The following are things the transform processor can do that no combination of the currently included processors supports:

1. Metric type conversions and manipulation

Convert between metric types, extract sub-metrics, and scale values:

transform:
  metric_statements:
    - convert_sum_to_gauge() where metric.name == "system.processes.count"
    - extract_count_metric(true) where metric.name == "http.server.duration"
    - scale_metric(0.001) where metric.name == "faas.duration"
    - convert_exponential_histogram_to_histogram()

2. Log body and field manipulation

Current processors can't modify log record bodies or do pattern-based redaction across arbitrary fields:

transform:
  log_statements:
    - replace_pattern(body, "password=\\S+", "password=***REDACTED***")
    - set(attributes["log.level"], severity_text)

3. Cross-field and cross-level operations

Move or derive values across the resource/scope/signal hierarchy (the attributes processor can only copy within the same level):

transform:
  trace_statements:
    - set(attributes["env"], resource.attributes["deployment.environment"])
    - set(name, Concat([attributes["http.method"], " ", attributes["http.route"]], ""))
      where attributes["http.route"] != nil

4. Span event manipulation

None of the current processors can target span events (not even the span processor):

transform:
  trace_statements:
    - context: spanevent
      statements:
        - set(attributes["handled"], true)
          where name == "exception" and attributes["exception.type"] == "TimeoutError"

5. Complex conditional logic

The where clause provides full OTTL expressions, significantly more powerful than the include/exclude matching in the attributes/span processors:

transform:
  trace_statements:
    - set(attributes["error.category"], "server_error")
      where span.kind == 2
        and Int(attributes["http.response.status_code"]) >= 500
        and resource.attributes["service.name"] == "my-lambda-function"

Overlap with existing processors

The transform processor could aalso fully replace the attributes, resource, and span processors:

Existing processor Transform processor equivalent
attributes: insert/update/upsert/delete/hash/extract/convert actions set, delete_key, delete_matching_keys, SHA256, replace_pattern + where clauses
resource: upsert/insert/delete on resource attributes Same OTTL statements in the resource context
span: rename from attributes, extract attributes from name, set status set(name, ...), replace_pattern(name, ...), set(status.code, ...)

This doesn't mean I'm advocating for those processors to be removed entirely. They remain simpler and maybe more intuitive for most operations, and removing them would break existing user configurations. It does mean that users who choose to adopt the transform processor can consolidate multiple processor steps into one. Part of the outcome of this issue should be a decision on whether to remove the other processors in favor of the transform processor in the future.

Describe alternatives you've considered

Default build vs. custom-build-only

The transform processor could either be provided in the default build or made available only via custom builds.

A reasonable middle ground could be to start with custom-build-only and promote to the default build once there's evidence of demand or if binary size impact turns out to be negligible.

Measuring binary size impact

The incremental cost should be measured before deciding. Since pkg/ottl is already pulled in transitively, the difference will mainly be in the transform processor itself and its OTTL function registrations. A comparison build can be done to check this.

Tasks

  • Add the transform processor to the collector layer so that users can opt-in to use it in their custom layer builds.
  • Investigate impact on binary size / cold start

Tip: React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestgoPull requests that update Go code

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions