Skip to content

Add OpenTelemetry and basic tracing + dev infra#7163

Open
johnewart wants to merge 9 commits intomainfrom
johnewart/otel
Open

Add OpenTelemetry and basic tracing + dev infra#7163
johnewart wants to merge 9 commits intomainfrom
johnewart/otel

Conversation

@johnewart
Copy link
Copy Markdown
Collaborator

Adds Open Telemetry along with basic tracing of FastAPI, httpx and sqlalchemy, a convenience tracing module to start new spans, and also adds the following to nox -s dev: Prometheus (for scraping metrics later), tempo (for collecting OTel traces), memcached (for tempo), and grafana (to view metrics/traces). When your dev setup is running, you can hit http://localhost:3000 and see Grafana. On the left is 'Drilldown > Traces' which will take you to the traces view; down at the bottom is a traces tab that will show you all the traces for the service.

Screenshot 2025-12-18 at 4 38 13 PM Screenshot 2025-12-18 at 4 38 19 PM

@johnewart johnewart requested a review from galvana December 19, 2025 00:39
@johnewart johnewart requested a review from a team as a code owner December 19, 2025 00:39
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Dec 19, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Review Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Dec 23, 2025 7:28pm
fides-privacy-center Ignored Ignored Dec 23, 2025 7:28pm

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Dec 19, 2025

Greptile Summary

This PR adds OpenTelemetry distributed tracing support to Fides, including:

  • New TracingSettings configuration for enabling/disabling tracing with options for OTLP endpoint, protocol, sample rate, and instrumentation toggles
  • Instrumentation for FastAPI, SQLAlchemy, httpx, and Redis
  • A convenience trace_span context manager for custom spans
  • Docker infrastructure for local development: Tempo (trace collection), Prometheus (metrics), Grafana (visualization), and Memcached

Key Issues Found:

  • Port 3000 conflict between Grafana and fides-ui services will prevent both from running simultaneously
  • The sample_rate configuration option is defined but ignored - traces use hardcoded ALWAYS_ON sampling
  • Several unused imports and variables in tracing.py

Confidence Score: 3/5

  • PR has a port conflict that will cause issues in dev environment, and dead code that should be cleaned up
  • The port 3000 conflict between Grafana and fides-ui is a blocking issue for dev environment. The sample_rate config being ignored is misleading to users. Tracing is disabled by default so production is safe.
  • docker/telemetry/docker-compose.yml (port conflict), src/fides/telemetry/tracing.py (unused code and ignored config)

Important Files Changed

Filename Overview
src/fides/telemetry/tracing.py New OpenTelemetry tracing module with unused imports, dead code (sample_rate config ignored), and an unused variable.
docker/telemetry/docker-compose.yml New telemetry stack docker-compose with port 3000 conflict with fides-ui service.
src/fides/config/tracing_settings.py New TracingSettings configuration class with sample_rate field that is not honored in implementation.
docker-compose.yml Added telemetry services as dependencies with port 3000 conflict between fides-ui and grafana.
src/fides/api/app_setup.py Added tracing initialization with import not at the top of file (import mixed with other imports).
requirements.txt Added OpenTelemetry dependencies with pinned versions.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread docker/telemetry/docker-compose.yml Outdated
# OpenTelemetry imports are optional - app should work without them
try:
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Unused import - OTLPSpanExporter is imported here but local imports are used inside _create_otlp_exporter instead.

ConsoleSpanExporter,
SpanExporter,
)
from opentelemetry.sdk.trace.sampling import ParentBasedTraceIdRatio
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Unused import - ParentBasedTraceIdRatio is imported but never used (commented-out code on line 110).

# Determine if we're using gRPC or HTTP based on the endpoint
endpoint = config.tracing.otlp_endpoint
protocol = config.tracing.otlp_protocol
exporter = config.tracing.otlp_exporter
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: exporter variable is assigned but never used.

Comment thread src/fides/telemetry/tracing.py Outdated
johnewart and others added 3 commits December 18, 2025 16:46
Remove temporary increased sample rate

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Moved grafana to port 3010 to not conflict with fides-ui

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 49.50980% with 103 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.49%. Comparing base (f8da0a0) to head (a1be108).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/fides/telemetry/tracing.py 38.75% 83 Missing and 15 partials ⚠️
src/fides/api/db/ctl_session.py 50.00% 2 Missing and 1 partial ⚠️
src/fides/api/db/session.py 77.77% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7163      +/-   ##
==========================================
- Coverage   87.17%   85.49%   -1.69%     
==========================================
  Files         534      536       +2     
  Lines       35312    35515     +203     
  Branches     4113     4142      +29     
==========================================
- Hits        30783    30362     -421     
- Misses       3638     4221     +583     
- Partials      891      932      +41     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread requirements.txt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing the Prometheus dependencies

prometheus-client==0.19.0
prometheus-fastapi-instrumentator==6.1.0

)

fastapi_app = FastAPI(title="fides", version=app_version, lifespan=lifespan, separate_input_output_schemas=False) # type: ignore
Instrumentator().instrument(fastapi_app).expose(fastapi_app, endpoint="/metrics")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this from the Swagger docs since this is just for Prometheus and not meant to be human-readable.

Suggested change
Instrumentator().instrument(fastapi_app).expose(fastapi_app, endpoint="/metrics")
Instrumentator().instrument(fastapi_app).expose(fastapi_app, endpoint="/metrics", include_in_schema=False)

workers_enabled=False, workers=[], queue_counts={}
).model_dump(mode="json")

with add_span_event("workers_health_check"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_span_event isn't a context manager, should this be trace_span?

Suggested change
with add_span_event("workers_health_check"):
with trace_span("workers_health_check"):

@@ -0,0 +1,2 @@
#!/bin/env sh
influx bucket create -n k6_tests -o ethyca -r 120d No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Comment on lines +35 to +38
otlp_exporter: Optional[str] = Field(
default="otlp",
description="OTLP exporter to use for exporting traces (otlp or console).",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Suggested change
otlp_exporter: Optional[str] = Field(
default="otlp",
description="OTLP exporter to use for exporting traces (otlp or console).",
)

Comment on lines +30 to +33
otlp_protocol: Optional[str] = Field(
default="http/protobuf",
description="OTLP protocol to use for exporting traces (http/protobuf or grpc).",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
otlp_protocol: Optional[str] = Field(
default="http/protobuf",
description="OTLP protocol to use for exporting traces (http/protobuf or grpc).",
)
otlp_protocol: Literal["http/protobuf", "grpc"] = Field(
default="http/protobuf",
description="OTLP protocol to use for exporting traces.",
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants