Skip to content

observability: rewrite /metrics admin command on Prometheus generate_latest (#2058 slice 1/2)#2082

Open
yastman wants to merge 1 commit into
devfrom
observability/2058-drop-pipelinemetrics
Open

observability: rewrite /metrics admin command on Prometheus generate_latest (#2058 slice 1/2)#2082
yastman wants to merge 1 commit into
devfrom
observability/2058-drop-pipelinemetrics

Conversation

@yastman
Copy link
Copy Markdown
Owner

@yastman yastman commented May 22, 2026

This pull request was created by @kiro-agent on behalf of @yastman 👻

Comment with /kiro fix to address specific feedback or /kiro all to address everything.
Learn about Kiro autonomous agent


Parent: #1648 (cleanup) → #2058.

#2058 asks to remove the rolling-window PipelineMetrics surface after Prometheus counters/histograms become authoritative. The full removal requires migrating ~7 PipelineMetrics.get() call-sites (cache.py, rerank.py, retrieve.py, generate_response.py) and ~5 unit-test modules that mock PipelineMetrics. That's a large multi-file change.

This PR delivers slice 1/2: rewrite the user-facing admin /metrics Telegram command on the SDK-native generate_latest() exposition format. It isolates the user-visible change from the internal class removal.

Before

async def cmd_metrics(bot: PropertyBot, message: Message) -> None:
    """Handle /metrics command - show pipeline p50/p95 timing stats."""
    metrics = PipelineMetrics.get()
    text = f"```\n{metrics.format_text()}\n```"
    await message.answer(text, parse_mode="Markdown")

After

async def cmd_metrics(bot: PropertyBot, message: Message) -> None:
    """Handle /metrics command — emit current Prometheus metrics text format."""
    from prometheus_client import REGISTRY, generate_latest

    payload = generate_latest(REGISTRY).decode("utf-8")
    if len(payload) > 3500:
        payload = payload[:3500] + "\n# ...truncated"
    text = f"```\n{payload}\n```"
    await message.answer(text, parse_mode="Markdown")

User-facing admin behaviour preserved (Markdown code-block, single message). The 3500-char truncation guards Telegram's 4096-char message limit when the registry is large in production.

Tests updated

  • tests/unit/handlers/test_command_handlers.py::TestCmdMetrics — asserts on generate_latest() output (Prometheus exposition format markers + a sentinel Counter).
  • tests/unit/test_bot_handlers.py::test_cmd_metrics — same pattern.

New contract test

tests/contract/test_admin_metrics_uses_prometheus_contract.py with two invariants:

  1. cmd_metrics() body calls generate_latest(...).
  2. cmd_metrics() body does not call format_text() (modulo the docstring, which legitimately names the deprecated symbol it replaces).

Slice 2/2 — class removal — deferred

The follow-up that removes the PipelineMetrics class itself (and migrates the remaining 7 call-sites) should land after #2056 merges the record_counter_metricrecord_pipeline_event migration. I'll leave a comment on #2058 reflecting that.

Verification

uv run pytest tests/contract/test_admin_metrics_uses_prometheus_contract.py
              tests/unit/handlers/test_command_handlers.py
              tests/unit/test_bot_handlers.py -q
  207 passed

uvx ruff check telegram_bot/handlers/command_handlers.py \
               tests/contract/test_admin_metrics_uses_prometheus_contract.py
  All checks passed!

Refs #2058 (slice 1/2 — class removal in follow-up).

…latest (#2058 slice 1/2)

Parent: #1648 (cleanup), #1648 slice 4/4 closeout.

Issue #2058 asks to remove the rolling-window PipelineMetrics surface
after Prometheus counters/histograms are authoritative. The full removal
requires migrating ~7 PipelineMetrics.get() call-sites in cache.py,
rerank.py, retrieve.py, generate_response.py and updating ~5 unit-test
modules that mock PipelineMetrics. That is a large multi-file change.

This PR delivers slice 1/2: rewrite the user-facing admin /metrics
Telegram command on the SDK-native generate_latest() exposition format.
This isolates the user-visible change from the internal class removal:

  Before:
    metrics = PipelineMetrics.get()
    text = f''

  After:
    payload = generate_latest(REGISTRY).decode('utf-8')
    if len(payload) > 3500:
        payload = payload[:3500] + '\n# ...truncated'
    text = f''

User-facing admin behaviour preserved (Markdown code-block, single
message). The 3500-char truncation guards Telegram's 4096-char message
limit when the registry is large in production.

Tests updated:
- tests/unit/handlers/test_command_handlers.py::TestCmdMetrics —
  asserts on generate_latest() output (Prometheus exposition format
  markers + a sentinel Counter).
- tests/unit/test_bot_handlers.py::test_cmd_metrics — same pattern.

New contract test:
- tests/contract/test_admin_metrics_uses_prometheus_contract.py with
  two invariants:
    1. cmd_metrics() body calls generate_latest(...).
    2. cmd_metrics() body does not call format_text() (modulo the
       docstring, which is legitimately allowed to *name* the
       deprecated symbol it replaces).

Slice 2/2 (PipelineMetrics class removal) remains open and should land
after #2056 merges the record_counter_metric → record_pipeline_event
migration.

Verification:
  uv run pytest tests/contract/test_admin_metrics_uses_prometheus_contract.py
                tests/unit/handlers/test_command_handlers.py
                tests/unit/test_bot_handlers.py -q
    207 passed
  uvx ruff check tests/contract/test_admin_metrics_uses_prometheus_contract.py
                 telegram_bot/handlers/command_handlers.py
    All checks passed!

Refs #2058. (Slice 1/2 — class removal deferred to follow-up.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants