Skip to content

Add enterprise runtime hardening#6

Closed
tasuke-pochira wants to merge 2 commits into
Proxy-Pointer:mainfrom
tasuke-pochira:enterprise/runtime-retrieval-hardening
Closed

Add enterprise runtime hardening#6
tasuke-pochira wants to merge 2 commits into
Proxy-Pointer:mainfrom
tasuke-pochira:enterprise/runtime-retrieval-hardening

Conversation

@tasuke-pochira
Copy link
Copy Markdown
Contributor

@tasuke-pochira tasuke-pochira commented May 27, 2026

Summary

This PR adds enterprise runtime hardening across the Proxy-Pointer suite. It focuses on making the project easier to operate,
validate, debug, and scale in controlled enterprise-style deployments.

The main changes are:

  • Shared model runtime helpers for retry/backoff, timeout-aware generation calls, rate-limit detection, and JSON response parsing.
  • Shared document manifest and Markdown line-cache utilities for faster indexed-document resolution.
  • DocComparator section processing moved into a bounded parallel pipeline.
  • Metadata-only audit event support for DocComparator comparison stages.
  • New local runtime commands:
    • pprag doctor
    • pprag doctor --json
    • pprag eval runtime
  • Docker, Docker Compose, and production environment template for hosted pilot deployments.
  • GitHub Actions test workflow for automatic validation.
  • Expanded tests covering runtime helpers, manifest behavior, caching, CLI dispatch, and section-pipeline concurrency.

Why

The repo already supports strong local RAG workflows, but enterprise pilots need more than retrieval quality. They need predictable runtime behavior, measurable performance, operational checks, safer configuration, and repeatable validation.

This PR moves the codebase in that direction without changing the core Proxy-Pointer retrieval architecture.

Runtime Hardening

A new shared runtime module centralizes common model-call behavior:

  • retry/backoff for rate-limit and quota errors
  • timeout-aware generation calls
  • common response text extraction
  • fenced JSON cleanup
  • robust JSON object parsing

This reduces duplicated retry/parsing logic across Text-Only, MultiModal, and DocComparator paths.

Document Manifest and Cache

A new document manifest layer records indexed Markdown documents by document id. This allows DocComparator to resolve doc_id -> md_path through a persisted manifest instead of repeatedly scanning directories and hashing files.

A shared Markdown line cache also avoids repeated file reads during context assembly.

Benefits:

  • faster repeated lookups
  • fewer full-file hash operations
  • more inspectable index state
  • better foundation for stale-index detection later

DocComparator Pipeline Upgrade

DocComparator now has a dedicated comparison pipeline helper that handles:

  • Doc 2 cross-retrieval
  • section comparison
  • deterministic result ordering
  • bounded parallel selected-section processing
  • metadata-only audit events

The previous implementation processed selected Doc 1 sections serially inside the Streamlit app. This PR moves that behavior into testable core code and adds DC_SECTION_CONCURRENCY to run multiple selected sections concurrently while preserving report order.

Audit Logging

Optional metadata-only audit logging was added.

It is disabled by default and only activates when either of these is configured:

  • PPRAG_AUDIT_LOG
  • DC_AUDIT_LOG

The audit events intentionally avoid prompts and document text. They record operational metadata such as section ids, match counts,
result counts, elapsed time, and event type.

Runtime Commands

This PR adds two operational commands:

pprag doctor
pprag doctor --json

These check runtime readiness without loading the full application stack.

pprag eval runtime

This runs local deterministic runtime checks for cache and concurrency behavior without external API calls.

Deployment Scaffold

This PR adds:

  • Dockerfile
  • docker-compose.yml
  • .dockerignore
  • deploy/production.env.example

These are intended for hosted pilot deployments and local deployment testing. The production env template documents runtime settings, trust flags, audit log location, and throughput knobs.

CI

A GitHub Actions workflow was added:

.github/workflows/tests.yml

It runs:

  • dependency installation through uv
  • whitespace validation with git diff --check
  • full test suite with pytest

Performance Notes

Local runtime evaluation showed:

section_serial_seconds: 0.121168
section_parallel_seconds: 0.031486
section_speedup: 3.85x

Focused section-pipeline proof showed:

serial: 0.2809s
parallel: 0.0713s
speedup: 3.94x

Other expected performance improvements:

  • cached Markdown reads avoid repeated file reads
  • manifest lookup avoids repeated directory scans and full-file hashing
  • shared retry helpers make rate-limit behavior consistent across workflows

Validation

Validated locally with:

UV_CACHE_DIR=/tmp/uv-cache uv run --all-extras --group test pytest -q

Result:

22 passed, 4 warnings in 1.58s

Also validated with:

UV_CACHE_DIR=/tmp/uv-cache uv run --all-extras --group test python -m compileall -q src tests

  • The current provider SDK deprecation warning remains unchanged.
  • That migration should be handled in a separate focused compatibility PR because it touches all provider call paths.
  • Audit logging is disabled unless explicitly configured.
  • DC_SECTION_CONCURRENCY defaults conservatively to 2.

@Proxy-Pointer
Copy link
Copy Markdown
Owner

The repo is not meant to be a production ready code.. but should be functional and simple enough that people can understand and adapt without getting overwhelmed by non-functional code. The last pull about doc comparator performance improvement was useful. However, this is going too far, adding complexity without any new/enhanced functionality added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants