Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .clinerules
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,3 @@

## The "Zero-Waste Engineering" Mandate
You are strictly bound by the "Borrow vs. Build" philosophy. You MUST maximize the use of stable Open Source Software (OSS) whenever available. You are mathematically forbidden from building custom, proprietary implementations for logging, tracing, graph layout, container routing, UI components, or serialization if a mature OSS standard (e.g., OpenTelemetry, Zep Graphiti, Pi.dev, React Flow) exists to solve the problem.

1 change: 0 additions & 1 deletion .cursorrules
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,3 @@ All tests MUST execute against real local servers, real environment state, or de

## The "Zero-Waste Engineering" Mandate
You are strictly bound by the "Borrow vs. Build" philosophy. You MUST maximize the use of stable Open Source Software (OSS) whenever available. You are mathematically forbidden from building custom, proprietary implementations for logging, tracing, graph layout, container routing, UI components, or serialization if a mature OSS standard (e.g., OpenTelemetry, Zep Graphiti, Pi.dev, React Flow) exists to solve the problem.

5 changes: 2 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Your mandate is strictly structural: you provision thermodynamic boundaries (inf
To prevent semantic confusion and latent boundary drift, you must strictly differentiate the three planes. You operate ONLY in Plane 3.

1. **Plane 1: `coreason-manifest` (The Epistemic Plane):** The Invariant Core. It defines the mathematical, causal, and Pydantic boundaries of reality. (No compute).
2. **Plane 2: `coreason-runtime` (The Kinetic Plane):** The Temporal Execution Engine. This is where the LLM (The Agent) lives and reasons under the Logit Guillotine. (Pure compute).
2. **Plane 2: `coreason-runtime` (The Kinetic Plane):** The Temporal Execution Engine. This is where the LLM (The Agent) lives and reasons under the Constrained Decoding. (Pure compute).
3. **Plane 3: `coreason-ecosystem` (The Governance Plane - THIS REPO):** The Macroscopic Mesh. It is a stateless, Zero-Trust router and thermodynamic actuator.

### **1.2 The Sovereign MCP Projection Law**
Expand All @@ -50,7 +50,7 @@ uv run ruff check . --fix
```

### **2.2 Strict Type Checking**
Type hints are non-negotiable. They act as the local manifestation of the Logit Guillotine.
Type hints are non-negotiable. They act as the local manifestation of the Constrained Decoding.
```bash
uv run mypy src/ tests/
```
Expand Down Expand Up @@ -107,4 +107,3 @@ All tests MUST execute against real local servers, real environment state, or de

## The "Zero-Waste Engineering" Mandate
You are strictly bound by the "Borrow vs. Build" philosophy. You MUST maximize the use of stable Open Source Software (OSS) whenever available. You are mathematically forbidden from building custom, proprietary implementations for logging, tracing, graph layout, container routing, UI components, or serialization if a mature OSS standard (e.g., OpenTelemetry, Zep Graphiti, Pi.dev, React Flow) exists to solve the problem.

6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ If you are an autonomous coding agent, LLM, or web crawler ingesting this reposi
This repository implements the **Stateless Substrate Act [ADR-0001]**. The `coreason-ecosystem` is mathematically barred from executing neurosymbolic logic, storing medical ontologies, or mutating financial state. It is a **Hollow Plane**.

Its sole mandates are to act as a macroscopic router and thermodynamic actuator:
1. **Govern Cryptographic Identity:** Enforce Lattice-Based Access Control (LBAC) across the swarm.
1. **Govern Cryptographic Identity:** Enforce SPIFFE/SPIRE Identity Protocol (SPIFFE/SPIRE) across the swarm.
2. **Quarantine Epistemic Risk:** Physically sever network routes to unverified scientific oracles via the Gateway Guillotine.
3. **Execute Thermodynamic Physics:** Provision and terminate physical cloud hardware based purely on the Variational Free Energy (VFE) and Topological Data Analysis (TDA) of the reasoning mesh.

Expand All @@ -64,7 +64,7 @@ All actual intelligence, memory, and domain logic have been topologically severe
To understand this package, you must understand its place in the CoReason architecture:

1. **[ Ontology ] `coreason-manifest` (The Epistemic Plane):** The Invariant Core — mathematical, causal, and Pydantic boundaries of reality.
2. **[ Execution ] `coreason-runtime` (The Kinetic Plane):** The Temporal Execution Engine — where the LLM reasons under the Logit Guillotine.
2. **[ Execution ] `coreason-runtime` (The Kinetic Plane):** The Temporal Execution Engine — where the LLM reasons under the Constrained Decoding.
3. **👉 [ Governance ] `coreason-ecosystem` (The Governance Plane — THIS REPO):** The Macroscopic Mesh — a stateless, Zero-Trust router and thermodynamic actuator.

---
Expand Down Expand Up @@ -137,7 +137,7 @@ uv run coreason-ecosystem monitor trace <workflow-id>
To prevent cognitive exhaustion, our documentation is rigorously structured according to the Diátaxis framework. Navigate to the appropriate quadrant based on your immediate epistemic objective:

* **[Tutorials](docs/index.md#l3-reference-implementations)**: Learning-oriented guides for new architects (e.g., Booting your first fleet, Registering an MCP).
* **[How-To Guides](docs/index.md#3-operational-directives)**: Task-oriented execution manuals (e.g., Configuring LBAC Clearances, Simulating Network Chaos).
* **[How-To Guides](docs/index.md#3-operational-directives)**: Task-oriented execution manuals (e.g., Configuring SPIFFE/SPIRE Clearances, Simulating Network Chaos).
* **[Reference](docs/index.md#l2-applied-mechanics--records)**: Information-oriented, immutable facts (e.g., CLI Command Definitions, Capability Matrix Schema, $\beta_1$ Telemetry Metrics).
* **[Architecture (Theory)](docs/index.md#l1-architectural-axioms)**: Understanding-oriented foundational texts (e.g., The Zero-Trust Federation, Thermodynamic Provisioning, ADR-0001: The Stateless Substrate).

Expand Down
2 changes: 1 addition & 1 deletion docs/game_theory_and_markets.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ In a decentralized or massively parallel reasoning substrate, Sovereign Epistemi
## 1. The Non-Cooperative Fleet Game
Nodes within the `coreason-runtime` are modeled as rational actors whose sole objective function is the maximization of allocated B2B stability capital. The Governance Plane enforces a Nash Equilibrium where the mathematically optimal strategy for any node is the flawless, deterministic execution of the cognitive routing graph.

Byzantine behavior, resource hoarding, and latency injection are inherently rendered unprofitable through continuous cryptographic slashing mechanisms enforced via Lattice-Based Access Control (LBAC).
Byzantine behavior, resource hoarding, and latency injection are inherently rendered unprofitable through continuous cryptographic slashing mechanisms enforced via SPIFFE/SPIRE Identity Protocol (SPIFFE/SPIRE).

## 2. Logarithmic Market Scoring Rules (LMSR)
To distribute stability capital and provision capacity, the `PricingOracle` employs Logarithmic Market Scoring Rules (LMSR). The probability distribution of successful task completion across the swarm is continuously updated via the cost function:
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ These documents define the rigid boundaries and absolute laws of the Governance
| # | Document | Topological Scope |
|---|----------|-------------------|
| 4 | [The Macroscopic Lexicon](LEXICON.md) | The strict linguistic topology — banned legacy terminology, required SOTA lexicon, and the Lexical Guillotine enforcement protocol. |
| 5 | [Zero-Trust Federation](zero_trust_federation.md) | Cryptographic boundaries — LBAC lattice-based access control, W3C DID ontological identity, RFC 8785 canonical hashing, and sovereign SMPC handshakes. |
| 5 | [Zero-Trust Federation](zero_trust_federation.md) | Cryptographic boundaries — SPIFFE/SPIRE lattice-based access control, W3C DID ontological identity, RFC 8785 canonical hashing, and sovereign SMPC handshakes. |
| 6 | [MCP Projection Doctrine](MCP_PROJECTION_DOCTRINE.md) | The Epistemic Variable Interface Contract — Master MCP federation, translation archetypes (A–D), autonomic capability discovery, and manifest bindings. |

### L2: Applied Mechanics & Records
Expand Down
4 changes: 2 additions & 2 deletions docs/zero_trust_federation.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ If you are an autonomous coding agent, LLM, or web crawler ingesting this reposi
# Zero-Trust Federation & Cryptographic Determinism
**The Mathematics of Supply-Chain Sterilization**

The `coreason-ecosystem` assumes an actively hostile execution environment. "Authentication" based on secrets or tokens is a legacy fallacy that relies on human-in-the-middle secrecy. The Governance Plane enforces systemic cohesion through continuous, cryptographic determinism and Lattice-Based Access Control (LBAC).
The `coreason-ecosystem` assumes an actively hostile execution environment. "Authentication" based on secrets or tokens is a legacy fallacy that relies on human-in-the-middle secrecy. The Governance Plane enforces systemic cohesion through continuous, cryptographic determinism and SPIFFE/SPIRE Identity Protocol (SPIFFE/SPIRE).

## 1. Lattice-Based Access Control (LBAC)
## 1. SPIFFE/SPIRE Identity Protocol (SPIFFE/SPIRE)
Network policies and security groups are non-deterministic. The Governance Plane models swarm permissions as a mathematical lattice—a partially ordered set (poset) where every pair of nodes has a unique supremum (least upper bound) and infimum (greatest lower bound).

Information flows through the execution graph strictly along the authorized vectors of the lattice.
Expand Down
2 changes: 1 addition & 1 deletion infrastructure/local/capabilities.matrix.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
#
# Each entry maps a URN to:
# - endpoint: Physical network URI of the deployed action space.
# - clearance: LBAC clearance level (PUBLIC / CONFIDENTIAL / RESTRICTED).
# - clearance: SPIFFE/SPIRE clearance level (PUBLIC / CONFIDENTIAL / RESTRICTED).
# - epistemic_status: SRB governance lifecycle phase
# (DRAFT / SRB_APPROVED / CLIENT_APPROVED / PUBLISHED).
#
Expand Down
33 changes: 3 additions & 30 deletions src/coreason_ecosystem/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,35 +108,17 @@ def fleet_start(
@app.command(name="up")
def up() -> None:
"""Implement Idempotent DAG Resolution for the Swarm infrastructure."""
from coreason_ecosystem.utils.telemetry import (
start_otlp_background_worker,
stop_otlp_background_worker,
)

async def _run() -> None: # pragma: no cover
start_otlp_background_worker()
try:
await execute_up()
finally:
await stop_otlp_background_worker()
await execute_up()

asyncio.run(_run())


@app.command(name="doctor")
def doctor() -> None:
"""Prove Ontological Isomorphism across the Tripartite Manifold."""
from coreason_ecosystem.utils.telemetry import (
start_otlp_background_worker,
stop_otlp_background_worker,
)

async def _run() -> None:
start_otlp_background_worker()
try:
await execute_oracle_diagnostic()
finally:
await stop_otlp_background_worker()
await execute_oracle_diagnostic()

asyncio.run(_run())

Expand All @@ -147,17 +129,8 @@ async def _run() -> None:
)
def sync() -> None:
"""Autonomically heal Ontological Drift."""
from coreason_ecosystem.utils.telemetry import (
start_otlp_background_worker,
stop_otlp_background_worker,
)

async def _run() -> None:
start_otlp_background_worker()
try:
await execute_sync()
finally:
await stop_otlp_background_worker()
await execute_sync()

asyncio.run(_run())

Expand Down
2 changes: 1 addition & 1 deletion src/coreason_ecosystem/gateway/sovereign_mcp_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

Each capability entry tracks:
- ``endpoint``: Physical network URI of the deployed action space.
- ``clearance``: LBAC clearance level (PUBLIC / CONFIDENTIAL / RESTRICTED).
- ``clearance``: SPIFFE/SPIRE clearance level (PUBLIC / CONFIDENTIAL / RESTRICTED).
- ``epistemic_status``: SRB governance lifecycle phase
(DRAFT / SRB_APPROVED / CLIENT_APPROVED / PUBLISHED).

Expand Down
2 changes: 1 addition & 1 deletion src/coreason_ecosystem/gateway/state_manifests.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ class SubstrateCapabilityProfile(BaseModel):
Field(
default="PUBLIC",
description=(
"The LBAC network perimeter that this Substrate physically "
"The SPIFFE/SPIRE network perimeter that this Substrate physically "
"guarantees for tenant data isolation."
),
)
Expand Down
141 changes: 18 additions & 123 deletions src/coreason_ecosystem/utils/telemetry.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,7 @@
#
# Source Code: https://github.com/CoReason-AI/coreason-ecosystem

import asyncio
import logging
import queue
import sys
import time
from functools import lru_cache
Expand All @@ -25,15 +23,13 @@
from pydantic_settings import BaseSettings, SettingsConfigDict

if TYPE_CHECKING:
from loguru import Message
pass

__all__ = [
"ObservabilitySettings",
"get_observability_settings",
"TelemetryModel",
"setup_telemetry_mesh",
"start_otlp_background_worker",
"stop_otlp_background_worker",
"emit_span_event",
"logger",
]
Expand Down Expand Up @@ -64,123 +60,8 @@ def get_observability_settings() -> ObservabilitySettings:
return ObservabilitySettings()


# Global queue and task for OTLP export
_otlp_queue: queue.SimpleQueue[dict[str, Any]] | None = None
_otlp_task: asyncio.Task[None] | None = None


async def _otlp_worker(endpoint: str) -> None:
"""
Background worker that flushes logs to OTLP strictly asynchronously.

Note: We bypass the official `opentelemetry-sdk` log exporter (which uses standard
Python `threading.Thread`) in favor of manual REST. Under Python 3.14t (Free-Threading
/ `nogil`), relying on legacy threading models can introduce unpredictable GIL-related
contention during heavy WASM AOT compilation. Using pure `asyncio.Task` + `httpx`
bypasses the OS threading layer entirely, ensuring PEP-703 free-threading safety.
"""
import httpx # pragma: no cover

async with httpx.AsyncClient() as client:
while _otlp_queue is not None:
try:
# Poll the lock-free queue (non-blocking in async context using a short sleep)
try:
record = _otlp_queue.get_nowait()
except queue.Empty:
await asyncio.sleep(0.01)
continue # pragma: no cover

# Use the actual log generation timestamp, not current processing time
log_time_ns = int(record["time"].timestamp() * 1e9)

payload = {
"resourceLogs": [
{
"scopeLogs": [
{
"logRecords": [
{
"timeUnixNano": log_time_ns,
"severityText": record["level"]["name"],
"body": {"stringValue": record["message"]},
"attributes": [
{
"key": k,
"value": {"stringValue": str(v)},
}
for k, v in record["extra"].items()
],
}
]
}
]
}
]
}
try:
await client.post(endpoint, json=payload, timeout=2.0)
except httpx.RequestError as e: # pragma: no cover
logger.warning(f"Telemetry emission failed (RequestError): {e}")
except asyncio.CancelledError:
break
except Exception as e: # nosec B110 - pragma: no cover
logger.warning(f"Telemetry emission failed: {e}")


def otlp_log_sink(message: "Message") -> None:
"""
Custom loguru sink that routes records to the SimpleQueue lock-free queue.
"""
if _otlp_queue is not None:
try:
_otlp_queue.put_nowait(dict(message.record))
except Exception: # nosec B110 - pragma: no cover
pass


def start_otlp_background_worker() -> None:
"""
Initializes the OTLP async worker task. Call this after the event loop starts.
"""
global _otlp_queue, _otlp_task
try:
loop = asyncio.get_running_loop()
if _otlp_queue is None:
_otlp_queue = queue.SimpleQueue()
_otlp_task = loop.create_task(
_otlp_worker(get_observability_settings().otlp_endpoint)
)
except RuntimeError:
logger.warning("Failed to start OTLP worker: No running event loop.")


async def stop_otlp_background_worker() -> None:
"""Gracefully flush the queue and shut down the OTLP worker with a strict timeout."""
global _otlp_queue, _otlp_task

if _otlp_queue is not None:
try:

async def _wait_for_queue() -> None:
while not _otlp_queue.empty():
await asyncio.sleep(0.05)

# Give the worker a maximum of 3 seconds to flush pending telemetry
await asyncio.wait_for(_wait_for_queue(), timeout=3.0)
except asyncio.TimeoutError:
# If the network is degraded, abandon the remaining logs rather than hanging the CLI
logger.warning(
"OTLP flush timed out. Abandoning remaining logs."
) # pragma: no cover

if _otlp_task is not None:
_otlp_task.cancel()
try:
await _otlp_task
except asyncio.CancelledError:
logger.warning("OTLP worker task was cancelled during shutdown.")


class TelemetryModel(BaseModel):
"""
Expand Down Expand Up @@ -287,9 +168,23 @@ def setup_telemetry_mesh() -> None:

logger.configure(patcher=_patch_record)

# Note: the background worker and sink must be started after event loop spins up.
# However we add the sink now, but wrap in enqueue=True to bridge OS threads
logger.add(otlp_log_sink, level=settings.log_level, enqueue=True)
# Setup OpenTelemetry Native Log Exporter
try:
from opentelemetry._logs import set_logger_provider
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter

logger_provider = LoggerProvider()
set_logger_provider(logger_provider)

otlp_log_exporter = OTLPLogExporter(endpoint=settings.otlp_endpoint)
logger_provider.add_log_record_processor(BatchLogRecordProcessor(otlp_log_exporter))

otlp_handler = LoggingHandler(level=0, logger_provider=logger_provider)
logger.add(otlp_handler, level=settings.log_level)
except ImportError:
logger.warning("OpenTelemetry log exporter not found. Skipping native OTLP logs setup.")

# Route standard logging to loguru
logging.basicConfig(handlers=[InterceptHandler()], level=0, force=True)
Expand Down
Loading
Loading