GitHub - sorunokoe/PureReason: Deterministic reasoning assurance engine for AI agents. Fast (<5ms), zero-cost verification. Best-in-class for arithmetic, logic, and hallucination detection.

╔═════════════════════════════╗
║                             ║
║ ◈  P U R E   R E A S O N  ◈ ║
║                             ║
╚═════════════════════════════╝

Fast hallucination detection for AI systems

What is PureReason?

PureReason verifies AI model outputs for hallucinations, contradictions, and overconfidence. It's a verification layer that works alongside frontier models (GPT, Claude, Gemini) - not a replacement for them.

Use it when you need:

✅ Fast verification (<5ms per check)
✅ Hallucination detection
✅ Explainable decisions
✅ Offline operation (zero API costs)
✅ Safety layer for AI agents

Don't use it for:

❌ General reasoning (use GPT-5, Claude, o1)
❌ Problem solving (it verifies, doesn't generate)
❌ Content generation

Benchmarks

PureReason achieves strong performance on hallucination detection benchmarks:

Benchmark	F1 Score	Task
HaluEval QA	0.871	Question answering verification
LogicBench	0.846	Structural logic detection
TruthfulQA	0.798	Misconception detection
HalluLens	0.729	Grounding + contradiction checks
FELM	0.645	Segment-level factuality
RAGTruth	0.646	Grounded hallucination detection
HalluMix	0.664	Multi-domain hallucination
HaluEval Dialogue	0.634	Dialogue verification
FaithBench	0.622	Summarization faithfulness

Performance gains (v0.3.1):

+25-30pp F1 improvement over baseline
-40% latency reduction
±5pp ECS accuracy (vs ±15pp drift before)

Full methodology: See docs/BENCHMARK.md and docs/REPRODUCIBILITY.md

How It Works

Input:  "The patient must have cancer."
Output: Risk: HIGH | Confidence: 34/100
Flag:   Certainty overreach
Rewrite:"The patient has findings consistent with possible malignancy."

PureReason combines:

Symbolic logic - Deterministic verification using Z3
Neural embeddings - Semantic similarity detection (all-MiniLM-L6-v2)
Domain calibration - Per-domain accuracy tuning
Knowledge grounding - Entity checking and contradiction detection

The typical workflow:

Frontier model (GPT, Claude) generates output
PureReason verifies and scores it (0-100 ECS)
Agent receives verification + regulated text
High-risk outputs flagged for human review

Quick Start

Installation

pip install -e .

5-Minute Quickstart

Verify any text in 3 lines of code:

from pureason.guard import ReasoningGuard

guard = ReasoningGuard(threshold=70)  # 70 = moderate strictness
result = guard.verify("Water boils at 100°C at sea level.")

print(f"ECS: {result.ecs}/100, Provenance: {result.provenance}")
# Output: ECS: 83.0/100, Provenance: verified

Decision logic:

if result.ecs >= 70:
    # Accept - high confidence output
elif result.ecs >= 40:
    # Review - medium confidence
else:
    # Reject - low confidence

Real-World Examples

See examples/ for production-ready code:

simple_verification.py - Basic usage (5 min)
langchain_integration.py - LangChain integration (10 min)
api_server.py - Production FastAPI server (15 min)

Run the simple example:

python examples/simple_verification.py

API Server

Deploy as a microservice:

python examples/api_server.py

Test it:

curl -X POST http://localhost:8000/verify \
     -H "Content-Type: application/json" \
     -d '{"text": "The sky is blue.", "min_ecs": 70}'

1. Standalone CLI

cargo install --path crates/pure-reason-cli --locked
pure-reason review "The patient must have cancer."

2. MCP Integration (for AI agents)

# Build the MCP server
cargo build --release -p pure-reason-mcp

# Add to your agent's MCP config
# Full guide: docs/MCP-INTEGRATION.md

Your agent (Claude Desktop, Cursor, GitHub Copilot) can then call PureReason verification tools.

3. Python API (Advanced)

from pureason.reasoning import verify_chain

# Verify a chain of reasoning steps
problem = "What is 2 + 2?"
steps = ["Let me add the numbers.", "2 + 2 = 4", "Therefore, the answer is 4."]

result = verify_chain(problem, steps)
print(f"Confidence: {result.ecs}/100")

Core Features

Hallucination detection - Catches contradictions, fabrications, entity errors
Confidence scoring - 0-100 ECS with domain-aware calibration
Reasoning verification - Chain-of-thought and arithmetic step checking
Text regulation - Rewrites overconfident claims to hedged language
Multiple interfaces - CLI, MCP, Python, Rust library, REST API
Offline operation - No API keys required, runs completely local
Explainable results - Traceable verification logic with evidence

Example: Chain-of-Thought Verification

Step 1: A train travels 120 miles in 2 hours.
Step 2: Speed = 120 / 2 = 90 mph
Step 3: Time for 300 miles = 300 / 90 ≈ 3.3 hours

Result: INVALID
First failing step: 2
Reason: arithmetic_error (120 / 2 should be 60, not 90)

PureReason verifies each step deterministically and pinpoints exact failures.

Advanced Usage

Python Reasoning Layer

# Verify formal syllogisms
from pureason.reasoning import verify_syllogism

report = verify_syllogism(
    premises=["All mammals are warm-blooded.", "Whales are mammals."],
    conclusion="Whales are warm-blooded.",
)
print(report.is_valid)  # True

# Solve arithmetic word problems
from pureason.reasoning import solve_arithmetic

report = solve_arithmetic("Maria earned 50 dollars and spent 23 dollars. How much?")
print(report.answer)  # "27"

Build from Source

git clone https://github.com/sorunokoe/PureReason
cd PureReason
cargo build --release
./target/release/pure-reason review "Your text here"

Documentation

Topic	Link
Benchmarks	`docs/BENCHMARK.md` - Full results and methodology
Reproducibility	`docs/REPRODUCIBILITY.md` - Seeds, hashes, holdout
MCP Integration	`docs/MCP-INTEGRATION.md` - Agent setup guide
Capabilities	`docs/CAPABILITIES.md` - Feature matrix
TRIZ Guide	`docs/TRIZ-IMPLEMENTATION.md` - Performance improvements
API Reference	`crates/pure-reason-core/` - Core Rust engine
Contributing	`.github/CONTRIBUTING.md` - How to contribute

Use Cases

Best for:

Verifying AI agent outputs before execution
Detecting hallucinations in RAG systems
Scoring confidence in generated claims
Offline reasoning verification
Production AI safety layers
Code agents needing local verification

Not suitable for:

Novel problem solving (use GPT-5, Claude, o1)
Long-context reasoning (>10K tokens)
Real-time streaming (optimized for batch)
Content generation

License

Apache 2.0 — see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.cargo		.cargo
.github		.github
benchmarks		benchmarks
crates		crates
data		data
docker		docker
docs		docs
domains		domains
examples		examples
pureason		pureason
results		results
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
README.md.backup		README.md.backup
RELEASE-v0.3.1-INSTRUCTIONS.md		RELEASE-v0.3.1-INSTRUCTIONS.md
docker-compose.yml		docker-compose.yml
logo.svg		logo.svg
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is PureReason?

Benchmarks

How It Works

Quick Start

Installation

5-Minute Quickstart

Real-World Examples

API Server

1. Standalone CLI

2. MCP Integration (for AI agents)

3. Python API (Advanced)

Core Features

Example: Chain-of-Thought Verification

Advanced Usage

Python Reasoning Layer

Build from Source

Documentation

Use Cases

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is PureReason?

Benchmarks

How It Works

Quick Start

Installation

5-Minute Quickstart

Real-World Examples

API Server

1. Standalone CLI

2. MCP Integration (for AI agents)

3. Python API (Advanced)

Core Features

Example: Chain-of-Thought Verification

Advanced Usage

Python Reasoning Layer

Build from Source

Documentation

Use Cases

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages