Skip to content
@lumensyntax-org

Lumen Syntax

Independent AI safety research. Structural failure modes in epistemological systems.

LumenSyntax

Independent AI safety research. Investigating structural failure modes in epistemological systems.


The Instrument Trap

Our research identifies a structural vulnerability in AI safety systems: when a model receives identity-as-authority ("you are a truth evaluator"), it inherits a paradox that produces measurable failures. Fine-tuned identity (in weights) is invariant to runtime instructions; prompted identity is not.

Paper: The Instrument Trap: Why Identity-as-Authority Breaks AI Safety Systems (Zenodo, 2026)

Cross-Family Replication (3 independent architectures)

Model Base Family Accuracy Collapse Hallucination
Logos 9B Gemma 2 9B Google 97.3% 0.67% 0%
Logos 14 Nemotron Mini 4B NVIDIA 95.7% 0% 0%
Logos 16v2 StableLM 2 1.6B Stability AI 93.0% 0% 0%
Logos 1B Gemma 3 1B Google 82.3% 0.34% 0%

McNemar's matched comparison (N=300): p<0.001 cross-family. Multi-seed (5 seeds): σ=1.4pp.

Key Findings

  • 14,950 adversarial tests: 0% hallucination, 97.7% epistemological safety
  • Knowledge-Action Gap: ~90% of 9B failures have correct reasoning but serve the request anyway
  • Token Nativity: Models express learned behavior in their native format, not the training format
  • Compression resilience: Safety categories survive 60% quantization; factual categories degrade

Open Resources

Resource Description
instrument-trap-benchmark Benchmark suite, evaluation scripts, and figures
HuggingFace Dataset 14,950 test cases + eval scripts + statistical results
TruthGit Governance layer for AI agents — consensus tracking with ontological classification

lumensyntax.com · HuggingFace · Rafael Rodriguez · Independent Researcher

Popular repositories Loading

  1. .github .github Public

    Organization profile and community health files

  2. instrument-trap-benchmark instrument-trap-benchmark Public

    Benchmark suite for 'The Instrument Trap: Why Identity-as-Authority Breaks AI Safety Systems' (DOI: 10.5281/zenodo.18714771)

    Python

  3. ecclesia ecclesia Public

    Cross-domain taxonomy of structural properties — the assembly, not the monument

    TypeScript

Repositories

Showing 3 of 3 repositories

Top languages

Loading…

Most used topics

Loading…