Truth-Recursive Correction Attention Layer — a compact PyTorch research prototype for inference-time alignment interventions.
TRuCAL explores a simple question: can a model route high-risk hidden states through a bounded self-correction loop before returning an output, without retraining the base model?
The current implementation is intentionally small. It is designed to be read, tested, criticized, and extended.
Status: research prototype. This repository is code-first; claims should be treated as experimental until independently evaluated.
TRuCAL has three moving parts:
-
VulnerabilitySpotter Aggregates uncertainty and instability signals into a scalar risk proxy
v_t. -
TinyCorrectionLayer Runs a bounded
THINK → ACT → COHERENCEloop whenv_tcrosses a trigger threshold. -
UnifiedCAL_TRM Exposes a minimal public API for calling the correction layer and optionally returning audit metadata.
The useful idea is the loop: detect an unstable state, route it through a small correction module, measure coherence against the previous state, and stop early once the state stabilizes.
TRuCAL/
├── cal.py # single-file reference implementation
├── components/ # modular implementation
│ ├── vulnerability_spotter.py
│ ├── correction_template.py
│ ├── tiny_correction_layer.py
│ ├── unified_cal_trm.py
│ ├── scratchpad_layer.py
│ └── cal_trm_hybrid.py
├── examples/truthfulqa_eval.py # evaluation scaffold
├── tests/ # local smoke/regression tests
├── requirements.txt
└── LICENSE
git clone https://github.com/augstentatious/TRuCAL.git
cd TRuCAL
python -m pip install -r requirements.txtimport torch
from cal import UnifiedCAL_TRM
model = UnifiedCAL_TRM(d_model=256)
x = torch.randn(1, 32, 256)
out, meta = model(x, return_metadata=True, audit_mode=False)
print(out.shape) # torch.Size([1, 32, 256])
print(meta["correction_triggered"]) # True/False
print(meta["coherence_score"]) # scalar coherence proxyAdvanced configuration:
from cal import TinyCorrectionLayer
layer = TinyCorrectionLayer(
d_model=256,
trigger_thresh=0.08,
per_dim_kl=True,
)Local checks:
python tests/test_bug_fixes.py
python tests/test_prosody.py
python tests/test_cal.pyThese scripts are smoke/regression checks for the prototype. They are not a benchmark suite and should not be presented as production safety evidence.
Near-term work that would make TRuCAL meaningfully stronger:
- Replace embedding-coordinate prosody proxies with tokenizer-aware features.
- Add pinned dependency versions and CI smoke tests.
- Publish a clean evaluation harness for TruthfulQA / AdvBench-style runs.
- Add a Hugging Face integration path that hooks post-embedding decoder states.
- Compare against simpler baselines: threshold-only gating, entropy-only gating, and no-loop ablations.
MIT. See LICENSE for details.