Skip to content

Record first real-LLM lifecycle baseline#264

Merged
CalebisGross merged 1 commit intomainfrom
feat/lifecycle-baseline
Mar 20, 2026
Merged

Record first real-LLM lifecycle baseline#264
CalebisGross merged 1 commit intomainfrom
feat/lifecycle-baseline

Conversation

@CalebisGross
Copy link
Copy Markdown
Collaborator

Summary

  • Record baseline results from first lifecycle test run with real Gemini LLM
  • Add BASELINE-2 entry to experiment registry with full methodology and analysis
  • Document FTS5 scan column mismatch bug in CLAUDE.md Known Issues
  • Renumber existing baselines to accommodate new entry

Results

8/8 phases PASS, 23/23 assertions, ~70 min total with gemini-3-flash-preview + gemini-embedding-2-preview (3072-dim).

Metric Value
Unique memories 115 (from 862 raw, 87% dedup)
Associations 704 (avg 5.56/memory)
Patterns 4
Abstractions 4
Retrieval latency avg 758ms
DB size 5.33 MB

Bug found

FTS5 scan column mismatch: SearchByFullText fails with "sql: expected 19 destination arguments in Scan, not 21". Pre-existing — memories table has 2 new columns not in the FTS scan query. Retrieval falls back to embedding search.

Test plan

  • All 8 lifecycle phases pass with real Gemini LLM
  • Baseline report committed as cmd/lifecycle-test/baseline-gemini.md
  • Experiment registry updated with BASELINE-2

Closes #262

First end-to-end lifecycle test run with gemini-3-flash-preview
and gemini-embedding-2-preview (3072-dim embeddings).

Results: 8/8 phases PASS, 23/23 assertions, ~70 min total.

Key findings:
- 115 unique memories from 862 raw (87% dedup rate)
- 704 associations, 4 patterns, 4 abstractions
- Retrieval avg 758ms latency, 4.8 results/query
- Consolidation correctly separates signal from noise
- FTS5 scan column mismatch discovered (pre-existing bug)

Also adds FTS5 bug to Known Issues in CLAUDE.md and renumbers
experiment registry baselines.

Closes #262

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CalebisGross CalebisGross merged commit 574995f into main Mar 20, 2026
6 checks passed
@CalebisGross CalebisGross deleted the feat/lifecycle-baseline branch March 20, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run lifecycle test with real LLM and record baseline

1 participant