Context
Per phase-a curation log §10.5; recurs in phases b/c/d. Surfaced 2026-05-11.
`services/providers/suttaCentralDictionary.ts` returns opaque payloads for many lemmas, leaving `Sense.english` as `(no sense)` while `rawExcerpt` is populated. The raw form is fine for the LLM compiler (it can parse the excerpt) and for future audit UI, but our structured `Sense` glosses can't draw from SC currently.
Particularly painful for words like `evaṁ`, `Kammāsadhammaṁ`, `kurūnaṁ` where DPD lacks a direct entry or the entry is sparse, but PED (Pali Text Society Dictionary, embedded in SC's payload) almost certainly has rich content.
What to investigate
SC's `/api/dictionary_full/{lemma}` returns a structured response. The current parser surfaces `rawExcerpt` but doesn't extract per-sense glosses. Possible reasons:
- The structure varies by entry (PED vs DPD vs Concise PED vs Buddhist Dictionary all merged); parser was conservative
- Some entries are HTML-formatted, others are JSON arrays
- The lemma normalization (niggahīta) might not be applied to the SC query
Acceptance
- For at least 3 of the 4 MN10 phase-a/b/c/d senses where SC returned `(no sense)`, the provider now returns structured English glosses
- Provider tests added for the parser improvements
Hit count
4/4 phases (universal pattern — every phase has at least one SC `(no sense)` result)
Context
Per phase-a curation log §10.5; recurs in phases b/c/d. Surfaced 2026-05-11.
`services/providers/suttaCentralDictionary.ts` returns opaque payloads for many lemmas, leaving `Sense.english` as `(no sense)` while `rawExcerpt` is populated. The raw form is fine for the LLM compiler (it can parse the excerpt) and for future audit UI, but our structured `Sense` glosses can't draw from SC currently.
Particularly painful for words like `evaṁ`, `Kammāsadhammaṁ`, `kurūnaṁ` where DPD lacks a direct entry or the entry is sparse, but PED (Pali Text Society Dictionary, embedded in SC's payload) almost certainly has rich content.
What to investigate
SC's `/api/dictionary_full/{lemma}` returns a structured response. The current parser surfaces `rawExcerpt` but doesn't extract per-sense glosses. Possible reasons:
Acceptance
Hit count
4/4 phases (universal pattern — every phase has at least one SC `(no sense)` result)