fix: stabilize hunyuan pglite embeddings and retrieval by 313094319-sudo · Pull Request #765 · garrytan/gbrain

313094319-sudo · 2026-05-09T01:59:01Z

support custom embedding base URL, model, and dimensions from config/env
use raw HTTP embeddings for non-OpenAI compatible endpoints
add CJK-aware PGLite keyword fallback for Chinese retrieval
align chunk metadata and tests with dynamic embedding dimensions
document the verified local Hunyuan + PGLite recovery and validation flow

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

- support custom embedding base URL, model, and dimensions from config/env - use raw HTTP embeddings for non-OpenAI compatible endpoints - add CJK-aware PGLite keyword fallback for Chinese retrieval - align chunk metadata and tests with dynamic embedding dimensions - document the verified local Hunyuan + PGLite recovery and validation flow

…#121) Two small ergonomics fixes folded together (#765 deferred — see TODOS.md follow-up; the CJK PGLite extraction was bigger than the plan estimated). #779 reworked (alexandreroumieu-codeapprentice): silence the missing-max_batch_tokens startup warning for recipes with genuinely dynamic batch capacity. New `EmbeddingTouchpoint.no_batch_cap?: true` field. Set on ollama (capacity depends on locally loaded model + OLLAMA_NUM_PARALLEL), litellm-proxy (depends on backend), llama-server (set by --ctx-size at server launch). Three less stderr warnings on every gateway configure; google still warns (it's a real fixed-cap provider that ought to ship a max_batch_tokens declaration). Bonus: litellm-proxy now declares `user_provided_models: true`, removing the last consumer of the legacy `recipe.id === 'litellm'` hardcode in gateway.ts:223 (D8=A wire-through completion). #121 reworked (vinsew): self-contained API keys. Two parts: 1. config.ts: ANTHROPIC_API_KEY env merge was silently missing. loadConfig() merged OPENAI_API_KEY but not ANTHROPIC_API_KEY into the file-config-shape result. One-line addition. 2. cli.ts:buildGatewayConfig: when ~/.gbrain/config.json declares openai_api_key / anthropic_api_key but the process env doesn't have those env vars set (common for launchd-spawned daemons, agent subprocess tools, containers that don't propagate ~/.zshrc), fold the config-file values into the gateway env snapshot. Process env still wins (loaded last) so per-process overrides keep working. Tests (4 cases in test/ai/no-batch-cap-suppression.test.ts): - Ollama / LiteLLM / llama-server all declare no_batch_cap: true - configureGateway does NOT warn for those three - configureGateway STILL warns for google (regression guard) - Cross-cutting invariant: empty-models recipes declare user_provided_models Tests: bun test test/ai/ — 128/128 (4 new + 124 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 9 of 11). #765 (Hunyuan PGLite + CJK keyword fallback) deferred to TODOS.md follow-up; the CJK extraction (~150 lines + scoring logic + tests) is larger than the wave's adjacent-fix lane should carry. Closes that PR with a deferral note. Co-Authored-By: alexandreroumieu-codeapprentice <noreply@github.com> Co-Authored-By: vinsew <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

313094319-sudo force-pushed the wuyun/hunyuan-pglite-fix branch from ee0289b to 99bf659 Compare May 9, 2026 05:57

garrytan mentioned this pull request May 10, 2026

v0.32.0 feat: 5 new embedding recipes + discoverability pass (closes 17-PR cluster) #810

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: stabilize hunyuan pglite embeddings and retrieval#765

fix: stabilize hunyuan pglite embeddings and retrieval#765
313094319-sudo wants to merge 1 commit intogarrytan:masterfrom
313094319-sudo:wuyun/hunyuan-pglite-fix

313094319-sudo commented May 9, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

313094319-sudo commented May 9, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

313094319-sudo commented May 9, 2026 •

edited by blacksmith-sh Bot

Loading