Add Streamlit prototype for sales instructor AI#2
Open
soy-tuber wants to merge 3 commits into
Open
Conversation
Multi-LLM (Claude/Gemini) + multi-DB (5 SQLite+FTS5 schemas) prototype deployable to Streamlit Cloud Community. Existing FastAPI SoyLM is untouched; new entry point is streamlit_app.py. - sales_ai/llm_router.py: streaming chat for Anthropic + google-genai - sales_ai/db_manager.py: per-schema SQLite with trigram FTS5 for JA - sales_ai/pipeline.py: PDF/DOCX/PPTX/XLSX/CSV/TXT extraction + LLM JSON - sales_ai/rag_engine.py: parallel multi-DB retrieval + context build - streamlit_app.py: chat / ingest / data-management / settings pages - .streamlit/: config + secrets.toml example (real secrets gitignored)
Streamlit Cloud build was failing on the heavy SoyLM FastAPI side deps (playwright pulls browser binaries; fastapi/uvicorn/etc. are unused by the Streamlit prototype). Move them to requirements-soylm.txt so the prototype installs cleanly on Cloud.
Replace the freeform chat panel with three task-specific feature pages,
each backed by a structured form, champion-case retrieval, and an
output shape fixed by a per-feature system prompt:
- 訪問前ブリーフィング: keyword retrieval over deals_fts, then output
キーメッセージ/想定質問/反論対応/差別化/準備チェックリスト.
- 提案書作成: bundles full A-E champion context into Opus to emit a
完全な Markdown 提案書 with [案件 #ID] citations.
- 訪問後フォローアップ: takes a visit report and returns 商談評価/
強み/リスク/ToDo 表 (期限付き)/フォローメール文案.
Data model: introduce a deals master ("案件マスタ") and move the five
existing schemas (proposals/talks/reports/results/followups) under it
via deal_id FKs. All tables now live in a single sales_ai.db so deal
bundles can be retrieved with a simple join, and FTS5 sticks with the
trigram tokenizer for Japanese substring matching.
Seed: ship 10 illustrative champion deals on first run so the
retrieval/feature pages have something to surface immediately.
Ingestion is now deal-scoped (pipeline.ingest takes deal_id). The data
management page becomes 案件マスタ with new/detail/export tabs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements a Streamlit-based prototype of the sales instructor AI system, enabling document ingestion, multi-database RAG search, and interactive chat with Claude/Gemini. This is a Phase 1–3 implementation of the design specification, deployable directly to Streamlit Cloud.
Key Changes
Streamlit UI (
streamlit_app.py): Multi-page application with four sections:Database Layer (
sales_ai/db_manager.py): SQLite + FTS5 storage with:data/sales_ai/Schema Definitions (
sales_ai/schemas.py): Five predefined schemas for sales data:Ingestion Pipeline (
sales_ai/pipeline.py): End-to-end document processing:LLM Router (
sales_ai/llm_router.py): Unified streaming interface for:RAG Engine (
sales_ai/rag_engine.py): Multi-database retrieval and context synthesis:Configuration: Added Streamlit config (
config.toml) and secrets templateNotable Implementation Details
st.session_stateand never written to disk, supporting both Streamlit Cloud Secrets and runtime inputunicode61to enable proper Japanese substring matching.dbfiles via the UI for backup/restorehttps://claude.ai/code/session_01N4GWxopPmrYavJ9psb6y43