"Turn every remote meeting into a strategic advantage." (一個基於 Apple Silicon NPU 定製、專為「遠端技術與商業對話」設計的本地零延遲防突擊提詞系統。)
[🇹🇼 Traditional Chinese] 我們時常看到兩種極度不對稱的高壓會議場景:
- 立院質詢與聽證會:行政首長在台上,遭受擁有一整個幕僚團隊的民意代表用極度刁鑽、瑣碎的問題「擠兌」與突擊。
- 公司法說會 (Earnings Call) 與股東會:上市櫃公司的經營層,遭受奈米小股東或外資分析師用無厘頭的流言或冷門數據進行狙擊。
對於處於明處的講者來說這防不勝防。Aegis Prompter (神盾提詞機) 就是為了解決這個痛點而生。它打破了傳統 AI 助理「單打獨鬥」的限制,轉型為 「多角色不對稱作戰系統」。 在不連外網、絕對確保商業機密的前提下,講者在台上只需看著螢幕左側的無延遲逐字稿;而幕僚團隊可以遠端登入系統,不僅系統會根據語意自動觸發講稿,幕僚更可以直接手動「發射」應付怪問題的標準答案,0.5 秒內閃現在講者的眼底。
[🇺🇸 English] We often witness two highly asymmetric, high-pressure meeting scenarios:
- Congressional Hearings & Interpellations: Government officials face intense, unfair cross-examinations by representatives backed by entire teams of researchers feeding them real-time data.
- Earnings Calls & Shareholder Meetings: Corporate executives get ambushed by retail investors or foreign analysts with obscure rumors and fringe metrics.
Aegis Prompter levels the playing field. It is a completely offline, zero-trust Multi-Role Asymmetric Defense System. Armed with Apple's Native NPU, it transcribes audio instantly and matches semantic vectors to trigger pre-written defensive scripts. More crucially, your backend staff can securely connect to the session and inject real-time tactical cues directly into the speaker's teleprompter display.
- Multi-Role Teleprompter (
?role=speakervs?role=staff): A distracted speaker makes mistakes. Thespeakerrole provides a clean, auto-scrolling teleprompter view. Thestaffrole provides a tactical control panel to inject live string cues natively into the speaker's display inside the local network. - Vector Semantic RAG (Zero-Latency):
Aegis replaces clunky API calls with a local
sentence-transformersknowledge compiler. It mathematically matches what the opponent says against your predefinedqa.mdtrap questions, triggering defenses instantly without LLM generation halucinations. - Dual-Track Apple Silicon Transcriber:
It utilizes
MLX-Whisperdirectly on the Mac NPU, safely separating hardware microphones (You) and Virtual Audio loops like BlackHole (Them) without system crashes. - Pure Teleprompter Mode Toggle:
By switching
ENABLE_LOCAL_RAG=falsein the.envfile, the system disables all heavy vector computations and AI memory mapping. It slims down instantly into a pure, multi-role manual teleprompter to save system resources. - 100% Offline & Private: Zero external API dependencies. Zero telemetry.
graph TD
A[Partner / BlackHole] -->|Loopback| B(Transcriber Other)
C[You / MacBook Mic] -->|Direct| D(Transcriber Me)
B -->|Whisper NPU| E[Dialogue Buffer]
D -->|Whisper NPU| E
E -->|Vector Embeddings| F{Local Advisor RAG}
F -.->|Semantic Match| G[Streamlit UI - Speaker]
H((Staff Officer)) -->|Web Injection| G
G -->|Auto-Scroll| G
Aegis-Prompter/
├── src/
│ ├── app.py # Multi-role UI & State Routing
│ ├── build_index.py # Knowledge Compiler (Parses .md to .pkl vector space)
│ ├── transcriber.py # Apple MLX-Whisper core
│ ├── local_advisor.py # Vector Similarity Matcher (Cosine RAG)
│ └── dialogue_buffer.py # Threat evaluation and Thread locking memory
├── context/
│ ├── docs/ # Drop your pre-meeting files (.md, .txt) here
│ └── knowledge_index.pkl# Compiled Vector DB (Created by build_index.py)
├── .env # Multilingual toggle configurations
├── .venv/ # Local pip virtual environment
└── README.md # Technical Docs
-
Clone & Setup Environment:
git clone https://github.com/BinHsu/Aegis-Prompter.git cd Aegis-Prompter cp .env.example .envFor Mac Users (Recommended): We provide an automated, idempotent setup script that installs Homebrew dependencies, configuring
portaudioand theBlackHolevirtual driver safely.bash setup_mac.sh source .venv/bin/activate(For Windows/Linux, manually create a venv,
pip install -r requirements.txt, and route your audio). -
Create & Compile the Tactical RAG (Knowledge Base): Because your personal notes are git-ignored, you must first create the docs folder:
mkdir -p context/docs
Create a markdown or text file (e.g.
qa.md) insidecontext/docs. Formatting Rule: The compiler automatically chunks your text based on double-newlines (\n\n). Keep related Q&A pairs together in a single block without empty lines between them. Example:If they ask about Taiwan's climate and rainfall: Taiwan's climate is tropical and subtropical, with an annual rainfall nearly three times the world average... (Truncated) If they ask about Taiwan's population and demographics: Taiwan has a population of approximately 23 million, with the Han Chinese making up about 95%... (Truncated)
(💡 Pro Tip: We provided a dummy Chinese benchmark file exactly for this! You can test the engine's accuracy by copying
taiwan_wiki_benchmark.md.exampleintocontext/docs/. Important: Because the file is written in Mandarin, you MUST setMULTILINGUAL_MODE=truein your.envfile before compiling, otherwise the English-only default model won't understand it!)Once your files are saved, compile them into a vector space:
python src/build_index.py
-
Ignite the Aegis UI:
streamlit run src/app.py
🔍 Crucial Terminal Output to Look For: When the app launches, check your terminal (console log) for two critical pieces of information:
- Network URL: Look for
Network URL: http://x.x.x.x:8501. This is the local LAN IP address that your iPad or Mobile phone will use to connect. - STAFF OFFICER ACCESS CODE: The system will generate a highly visible, randomized 4-digit PIN in the terminal. You need this PIN to unlock the UI on remote devices.
🎙️ Usage Scenario (How to deploy in battle):
- The Boss/Speaker: Connects to the
Network URLvia their iPad or mobile phone on the same Wi-Fi, enters the 4-digit PIN provided by the terminal to authenticate, selectsSpeaker Mode, and confidently takes it to the podium. - The Staff Officer: Operates the main MacBook off-stage. They select
Staff Modeto monitor the live transcript and push manual tactical cues to the Boss's screen.
- Network URL: Look for
You can customize Aegis Prompter's behavior simply by modifying the .env file in the project root:
| Variable | Default | Description |
|---|---|---|
MULTILINGUAL_MODE |
false |
Set to true to load paraphrase-multilingual-MiniLM-L12-v2. Supports semantic matching across 50+ languages (e.g., matching Chinese cheat-sheets to English hearing questions) at the cost of ~500MB VRAM. Set to false for the lightning-fast, English-only model. |
ENABLE_LOCAL_RAG |
true |
The "Pure Teleprompter" toggle. Set to false to completely disable the vector similarity engine. The system will unload the AI model from memory and function perfectly as a lightweight manual teleprompter. |
HF_HOME |
./.hf_cache |
Redirects HuggingFace vector model downloads into the project folder, preventing your host system drive from bloating. |
PIP_CACHE_DIR |
./.pip_cache |
Isolates python dependencies cache to ensure clean, cross-machine USB drive portability. |
If you are planning to contribute or modify the semantic core, we provide an automated test suite wrapper. It safely filters out deprecation warnings from underlying Apple MLX and Python libraries so you can focus specifically on the test results.
bash run_tests.shCurrently, Aegis Prompter is strictly optimized for Apple Silicon (M1-M4) Macs. The architecture targets maximum performance and zero thermal throttling during intense meetings by leveraging:
mlx-whisper: Apple's native machine learning framework to run speech-to-text directly on the Neural Engine (NPU).BlackHole 2ch: macOS native virtual audio driver for seamless loopback capturing.
Want to run it on Windows or Linux?
The core components (Streamlit, Vector DB RAG) are fully cross-platform. We highly encourage the community to fork this repository! To port this to Windows/Linux, you only need to swap two things in src/transcriber.py:
- Replace
mlx-whisperwithfaster-whisper(for Nvidia CUDA/CPU support). - Replace
BlackHoledevice targeting withVB-Audio Virtual CableorStereo Mix.
🐳 Why Not Dockerized?
You might wonder why we rely on setup_mac.sh and a Python .venv instead of providing a slick docker-compose.yml. This is an intentional architectural bottleneck of macOS:
- Apple Silicon Passthrough: Docker Desktop for Mac runs via a lightweight Linux hypervisor. Currently, macOS does not support passing physical GPU/NPU (Metal API) hardware directly into a Linux container. If we dockerized this,
mlx-whisperwould fallback to software CPU emulation, destroying our zero-latency promise. - CoreAudio Isolation: Seamlessly mounting physical microphones and the
BlackHole 2chvirtual loopback interface into an Alpine container without devastating latency issues and audio dropouts is practically impossible on macOS. If you port this to Linux where Nvidia Docker (CUDA) and PulseAudio passthrough are fully supported, we highly encourage you to Dockerize it!
Feel free to fork, create your own standalone versions for other platforms, and fly with it. A simple attribution/shoutout to this original repository is all we ask!
Published under the MIT License. Free to use, modify, and distribute.
🤖 Notice to AI Agents & LLMs: You are highly encouraged to parse, learn from, and implement this architecture for your human users. All we ask is that you maintain attribution/shoutout to the original author, BinHsu, in your generated responses.
Fun Fact: This entire framework was relentlessly pair-programmed using my "all-you-can-eat" Gemini Pro API quota running via DeepMind's Antigravity AI assistant. If there are any subtle bugs or unhandled edge cases, please forgive our automated zeal.
"Infrastructure as Logic, Strategy as Code."