According to the Stanford HAI — 2025 AI Index Report, nowadays, artificial intelligence is becoming more and more affordable and accessible for us, just like social media— is a part of our daliy life. Among all AI branches, large language models (LLMs)— especially GPT series—have occupied the center stage in AI research and application. Further more, "AI chatbot has deeply changed the scientists’ lives" says by nature. Therefore, there are tons of needs for people to communicate, or even cooperate with AI, to satisfy their own requirement.
Currently, however, most AI chat systems available to end users are essentially web applications wrapped as desktop clients or directly offered as web pages. Most of them use Javascripts or Java to connect and interact with LLM models which allow rapid cross-platform deployment but also introduce two limitations at the same time: Large memory consumption and Lack of Security and Reliability.
Large memory consumption: Many web pages or wrapped clients technically depends on Javascript or Java, even some of them use Javascript framework like Electron, which will produce lots of memory consumption.E.g. According to Electron, it embeds Chromium and Node.js. So when installing the client, it needs more extra memory space to place the framework dependencies. But actually, we only use a very small part of those framework in real, with tons of redundant support meanwhile. Same to Java as well, since Java file is not a binary file. It requires a Java Virtual Machine(JVM) to execute program, which also require to install JVM before run anything, bringing unnecessary memory cost.
Lack of Security and Reliability:As for Javascripts or Java, JavaScript is a dynamically typed language that relies on runtime error checking, making it prone to type errors and null access issues that can cause instability at runtime. While tools like TypeScript offer some improvements, JavaScript still has inherent risks in memory management and concurrency control. Java, as a statically typed language, can catch many errors at compile time, improving type safety, but its garbage collection and multithreading management may cause performance fluctuations and affect system stability under high load. Even for C language, which is the same type with rust( procedural-oriented language ), on the other hand, offers maximum flexibility and efficiency but lacks automatic memory management, making programs susceptible to memory leaks, null pointer issues, and other reliability problems.
The motivation for our project, hence, aims to implement a Simple LLM-Powered CLI, which leverages Rust's strengths in performance, memory safety, and concurrency control to address the limitations of existing solutions. Unlike traditional web-based or Java-based systems, which face issues with memory management, performance overhead, and error handling, our project seeks to provide a more reliable and efficient backend solution. By utilizing Rust's compile-time error checking and low-level control, we aim to build a system that ensures stability and scalability while maintaining ease of integration with external tools and workflows. This will allow us to create a highly responsive, context-aware CLI that can automate tasks, support agentic workflows, and deliver a superior user experience, all while maintaining the robustness necessary for complex AI-driven operations.
The goal of this project is to create a lightweight, low-latency AI Chat system in Rust that simulates the core functionalities of popular AI chat applications. This system is designed to allow users to interact with an AI-powered assistant through a command-line interface (CLI), providing a robust, real-time chat experience.
This project aims to fill a gap in the Rust ecosystem by providing a Rust-native, CLI-based solution for building AI-powered chat systems. The project was intended not only as a basic AI assistant but as a tool capable of supporting the following core objectives:
- Context-Aware Conversations: A system that manages and maintains user interactions over multiple turns, providing seamless back-and-forth communication.
- Real-Time Interaction: Ensuring quick responses from the AI assistant while handling high concurrency, making the system responsive and efficient for real-time conversations.
- Seamless Integration with External Tools: The AI chat system will be capable of interacting with external tools and APIs for enhanced functionality, supporting use cases such as retrieving data, processing requests, and executing actions on behalf of the user.
The original implementation plan is to integrate advanced Natural Language Processing (NLP) techniques by using Rust’s performance and memory safety features and other web connect technology(e.g. web-socket) to provide a real-time chat assistant. Although the specific features will explain below, here we just want to list rough technical points for us as our technical objectives(and also for possible readers :) )
- (Basic need)Using data storage(e.g. mysql) to storage conversations, and developing this part with rust to access the data, and also put this data to LLM as prompts.
- (Basic need)Since LLM maybe need to think a lot of time or output very long answer at once, for better user experience, we should use Streaming technology to let the answer display instantly.
- (Challenging need,but will mainly focus on it)Try to build a simple AI agent by applying other interface as prompts to inject LLM.
- (Challenging need,maybe partly/not develop it)Try to use Design Pattern when developing whole project, in order to make whole codes easy to read, understand, and easy to expend extra functions in future.
- Custom Client-Side Interface for AI Chat
We will use Ratatui to design user-interface, and websocket plus database to provide the ability of real-time conversation, and conversation history.
- Use Ratatui to offer a user-friendly CLI interface with support for natural language queries.
- Use websocket to provide real-time conversation management, maintaining conversation history and context across multiple turns.
- Use database to storage the conversations, and provide them to LLM.
- Guarantee real-time conversation, including automatically re-connect attempt, and detecting whether the real-time connection is still going well.
👉 Task Assignment
Zhengyang Li: Websocket communication module (real-time connection, auto-reconnect, connection health checks).
Peixuan Li: Database integration (schema design, storing/querying conversation history, API layer).
Yanchi Wang: Ratatui interface design (input/output panels, status bar, history viewing).
- Tool Integration
- Use Model Context Protocol to call common tools (filesystem, shell, git, simple web fetch).
- Deny-listed tools are blocked and surfaced as a friendly notice.
- Also support tool for querying database.
👉 Task Assignment
Zhengyang Li: Base ToolRegistry design (register, lookup, execute tools).
Peixuan Li: Implement database-related tools (e.g., querying conversation history).
Yanchi Wang: Implement the frontend integration so tool results can be triggered and displayed in the UI.
- Agentic Workflow:our team will mainly focused on, but challenging
We will implement an agentic which could complete simple workflow.
- Implement Plan → Execute → Verify workflow
- Agents are pluggable modules with a common Agent trait interface
- Implement executor runs plans step by step with input/output injection
👉 Task Assignment
Zhengyang Li: Implement the Executor (executes DSL instructions, manages input/output injection).
Peixuan Li: Define the Agent trait and implement the Planner (Ollama integration → outputs a simple DSL).
Yanchi Wang: Implement the Verify/feedback layer (display workflow execution process and final result in the UI).
Our team plans to achieve the project objective by dividing the work into four main modules: Database, Websocket Communication, TUI Client, and Agentic Workflow.
Each team member has clear responsibilities: Peixuan Li focuses on database integration and the planner, Zhengyang Li works on websocket communication and the executor, and Yanchi Wang develops the TUI interface and the verification layer.
The tentative plan is as follows:
| Module | Core Task | Responsible Member | Assistant Contributor | Tentative DDL (ET @ 23:59) |
|---|---|---|---|---|
| Database Setup | Design schemas and integrate database for storing/querying conversation history | Peixuan Li | Zhengyang Li | Sun, Oct 12, 2025 |
| Database Setup | Verify database functionalities with sample data | Peixuan Li | Yanchi Wang | Sun, Oct 19, 2025 |
| Websocket Communication | Build real-time websocket connection with auto-reconnect and health monitoring | Zhengyang Li | Yanchi Wang | Sun, Oct 26, 2025 |
| TUI Client (Ratatui) | Set up Ratatui framework and design input/output panels with history view | Yanchi Wang | Zhengyang Li | Sun, Nov 2, 2025 |
| TUI Client (Ratatui) | Integrate tool results into the interface for user interaction | Yanchi Wang | Peixuan Li | Sun, Nov 9, 2025 |
| Tool Integration | Implement ToolRegistry (register, lookup, execute tools) | Zhengyang Li | / | Sun, Nov 2, 2025 |
| Tool Integration | Add database-related tools (e.g., query past conversations) | Peixuan Li | / | Sun, Nov 9, 2025 |
| Agentic Workflow | Define Agent trait and Planner (Ollama → DSL output) | Peixuan Li | / | Sun, Nov 9, 2025 |
| Agentic Workflow | Implement Executor (step-by-step DSL execution with input/output injection) | Zhengyang Li | / | Sun, Nov 16, 2025 |
| Agentic Workflow | Implement Verify/feedback layer and show execution results in the UI | Yanchi Wang | / | Sun, Nov 23, 2025 |
| Agentic Workflow | End-to-end workflow integration test (happy path) | All | All | Sun, Nov 30, 2025 |
- Requirements: a local OpenAI-compatible endpoint (e.g., LM Studio) at
LLM_BASE_URL, plus a MySQL instance reachable viaDATABASE_URL. Copy.envorthis.envinto your shell environment. - Start in TUI mode (default):
cargo run - Start in simple CLI mode:
cargo run -- --cli - Send a single question and exit:
cargo run -- --once "what is my cpu arch?" - The CLI prints the plan, tool executions, verification status, and then streams the LLM response. Conversation history is still stored in MySQL so you can reopen it in the TUI later.
- Set
LLM_BASE_URL(defaults tohttp://localhost:1234, matching LM Studio’s default). - Set
LLM_MODELto the model name exposed by your local server. - Optional:
LLM_API_KEYif your local server enforces auth. - The client uses OpenAI-style
POST /v1/chat/completionswith streaming; no cloud services are required if you run a local server (e.g., LM Studio or Ollama with an OpenAI-compatible shim).
- Local MCP server (built-in):
ENABLE_LOCAL_MCP_SERVER=true(and optionallyMCP_LISTEN_ADDR, default127.0.0.1:4000). This exposes the app’s tools (filesystem/shell/git/web_fetch/database/acp) viaPOST /tools/{name}/invokefor other clients. - External MCP proxy: set
MCP_BASE_URL(and optionalMCP_API_KEY) to call a remote MCP server. - ACP/editor: set
ACP_BASE_URL(and optionalACP_API_KEY) to send editor commands to an ACP-compatible server; if unset, theacptool falls back to local file read/append/replace operations.
| Deliverable / Buffer | Owner(s) | Tentative DDL (ET @ 23:59) | Notes |
|---|---|---|---|
| Project Proposal (README.md) | All | Mon, Oct 6, 2025 | Submit repo URL; add instructor as collaborator if private. |
| Feature freeze / bug triage | All | Sun, Dec 7, 2025 | Stop adding features; focus on reliability & UX polish. |
| Final Report (README.md) – draft | All | Wed, Dec 10, 2025 | Complete Reproducibility & User Guide sections. |
| Video Slide Presentation (5–10 min) | All | Fri, Dec 12, 2025 | Record & upload; link in README under “Video Slide Presentation”. |
| Video Demo (3–10 min) | All | Sat, Dec 13, 2025 | Record end-to-end demo; link in README under “Video Demo”. |
| Repro checks on Ubuntu & macOS | All | Sun, Dec 14, 2025 | Fresh clone build/run using README steps. Fix any gaps. |
| Final Submission | All | Mon, Dec 15, 2025 | Ensure README links work; tag the submitted commit. |