Rewind.tg - Local-AI Telegram Agent Search

A multi-agent system that lets a local LLM running on Ollama search, explore, and reason over your Telegram: your own chats, groups, channels, the public channel directory, and arbitrary links inside messages. No cloud, no vendor, no content filtering.

Designed around an agent harness (ReAct + JSON tool-calling) plus a multi-agent orchestrator that can spawn focused sub-agents (dialog_search, channel_explorer, link_crawler, generic). An optional MCP server exposes every tool to Cursor / VS Code Copilot.

Features

Uses MTProto (Telethon): the agent has the same reach as your own Telegram app: DMs, groups, joined channels, public channel directory, joining new channels, and resolving / following t.me/... links.
Multi-agent: orchestrator plans, dispatches spawn_subagent calls to specialists, then synthesises the result.
Local RAG index: SQLite FTS5 with CJK-aware segmentation (substring search works on Chinese / Japanese / Korean, not just whitespace-split Latin scripts) + Ollama dense embeddings (qwen3-embedding:0.6b by default), fused with reciprocal-rank fusion. Dense is gated by a cosine floor and weighted by how much of the corpus is embedded, so it degrades gracefully before a full backfill.
Ingest-time cleaning: exact + SimHash near-duplicate collapse (the same blurb reposted across dozens of channels becomes one result with a "+N more" tag) and purely structural spam down-ranking. No content moderation — your data, your rules.
Incremental sync: each dialog resumes from its last archived message, so refreshing the archive only fetches what's new.
Web tool: http_fetch for non-Telegram links found in messages, routed through your local proxy.
Optional vision: qwen3.5 already has built-in vision, so by default VISION_MODEL points at the same qwen3.5 model and the agent can call tg_describe_media on photos/videos it finds.
MCP server: expose the whole tool belt to Cursor / Copilot Chat.
Optional language packs (LOCALE_PACK= in .env): drop-in NLP data for intent classification, language-specific anti-spam, and deep-search triggers. Bundled: zh, en, ja, ko, ru, es — see Language packs to add your own. The search itself is fully Unicode-aware and works without any pack.

Requirements

Windows / Linux / macOS
Python >= 3.10 (conda / venv / pixi all work)
Ollama running locally at 127.0.0.1:11434
A Telegram account + API id/hash from https://my.telegram.org
Local proxy at 127.0.0.1:7890 (Clash-style HTTP proxy)

Recommended Models

Qwen3.5 (released April 2026) has tool-calling, vision, and thinking mode in one model, which is exactly what this system needs. The default setup uses the Qwen3.5 family throughout and does not require a separate vision model.

Role	Pick	~VRAM (Q4)
Default / low VRAM	`qwen3.5:0.8b`	~1 GB
Balanced / 8GB GPU	`qwen3.5:4b`	~3 GB
Best for 3060 12GB	`qwen3.5:9b`	~5-6 GB
Embeddings (default)	`qwen3-embedding:0.6b`	~0.6 GB
Embeddings (multilingual)	`bge-m3`	~1.2 GB
Heavy-duty (24GB+)	`qwen3.6:35b` or `qwen3:30b`	~15-20 GB

qwen3-embedding:0.6b is the recommended embedder: #1-for-its-size on the MTEB multilingual leaderboard, 32K context, same family as the LLM. bge-m3 is a strong alternative when you need maximum multilingual recall — swap it in via EMBED_MODEL in .env (dimensions are auto-detected, no code change).

Pulls:

ollama pull qwen3.5:0.8b
ollama pull qwen3-embedding:0.6b

If you want image understanding, point VISION_MODEL=qwen3.5:0.8b in .env. That reuses the same weights with zero extra VRAM.

Install

conda activate <your-env>   # or activate your venv
cd /path/to/Rewind.tg
pip install -r requirements.txt
copy .env.example .env
notepad .env   # fill TG_API_ID / TG_API_HASH / TG_PHONE, tweak models

First login (interactive; saves data/tg.session):

python -m tg_search.telegram_client

Enter the SMS / app code, and your 2FA password if enabled. The session is reused after that.

Usage

One-Shot

python cli.py "find every message mentioning 'stable diffusion LoRA' across all my chats, summarise the top 10 with links"

REPL

python cli.py --repl

Force A Specific Specialist Role

python cli.py --role channel_explorer "discover high-quality public channels about local LLM fine-tuning and index their last 500 msgs"

Build The Local RAG Archive

Search quality is bounded by how much history is local. The flow is sync → embed → search, all incremental and resumable:

# 1. Pull history (incremental: re-runs only fetch new messages).
#    Cleaned + deduped + spam-scored at ingest.
python -m scripts.download_history --all --limit 3000

# 2. Backfill dense vectors in batches (enables semantic retrieval;
#    needs Ollama running). Resumable — safe to Ctrl-C and rerun.
python -m scripts.download_history --embed-only

#    …or do both in one go:
python -m scripts.download_history --all --limit 3000 --embed

Inside the REPL the same thing is /sync then /embed, and /stats shows per-dialog freshness + how many messages still need embedding.

Until ~2% of the corpus is embedded the index is pure BM25 over the Unicode-segmented FTS5 table; dense retrieval (cross-language, paraphrase, semantic) ramps in automatically as the backfill progresses. tg_smart_search is local-first — it queries this archive before falling back to live Telegram search.

MCP Server (Cursor / VS Code Copilot)

Add to your MCP client config, for example ~/.cursor/mcp.json:

{
  "mcpServers": {
    "tg-agent": {
      "command": "python",
      "args": ["-m", "mcp_server.server"],
      "cwd": "/absolute/path/to/Rewind.tg",
      "env": { "PYTHONUNBUFFERED": "1" }
    }
  }
}

For VS Code Copilot Chat, add the same under the github.copilot.chat.mcpServers setting. Cursor / Copilot will then have tg_search_global, tg_join_channel, idx_semantic_search, tg_describe_media, spawn_subagent, and more.

Architecture

user query
  -> Orchestrator agent (ReAct loop, tool-calls)
  -> spawn_subagent(role=...)
     -> dialog_search
     -> channel_explorer
     -> link_crawler
  -> shared tool belt
     -> Telethon (MTProto, proxy)
     -> Ollama (local LLM / vision / embeddings)
     -> SQLite hybrid index (BM25 + cosine)
  -> final answer

Tool Belt (Agents + MCP)

Tool	Purpose
`tg_list_dialogs`	list user's chats / groups / channels / bots
`tg_search_global`	global search across all user's chats
`tg_search_in_dialog`	search inside one dialog
`tg_message_context`	expand around a hit
`tg_search_public_channels`	public directory search
`tg_join_channel`	join public / invite-only channels
`tg_resolve_link`	inspect a `t.me/...` link without joining
`tg_index_channel`	download channel history into local index
`idx_semantic_search`	BM25 + embedding hybrid search
`http_fetch`	fetch web URLs via proxy, cleaned HTML
`tg_describe_media`	download + describe media with the vision model
`spawn_subagent`	recursive delegation to a focused sub-agent

Language Packs

The search, index, and embeddings are all Unicode-aware out of the box — any script that Python and FTS5 handle (which is essentially every modern script) works without configuration. A language pack only adds nice-to-have NLP optimisations on top:

Signal	What it does
`geo_tail_chars` / `geo_substrs`	recognise location-style queries for intent classification
`verb_phrase_hints`	spot sentence / dialogue / idiom queries
`sentence_stopwords`	filter out generic noise terms in LLM-derived query expansions
`spam_phrase_pattern`	down-rank language-specific promo / CTA blasts at ingest
`deep_search_keywords`	trigger exhaustive cross-dialog search from the orchestrator

Enable one of the bundled packs by setting LOCALE_PACK=<code> in .env:

LOCALE_PACK=zh   # or: en, ja, ko, ru, es

Bundled packs live under locales/. To add a new language, copy locales/_template.py to locales/<code>.py, fill in any of the fields, and set LOCALE_PACK=<code>. All fields are optional — an empty pack is just the same as no pack.

If LOCALE_PACK is unset (the default) the tool is fully functional; intent classification just relies on the LLM's verdict without linguistic post-correction, and the structural spam score skips the language-specific CTA signal.

Safety / Data Policy

All data stays on your machine (SQLite file + Telethon session). The agent performs no content filtering: your Telegram, your rules. Inbound prompt injection from message content is possible in theory, so do not run this against an account whose tool belt you would not trust to run locally.

Your api_hash is a secret, so keep it in .env (git-ignored). If it ever leaks, rotate it at https://my.telegram.org. Never commit .env or data/ (session file + local index + downloaded media).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rewind.tg - Local-AI Telegram Agent Search

Features

Requirements

Recommended Models

Install

Usage

One-Shot

REPL

Force A Specific Specialist Role

Build The Local RAG Archive

MCP Server (Cursor / VS Code Copilot)

Architecture

Tool Belt (Agents + MCP)

Language Packs

Safety / Data Policy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
locales		locales
mcp_server		mcp_server
scripts		scripts
tg_search		tg_search
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
cli.py		cli.py
config.py		config.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Rewind.tg - Local-AI Telegram Agent Search

Features

Requirements

Recommended Models

Install

Usage

One-Shot

REPL

Force A Specific Specialist Role

Build The Local RAG Archive

MCP Server (Cursor / VS Code Copilot)

Architecture

Tool Belt (Agents + MCP)

Language Packs

Safety / Data Policy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages