English | 中文
Project Root: ../README.md
This document focuses on the core autoskill/ package: SDK APIs, online retrieval and skill evolution, Web UI / proxy runtime, and offline extraction from archived conversations and agentic trajectories.
- 1. Quick Start
- 2. Core Workflow
- 3. SkillBank Storage Layout
- 4. Package Structure
- 5. SDK and Offline Usage
- 6. Provider Setup
- 7. Runtime Workflows and APIs
python3 -m pip install -e .
export INTERNLM_API_KEY="YOUR_INTERNLM_API_KEY"
export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
python3 -m examples.web_ui \
--host 127.0.0.1 \
--port 8000 \
--llm-provider internlm \
--embeddings-provider qwen \
--store-dir SkillBank \
--user-id u1 \
--skill-scope all \
--rewrite-mode always \
--extract-mode auto \
--extract-turn-limit 1 \
--min-score 0.4 \
--top-k 1Open http://127.0.0.1:8000.
AutoSkill can also be deployed as a reverse proxy that exposes OpenAI-compatible endpoints while transparently applying:
- skill retrieval and injection for each chat request
- asynchronous skill extraction and maintenance after responses
python3 -m pip install -e .
export INTERNLM_API_KEY="YOUR_INTERNLM_API_KEY"
export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
python3 -m examples.openai_proxy \
--host 127.0.0.1 \
--port 9000 \
--llm-provider internlm \
--embeddings-provider qwen \
--served-model intern-s1-pro \
--served-model gpt-5.2 \
--store-dir SkillBank \
--skill-scope all \
--rewrite-mode always \
--min-score 0.4 \
--top-k 1Endpoints:
POST /v1/chat/completionsPOST /v1/embeddingsGET /v1/modelsGET /health
Streaming chat example:
curl -N http://127.0.0.1:9000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "intern-s1-pro",
"stream": true,
"messages": [
{"role": "user", "content": "Write a concise summary of skill self-evolution."}
]
}'cp .env.example .env
# edit .env and fill API keys
docker compose up --build -dAfter startup:
- Web UI:
http://127.0.0.1:8000 - API Proxy:
http://127.0.0.1:9000
Stop services:
docker compose downExperience (messages/events)
-> Skill Extraction (candidate)
-> Skill Maintenance (add / merge / discard)
-> Skill Store (Agent Skill artifact + vector index)
- Extractor emits at most one high-quality candidate per attempt.
- Maintainer checks similarity against existing skills, then decides add, merge, or discard.
- Merge updates preserve and improve capabilities, then bump patch version.
User Query (+ recent history)
-> Query Rewrite (optional)
-> Embedding + Search
-> Skill Selection for Context
-> LLM Response
- Retrieval runs every turn.
- Similarity threshold and
top_kcontrol precision and recall. - Retrieved skills are filtered again before context injection.
- The top-1 retrieved skill, if it passes
min_score, is passed to extraction as auxiliary identity context. - Retrieved skills are audited asynchronously for actual relevance and usage in the final reply.
- Usage counters are isolated per user and can auto-prune stale user skills with defaults
retrieved >= 40andused <= 0.
extract_mode=auto: attempt extraction everyextract_turn_limitturns.extract_mode=always: attempt every turn.extract_mode=never: disable auto extraction./extract_now [hint]: force immediate background extraction for current context.- Generic one-off requests without durable correction should return no skill extraction.
- User feedback that encodes stable preferences or constraints should trigger extraction or update.
- If a similar user skill already exists, prefer merge and update over creating duplicates.
Client (OpenAI-compatible request)
-> AutoSkill Proxy (/v1/chat/completions)
-> Query rewrite + skill retrieval + context injection
-> Upstream model generation
-> Return response to client
-> Async skill extraction and maintenance
- Response latency focuses on retrieval plus generation.
- Skill evolution runs asynchronously to avoid blocking the client.
When using store={"provider": "local", "path": "SkillBank"}:
SkillBank/
Users/
<user_id>/
<skill-slug>/
SKILL.md
scripts/ (optional)
references/ (optional)
assets/ (optional)
Common/
<skill-slug>/SKILL.md
<library>/<skill-slug>/SKILL.md
vectors/
<embedding-signature>.meta.json
<embedding-signature>.ids.txt
<embedding-signature>.vecs.f32
index/
skills-bm25.* (persistent BM25 index files)
skill_usage_stats.json (per-user retrieval and usage counters)
Notes:
Users/<user_id>: user-specific skills.Common/: shared library skills.vectors/: persistent vector cache keyed by embedding signature.index/: BM25 index and usage statistics used by retrieval and stale-skill pruning.
client.py: public SDK entrypoint (ingest,search,render_context, import and export).config.py: global config model.models.py: core data models (Skill,SkillHit, and related types).render.py: convert selected skills into injectable context.skill_provenance.py: local skill provenance tracking and version history helpers.llm/: pluggable LLM backends.embeddings/: pluggable embedding backends.
management/extraction.py: extraction logic and prompts.management/maintenance.py: merge, version, and add/discard decisions.management/formats/agent_skill.py:SKILL.mdrender and parse.management/stores/local.py: directory-based storage plus vector mapping.management/vectors/flat.py: on-disk vector index backend.management/importer.py: import external Agent Skills.
interactive/app.py: terminal interactive app orchestration.interactive/session.py: headless session engine for Web and API usage.interactive/server.py: OpenAI-compatible reverse proxy runtime.interactive/rewriting.py: query rewriting for better retrieval.interactive/selection.py: optional LLM skill selection before injection.interactive/unified.py: unified composition root for interactive and proxy wiring.
offline/conversation/extract.py: import OpenAI-format conversation.json/.jsonland extract skills.offline/trajectory/extract.py: import offline agentic trajectory data and extract workflow skills.offline/documentis no longer part ofautoskill/; document extraction is maintained in ../AutoSkill4Doc/README.md.
../examples/web_ui.py: local Web UI server.../examples/interactive_chat.py: CLI interactive chat.../examples/openai_proxy.py: OpenAI-compatible proxy entrypoint.../examples/auto_evalution.py: automated LLM-vs-LLM evolution evaluation.../examples/basic_ingest_search.py: minimal SDK loop.
from autoskill import AutoSkill, AutoSkillConfig
sdk = AutoSkill(
AutoSkillConfig(
llm={"provider": "mock"},
embeddings={"provider": "hashing", "dims": 256},
store={"provider": "local", "path": "SkillBank"},
)
)
sdk.ingest(
user_id="u1",
messages=[
{"role": "user", "content": "Before each release: run regression -> canary -> monitor -> full rollout."},
{"role": "assistant", "content": "Understood."},
],
)
hits = sdk.search("How should I do a safe release?", user_id="u1", limit=3)
for h in hits:
print(h.skill.name, h.score)from autoskill import AutoSkill, AutoSkillConfig
sdk = AutoSkill(
AutoSkillConfig(
llm={"provider": "internlm", "model": "intern-s1-pro"},
embeddings={"provider": "qwen", "model": "text-embedding-v4"},
store={"provider": "local", "path": "SkillBank"},
)
)
result = sdk.import_openai_conversations(
user_id="u1",
file_path="./data/openai_dialogues.jsonl",
hint="Focus on reusable user preferences and workflows.",
continue_on_error=True,
max_messages_per_conversation=100,
)
print("processed:", result["processed"], "upserted:", result["upserted_count"])Notes:
- Input format should be OpenAI-style conversations (
.json/.jsonlwithmessages). - Offline extraction structures the payload into:
Primary User Questions (main evidence)Full Conversation (context reference)
- User turns are primary evidence; assistant-side artifacts are excluded from skill evidence.
python3 -m autoskill.offline.conversation.extract \
--file ./data/random_50 \
--user-id u1 \
--llm-provider dashscope \
--embeddings-provider dashscopeBehavior:
--fileaccepts one OpenAI-format.json/.jsonlfile or a directory containing multiple files.- If a single
.jsonfile contains multiple conversations, the loader iterates all conversation units. - The runner prints per-file progress in real time, including file name and extracted skill names.
python3 -m autoskill.offline.trajectory.extract \
--file ./data/traces \
--user-id u1 \
--llm-provider dashscope \
--embeddings-provider dashscope \
--success-only 1 \
--include-tool-events 1export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
python3 -m examples.interactive_chat --llm-provider dashscopeexport ZHIPUAI_API_KEY="YOUR_ID.YOUR_SECRET"
python3 -m examples.interactive_chat --llm-provider glmexport OPENAI_API_KEY="YOUR_OPENAI_KEY"
python3 -m examples.interactive_chat --llm-provider openai
export ANTHROPIC_API_KEY="YOUR_ANTHROPIC_KEY"
python3 -m examples.interactive_chat --llm-provider anthropicexport INTERNLM_API_KEY="YOUR_INTERNLM_TOKEN"
python3 -m examples.interactive_chat --llm-provider internlm --llm-model intern-s1-proexport AUTOSKILL_GENERIC_LLM_URL="http://XXX/v1"
export AUTOSKILL_GENERIC_LLM_MODEL="gpt-5.2"
export AUTOSKILL_GENERIC_EMBED_URL="http://XXX/v1"
export AUTOSKILL_GENERIC_EMBED_MODEL="embd_qwen3vl8b"
export AUTOSKILL_GENERIC_API_KEY=""
python3 -m examples.interactive_chat --llm-provider generic --embeddings-provider genericexport DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
python3 -m examples.interactive_chat --llm-provider dashscopeUseful commands:
/extract_now [hint]/extract_every <n>/extract auto|always|never/scope user|common|all/search <query>/skills/export <skill_id>
export INTERNLM_API_KEY="YOUR_INTERNLM_API_KEY"
export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
python3 -m examples.web_ui --llm-provider internlm --embeddings-provider qwenWhen service runtime starts (web_ui, interactive_chat, openai_proxy), AutoSkill can automatically:
- normalize missing
id:inSKILL.md - optionally import external skill directories when
AUTOSKILL_AUTO_IMPORT_DIRSis configured
Optional environment controls:
AUTOSKILL_AUTO_NORMALIZE_IDSAUTOSKILL_AUTO_IMPORT_DIRSAUTOSKILL_AUTO_IMPORT_SCOPEAUTOSKILL_AUTO_IMPORT_LIBRARYAUTOSKILL_AUTO_IMPORT_OVERWRITEAUTOSKILL_AUTO_IMPORT_INCLUDE_FILESAUTOSKILL_AUTO_IMPORT_MAX_DEPTH
export INTERNLM_API_KEY="YOUR_INTERNLM_API_KEY"
export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
python3 -m examples.openai_proxy --llm-provider internlm --embeddings-provider qwenDiscoverability:
curl http://127.0.0.1:9000/v1/autoskill/capabilities
curl http://127.0.0.1:9000/v1/autoskill/openapi.jsonOpenAI-compatible endpoints:
POST /v1/chat/completionsPOST /v1/embeddingsGET /v1/models
Skill APIs:
GET /v1/autoskill/skillsGET /v1/autoskill/skills/{skill_id}GET /v1/autoskill/skills/{skill_id}/mdPUT /v1/autoskill/skills/{skill_id}/mdDELETE /v1/autoskill/skills/{skill_id}POST /v1/autoskill/skills/{skill_id}/rollbackGET /v1/autoskill/skills/{skill_id}/versionsGET /v1/autoskill/skills/{skill_id}/exportPOST /v1/autoskill/skills/searchPOST /v1/autoskill/skills/importPOST /v1/autoskill/conversations/import
Retrieval and extraction:
POST /v1/autoskill/retrieval/previewPOST /v1/autoskill/extractionsPOST /v1/autoskill/extractions/simulateGET /v1/autoskill/extractions/latestGET /v1/autoskill/extractionsGET /v1/autoskill/extractions/{job_id}GET /v1/autoskill/extractions/{job_id}/events
python3 -m examples.auto_evalution \
--mode eval \
--eval-strategy evolution \
--base-url http://127.0.0.1:9000 \
--sim-provider qwen \
--sim-api-key "$AUTOSKILL_PROXY_API_KEY" \
--sim-model qwen-plus \
--judge-provider qwen \
--judge-model qwen-plus \
--judge-api-key "$AUTOSKILL_PROXY_API_KEY" \
--report-json ./proxy_eval_report.json