Phase 3 roadmap for the Felix Agent SDK. Builds on v0.2.1 (824 tests, 34 exports, 10 examples).
Confirmed
MCP Server (feat/mcp-server)
Wrap Felix workflows as MCP tools so Claude Desktop, Cursor, Windsurf, and other MCP clients can invoke them natively.
Three tools:
run_workflow — accepts task description + template name or full config JSON → runs workflow → returns synthesis, confidence, metadata
list_templates — returns available workflow templates (research, analysis, review) with descriptions and default configs
validate_config — validates a workflow config without running it, returns errors/warnings
Uses the official Anthropic mcp SDK. Reuses the existing CLI yaml_loader for config parsing and run_command provider resolution. Adds felix-mcp console script entry point.
Community Infrastructure (feat/community)
- Expanded
CONTRIBUTING.md: how to add memory backends, observability adapters, MCP tools. Code style guide.
.github/ISSUE_TEMPLATE/: bug report + feature request templates
.github/PULL_REQUEST_TEMPLATE.md: PR checklist
- New examples:
11_mcp_server.py, 12_vector_search.py, 13_observability.py
CHANGELOG.md v0.3.0 section, version bump
Under Discussion — Feedback Requested
Vector DB Backends (feat/vector-backends)
Add semantic similarity search to KnowledgeStore via pluggable vector backends (ChromaDB, Pinecone).
What it adds:
ChromaBackend(BaseBackend) and PineconeBackend(BaseBackend) implementing the existing 11-method backend interface
BaseEmbeddingProvider interface with embed(text) → list[float] for provider-agnostic embeddings
search_vector(table, embedding, limit) method on BaseBackend (optional, default NotImplementedError)
- Wires into
KnowledgeStore.semantic_search() which currently raises NotImplementedError
Open question: Is this the right direction? The current BaseBackend.search_text() with FTS5 handles keyword search well. Adding vector search means users need an embedding model (OpenAI, local sentence-transformers) just to use the memory system. Is the complexity worth it at this stage, or should we focus on making the existing FTS5 search smarter first?
Observability Adapters (feat/observability)
Bridge EventBus events to production monitoring: OpenTelemetry spans/metrics, DataDog custom events/StatsD, Prometheus counters/histograms.
What it adds:
BaseObservabilityAdapter(ABC) following the EventLogBridge pattern (subscribe to EventBus, transform events, provide detach())
OpenTelemetryAdapter — workflow = parent span, rounds = child spans, tasks = leaf spans. Counters for rounds/tokens/confidence.
DataDogAdapter — DogStatsD metrics + custom events with agent_type/phase tags
PrometheusAdapter — felix_workflow_rounds_total, felix_task_confidence, felix_tokens_total gauges/counters
Open question: Are three adapters too many for a v0.3.0? Should we ship just OpenTelemetry (most universal) and let DataDog/Prometheus be community contributions? Or is there a different monitoring approach (structured JSON logs piped to any backend) that would be more practical?
Branching Strategy
Same as Phase 2: dev/phase-3 integration branch, feature PRs merge into it, final PR to main.
Stats Target
- Tests: 900+
- Public exports: ~42
- Examples: 13
Phase 3 roadmap for the Felix Agent SDK. Builds on v0.2.1 (824 tests, 34 exports, 10 examples).
Confirmed
MCP Server (
feat/mcp-server)Wrap Felix workflows as MCP tools so Claude Desktop, Cursor, Windsurf, and other MCP clients can invoke them natively.
Three tools:
run_workflow— accepts task description + template name or full config JSON → runs workflow → returns synthesis, confidence, metadatalist_templates— returns available workflow templates (research, analysis, review) with descriptions and default configsvalidate_config— validates a workflow config without running it, returns errors/warningsUses the official Anthropic
mcpSDK. Reuses the existing CLI yaml_loader for config parsing and run_command provider resolution. Addsfelix-mcpconsole script entry point.Community Infrastructure (
feat/community)CONTRIBUTING.md: how to add memory backends, observability adapters, MCP tools. Code style guide..github/ISSUE_TEMPLATE/: bug report + feature request templates.github/PULL_REQUEST_TEMPLATE.md: PR checklist11_mcp_server.py,12_vector_search.py,13_observability.pyCHANGELOG.mdv0.3.0 section, version bumpUnder Discussion — Feedback Requested
Vector DB Backends (
feat/vector-backends)Add semantic similarity search to KnowledgeStore via pluggable vector backends (ChromaDB, Pinecone).
What it adds:
ChromaBackend(BaseBackend)andPineconeBackend(BaseBackend)implementing the existing 11-method backend interfaceBaseEmbeddingProviderinterface withembed(text) → list[float]for provider-agnostic embeddingssearch_vector(table, embedding, limit)method on BaseBackend (optional, default NotImplementedError)KnowledgeStore.semantic_search()which currently raises NotImplementedErrorOpen question: Is this the right direction? The current
BaseBackend.search_text()with FTS5 handles keyword search well. Adding vector search means users need an embedding model (OpenAI, local sentence-transformers) just to use the memory system. Is the complexity worth it at this stage, or should we focus on making the existing FTS5 search smarter first?Observability Adapters (
feat/observability)Bridge EventBus events to production monitoring: OpenTelemetry spans/metrics, DataDog custom events/StatsD, Prometheus counters/histograms.
What it adds:
BaseObservabilityAdapter(ABC)following theEventLogBridgepattern (subscribe to EventBus, transform events, providedetach())OpenTelemetryAdapter— workflow = parent span, rounds = child spans, tasks = leaf spans. Counters for rounds/tokens/confidence.DataDogAdapter— DogStatsD metrics + custom events with agent_type/phase tagsPrometheusAdapter—felix_workflow_rounds_total,felix_task_confidence,felix_tokens_totalgauges/countersOpen question: Are three adapters too many for a v0.3.0? Should we ship just OpenTelemetry (most universal) and let DataDog/Prometheus be community contributions? Or is there a different monitoring approach (structured JSON logs piped to any backend) that would be more practical?
Branching Strategy
Same as Phase 2:
dev/phase-3integration branch, feature PRs merge into it, final PR tomain.Stats Target