Summary
Replace custom Anthropic API calls in the Go worker with a dedicated Node.js service using the official Claude SDK. This enables access to SDK-exclusive features (prompt caching, better streaming, extended thinking) while preserving the existing Go infrastructure.
Background
The current Go worker (~19K LOC) handles:
- LLM calls (~4K LOC in
pkg/llm/) - Anthropic Claude + Groq
- Helm SDK integration - Chart parsing, rendering, validation
- PostgreSQL persistence - Workspaces, plans, files, chat history
- Real-time updates - Centrifugo for streaming to frontend
- Queue/listener infrastructure - Event-driven processing
The Anthropic Go SDK is community-maintained and lacks features available in the official Node.js/Python SDKs:
- Prompt caching
- Extended thinking (Claude 3.7)
- Better streaming primitives
- Faster feature parity with API releases
Proposed Architecture
┌─────────────────────────────────────────────────────────────┐
│ Go Worker │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Helm SDK │ │ PostgreSQL │ │ Centrifugo │ │
│ │ Rendering │ │ Persistence │ │ Real-time │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ LLM Client (HTTP) │ │
│ │ • Calls Node service for Claude │ │
│ │ • Handles streaming via SSE/WebSocket │ │
│ │ • Groq calls remain in Go (optional) │ │
│ └──────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
│ HTTP/SSE
▼
┌─────────────────────────────────────────────────────────────┐
│ Node.js Claude Service │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Official @anthropic-ai/sdk │ │
│ │ │ │
│ │ • Streaming responses (SSE to Go) │ │
│ │ • Prompt caching support │ │
│ │ • Extended thinking (Claude 3.7) │ │
│ │ • Tool use / function calling │ │
│ └──────────────────────────────────────────────────┘ │
│ │
│ Endpoints: │
│ POST /v1/messages - Standard completion │
│ POST /v1/messages/stream - Streaming completion │
│ GET /health - Health check │
└─────────────────────────────────────────────────────────────┘
API Interface
Request (Go → Node)
interface ClaudeRequest {
model: string; // "claude-3-7-sonnet-20250219"
system?: string; // System prompt
messages: Message[]; // Conversation history
max_tokens: number;
tools?: Tool[]; // For tool use
stream?: boolean; // Enable streaming
// SDK-specific features
prompt_caching?: {
cache_system?: boolean; // Cache system prompt
cache_tools?: boolean; // Cache tool definitions
};
thinking?: { // Extended thinking (3.7)
enabled: boolean;
budget_tokens?: number;
};
}
interface Message {
role: "user" | "assistant";
content: string | ContentBlock[];
}
Response (Node → Go)
Non-streaming:
interface ClaudeResponse {
id: string;
content: ContentBlock[];
model: string;
stop_reason: string;
usage: {
input_tokens: number;
output_tokens: number;
cache_creation_input_tokens?: number;
cache_read_input_tokens?: number;
};
}
Streaming (SSE):
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_delta
data: {"type":"content_block_delta","delta":{"text":"Hello"}}
event: message_stop
data: {"type":"message_stop"}
Migration Plan
Phase 1: Create Node Service (New Package)
Phase 2: Go Client Abstraction
Phase 3: Migrate LLM Calls (One at a Time)
Files to migrate in pkg/llm/:
| File |
Priority |
Streaming |
Notes |
conversational.go |
High |
Yes |
Main chat flow |
execute-action.go |
High |
Yes |
Core action execution |
execute-plan.go |
High |
Yes |
Plan execution |
initial-plan.go |
High |
Yes |
Plan creation |
plan.go |
High |
Yes |
Plan updates |
expand.go |
Medium |
No |
Prompt expansion |
cleanup-converted-values.go |
Medium |
No |
Values cleanup |
convert-file.go |
Medium |
No |
File conversion |
summarize.go |
Low |
No |
Can stay Groq |
intent.go |
Low |
No |
Uses Groq, can stay |
Phase 4: Enable SDK Features
Phase 5: Cleanup
Deployment Considerations
Local Development
# docker-compose.yml addition
services:
claude-service:
build: ./claude-service
ports:
- "3100:3100"
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
Production
- Deploy as sidecar container in same pod (lowest latency)
- Or as separate service if horizontal scaling needed
- Health checks required for orchestration
Open Questions
- Groq calls - Move to Node too, or keep in Go?
- Embedding calls - Currently using Voyage API, include in Node service?
- Protocol - HTTP/SSE vs gRPC vs WebSocket for streaming?
- Caching layer - Add Redis for prompt cache persistence across restarts?
Success Metrics
References
Summary
Replace custom Anthropic API calls in the Go worker with a dedicated Node.js service using the official Claude SDK. This enables access to SDK-exclusive features (prompt caching, better streaming, extended thinking) while preserving the existing Go infrastructure.
Background
The current Go worker (~19K LOC) handles:
pkg/llm/) - Anthropic Claude + GroqThe Anthropic Go SDK is community-maintained and lacks features available in the official Node.js/Python SDKs:
Proposed Architecture
API Interface
Request (Go → Node)
Response (Node → Go)
Non-streaming:
Streaming (SSE):
Migration Plan
Phase 1: Create Node Service (New Package)
claude-service/directory/v1/messagesendpoint (non-streaming)/v1/messages/streamendpoint (SSE)Phase 2: Go Client Abstraction
pkg/llm/claude/client.go- HTTP client for Node servicePhase 3: Migrate LLM Calls (One at a Time)
Files to migrate in
pkg/llm/:conversational.goexecute-action.goexecute-plan.goinitial-plan.goplan.goexpand.gocleanup-converted-values.goconvert-file.gosummarize.gointent.goPhase 4: Enable SDK Features
Phase 5: Cleanup
Deployment Considerations
Local Development
Production
Open Questions
Success Metrics
References