Developer-facing notes on the internal design. Read this before making non-trivial changes — several subsystems have sharp edges that aren't obvious from the code alone.
Windsurf-to-OpenAI compatible proxy. Headless Node.js service that authenticates
with Codeium's backend via the Windsurf language server binary (language_server_linux_x64)
over gRPC, and exposes /v1/chat/completions to OpenAI-compatible clients.
node src/index.js # prod entry
node --watch src/index.js # dev (npm run dev)Requires Node >= 20. Zero npm deps — the whole project is pure node:* builtins.
Language server binary must exist at config.lsBinaryPath (default
/opt/windsurf/language_server_linux_x64). On Windows the LS won't start, but
the HTTP server and dashboard still come up for local dev of non-chat paths.
src/
index.js - entry point, boots LS + HTTP server
server.js - HTTP server, route dispatch, streaming
config.js - env + defaults (.env overrides)
auth.js - account pool, RPM tracking, tier/blocklist, credit refresh
models.js - model catalog + tier->models table (MODEL_TIER_ACCESS)
langserver.js - LS pool: one binary per unique egress proxy
client.js - WindsurfClient: StartCascade / Send / poll trajectory
windsurf.js - protobuf builders + parsers (exa.language_server_pb, exa.cortex_pb)
proto.js - minimal varint/length-prefixed reader/writer
grpc.js - gRPC-over-HTTP2 unary call helper
connect.js - Firebase + api.codeium.com sign-in (register_user)
conversation-pool.js - experimental cascade_id reuse pool (OFF by default)
runtime-config.js - runtime-config.json: experimental toggles
cache.js - in-memory exact-body response cache
handlers/
chat.js - /v1/chat/completions: model routing, retry, tool emulation
models.js - /v1/models
tool-emulation.js - prompt-level <tool_call> protocol for Cascade (no native slot)
dashboard/
api.js - /dashboard/api/* admin routes
index.html - single-page admin UI (shadcn-style dark theme)
logger.js - ring buffer + SSE log stream
proxy-config.js - global + per-account proxy config (proxy-config.json)
model-access.js - global model allow/blocklist (model-access.json)
stats.js - request counters
windsurf-login.js - direct Windsurf email+password sign-in flow
Request flow (chat): server.js → handlers/chat.js → account pick
(auth.getApiKey(tried, modelKey)) → langserver.getLsFor(proxy) → WindsurfClient.cascadeChat()
or .rawGetChatMessage() → gRPC unary calls to LanguageServerService/StartCascade,
SendUserCascadeMessage, polling GetCascadeTrajectorySteps/GetCascadeTrajectory.
Cascade vs legacy: Models with a modelUid go through the Cascade flow.
Models with only enumValue > 0 (no modelUid) use legacy RawGetChatMessage.
Newer models (gemini-3.0, gpt-5.2, etc.) have both enumValue AND modelUid —
they MUST use Cascade because the LS binary rejects their high enum values in the
legacy proto endpoint with "cannot parse invalid wire-format data".
LS pool: one LS process per unique outbound proxy URL. Mixing accounts with
different proxies in a single LS causes silent state pollution — InitializeCascadePanelState
starts failing with "The pending stream has been canceled" and all accounts look "expired".
Always route an account through getLsFor(acct.proxy).
Tool emulation: Cascade's protobuf has no per-request slot for client-defined
tool schemas (verified against the on-disk exa.cortex_pb.proto — SendUserCascadeMessageRequest
fields 1–6 are cascade_id / items / metadata / experiment_config / cascade_config / images,
nothing for tool defs). When a caller passes OpenAI-format tools[], we serialize
them into the user text as a <tool_call>{...}</tool_call> emission contract and
parse blocks back out of the Cascade text stream. Lives in src/handlers/tool-emulation.js.
Planner mode is load-bearing. buildCascadeConfig() sets
CascadeConversationalPlannerConfig.planner_mode = 3 (NO_TOOL) — not the
DEFAULT = 1 that all reference repos (pqhaz3925, AlexStrNik) use. The enum
exa.codeium_common.ConversationalPlannerMode has seven values:
UNSPECIFIED=0 DEFAULT=1 READ_ONLY=2 NO_TOOL=3 EXPLORE=4 PLANNING=5 AUTO=6.
DEFAULT keeps Cascade's IDE-agent loop hot even when CascadeToolConfig is
unset, so the planner reflexively fires edit_file /tmp/windsurf-workspace/...
on every turn, producing (a) 20 % stall_warm false-positives from silent
tool-execution trajectory steps where responseText stops growing while
status stays ACTIVE, (b) "Cascade cannot create foo because it already exists" errors on bursts that reuse filenames, (c) /tmp/windsurf-workspace/
path leaks narrated into response bodies. NO_TOOL skips the loop entirely.
Stress-tested 2026-04 on a remote host — 15-way opus concurrency went from
13/15 → 15/15 success, 99 s → 35 s wall time, 0 path leaks, 0 filename conflicts.
DO NOT confuse planner_mode with CascadeToolConfig.run_command. They
are different fields — planner_mode lives on CascadeConversationalPlannerConfig
(field 4), run_command lives on CascadeToolConfig (field 8 of the tool
config). NO_TOOL is the flag to set; tool_config must stay unset — turning
run_command on puts the agent into auto-execute mode and breaks things worse.
Single page at /dashboard. Auth: bearer token = config.API_KEY or configured
dashboard password. 8 panels: 總覽 / 登入取號 / 帳號管理 / 模型控制 / Proxy / 日誌 / 統計 / 封禁偵測.
Persisted state lives in JSON files next to src/:
accounts.json— account pool (tier, capabilities, blockedModels, credits, proxy)proxy-config.json— global + per-account proxy URLsmodel-access.json— global model allow/blocklistruntime-config.json— experimental toggles (cascadeConversationReuse, etc.)
- Language: Dashboard UI (
src/dashboard/index.html) uses 简体中文. README anddocs/GitHub Pages use 繁體中文. Code identifiers and comments stay in English. - Dashboard UI: shadcn-style dark theme via CSS variables (
--surface,--accent,--radius). NEVER use browseralert()/confirm()/prompt(). UseApp.confirm(title, desc, opts)andApp.prompt(title, desc, fields)— they render styled modal overlays.App.confirmsupportsopts.html,opts.wide,opts.titleHtml,opts.danger,opts.okText. - No npm deps. Stick to
node:*builtins. If you need protobuf, hand-roll it inproto.js. - Errors that must not burn account error counters: rate limits,
permission_denied,failed_precondition, and upstream "internal error occurred (error ID: ...)". SeereportInternalError/markRateLimitedinauth.js. - Logs: use
log.info/warn/error/debugfromconfig.js— these feed the dashboard log panel viadashboard/logger.js. - Git line endings: the repo has
text=auto; Windows checkouts will complain about LF→CRLF on save. Ignore the warnings.
Two supported flows; both avoid the zombie-process trap you get from pm2 restart:
PM2 (common):
pm2 stop windsurf-api && pm2 delete windsurf-api
fuser -k 3003/tcp 2>/dev/null
sleep 2
pm2 start src/index.js --name windsurf-api --cwd /path/to/WindsurfAPIDocker: docker compose up -d --build — see Dockerfile and docker-compose.yml.
You still need to mount the Windsurf Language Server binary into the container
(/opt/windsurf/language_server_linux_x64); it cannot be bundled for license reasons.
Do NOT use pm2 restart windsurf-api — on at least some Node/PM2 combos it leaves
the old process alive holding port 3003, and the new one silently falls back to
a different port without warning.
- Free tier only serves
gpt-4o-miniandgemini-2.5-flash— all Claude and premium models returnpermission_denied. Hardcoded inMODEL_TIER_ACCESS.free. - Workspace wipe on startup —
src/index.jsrunsrm -rf /tmp/windsurf-workspace/*before the LS starts. If you skip this, files created by Cascade's baked-in file-editing tools persist across restarts and the model starts narrating "edits" to files the caller never mentioned. responseTextvsmodifiedText: while streaming, preferresponseText(append-only). Only top up withmodifiedTextat idle if it's a strict prefix extension ofresponseText. See the big comment block inclient.jsaround the cascade polling loop.- Firebase API key for Windsurf auth is
AIzaSyDsOl-1XpT5err0Tcnx8FFod1H8gVGIycY(extracted fromhttps://windsurf.com/_next/static/chunks/46097-*.js). Three other keys have been tried and confirmed to NOT work — do not rotate to them. - Protobuf field numbers are hand-rolled in
src/windsurf.jsandsrc/proto.js. When Windsurf ships a new model or capability, diff against the LS binary'sexa.cortex_pb/exa.language_server_pbproto descriptors before adding fields — the varint wire format is unforgiving of silent schema drift.