You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #97/#98이 안정화된 LLM Binance Futures 흐름을 만들었지만 모델이 받는 입력은 여전히 OHLCV 캔들 50개 + 심볼/인터벌이 전부다. 사실상 raw price만 보고 long/short과 TP/SL을 결정하는 셈이라:
추세의 강도(EMA 정렬 / MACD 모멘텀)를 모델이 매번 처음부터 추론해야 함 — 종종 산수 실수로 귀결
변동성(ATR, Bollinger width)을 모르니까 TP/SL 거리가 시장 변동성에 비해 비현실적으로 좁거나 넓게 나옴
심리적 컨텍스트(funding rate / OI 추이)를 전혀 반영 못 함 — 추세 지속 여부 판단의 핵심을 빠뜨림
사용자가 슬라이더로 candleCount=50 골라도 EMA200 같은 장기 인디케이터는 안정적으로 못 뽑힘 (200+ warmup 필요)
결과적으로 "그럴듯한 분석 한 줄"은 나오지만 신호의 일관성과 품질이 들쑥날쑥하고, 후행 검증해보면 명백한 시장 신호(과매수 RSI 90인데 long 추천 등)를 놓치는 경우가 발생.
Solution
LLM에게 보내는 user prompt를 세 layer로 보강한다:
인디케이터 인라인 enriched candles — 각 candle row에 OHLCV + 기술적 지표(EMA20/50/200, RSI14, BB(20,2σ), MACD(12/26/9), ATR14)를 함께 인라인. 모델이 "이 봉 시점에 가격과 인디케이터가 어떻게 움직였는지" 한 row에서 즉시 파악.
market-structure 요약 — 최근 swing high/low, ATR 기반 권장 SL 거리 등을 별도 섹션으로 제공. TP/SL 산정의 객관적 anchor.
futures sentiment 섹션 — Funding Rate 최근 settlement 시계열 + Open Interest 24h 추이를 자연 주기에 맞춘 시계열로. 추세 지속성/반전 가능성 reasoning에 활용.
UI는 사용자 혼란 줄이려고 candleCount 라벨을 "프롬프트 캔들 수"로 명확화하고 범위를 인디케이터 의미가 보장되는 30~120(default 60)으로 좁힌다. 백엔드는 사용자가 60 골라도 항상 220+ 캔들을 페치해서 EMA200 안정성을 보장.
User Stories
As a 트레이더, I want each candle in the prompt to come with its EMA20/50/200 values so the LLM can immediately see whether price is above or below trend lines without recomputing.
As a 트레이더, I want RSI14 in every candle row so the LLM can spot overbought/oversold sequences (e.g., "RSI was 85 three bars ago, now 60 — momentum cooling") without me asking.
As a 트레이더, I want Bollinger Band upper/lower in each row so the LLM understands when price is statistically extended.
As a 트레이더, I want MACD line + signal line in each row so the LLM can identify crossovers and divergences.
As a 트레이더, I want ATR14 in each row so the LLM can calibrate TP/SL distance to actual volatility (no more "0.1% TP on a 2% ATR coin").
As a 트레이더, I want a market-structure summary block with recent swing high/low so the LLM has hard reference points for SL placement.
As a 트레이더, I want a recommended SL distance (e.g., 1.5× ATR) suggested in the structure block so the LLM has a sensible default to work from.
As a 트레이더, I want the current Funding Rate plus its last 8 settlements so the LLM knows whether longs or shorts are paying and whether that's a recent shift or sustained.
As a 트레이더, I want the current Open Interest plus a 24h history series so the LLM can see whether new money is flowing in or out.
As a 트레이더, I want timestamps in human-readable ISO format (truncated to interval precision) so the LLM can reason about session boundaries (Asia open, US close, weekend) and the 8h funding cycle.
As a 트레이더, I want indicators that need more lookback than I requested (e.g., EMA200 when I only asked for 60 candles) to still be accurate, because the backend silently fetched the warmup window.
As a 트레이더, I want the slider to default to 60 and cap at 120 so I don't accidentally request a value where indicators are unstable or attention degrades.
As a 트레이더, I want the slider's label to say "프롬프트 캔들 수" so I know it controls what the LLM sees, not what gets computed.
As a 트레이더, I want a brand-new symbol with insufficient history to still get a signal — indicators that can't be computed appear as null in rows, and the system prompt tells the LLM how to interpret null.
As a 트레이더, I want the LLM's reasoning to explicitly cite indicators when relevant ("RSI 78 + price at upper BB → likely pullback"), so I can audit signal quality post-hoc.
As a 트레이더, I want Funding Rate fetches to be cached (~1h TTL) so rapid back-to-back signal requests don't re-hit Binance and risk rate limits.
As a 트레이더, I want OI fetches to be cached (~1min TTL) so the same applies for high-frequency requests.
As a 트레이더, I want a sentiment fetch failure (Binance temporarily 503) to NOT block my signal — the field appears as null and the LLM proceeds in degraded mode.
As a 트레이더, I want the indicator computation logic to be unit-tested against fixed candle inputs so it doesn't silently regress when libraries upgrade.
As a developer, I want the technical-indicator computation to be a deep, reusable module so the worker (already using technicalindicators) and the api-server (signal flow) share the same implementation.
As a developer, I want each market-context section (candles, structure, sentiment) to be assembled by a single MarketContextBuilder method so the prompt builder stays simple and the build pipeline is testable end-to-end.
As a developer, I want the system prompt to declare the new schema explicitly (field names, units, null semantics) so the LLM doesn't hallucinate a different interpretation.
As an operator, I want indicator computation latency logged so I can confirm the pipeline doesn't blow the per-signal SLO.
As an operator, I want a structured log line per signal request showing how many candles were fetched, computed, sent in prompt, and which sentiment fields were null — so I can debug "why did this signal look weird".
Implementation Decisions
New shared module
New workspace package @coin/indicators (extracts the existing technicalindicators dep from worker-service to a shared location). Surface a single deep entry point — computeIndicators(candles): IndicatorSeries — that returns all required series (EMA20/50/200, RSI14, BB(20,2σ), MACD(12/26/9), ATR14) aligned with the input. Internally wraps technicalindicators so the upstream library is not leaked beyond the package boundary.
Caller never sees raw library types; the package owns its own IndicatorSeries + IndicatorRow types in @coin/types.
Market context builder
New service in api-server: MarketContextService.build({ symbol, interval, promptCandleCount }) returns { enrichedCandles, structure, sentiment }.
Internally orchestrates four steps:
Fetch max(220, promptCandleCount + warmup) candles via existing BinanceRest.getCandles.
Run computeIndicators over the full window.
Build the prompt window: take the last promptCandleCount candles + their aligned indicator values into one EnrichedCandleRow[].
Compute structure (swing high/low over the prompt window, ATR-derived suggested SL distance) and sentiment (funding + OI series via new adapter methods, with Redis caching and degraded-mode fallback).
Single integration surface for LlmTradesService.signal — it just calls MarketContextService.build(...) and forwards the bundle to LlmCliService.decide.
Adapter additions
New methods on IExchangeRest (and BinanceRest impl):
getFundingRateHistory(symbol, opts) → backed by /fapi/v1/fundingRate. Returns last N settlements with timestamp + rate.
getCurrentFundingRate(symbol) → backed by /fapi/v1/premiumIndex. Returns the next-settlement projected rate.
getOpenInterest(symbol) → backed by /fapi/v1/openInterest. Returns the current snapshot.
getOpenInterestHistory(symbol, period, limit) → backed by /futures/data/openInterestHist. Returns the recent OI history at the specified period (e.g., 5m).
All four are public (no signature required) — signedRequest not needed.
sentiment.openInterest: { currentUsdt, change24hPct, history: [{t, oi}, ...up to 24] } at 1h granularity.
Each section is wrapped in a top-level key in the user prompt — {symbol, interval, candles: [...], structure: {...}, sentiment: {...}} — so the schema is unambiguous.
System prompt update
Augment TRADING_SYSTEM_PROMPT with:
A schema description for the new fields and units.
Explicit "null = insufficient data, do not guess" rule.
Hint that the trader should reference indicators / structure / sentiment in reasoning when relevant.
Recommendation that TP/SL distances be calibrated against structure.suggestedSlPct unless the trader has a specific reason to deviate.
candleCount slider in LlmTradeForm — relabel to "프롬프트 캔들 수", min=30, max=120, default=60. Existing user-stored preferences outside the new range get clamped on first render.
Tooltip explains that the backend always fetches enough history for accurate indicators regardless of this number.
No other UI surface changes — the enriched prompt is invisible to the user; only signal quality changes.
Compatibility
No DB schema changes.
Existing LlmDecisionLog rows continue to store the prompt as plain text — the new payload just makes the prompt longer, not differently shaped at the DB level.
No frontend type changes (signal response shape unchanged).
Testing Decisions
External-behavior tests only.
@coin/indicators unit (Vitest): given a fixed set of candle fixtures, computeIndicators returns expected EMA/RSI/BB/MACD/ATR values matching a reference vector (snapshot of technicalindicators output that we own). Catches accidental upgrade-driven regressions.
MarketContextService.build unit (Vitest, mocked adapter + Redis): given mocked candle/funding/OI responses, returns the documented schema. Sentiment section is null-filled when adapter throws (degraded mode). Indicator rows with insufficient history have null entries. Cache is hit on the second call with the same args.
BinanceRest.getFundingRateHistory / getOpenInterestHistory unit: mocked HTTP returns the expected mapped shape; URL contains the expected query params.
LlmCliService.decide test: existing tests stay green. Augmented one extra fixture confirms the LLM JSON parser still works against a prompt that contains the new schema (no regression on response handling — the change is upstream of decide).
End-to-end smoke test (existing pattern): LlmTradesService.signal returns a LlmDecision with non-null TP/SL when MarketContextService is wired with a mocked Binance backend that returns full candle/funding/OI fixtures.
Prior art:
apps/worker-service/src/orders/sagas/close-position-saga.test.ts — hoisted-mock idiom for adapter mocks.
apps/worker-service/src/orders/reconciler/reconcile-order.test.ts — testing a deep, dep-injected module.
apps/api-server/src/portfolio/portfolio.service.test.ts — Redis + Prisma mocks alongside an adapter mock.
Out of Scope
Order book depth (bid/ask imbalance) — useful but requires a different data feed (@depth WebSocket) and adds complexity. Defer.
On-chain whale-flow data (Glassnode / Nansen) — out of scope, requires paid third-party APIs.
Per-user-customizable indicator parameters (e.g., user picks RSI period) — not requested; standard parameters keep the prompt schema stable.
Multi-timeframe context (e.g., 1h candles when prompting on 5m interval) — meaningful but doubles fetch + token cost. Defer to a follow-up PRD if signal quality demands it.
Auto-tuning of candleCount per interval — kept manual for now; users get a sensible default (60) and a constrained range.
Switching to a non-OAuth Anthropic API path — orthogonal; current Pro/Max OAuth pipeline is fine for added prompt size.
Storing computed indicators in the DB — they're derived data, recomputable; no need for persistence.
Further Notes
Token budget reality check: a 60-row enriched prompt (~200 chars/row × 60 = ~12K chars) plus structure + sentiment ≈ ~15K chars / ~4K tokens. Sonnet 4.6 has 200K context. Pro/Max rate limits are dominated by request count and time, not token count, so per-request token growth is essentially free. The grilling concluded that token economy is not a real constraint here.
Why inline indicators in every row instead of separate aligned arrays: the LLM reasoning model handles "this row is the truth at this timestamp" much more reliably than "indicator series at index i corresponds to candle at index i". The token overhead of repeating slowly-changing indicators (EMA200) per row is the cost we pay for prompt clarity — and it's tiny.
Why ISO timestamps with interval-truncated precision: enables the LLM to recognize trading-session, weekday, and 8h-funding-cycle patterns directly in the prompt without arithmetic. Truncation removes noise that doesn't matter for the candle's bucket.
Why funding/OI live in a separate sentiment section instead of being inlined: funding settles every 8h, so inlining it on a 5m candle prompt would repeat the same value 96 times — pure noise. OI changes by the minute but its long-trend interpretation reads cleaner as a 24-point time series than as a per-row column. The natural-cadence approach matches each data type's information density.
EMA200 warmup: requires 200 prior data points for stability; the backend always fetches at least promptCandleCount + 200 candles regardless of the user's slider. This is invisible to the user but guarantees indicator validity. Symbols with <200 history (very new listings) get null in the relevant rows; the LLM is instructed to treat null as "data unavailable, do not guess".
Cache TTL rationale: Funding rate updates exactly every 8h on Binance Futures, so 1h TTL is conservative-safe for cache; OI changes continuously but a 60s read-through cache matches realistic signal-request cadence (a single trader rarely fires more than a few signals per minute).
Degraded mode semantics: if Binance's funding or OI endpoint times out, the field surfaces as null in the prompt — the system prompt instructs the LLM to treat null as missing data, not as "0" or "neutral". Signal still goes out. We log a warning so ops can spot recurring degradation.
Problem Statement
PR #97/#98이 안정화된 LLM Binance Futures 흐름을 만들었지만 모델이 받는 입력은 여전히 OHLCV 캔들 50개 + 심볼/인터벌이 전부다. 사실상 raw price만 보고 long/short과 TP/SL을 결정하는 셈이라:
candleCount=50골라도 EMA200 같은 장기 인디케이터는 안정적으로 못 뽑힘 (200+ warmup 필요)결과적으로 "그럴듯한 분석 한 줄"은 나오지만 신호의 일관성과 품질이 들쑥날쑥하고, 후행 검증해보면 명백한 시장 신호(과매수 RSI 90인데 long 추천 등)를 놓치는 경우가 발생.
Solution
LLM에게 보내는 user prompt를 세 layer로 보강한다:
UI는 사용자 혼란 줄이려고
candleCount라벨을 "프롬프트 캔들 수"로 명확화하고 범위를 인디케이터 의미가 보장되는 30~120(default 60)으로 좁힌다. 백엔드는 사용자가 60 골라도 항상 220+ 캔들을 페치해서 EMA200 안정성을 보장.User Stories
nullin rows, and the system prompt tells the LLM how to interpret null.nulland the LLM proceeds in degraded mode.Implementation Decisions
New shared module
@coin/indicators(extracts the existingtechnicalindicatorsdep from worker-service to a shared location). Surface a single deep entry point —computeIndicators(candles): IndicatorSeries— that returns all required series (EMA20/50/200, RSI14, BB(20,2σ), MACD(12/26/9), ATR14) aligned with the input. Internally wrapstechnicalindicatorsso the upstream library is not leaked beyond the package boundary.IndicatorSeries+IndicatorRowtypes in@coin/types.Market context builder
MarketContextService.build({ symbol, interval, promptCandleCount })returns{ enrichedCandles, structure, sentiment }.max(220, promptCandleCount + warmup)candles via existingBinanceRest.getCandles.computeIndicatorsover the full window.promptCandleCountcandles + their aligned indicator values into oneEnrichedCandleRow[].structure(swing high/low over the prompt window, ATR-derived suggested SL distance) andsentiment(funding + OI series via new adapter methods, with Redis caching and degraded-mode fallback).LlmTradesService.signal— it just callsMarketContextService.build(...)and forwards the bundle toLlmCliService.decide.Adapter additions
IExchangeRest(and BinanceRest impl):getFundingRateHistory(symbol, opts)→ backed by/fapi/v1/fundingRate. Returns last N settlements with timestamp + rate.getCurrentFundingRate(symbol)→ backed by/fapi/v1/premiumIndex. Returns the next-settlement projected rate.getOpenInterest(symbol)→ backed by/fapi/v1/openInterest. Returns the current snapshot.getOpenInterestHistory(symbol, period, limit)→ backed by/futures/data/openInterestHist. Returns the recent OI history at the specified period (e.g.,5m).signedRequestnot needed.Caching
funding:current:{symbol}(TTL 1h),funding:history:{symbol}(TTL 1h),oi:current:{symbol}(TTL 60s),oi:history:{symbol}:{period}(TTL 60s).MarketContextService; cache miss falls through to adapter.Prompt schema
enrichedCandlesrow carries:t(ISO 8601 truncated to interval precision — e.g.,1m→2026-05-02T15:23:00Z,1d→2026-05-02),o,h,l,c,v,ema20,ema50,ema200,rsi14,bb_u,bb_m,bb_l,macd,macd_sig,atr14. Indicator values that can't be computed yet (warmup) appear asnull.structure:{ swingHigh, swingHighAt, swingLow, swingLowAt, atrPct, suggestedSlPct }. Scoped to the prompt window.sentiment.fundingRate:{ current, lastSettlements: [{t, rate}, ...up to 8], avg7d }.sentiment.openInterest:{ currentUsdt, change24hPct, history: [{t, oi}, ...up to 24] }at 1h granularity.{symbol, interval, candles: [...], structure: {...}, sentiment: {...}}— so the schema is unambiguous.System prompt update
TRADING_SYSTEM_PROMPTwith:reasoningwhen relevant.structure.suggestedSlPctunless the trader has a specific reason to deviate.UI
candleCountslider inLlmTradeForm— relabel to "프롬프트 캔들 수", min=30, max=120, default=60. Existing user-stored preferences outside the new range get clamped on first render.Compatibility
LlmDecisionLogrows continue to store the prompt as plain text — the new payload just makes the prompt longer, not differently shaped at the DB level.Testing Decisions
External-behavior tests only.
@coin/indicatorsunit (Vitest): given a fixed set of candle fixtures,computeIndicatorsreturns expected EMA/RSI/BB/MACD/ATR values matching a reference vector (snapshot oftechnicalindicatorsoutput that we own). Catches accidental upgrade-driven regressions.MarketContextService.buildunit (Vitest, mocked adapter + Redis): given mocked candle/funding/OI responses, returns the documented schema. Sentiment section isnull-filled when adapter throws (degraded mode). Indicator rows with insufficient history havenullentries. Cache is hit on the second call with the same args.BinanceRest.getFundingRateHistory/getOpenInterestHistoryunit: mocked HTTP returns the expected mapped shape; URL contains the expected query params.LlmCliService.decidetest: existing tests stay green. Augmented one extra fixture confirms the LLM JSON parser still works against a prompt that contains the new schema (no regression on response handling — the change is upstream ofdecide).LlmTradesService.signalreturns aLlmDecisionwith non-null TP/SL whenMarketContextServiceis wired with a mocked Binance backend that returns full candle/funding/OI fixtures.Prior art:
apps/worker-service/src/orders/sagas/close-position-saga.test.ts— hoisted-mock idiom for adapter mocks.apps/worker-service/src/orders/reconciler/reconcile-order.test.ts— testing a deep, dep-injected module.apps/api-server/src/portfolio/portfolio.service.test.ts— Redis + Prisma mocks alongside an adapter mock.Out of Scope
@depthWebSocket) and adds complexity. Defer.candleCountper interval — kept manual for now; users get a sensible default (60) and a constrained range.Further Notes
sentimentsection instead of being inlined: funding settles every 8h, so inlining it on a 5m candle prompt would repeat the same value 96 times — pure noise. OI changes by the minute but its long-trend interpretation reads cleaner as a 24-point time series than as a per-row column. The natural-cadence approach matches each data type's information density.promptCandleCount + 200candles regardless of the user's slider. This is invisible to the user but guarantees indicator validity. Symbols with <200 history (very new listings) getnullin the relevant rows; the LLM is instructed to treat null as "data unavailable, do not guess".nullin the prompt — the system prompt instructs the LLM to treat null as missing data, not as "0" or "neutral". Signal still goes out. We log a warning so ops can spot recurring degradation.