feat: fee-adjusted edge, fractional-Kelly sizing, and risk-assessment endpoints#7
Open
JonathanW666 wants to merge 14 commits into
Open
feat: fee-adjusted edge, fractional-Kelly sizing, and risk-assessment endpoints#7JonathanW666 wants to merge 14 commits into
JonathanW666 wants to merge 14 commits into
Conversation
Three pure modules used by the signal and arbitrage code. fees.ts models per-platform taker fees with a bounded adverse-execution slippage term. edge.ts returns signed edge, fractional Kelly, and breakeven probability for a bet at a given price. risk.ts wraps both to produce EV, variance, and a TAKE/SCALE_DOWN/AVOID call on a proposed trade. No I/O in any of them; consumed by the commits that follow.
Replaces the bag-of-words scorer. Per-token weights, ~150 prediction- market terms, multi-word phrases, emoji graphemes, intensifier and hedge multipliers, and a three-token negation scope. ALL-CAPS and trailing "!!" bump magnitude. The analyzeSentiment(text) signature is unchanged so the keyword matcher and signal generator keep working without modification.
The previous formula was edge = confidence * |p - price|, which drops the sign. A bullish tweet on a 95c YES market still reported a large edge because there was nowhere to buy. Signals now pick the side by expected value and return signed edge, ev_per_dollar, kelly_fraction, and breakeven_prob from edge.ts. Other changes here: - sentimentToProbability range narrowed from ±0.45 to ±0.25; tweet- level evidence rarely justifies a sharper prior. - The confidence passed to Kelly is capped at 0.6 for lexicon-only signals. - computeUrgency reads the new covered-bundle arbitrage semantics (spread is already net of modeled cost) and downgrades critical/ high to medium when 24h volume is under $25k.
Leaves the covered-bundle detector from PR MusashiBot#2 alone and adds three optional fields on each opportunity: maxStake, expectedDollarProfit, and annualisedReturn. Sizing uses the impact-aware slippage model in fees.ts and clamps the stake to $10 when refined edge is non-positive. APR uses the earlier of the two endDates or falls back to 30 days.
analyze-text now returns ev_per_dollar, kelly_fraction, and breakeven_prob on data.suggested_action, plus implied_true_prob under metadata. markets/arbitrage accepts three new optional filters (minExpectedProfit, minAnnualisedReturn, minMaxStake) that read the fields added in the previous commit. health bumps the version to 2.1.0 and lists the new endpoints in its catalog. No existing field or query parameter changed shape.
POST /api/position-sizing returns a Kelly-optimal stake plus half- and quarter-Kelly alternatives given true_prob, yes_price, bankroll, and volume_24h. Defaults to quarter-Kelly with a 10%-of-bankroll hard cap. Runs a second pass so the stake feeds back into the slippage model. POST /api/risk-assessment takes a proposed trade (side, price, stake, optional bankroll and expiry) and returns EV, variance, Sharpe, prob_profit, Kelly-suggested stake, and a TAKE/SCALE_DOWN/AVOID call. Both are wired in vercel.json and the local server. API-REFERENCE is extended with request/response examples.
|
@LechenWang is attempting to deploy a commit to the Victor's projects Team on Vercel. A member of the Team first needs to authorize it. |
c30f8e9 to
a26a0e9
Compare
The keyword matcher surfaces untradable matches: markets with
essentially no 24h volume, markets pinned near 1c or 99c, and single-
unigram broad hits (the typical case is a Fed-rate-cut tweet matching
an NBA penny market on the word "win"). The gate drops all three
before results are returned:
- volume floor: drop if 24h volume < $5k
- extreme-price: drop if yesPrice < 2% or > 98% unless the match has
a phrase hit and confidence >= 0.75
- strong-signal: require either a phrase hit or >= 2 matched keywords
at >= 0.55 confidence
The gate is on by default and can be disabled by passing false as the
KeywordMatcher's fourth constructor argument. The eval harness uses
that to produce its baseline row.
scripts/matcher-eval/ ships a reproducible evaluation:
npx tsx scripts/matcher-eval/snapshot-markets.ts # regen snapshot
npx tsx scripts/matcher-eval/run-eval.ts
On the 30-tweet x 1,857-market fixture:
before after delta
total matches surfaced 99 86 -13
junk rate (any rule) 63.6% 40.7% -22.9 pp
thin-market (<$5k) 4.0% 0.0% -4.0 pp
extreme-price 46.5% 25.6% -20.9 pp
cross-domain 29.3% 20.9% -8.4 pp
weak-signal 4.0% 0.0% -4.0 pp
Recall drops 13% while the overall junk rate drops 36% relative.
Changes to api/lib/market-cache.ts: - in-flight request deduplication so concurrent callers during a cache miss share a single fetch promise instead of each making their own Polymarket/Kalshi roundtrip - stale-while-revalidate up to MARKET_CACHE_SWR_SECONDS (default 60s) past the 20s TTL; stale is returned immediately while a single background refresh runs - refresh no longer overwrites the cache with two empty arrays on a full-platform outage; last-known-good is retained New api/lib/rate-limit.ts: sliding-window per-IP counter backed by Vercel KV with an in-process Map fallback. Writes X-RateLimit-Limit / -Remaining / -Reset headers on every response, returns 429 with Retry-After once the bucket is exhausted, and fails open if KV is unavailable. Wired into analyze-text, markets/arbitrage, ground-probability, position-sizing, and risk-assessment with per-endpoint budgets. Verified directly: 40 bursted requests from one IP at a 30-rpm cap produced 30 allowed + 10 denied; a different IP got a fresh 30-req budget in the same window.
scripts/backtest/run-backtest.ts runs the signal pipeline over the same 30-tweet corpus and 1,857-market snapshot used by the matcher eval, and compares three sizing strategies over 500 replications: KELLY (quarter-Kelly, 10% bankroll cap), FLAT ($100), and RANDOM ($100, random side). Execution realism: - only trades signals on markets priced in [0.10, 0.90] with 24h volume >= $25k; penny markets are theoretically +EV but not executable at any realistic size - Kelly stake is computed against min(current, 2x starting) bankroll so a streak does not size trades past book depth - fees and adverse-execution slippage come from fees.ts, so the PnL is apples-to-apples with what the API reports Results at calibration = 1.0 (pooled-trade Sharpe, 500 reps): strategy return Sharpe maxDD winRate Brier KELLY +17.4% 0.360 3.5% 17.8% 0.206 FLAT +5.6% 0.570 1.7% 54.2% 0.239 RANDOM +1.8% 0.228 1.9% 49.1% 0.240 Kelly sensitivity sweep: calibration return Sharpe 0.00 -1.2% -0.033 0.25 +3.7% 0.090 0.50 +8.3% 0.188 0.75 +13.5% 0.289 1.00 +18.4% 0.377 At calibration 0 (signals are noise) Kelly loses small rather than blowing up, because the EV check short-circuits to HOLD on negative- expectation trades. Results persisted to scripts/backtest/fixtures/ result.json and regenerable from scratch.
874b73d to
2fe2a00
Compare
sentimentToProbability returned 0.5 for neutral sentiment, which then flowed into computeEdge as a hardcoded prior. computeEdge would find positive EV on whichever side was cheaper and emit a YES/NO recommendation, effectively betting against the market price based on nothing. Neutral sentiment means we have no directional evidence. When there is no arbitrage to dominate, the right call is HOLD and deferral to the market. generateSignal now short-circuits to that suggested_action in the neutral case, with implied_true_prob set to the market yesPrice so downstream consumers know we did not derive an independent prior. This was surfaced while inspecting the backtest: 3 of 4 signals on the fixture corpus had trueProb = 0.5 exactly, came from neutral-sentiment tweets like "Real Madrid beat Barcelona 3-1", and were generating coin-flip trades against the market. They are now filtered at signal generation, not at the strategy layer.
The original computeMetrics counted skipped-signal slots (stake=0) in
the denominators of both winRate and Sharpe. A selective strategy
like KELLY, which correctly refuses low-evidence trades, was
penalised for doing so: its reported win rate was 17.8% and its
Sharpe 0.36 while FLAT, which took every signal, showed 54.2% and
0.57. Both numbers were an artefact of the zero-pad on the pnl path.
Metrics are now split:
- totalReturn and maxDrawdown still walk the full pnlPath (zeros
do not move equity, so they're correct as a full-path metric)
- sharpe, winRate, meanPnl, stdPnl are computed over active trades
(entries with pnl != 0) only
- activeTrades and activeRate are reported alongside the other
numbers so a reviewer can see how selective each strategy was
After the fix plus the matching signal-generator change that removes
coin-flip trades on neutral-sentiment tweets, KELLY and FLAT
converge to the same 71% win rate (they take the same single
well-calibrated bet per replication) and near-identical Sharpe
(0.92 vs 0.98). The total return gap (17.4% vs 1.9%) is the
intended effect of staking 10% of bankroll vs a flat $100, and
max drawdown scales the same way (3.5% vs 0.3%).
…nd worst-case loss Arbitrage `expectedDollarProfit` in `estimateExecutableSizing` used `refinedEdge * maxStake`. `refinedEdge` is profit per $1 of bundle payout, while `maxStake` is dollars outlaid; the correct expression is `refinedEdge * maxStake / (1 - refinedEdge)`. Under-statement was approximately 1-5% at typical edges and ~43% at 30%+ edges. Sentiment polarity flipping in `analyzeSentimentForMarket` used `title.toLowerCase().includes(key)` against `BEARISH_LEXICON`, which produced false matches on substrings (for example, `fall` inside "Falcons roster 2026"). Replaced with a word-boundary regex that also supports multi-word keys. `EdgeResult.worstCaseLoss` was set to `1 + cost`, which propagated through `/api/position-sizing` and `/api/risk-assessment` responses as a loss exceeding the stake. Binary-option longs on Polymarket and Kalshi cannot lose more than the stake, and fees are already reflected in `evPerDollar`. Corrected to `1`; `bestCaseGain` floored at `0` for symmetry. Reproducibility unchanged: matcher evaluation junk-rate delta -22.9 pp; back-test KELLY total return 17.4%, Sharpe (active) 0.923, win rate (active) 71.0%. Typecheck and wallet tests pass.
`getArbitrage` previously validated its cache by TTL alone. A request arriving after a market-cache refresh but before the arbitrage TTL expired received fresh market prices alongside arbitrage opportunities computed against the previous market snapshot. The arbitrage cache now records the `cacheTimestamp` under which it was computed and invalidates whenever that timestamp advances. The same change replaces the empty-array "uncached" sentinel with an explicit timestamp sentinel, so a legitimate no-arbitrage result no longer triggers a full O(n*m) rescan on every subsequent request. `api/lib/rate-limit.ts` has been replaced with a two-bucket weighted sliding-window implementation (`current + previous * (1 - elapsed fraction)`). The previous fixed-window counter permitted up to twice the configured limit across a window boundary. The Vercel KV read-modify-write sequence remains non-atomic; the docstring now states this explicitly. The `source` field distinguishes `kv-with-local` from `local-only` so operators can detect KV degradation. Verified: a 40-request burst at a 30-rpm cap yields 30 allowed and 10 denied, with per-IP isolation intact. Typecheck and wallet tests pass.
c81b7ba to
165edaf
Compare
…emantics Three follow-up corrections surfaced by re-reading the public-facing response shapes after the previous review pass. `src/analysis/risk.ts` — `SCALE_DOWN` was structurally unreachable via the bankroll-fraction branch whenever `bankroll` was omitted, because the inferred bankroll (`stake / maxFrac`) made `stake / bankroll` exactly equal to `maxFrac`. The module now tracks whether `bankroll` was supplied: when it is, both SCALE_DOWN branches apply as documented; when it is not, the bankroll-fraction check is skipped and a warning is surfaced telling the caller to pass `bankroll` for a complete assessment. The dollar worst-case is also bounded at `-stake` (previously `-stake * (1 + cost)`, which propagated an impossible "loss exceeds stake" figure to clients); `bestCase` is floored at `0`. `src/analysis/signal-generator.ts` — when `computeEdge` recommends YES or NO on raw edge but fees and slippage push `evPerDollar` non-positive, `buildSuggestedAction` overrode the direction to `HOLD` but preserved the original reasoning string (for example, "YES underpriced at X%"). The payload is now internally consistent: on the override path the reasoning is rewritten to explain that raw edge favoured the side but net EV is non-positive. `api/position-sizing.ts` — `alternative_sizing.half_kelly` and `quarter_kelly` were previously computed as `recommendedStake / 2` and `recommendedStake / 4`. Because `recommendedStake` is already capped at `min(kelly_cap, max_bankroll_fraction)` of bankroll, those values did not correspond to one-half and one-quarter of the full Kelly fraction. Both fields are now derived from the uncapped full Kelly fraction (`kellyFraction(shrunk_prob, yes_price)`), with the same outer risk cap applied so the "safer" sizings never exceed the recommended stake or the caller's hard risk limit. A new `full_kelly_fraction` field exposes the uncapped value for clients that want to reason about it explicitly. Verified: typecheck and wallet tests pass; matcher evaluation delta unchanged at -22.9 pp junk rate; back-test KELLY total return 17.4%, Sharpe (active) 0.923, win rate (active) 71.0%. Direct smoke tests confirm all three behavioural corrections.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR extends the signal pipeline so the API's trading primitives are
fee-aware, return Kelly-sized stakes, and ship with reproducible harnesses
for matcher quality and end-to-end back-test performance. All changes are
additive; no existing response field or query parameter was renamed or
removed. Version bumps from 2.0.0 to 2.1.0.
Fourteen commits, organised by concern:
rewritten sentiment analyzer; a signed-edge correction in the signal
generator.
maxStake,expectedDollarProfit, andannualisedReturnadded to the covered-bundle opportunities introducedin PR Fix legacy arbitrage detector with covered-bundle pricing and strict contract matching #2. The underlying detector is unchanged.
analyze-textnow surfaces EV and Kellyfields;
markets/arbitrageaccepts three sizing filters; two new POSTendpoints are added (
/api/position-sizing,/api/risk-assessment).reproducible evaluation harness.
deduplication on the market cache; a per-IP sliding-window rate limiter.
sensitivity, a signal-generator fix that routes neutral-sentiment tweets
to
HOLD, and a metrics fix that computes Sharpe and win rate overactive trades.
identified during a full re-read of the PR. Detail in the "Review pass"
section below.
response shapes (
SCALE_DOWNreachability, HOLD-override reasoning,alternative_sizingsemantics). Detail in the "Polish" section below.Matcher evaluation
Evaluation on the 30-tweet × 1,857-market fixture
(
scripts/matcher-eval/run-eval.ts):Back-test
Monte Carlo over 500 replications at calibration = 1.0
(
scripts/backtest/run-backtest.ts):Kelly sensitivity to calibration:
Performance degrades monotonically with calibration error. At
calibration = 0 the EV filter routes zero-edge signals to
HOLD, whichbounds the total-return loss to −1.2%.
Back-test interpretation
The single active signal on the current corpus is Ethereum at $2,600,
yesPrice= 0.24, impliedtrueProb= 0.691. KELLY and FLAT both takethe YES side on that signal, so the observed differences reflect stake
sizing rather than directional divergence.
bankroll and FLAT stakes 1%; the observed 17.4% / 1.9% ratio matches
the 10× stake ratio.
0.3% reflects position size, not additional per-dollar risk.
invariant to stake size. The 0.923 / 0.980 gap between KELLY and FLAT
is attributable to the non-linear fee and slippage terms, not to
strategy divergence.
side on the same signal.
RANDOM serves as a control: it chooses
YESorNOuniformly andtherefore converges on a 50% win rate and a near-zero Sharpe.
New surfaces
POST /api/position-sizingreturns Kelly-optimal stake plusalternatives, given
true_prob,yes_price,bankroll, andvolume_24h.POST /api/risk-assessmentreturns aTAKE/SCALE_DOWN/AVOIDrecommendation together with EV, variance, Sharpe,
prob_profit, anda Kelly suggestion for a proposed trade.
analyze-textaddsev_per_dollar,kelly_fraction, andbreakeven_probunderdata.suggested_action, andmetadata.implied_true_probfor debugging.markets/arbitrageaddsmaxStake,expectedDollarProfit, andannualisedReturntodata.opportunities[], and acceptsminExpectedProfit,minAnnualisedReturn, andminMaxStakeas queryparameters.
X-RateLimit-Limit,X-RateLimit-Remaining,and
X-RateLimit-Resetheaders; 429 responses additionally carryRetry-After.Testing
npm run typecheckpasses for both tsconfigs.npm run test:walletpasses 5 of 5.npx tsx scripts/matcher-eval/run-eval.tsreproduces the matcherevaluation table above.
npx tsx scripts/backtest/run-backtest.tsreproduces the back-testtables above.
for
infra: stale-while-revalidate cache and per-IP rate limitingfor the burst pattern and per-IP isolation check.
Backwards compatibility
ArbitrageOpportunity,Market, andTradingSignalare optional properties.
analyzeSentiment(text)retains its legacy signature.KeywordMatcheraccepts an optional fourth constructor argument todisable the quality gate; the evaluation harness uses this to produce
the "before" baseline row.
on neutral sentiment the endpoint now returns
direction: 'HOLD'. The prior behaviour issued a directionalrecommendation derived from the 50/50 prior against the market price,
which is not a supported use of the signal.
Review pass
A full re-read of the PR identified six correctness and documentation
issues. They are addressed in the two commits at the tip of the branch
(
review: correct arbitrage profit formula, sentiment polarity match, and worst-case loss, andreview: tighten arbitrage cache invalidation and rate-limit algorithm).1. Arbitrage profit formula
estimateExecutableSizingcomputedexpectedDollarProfit = refinedEdge × maxStake.refinedEdgeis profitper $1 of bundle payout, while
maxStakeis dollars outlaid, so thecorrect expression is
refinedEdge × maxStake / (1 − refinedEdge).Understatement was approximately 1–5% at typical edges and ~43% at
30%+ edges. File:
src/api/arbitrage-detector.ts.2. Arbitrage cache invalidation
cachedArbitragepreviously had an independent TTL. When the marketcache refreshed mid-window, a subsequent request could receive fresh
market prices alongside arbitrage opportunities computed against the
previous snapshot. The arbitrage cache now records the
cacheTimestampunder which it was computed and invalidates whenever that timestamp
advances. File:
api/lib/market-cache.ts.3. Empty-result arbitrage sentinel
The "is the arbitrage cache populated?" check used
cachedArbitrage.length === 0as the uninitialised sentinel, whichcaused a legitimate no-arbitrage result to trigger a full O(n·m) rescan
on every subsequent request. An explicit timestamp sentinel now
distinguishes "not yet computed" from "computed and empty". File:
api/lib/market-cache.ts(same commit as item 2).4. Rate-limit algorithm and documentation
The previous rate limiter was a fixed-window counter, not the
sliding-window implementation its docstring described; the
INCR + EXPIREclaim was also inaccurate, as the Vercel KV path performs anon-atomic read-modify-write. The limiter has been replaced with a
two-bucket weighted sliding window, and the docstring updated to
describe its behaviour (non-atomic KV, fail-open on KV failure,
in-process counter as the authoritative limiter within a warm
instance). The
sourcefield on the result now distinguisheskv-with-localfromlocal-only. File:api/lib/rate-limit.ts.5. Sentiment polarity matching
analyzeSentimentForMarketmatchedBEARISH_LEXICONkeys withtitle.toLowerCase().includes(key), which produced false flips onsubstrings (for example,
fallinside "Falcons roster 2026"). Replacedwith a word-boundary regex that preserves multi-word key support.
File:
src/analysis/sentiment-analyzer.ts.6. Worst-case loss bound
EdgeResult.worstCaseLosswas1 + cost, which propagated to the/api/position-sizingand/api/risk-assessmentresponses as a lossexceeding the stake. Binary-option longs on Polymarket and Kalshi
cannot lose more than the stake, and fees are already accounted for in
evPerDollar.worstCaseLossis now bounded at1;bestCaseGainisfloored at
0for symmetry. File:src/analysis/edge.ts.Polish
Three follow-up corrections to the public-facing response shapes,
delivered in the final commit on the branch.
1.
SCALE_DOWNreachability without an explicit bankroll/api/risk-assessmentpreviously inferredbankroll = stake / maxFracwhen the caller omitted
bankroll, which madestake / bankrollidentically equal to
maxFracand rendered the bankroll-fractionbranch of
SCALE_DOWNstructurally unreachable. The module now trackswhether
bankrollwas supplied. When it is, bothSCALE_DOWNbranchesfire as documented. When it is not, the bankroll-fraction check is
skipped and a warning is appended instructing the caller to supply
bankrollfor a complete assessment. The Kelly-vs-stake branch ofSCALE_DOWNcontinues to operate in both cases. The dollar worst-casein
assessRiskis also bounded at-stake(previously-stake * (1 + cost), consistent with theworstCaseLosscorrection in the reviewpass). File:
src/analysis/risk.ts.2. HOLD-override reasoning consistency
When
computeEdgeidentifies raw edge on a side (YES or NO) but feesand slippage push
evPerDollarnon-positive,buildSuggestedActionoverrode the direction to
HOLDwhile leavingreasoningas theoriginal "YES underpriced at X%" string produced by
computeEdge. Theoverride path now rewrites the reasoning to explain that raw edge
favoured the side but net EV is non-positive, so the returned payload
is internally consistent. File:
src/analysis/signal-generator.ts.3.
alternative_sizingsemanticsalternative_sizing.half_kellyandalternative_sizing.quarter_kellywere computed as
recommendedStake / 2andrecommendedStake / 4.Because
recommendedStakeis already capped atmin(kelly_cap, max_bankroll_fraction)of bankroll, those values did not correspondto one-half and one-quarter of the full Kelly fraction. Both fields
are now derived from the uncapped full Kelly fraction (via
kellyFraction(shrunk_prob, yes_price)), with the same outer riskcap applied so the "safer" sizings never exceed the recommended stake
or the caller's hard risk limit. A new
full_kelly_fractionfield isexposed for clients that need the uncapped value. File:
api/position-sizing.ts.Future improvements
The items below were identified during the review pass and are out of
scope for this PR. They are listed as a prioritised backlog for
subsequent iterations.
Correctness
Back-test RNG drift across strategies.
KELLYcan skip signals(no
rng()call),FLATcallsrng()once per signal, andRANDOMcalls it twice. On a multi-signal corpus the three strategies would
not observe the same coin flips on the same underlying bet.
Innocuous with the present single-signal corpus; should be resolved
before expanding the fixture set, either by pre-materialising
outcomes per signal or by assigning a dedicated RNG stream per
strategy.
scripts/backtest/run-backtest.ts.breakevenProbsemantics by side. The field is interpreted as"minimum
trueProbfor positive EV" on the YES side and "maximumtrueProbfor positive EV" on the NO side; the docstring statesonly "min". Proposed resolution: split into
breakevenMin/breakevenMax, or add a side-aware docstring.src/analysis/edge.ts.Additive fee accounting. EV subtracts
costfrom the win-legand adds it to the loss-leg, treating fees as paid on top of the
stake. This slightly overstates EV under Polymarket's
slippage-dominated model and slightly understates it under Kalshi's
profit-based taker fees; net impact on typical markets is
approximately 2%. A physical-model rewrite would shift the
back-test headline numbers by 1–3 pp and belongs in a dedicated PR
that also updates the back-test fixtures.
Observability and consistency
Rate-limit KV TTL is
windowSeconds × 2.windowSecondsissufficient; the doubled TTL consumes additional KV memory without
affecting correctness.
api/lib/rate-limit.ts.RateLimitResult.sourceunion includes an unused'disabled'variant. Either remove from the union type or introduce a
disabled-by-config code path.
applyQualityGate.droppedtelemetry is discarded. The gateproduces per-reason drop counts (
lowVolume,extremePrice,weakSignal), butKeywordMatcher.matchconsumes only.kept.Either emit drops to logs or metrics, or remove the field if unused.
src/analysis/match-quality.ts.Divergent NaN handling.
passesVolumedrops non-finite inputwhile
passesExtremePricepasses it through. A single rule shouldapply.
src/analysis/match-quality.ts.Divergent validation between
position-sizingandrisk-assessment.true_probis required by the former andoptional by the latter;
yes_pricebounds are aligned. Document theintentional difference or unify.
generateEventIduses a 32-bit hash. Birthday collisions becomemeaningful around 60,000–100,000 distinct tweets; adequate for
current usage. Migrate to a base36-truncated SHA-256 before
event_idis relied upon as a primary key.src/analysis/signal-generator.ts.Deployment and operations
SWR and in-flight dedup are per-instance. The module-level
cachedMarketsandinFlightFetchstate does not cross functionboundaries. Under Vercel's multi-instance scaling, concurrent
requests routed to different instances each trigger an independent
refresh. Migrating to a shared backend (for example, KV with a short
TTL) is the appropriate remediation if cross-instance consistency
becomes a requirement.
api/lib/market-cache.ts.markets.snapshot.json(1.8 MB) is checked into the repo.Reviewed manually: contents are public market data (titles, prices,
URLs). No remediation required.
scripts/matcher-evalandscripts/backtestare not wired intonpm run test. Reviewers must invoke them directly. Addingtest:evalandtest:backtesttargets would make the evidenceavailable as a single command.
Narrow back-test corpus. One active signal remains after the
quality gate and tradability filter. The 500-replication Monte Carlo
exercises the sizing math but does not robustly estimate production
PnL. Calibration sensitivity partially compensates; expanding the
fixture corpus is the next step.
KeywordMatcherconstructor takes four positional arguments.Backwards compatible for 0–3 arguments, but subclasses or mocks that
match the constructor signature require updating. Consider
documenting the new argument or migrating to an options object.
src/analysis/keyword-matcher.ts.None of the above blocks merge. Recommended sequencing for the next
iteration: the back-test RNG drift fix before any corpus expansion,
and the SWR/dedup deployment note before the next deployment cycle
on a multi-instance Vercel project.