feat: fee-adjusted edge, fractional-Kelly sizing, and risk-assessment endpoints by JonathanW666 · Pull Request #7 · MusashiBot/musashi-api

JonathanW666 · 2026-04-20T18:56:29Z

Summary

This PR extends the signal pipeline so the API's trading primitives are
fee-aware, return Kelly-sized stakes, and ship with reproducible harnesses
for matcher quality and end-to-end back-test performance. All changes are
additive; no existing response field or query parameter was renamed or
removed. Version bumps from 2.0.0 to 2.1.0.

Fourteen commits, organised by concern:

Analysis math (commits 1–3): shared fee, Kelly, and risk modules; a
rewritten sentiment analyzer; a signed-edge correction in the signal
generator.
Arbitrage sizing (commit 4): maxStake, expectedDollarProfit, and
annualisedReturn added to the covered-bundle opportunities introduced
in PR Fix legacy arbitrage detector with covered-bundle pricing and strict contract matching #2. The underlying detector is unchanged.
Endpoints (commits 5–6): analyze-text now surfaces EV and Kelly
fields; markets/arbitrage accepts three sizing filters; two new POST
endpoints are added (/api/position-sizing, /api/risk-assessment).
Matcher quality (commit 7): a post-match quality gate and a
reproducible evaluation harness.
Infrastructure (commit 8): stale-while-revalidate and in-flight request
deduplication on the market cache; a per-IP sliding-window rate limiter.
Validation (commits 9–11): a Monte Carlo back-test with calibration
sensitivity, a signal-generator fix that routes neutral-sentiment tweets
to HOLD, and a metrics fix that computes Sharpe and win rate over
active trades.
Review pass (commits 12–13): six correctness and documentation fixes
identified during a full re-read of the PR. Detail in the "Review pass"
section below.
Polish (commit 14): three follow-up corrections to the public-facing
response shapes (SCALE_DOWN reachability, HOLD-override reasoning,
alternative_sizing semantics). Detail in the "Polish" section below.

Matcher evaluation

Evaluation on the 30-tweet × 1,857-market fixture
(scripts/matcher-eval/run-eval.ts):

	before	after	delta
total matches surfaced	99	86	−13
junk rate (any rule)	63.6%	40.7%	−22.9 pp
thin-market (<$5k volume)	4.0%	0.0%	−4.0 pp
extreme-price (<2% or >98%)	46.5%	25.6%	−20.9 pp
cross-domain	29.3%	20.9%	−8.4 pp
weak-signal	4.0%	0.0%	−4.0 pp

Back-test

Monte Carlo over 500 replications at calibration = 1.0
(scripts/backtest/run-backtest.ts):

strategy	active/slots	total return	Sharpe (active)	max DD	win rate (active)	Brier
KELLY	1/1	+17.4%	0.923	3.5%	71.0%	0.206
FLAT	1/1	+1.9%	0.980	0.3%	71.0%	0.206
RANDOM	1/1	+0.6%	0.331	0.5%	50.0%	0.209

Kelly sensitivity to calibration:

calibration	total return	Sharpe (active)	max DD	win rate (active)
0.00	−1.2%	−0.067	9.0%	26.2%
0.25	+3.7%	0.183	7.5%	38.0%
0.50	+8.3%	0.398	6.2%	49.0%
0.75	+13.5%	0.668	4.7%	61.6%
1.00	+18.4%	0.995	3.3%	73.2%

Performance degrades monotonically with calibration error. At
calibration = 0 the EV filter routes zero-edge signals to HOLD, which
bounds the total-return loss to −1.2%.

Back-test interpretation

The single active signal on the current corpus is Ethereum at $2,600,
yesPrice = 0.24, implied trueProb = 0.691. KELLY and FLAT both take
the YES side on that signal, so the observed differences reflect stake
sizing rather than directional divergence.

Total return scales linearly with stake size. KELLY stakes 10% of
bankroll and FLAT stakes 1%; the observed 17.4% / 1.9% ratio matches
the 10× stake ratio.
Max drawdown scales similarly with stake. KELLY's 3.5% versus FLAT's
0.3% reflects position size, not additional per-dollar risk.
Per-trade Sharpe on a fixed Bernoulli payoff is approximately
invariant to stake size. The 0.923 / 0.980 gap between KELLY and FLAT
is attributable to the non-linear fee and slippage terms, not to
strategy divergence.
Win rate is identical (71.0%) because both strategies take the same
side on the same signal.

RANDOM serves as a control: it chooses YES or NO uniformly and
therefore converges on a 50% win rate and a near-zero Sharpe.

New surfaces

POST /api/position-sizing returns Kelly-optimal stake plus
alternatives, given true_prob, yes_price, bankroll, and
volume_24h.
POST /api/risk-assessment returns a TAKE / SCALE_DOWN / AVOID
recommendation together with EV, variance, Sharpe, prob_profit, and
a Kelly suggestion for a proposed trade.
analyze-text adds ev_per_dollar, kelly_fraction, and
breakeven_prob under data.suggested_action, and
metadata.implied_true_prob for debugging.
markets/arbitrage adds maxStake, expectedDollarProfit, and
annualisedReturn to data.opportunities[], and accepts
minExpectedProfit, minAnnualisedReturn, and minMaxStake as query
parameters.
Every response carries X-RateLimit-Limit, X-RateLimit-Remaining,
and X-RateLimit-Reset headers; 429 responses additionally carry
Retry-After.

Testing

npm run typecheck passes for both tsconfigs.
npm run test:wallet passes 5 of 5.
npx tsx scripts/matcher-eval/run-eval.ts reproduces the matcher
evaluation table above.
npx tsx scripts/backtest/run-backtest.ts reproduces the back-test
tables above.
The rate limiter is exercised by direct import; see the commit body
for infra: stale-while-revalidate cache and per-IP rate limiting
for the burst pattern and per-IP isolation check.

Backwards compatibility

Existing response fields and query parameters retain their shapes.
Additions to ArbitrageOpportunity, Market, and TradingSignal
are optional properties.
analyzeSentiment(text) retains its legacy signature.
KeywordMatcher accepts an optional fourth constructor argument to
disable the quality gate; the evaluation harness uses this to produce
the "before" baseline row.
The neutral-sentiment change in commit 10 is a behaviour correction:
on neutral sentiment the endpoint now returns
direction: 'HOLD'. The prior behaviour issued a directional
recommendation derived from the 50/50 prior against the market price,
which is not a supported use of the signal.

Review pass

A full re-read of the PR identified six correctness and documentation
issues. They are addressed in the two commits at the tip of the branch
(review: correct arbitrage profit formula, sentiment polarity match, and worst-case loss, and review: tighten arbitrage cache invalidation and rate-limit algorithm).

1. Arbitrage profit formula

estimateExecutableSizing computed
expectedDollarProfit = refinedEdge × maxStake. refinedEdge is profit
per $1 of bundle payout, while maxStake is dollars outlaid, so the
correct expression is refinedEdge × maxStake / (1 − refinedEdge).
Understatement was approximately 1–5% at typical edges and ~43% at
30%+ edges. File: src/api/arbitrage-detector.ts.

2. Arbitrage cache invalidation

cachedArbitrage previously had an independent TTL. When the market
cache refreshed mid-window, a subsequent request could receive fresh
market prices alongside arbitrage opportunities computed against the
previous snapshot. The arbitrage cache now records the cacheTimestamp
under which it was computed and invalidates whenever that timestamp
advances. File: api/lib/market-cache.ts.

3. Empty-result arbitrage sentinel

The "is the arbitrage cache populated?" check used
cachedArbitrage.length === 0 as the uninitialised sentinel, which
caused a legitimate no-arbitrage result to trigger a full O(n·m) rescan
on every subsequent request. An explicit timestamp sentinel now
distinguishes "not yet computed" from "computed and empty". File:
api/lib/market-cache.ts (same commit as item 2).

4. Rate-limit algorithm and documentation

The previous rate limiter was a fixed-window counter, not the
sliding-window implementation its docstring described; the INCR + EXPIRE claim was also inaccurate, as the Vercel KV path performs a
non-atomic read-modify-write. The limiter has been replaced with a
two-bucket weighted sliding window, and the docstring updated to
describe its behaviour (non-atomic KV, fail-open on KV failure,
in-process counter as the authoritative limiter within a warm
instance). The source field on the result now distinguishes
kv-with-local from local-only. File: api/lib/rate-limit.ts.

5. Sentiment polarity matching

analyzeSentimentForMarket matched BEARISH_LEXICON keys with
title.toLowerCase().includes(key), which produced false flips on
substrings (for example, fall inside "Falcons roster 2026"). Replaced
with a word-boundary regex that preserves multi-word key support.
File: src/analysis/sentiment-analyzer.ts.

6. Worst-case loss bound

EdgeResult.worstCaseLoss was 1 + cost, which propagated to the
/api/position-sizing and /api/risk-assessment responses as a loss
exceeding the stake. Binary-option longs on Polymarket and Kalshi
cannot lose more than the stake, and fees are already accounted for in
evPerDollar. worstCaseLoss is now bounded at 1; bestCaseGain is
floored at 0 for symmetry. File: src/analysis/edge.ts.

Polish

Three follow-up corrections to the public-facing response shapes,
delivered in the final commit on the branch.

1. `SCALE_DOWN` reachability without an explicit bankroll

/api/risk-assessment previously inferred bankroll = stake / maxFrac
when the caller omitted bankroll, which made stake / bankroll
identically equal to maxFrac and rendered the bankroll-fraction
branch of SCALE_DOWN structurally unreachable. The module now tracks
whether bankroll was supplied. When it is, both SCALE_DOWN branches
fire as documented. When it is not, the bankroll-fraction check is
skipped and a warning is appended instructing the caller to supply
bankroll for a complete assessment. The Kelly-vs-stake branch of
SCALE_DOWN continues to operate in both cases. The dollar worst-case
in assessRisk is also bounded at -stake (previously -stake * (1 + cost), consistent with the worstCaseLoss correction in the review
pass). File: src/analysis/risk.ts.

2. HOLD-override reasoning consistency

When computeEdge identifies raw edge on a side (YES or NO) but fees
and slippage push evPerDollar non-positive, buildSuggestedAction
overrode the direction to HOLD while leaving reasoning as the
original "YES underpriced at X%" string produced by computeEdge. The
override path now rewrites the reasoning to explain that raw edge
favoured the side but net EV is non-positive, so the returned payload
is internally consistent. File: src/analysis/signal-generator.ts.

3. `alternative_sizing` semantics

alternative_sizing.half_kelly and alternative_sizing.quarter_kelly
were computed as recommendedStake / 2 and recommendedStake / 4.
Because recommendedStake is already capped at min(kelly_cap, max_bankroll_fraction) of bankroll, those values did not correspond
to one-half and one-quarter of the full Kelly fraction. Both fields
are now derived from the uncapped full Kelly fraction (via
kellyFraction(shrunk_prob, yes_price)), with the same outer risk
cap applied so the "safer" sizings never exceed the recommended stake
or the caller's hard risk limit. A new full_kelly_fraction field is
exposed for clients that need the uncapped value. File:
api/position-sizing.ts.

Future improvements

The items below were identified during the review pass and are out of
scope for this PR. They are listed as a prioritised backlog for
subsequent iterations.

Correctness

Back-test RNG drift across strategies. KELLY can skip signals
(no rng() call), FLAT calls rng() once per signal, and RANDOM
calls it twice. On a multi-signal corpus the three strategies would
not observe the same coin flips on the same underlying bet.
Innocuous with the present single-signal corpus; should be resolved
before expanding the fixture set, either by pre-materialising
outcomes per signal or by assigning a dedicated RNG stream per
strategy. scripts/backtest/run-backtest.ts.
breakevenProb semantics by side. The field is interpreted as
"minimum trueProb for positive EV" on the YES side and "maximum
trueProb for positive EV" on the NO side; the docstring states
only "min". Proposed resolution: split into breakevenMin /
breakevenMax, or add a side-aware docstring.
src/analysis/edge.ts.
Additive fee accounting. EV subtracts cost from the win-leg
and adds it to the loss-leg, treating fees as paid on top of the
stake. This slightly overstates EV under Polymarket's
slippage-dominated model and slightly understates it under Kalshi's
profit-based taker fees; net impact on typical markets is
approximately 2%. A physical-model rewrite would shift the
back-test headline numbers by 1–3 pp and belongs in a dedicated PR
that also updates the back-test fixtures.

Observability and consistency

Rate-limit KV TTL is windowSeconds × 2. windowSeconds is
sufficient; the doubled TTL consumes additional KV memory without
affecting correctness. api/lib/rate-limit.ts.
RateLimitResult.source union includes an unused 'disabled'
variant. Either remove from the union type or introduce a
disabled-by-config code path.
applyQualityGate.dropped telemetry is discarded. The gate
produces per-reason drop counts (lowVolume, extremePrice,
weakSignal), but KeywordMatcher.match consumes only .kept.
Either emit drops to logs or metrics, or remove the field if unused.
src/analysis/match-quality.ts.
Divergent NaN handling. passesVolume drops non-finite input
while passesExtremePrice passes it through. A single rule should
apply. src/analysis/match-quality.ts.
Divergent validation between position-sizing and
risk-assessment. true_prob is required by the former and
optional by the latter; yes_price bounds are aligned. Document the
intentional difference or unify.
generateEventId uses a 32-bit hash. Birthday collisions become
meaningful around 60,000–100,000 distinct tweets; adequate for
current usage. Migrate to a base36-truncated SHA-256 before
event_id is relied upon as a primary key.
src/analysis/signal-generator.ts.

Deployment and operations

SWR and in-flight dedup are per-instance. The module-level
cachedMarkets and inFlightFetch state does not cross function
boundaries. Under Vercel's multi-instance scaling, concurrent
requests routed to different instances each trigger an independent
refresh. Migrating to a shared backend (for example, KV with a short
TTL) is the appropriate remediation if cross-instance consistency
becomes a requirement. api/lib/market-cache.ts.
markets.snapshot.json (1.8 MB) is checked into the repo.
Reviewed manually: contents are public market data (titles, prices,
URLs). No remediation required.
scripts/matcher-eval and scripts/backtest are not wired into
npm run test. Reviewers must invoke them directly. Adding
test:eval and test:backtest targets would make the evidence
available as a single command.
Narrow back-test corpus. One active signal remains after the
quality gate and tradability filter. The 500-replication Monte Carlo
exercises the sizing math but does not robustly estimate production
PnL. Calibration sensitivity partially compensates; expanding the
fixture corpus is the next step.
KeywordMatcher constructor takes four positional arguments.
Backwards compatible for 0–3 arguments, but subclasses or mocks that
match the constructor signature require updating. Consider
documenting the new argument or migrating to an options object.
src/analysis/keyword-matcher.ts.

None of the above blocks merge. Recommended sequencing for the next
iteration: the back-test RNG drift fix before any corpus expansion,
and the SWR/dedup deployment note before the next deployment cycle
on a multi-instance Vercel project.

Three pure modules used by the signal and arbitrage code. fees.ts models per-platform taker fees with a bounded adverse-execution slippage term. edge.ts returns signed edge, fractional Kelly, and breakeven probability for a bet at a given price. risk.ts wraps both to produce EV, variance, and a TAKE/SCALE_DOWN/AVOID call on a proposed trade. No I/O in any of them; consumed by the commits that follow.

Replaces the bag-of-words scorer. Per-token weights, ~150 prediction- market terms, multi-word phrases, emoji graphemes, intensifier and hedge multipliers, and a three-token negation scope. ALL-CAPS and trailing "!!" bump magnitude. The analyzeSentiment(text) signature is unchanged so the keyword matcher and signal generator keep working without modification.

The previous formula was edge = confidence * |p - price|, which drops the sign. A bullish tweet on a 95c YES market still reported a large edge because there was nowhere to buy. Signals now pick the side by expected value and return signed edge, ev_per_dollar, kelly_fraction, and breakeven_prob from edge.ts. Other changes here: - sentimentToProbability range narrowed from ±0.45 to ±0.25; tweet- level evidence rarely justifies a sharper prior. - The confidence passed to Kelly is capped at 0.6 for lexicon-only signals. - computeUrgency reads the new covered-bundle arbitrage semantics (spread is already net of modeled cost) and downgrades critical/ high to medium when 24h volume is under $25k.

Leaves the covered-bundle detector from PR MusashiBot#2 alone and adds three optional fields on each opportunity: maxStake, expectedDollarProfit, and annualisedReturn. Sizing uses the impact-aware slippage model in fees.ts and clamps the stake to $10 when refined edge is non-positive. APR uses the earlier of the two endDates or falls back to 30 days.

analyze-text now returns ev_per_dollar, kelly_fraction, and breakeven_prob on data.suggested_action, plus implied_true_prob under metadata. markets/arbitrage accepts three new optional filters (minExpectedProfit, minAnnualisedReturn, minMaxStake) that read the fields added in the previous commit. health bumps the version to 2.1.0 and lists the new endpoints in its catalog. No existing field or query parameter changed shape.

POST /api/position-sizing returns a Kelly-optimal stake plus half- and quarter-Kelly alternatives given true_prob, yes_price, bankroll, and volume_24h. Defaults to quarter-Kelly with a 10%-of-bankroll hard cap. Runs a second pass so the stake feeds back into the slippage model. POST /api/risk-assessment takes a proposed trade (side, price, stake, optional bankroll and expiry) and returns EV, variance, Sharpe, prob_profit, Kelly-suggested stake, and a TAKE/SCALE_DOWN/AVOID call. Both are wired in vercel.json and the local server. API-REFERENCE is extended with request/response examples.

vercel · 2026-04-20T18:56:37Z

@LechenWang is attempting to deploy a commit to the Victor's projects Team on Vercel.

A member of the Team first needs to authorize it.

The keyword matcher surfaces untradable matches: markets with essentially no 24h volume, markets pinned near 1c or 99c, and single- unigram broad hits (the typical case is a Fed-rate-cut tweet matching an NBA penny market on the word "win"). The gate drops all three before results are returned: - volume floor: drop if 24h volume < $5k - extreme-price: drop if yesPrice < 2% or > 98% unless the match has a phrase hit and confidence >= 0.75 - strong-signal: require either a phrase hit or >= 2 matched keywords at >= 0.55 confidence The gate is on by default and can be disabled by passing false as the KeywordMatcher's fourth constructor argument. The eval harness uses that to produce its baseline row. scripts/matcher-eval/ ships a reproducible evaluation: npx tsx scripts/matcher-eval/snapshot-markets.ts # regen snapshot npx tsx scripts/matcher-eval/run-eval.ts On the 30-tweet x 1,857-market fixture: before after delta total matches surfaced 99 86 -13 junk rate (any rule) 63.6% 40.7% -22.9 pp thin-market (<$5k) 4.0% 0.0% -4.0 pp extreme-price 46.5% 25.6% -20.9 pp cross-domain 29.3% 20.9% -8.4 pp weak-signal 4.0% 0.0% -4.0 pp Recall drops 13% while the overall junk rate drops 36% relative.

Changes to api/lib/market-cache.ts: - in-flight request deduplication so concurrent callers during a cache miss share a single fetch promise instead of each making their own Polymarket/Kalshi roundtrip - stale-while-revalidate up to MARKET_CACHE_SWR_SECONDS (default 60s) past the 20s TTL; stale is returned immediately while a single background refresh runs - refresh no longer overwrites the cache with two empty arrays on a full-platform outage; last-known-good is retained New api/lib/rate-limit.ts: sliding-window per-IP counter backed by Vercel KV with an in-process Map fallback. Writes X-RateLimit-Limit / -Remaining / -Reset headers on every response, returns 429 with Retry-After once the bucket is exhausted, and fails open if KV is unavailable. Wired into analyze-text, markets/arbitrage, ground-probability, position-sizing, and risk-assessment with per-endpoint budgets. Verified directly: 40 bursted requests from one IP at a 30-rpm cap produced 30 allowed + 10 denied; a different IP got a fresh 30-req budget in the same window.

scripts/backtest/run-backtest.ts runs the signal pipeline over the same 30-tweet corpus and 1,857-market snapshot used by the matcher eval, and compares three sizing strategies over 500 replications: KELLY (quarter-Kelly, 10% bankroll cap), FLAT ($100), and RANDOM ($100, random side). Execution realism: - only trades signals on markets priced in [0.10, 0.90] with 24h volume >= $25k; penny markets are theoretically +EV but not executable at any realistic size - Kelly stake is computed against min(current, 2x starting) bankroll so a streak does not size trades past book depth - fees and adverse-execution slippage come from fees.ts, so the PnL is apples-to-apples with what the API reports Results at calibration = 1.0 (pooled-trade Sharpe, 500 reps): strategy return Sharpe maxDD winRate Brier KELLY +17.4% 0.360 3.5% 17.8% 0.206 FLAT +5.6% 0.570 1.7% 54.2% 0.239 RANDOM +1.8% 0.228 1.9% 49.1% 0.240 Kelly sensitivity sweep: calibration return Sharpe 0.00 -1.2% -0.033 0.25 +3.7% 0.090 0.50 +8.3% 0.188 0.75 +13.5% 0.289 1.00 +18.4% 0.377 At calibration 0 (signals are noise) Kelly loses small rather than blowing up, because the EV check short-circuits to HOLD on negative- expectation trades. Results persisted to scripts/backtest/fixtures/ result.json and regenerable from scratch.

sentimentToProbability returned 0.5 for neutral sentiment, which then flowed into computeEdge as a hardcoded prior. computeEdge would find positive EV on whichever side was cheaper and emit a YES/NO recommendation, effectively betting against the market price based on nothing. Neutral sentiment means we have no directional evidence. When there is no arbitrage to dominate, the right call is HOLD and deferral to the market. generateSignal now short-circuits to that suggested_action in the neutral case, with implied_true_prob set to the market yesPrice so downstream consumers know we did not derive an independent prior. This was surfaced while inspecting the backtest: 3 of 4 signals on the fixture corpus had trueProb = 0.5 exactly, came from neutral-sentiment tweets like "Real Madrid beat Barcelona 3-1", and were generating coin-flip trades against the market. They are now filtered at signal generation, not at the strategy layer.

The original computeMetrics counted skipped-signal slots (stake=0) in the denominators of both winRate and Sharpe. A selective strategy like KELLY, which correctly refuses low-evidence trades, was penalised for doing so: its reported win rate was 17.8% and its Sharpe 0.36 while FLAT, which took every signal, showed 54.2% and 0.57. Both numbers were an artefact of the zero-pad on the pnl path. Metrics are now split: - totalReturn and maxDrawdown still walk the full pnlPath (zeros do not move equity, so they're correct as a full-path metric) - sharpe, winRate, meanPnl, stdPnl are computed over active trades (entries with pnl != 0) only - activeTrades and activeRate are reported alongside the other numbers so a reviewer can see how selective each strategy was After the fix plus the matching signal-generator change that removes coin-flip trades on neutral-sentiment tweets, KELLY and FLAT converge to the same 71% win rate (they take the same single well-calibrated bet per replication) and near-identical Sharpe (0.92 vs 0.98). The total return gap (17.4% vs 1.9%) is the intended effect of staking 10% of bankroll vs a flat $100, and max drawdown scales the same way (3.5% vs 0.3%).

…nd worst-case loss Arbitrage `expectedDollarProfit` in `estimateExecutableSizing` used `refinedEdge * maxStake`. `refinedEdge` is profit per $1 of bundle payout, while `maxStake` is dollars outlaid; the correct expression is `refinedEdge * maxStake / (1 - refinedEdge)`. Under-statement was approximately 1-5% at typical edges and ~43% at 30%+ edges. Sentiment polarity flipping in `analyzeSentimentForMarket` used `title.toLowerCase().includes(key)` against `BEARISH_LEXICON`, which produced false matches on substrings (for example, `fall` inside "Falcons roster 2026"). Replaced with a word-boundary regex that also supports multi-word keys. `EdgeResult.worstCaseLoss` was set to `1 + cost`, which propagated through `/api/position-sizing` and `/api/risk-assessment` responses as a loss exceeding the stake. Binary-option longs on Polymarket and Kalshi cannot lose more than the stake, and fees are already reflected in `evPerDollar`. Corrected to `1`; `bestCaseGain` floored at `0` for symmetry. Reproducibility unchanged: matcher evaluation junk-rate delta -22.9 pp; back-test KELLY total return 17.4%, Sharpe (active) 0.923, win rate (active) 71.0%. Typecheck and wallet tests pass.

`getArbitrage` previously validated its cache by TTL alone. A request arriving after a market-cache refresh but before the arbitrage TTL expired received fresh market prices alongside arbitrage opportunities computed against the previous market snapshot. The arbitrage cache now records the `cacheTimestamp` under which it was computed and invalidates whenever that timestamp advances. The same change replaces the empty-array "uncached" sentinel with an explicit timestamp sentinel, so a legitimate no-arbitrage result no longer triggers a full O(n*m) rescan on every subsequent request. `api/lib/rate-limit.ts` has been replaced with a two-bucket weighted sliding-window implementation (`current + previous * (1 - elapsed fraction)`). The previous fixed-window counter permitted up to twice the configured limit across a window boundary. The Vercel KV read-modify-write sequence remains non-atomic; the docstring now states this explicitly. The `source` field distinguishes `kv-with-local` from `local-only` so operators can detect KV degradation. Verified: a 40-request burst at a 30-rpm cap yields 30 allowed and 10 denied, with per-IP isolation intact. Typecheck and wallet tests pass.

…emantics Three follow-up corrections surfaced by re-reading the public-facing response shapes after the previous review pass. `src/analysis/risk.ts` — `SCALE_DOWN` was structurally unreachable via the bankroll-fraction branch whenever `bankroll` was omitted, because the inferred bankroll (`stake / maxFrac`) made `stake / bankroll` exactly equal to `maxFrac`. The module now tracks whether `bankroll` was supplied: when it is, both SCALE_DOWN branches apply as documented; when it is not, the bankroll-fraction check is skipped and a warning is surfaced telling the caller to pass `bankroll` for a complete assessment. The dollar worst-case is also bounded at `-stake` (previously `-stake * (1 + cost)`, which propagated an impossible "loss exceeds stake" figure to clients); `bestCase` is floored at `0`. `src/analysis/signal-generator.ts` — when `computeEdge` recommends YES or NO on raw edge but fees and slippage push `evPerDollar` non-positive, `buildSuggestedAction` overrode the direction to `HOLD` but preserved the original reasoning string (for example, "YES underpriced at X%"). The payload is now internally consistent: on the override path the reasoning is rewritten to explain that raw edge favoured the side but net EV is non-positive. `api/position-sizing.ts` — `alternative_sizing.half_kelly` and `quarter_kelly` were previously computed as `recommendedStake / 2` and `recommendedStake / 4`. Because `recommendedStake` is already capped at `min(kelly_cap, max_bankroll_fraction)` of bankroll, those values did not correspond to one-half and one-quarter of the full Kelly fraction. Both fields are now derived from the uncapped full Kelly fraction (`kellyFraction(shrunk_prob, yes_price)`), with the same outer risk cap applied so the "safer" sizings never exceed the recommended stake or the caller's hard risk limit. A new `full_kelly_fraction` field exposes the uncapped value for clients that want to reason about it explicitly. Verified: typecheck and wallet tests pass; matcher evaluation delta unchanged at -22.9 pp junk rate; back-test KELLY total return 17.4%, Sharpe (active) 0.923, win rate (active) 71.0%. Direct smoke tests confirm all three behavioural corrections.

LechenWang added 6 commits April 20, 2026 13:54

JonathanW666 force-pushed the feat/trading-edge-and-sizing branch from c30f8e9 to a26a0e9 Compare April 20, 2026 19:02

JonathanW666 closed this Apr 20, 2026

LechenWang added 3 commits April 20, 2026 15:02

JonathanW666 reopened this Apr 20, 2026

JonathanW666 force-pushed the feat/trading-edge-and-sizing branch from 874b73d to 2fe2a00 Compare April 20, 2026 20:34

LechenWang added 4 commits April 20, 2026 16:06

JonathanW666 force-pushed the feat/trading-edge-and-sizing branch from c81b7ba to 165edaf Compare April 20, 2026 22:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fee-adjusted edge, fractional-Kelly sizing, and risk-assessment endpoints#7

feat: fee-adjusted edge, fractional-Kelly sizing, and risk-assessment endpoints#7
JonathanW666 wants to merge 14 commits into
MusashiBot:mainfrom
JonathanW666:feat/trading-edge-and-sizing

JonathanW666 commented Apr 20, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JonathanW666 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Matcher evaluation

Back-test

Back-test interpretation

New surfaces

Testing

Backwards compatibility

Review pass

1. Arbitrage profit formula

2. Arbitrage cache invalidation

3. Empty-result arbitrage sentinel

4. Rate-limit algorithm and documentation

5. Sentiment polarity matching

6. Worst-case loss bound

Polish

1. SCALE_DOWN reachability without an explicit bankroll

2. HOLD-override reasoning consistency

3. alternative_sizing semantics

Future improvements

Correctness

Observability and consistency

Deployment and operations

Uh oh!

vercel Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JonathanW666 commented Apr 20, 2026 •

edited

Loading

1. `SCALE_DOWN` reachability without an explicit bankroll

3. `alternative_sizing` semantics