Skip to content

Aarav Jain improvements#8

Open
aaravjj2 wants to merge 6 commits into
MusashiBot:mainfrom
aaravjj2:aarav-jain-improvements
Open

Aarav Jain improvements#8
aaravjj2 wants to merge 6 commits into
MusashiBot:mainfrom
aaravjj2:aarav-jain-improvements

Conversation

@aaravjj2
Copy link
Copy Markdown

Summary\n- Expand test coverage across unit and API handler layers\n- Add shared test helpers and improve test scripts/docs\n- Include infrastructure/docs updates for performance, ML, risk, and backtesting\n\n## Validation\n- pnpm test:all (all suites passing)\n

Copilot AI review requested due to automatic review settings April 20, 2026 22:32
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 20, 2026

@aaravjj2 is attempting to deploy a commit to the Victor's projects Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands Musashi’s operational + research loop: adds risk/session controls, performance tracking & resolution ingestion for signal_outcomes, introduces ML signal scoring scaffolding, and strengthens testing/docs/scripts for CI/backtesting and infra operations.

Changes:

  • Add new API endpoints/routes (risk session, performance metrics, internal resolve-market) plus rate limiting and cache/WS enhancements.
  • Introduce ML scoring module + schema to log outcomes for training, along with scripts to collect resolutions and run backtests.
  • Expand unit/smoke/testing ladder and add extensive deployment/environment documentation.

Reviewed changes

Copilot reviewed 81 out of 85 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
vercel.json Adds new API routes + CORS header for X-API-Key.
tests/unit/cache-utils.test.mjs Adds unit tests for rate limit + cache/KV helpers.
tests/helpers/test-helpers.mjs Adds shared response/fetch/KV mock helpers.
supabase/migrations/20260418000000_signal_outcomes.sql Creates signal_outcomes table + indexes for ML/metrics.
src/types/market.ts Extends market/arbitrage types; adds Kelly sizing types.
src/ml/signal-scorer-model.ts Adds ML model loading/inference + heuristic fallback.
src/ml/models/README.md Documents model-weights directory.
src/ml/models/.gitkeep Keeps weights directory in repo.
src/ml/index.ts Exposes ML public API exports.
src/ml/example-usage.ts Demonstrates end-to-end ML scoring workflow.
src/ml/README.md Full ML scorer documentation.
src/ml/QUICKSTART.md Quickstart guide for ML scoring.
src/db/signal-outcomes.example.ts Example usage for outcome logging + resolution + metrics.
src/db/README.md Documents outcome tracking system.
src/db/ARCHITECTURE.md Describes signal→DB→training/metrics architecture.
src/api/supabase-client.ts Adds typed signal_outcomes + connection test.
src/api/polymarket-price-poller.ts Adds REST order book depth fetcher.
src/analysis/sentiment-analyzer.ts Adds weighted multi-tweet sentiment aggregation.
src/analysis/semantic-matcher.ts Adds transformer-based semantic matching with caching.
src/analysis/semantic-matcher-example.ts Example usage for semantic matching.
src/analysis/kelly-sizing.ts Adds Kelly sizing + volatility/anomaly detectors.
src/analysis/README.md Documents semantic matcher design/perf.
scripts/test-smoke-imports.ts Adds import smoke tests for critical modules.
scripts/test-performance-endpoints.ts Adds integration script to validate performance endpoints.
scripts/test-agent-api.ts Increases default timeout to 30s.
scripts/ml/collect-resolutions.ts Batch job to backfill outcomes from venue resolutions.
scripts/lib/is-main-module.ts ESM-safe entrypoint detection helper.
scripts/interview-ready.ts Runs CI ladder + prints pitch prompts.
scripts/backtest/pnl-calculator.ts Adds P&L calculator utilities for backtests.
scripts/backtest/historical-data-fetcher.ts Fetches KV price snapshots for backtesting.
scripts/backtest/example-usage.ts Shows programmatic backtest usage.
package.json Adds test ladder scripts + ML/backtest/collector commands; deps for transformers/ws.
docs/WS_STRATEGY.md WS operations guidance.
docs/TESTING.md Documents testing ladder and scripts.
docs/SUBMISSION.md Submission guidance and commands.
docs/SLO.md Adds SLO ideas and metrics targets.
docs/QUICK_START_PERFORMANCE.md Setup guide for performance tracking endpoints.
docs/PORTFOLIO_RISK.md Notes on portfolio-level risk beyond session.
docs/PHASE_D_GATE.md Gate criteria for execution-layer work.
docs/NATIVE_DEPS.md Notes on transformers/sharp/native deps.
docs/ML_CALIBRATION.md Roadmap for calibration improvements.
docs/ENVIRONMENT.md Central env var reference (flags/keys/limits).
docs/DEPLOYMENT.md Deployment checklist (migrations/env/crons/workflows).
docs/ARBITRAGE_REALISM.md Notes on mid-price vs executable edge and scaling.
api/risk/session.ts New session-level circuit breaker endpoint.
api/metrics/performance.ts New endpoint for aggregated signal performance metrics.
api/markets/arbitrage.ts Adds rate limiting + new filtering params (net spread/opposed).
api/lib/rate-limit.ts Adds per-instance sliding window rate limiter helpers.
api/lib/market-cache.ts Integrates Polymarket WS price refresh + order book fetcher; makes arbitrage async.
api/internal/resolve-market.ts New internal endpoint to resolve markets & compute P&L.
api/health.ts Adds operational readiness flags + updated endpoint catalog/version.
api/analyze-text.ts Adds rate limiting + vol regime hint + optional ML scoring fields.
REAL_TIME_IMPLEMENTATION.md Real-time infra documentation.
README.md Updates repo pointers + adds env/testing/deploy docs links.
QUICKSTART_OUTCOME_TRACKING.md Quickstart for outcome tracking workflow.
IMPLEMENTATION_SUMMARY.md Summary documentation for real-time infra implementation.
DEPLOYMENT_READY.md Deployment readiness narrative for performance system.
.gitignore Ignores reports + ML weights JSON outputs.
.github/workflows/collect-resolutions.yml Adds workflow_dispatch job for resolution collection.
.github/workflows/ci.yml Adds CI workflow (typecheck + test ladder).
.github/workflows/backtest-report.yml Adds workflow_dispatch backtest report artifact job.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +20 to +32
function isAuthorized(req: VercelRequest): boolean {
const apiKey = req.headers['x-api-key'] || req.headers['authorization']?.replace('Bearer ', '');
const expectedKey = process.env.INTERNAL_API_KEY;

if (!expectedKey) {
// If no key is configured, check if request is from internal network
const allowedIps = (process.env.INTERNAL_IPS || '').split(',');
const clientIp = req.headers['x-forwarded-for'] || req.socket?.remoteAddress || '';
return allowedIps.some(ip => clientIp.toString().includes(ip));
}

return apiKey === expectedKey;
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isAuthorized() falls back to an IP allowlist when INTERNAL_API_KEY is unset, but (process.env.INTERNAL_IPS || '').split(',') yields [''] when the env var is missing/empty; clientIp.includes('') will then return true for every request, effectively disabling auth. Default to deny when no API key is configured (or require a non-empty allowlist and perform exact/CIDR matching on the first x-forwarded-for IP).

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +129
const supabaseUrl = process.env.NEXT_PUBLIC_SUPABASE_URL;
const supabaseKey = process.env.SUPABASE_SERVICE_KEY || process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY;

if (!supabaseUrl || !supabaseKey) {
res.status(500).json({
success: false,
error: 'Supabase configuration missing',
});
return;
}

const supabase = createSupabaseBrowserClient(supabaseUrl, supabaseKey);

Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint only reads NEXT_PUBLIC_SUPABASE_URL (and NEXT_PUBLIC_SUPABASE_ANON_KEY), but the rest of the codebase/docs also use SUPABASE_URL / SUPABASE_ANON_KEY. As written, /api/internal/resolve-market will 500 in environments that set only the non-NEXT_PUBLIC_* vars. Consider using the same fallback chain as api/metrics/performance.ts (SUPABASE_URL || NEXT_PUBLIC_SUPABASE_URL, and prefer SUPABASE_SERVICE_KEY on the server).

Copilot uses AI. Check for mistakes.
Comment thread api/metrics/performance.ts Outdated
Comment on lines +87 to +90
// Fetch all signal outcomes
const { data: allSignals, error: allSignalsError } = await supabase
.from('signal_outcomes')
.select('*');
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint pulls the entire signal_outcomes table into memory (select('*')) and then filters in-process for 24h/7d/30d. As the table grows, this will become slow and can exceed serverless memory/time limits. Prefer querying only the needed time window (e.g., gte('created_at', day30Ago)) and selecting only required columns; consider separate lightweight queries/counts for totals and unresolved backlog.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +162
// Recency decay — half-life of 15 minutes
const ageMins = Math.max(0, (now - t.timestamp) / 60_000);
const recencyDecay = Math.exp(-ageMins / 15);

Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says “half-life of 15 minutes”, but Math.exp(-ageMins / 15) uses 15 minutes as the decay constant (time to e^-1), not a half-life. If you intend a true 15-minute half-life, the exponent should be scaled by ln(2); otherwise update the comment to match the implemented decay.

Copilot uses AI. Check for mistakes.
Comment thread api/analyze-text.ts
Comment on lines +167 to 168
// Filter out anomalous markets from arbitrage consideration
const arbitrageOpportunities = await getArbitrage(0.03);
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says “Filter out anomalous markets from arbitrage consideration”, but the code only filters is_directionally_opposed. Either update the comment to reflect what’s actually being filtered, or implement the anomalous-market filter here (e.g., based on market.is_anomalous).

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +54
function calculatePnL(
edge: number,
predictedProb: number,
wasCorrect: boolean,
bankroll: number = 1000 // Default bankroll
): number {
// Kelly Criterion: f* = (bp - q) / b
// where b = decimal odds - 1, p = win probability, q = 1 - p
// Simplified: bet size = edge * bankroll (fraction Kelly)

const kellyFraction = Math.abs(edge) * 0.25; // Quarter Kelly for safety
const betSize = kellyFraction * bankroll;

if (wasCorrect) {
// Return at fair odds based on predicted probability
// Profit = betSize / predictedProb - betSize
return betSize * (1 / predictedProb - 1);
} else {
// Loss: we lose the entire bet
return -betSize;
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculatePnL() can divide by zero (or near-zero) when predictedProb is 0/very small, producing Infinity/huge P&L values. Clamp predictedProb to a safe epsilon range (and validate it’s finite) before using 1 / predictedProb so a single bad row can’t corrupt metrics.

Copilot uses AI. Check for mistakes.
Comment thread src/ml/models/README.md
Comment on lines +5 to +6
The file will be created after running the training script.

Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the extra space in “The file …” (double space) to avoid a typo in docs.

Copilot uses AI. Check for mistakes.
Comment thread src/ml/example-usage.ts
Comment on lines +222 to +225
// Run if executed directly
if (require.main === module) {
main().catch(console.error);
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script uses require.main === module as its entrypoint guard. If it’s executed under ESM (which is common with node --import tsx), require is undefined and the script will crash. Consider using the repo’s isMainModule() helper (scripts/lib/is-main-module.ts) or an import.meta.url-based guard for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +74
try {
const modelPath = path.join(__dirname, 'models', 'signal-scorer-v1.json');

if (!fs.existsSync(modelPath)) {
modelLoadError = 'Model file not found';
console.warn('[signal-scorer-model] Model file not found, using heuristic fallback');
return null;
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadModel() builds the weights path using __dirname. If this module is ever executed as ESM (or bundled), __dirname is not defined and model loading will fail. Prefer deriving the directory from import.meta.url (via fileURLToPath) or otherwise ensure this module is CJS-only and documented as such.

Copilot uses AI. Check for mistakes.
Comment on lines +200 to +203
// Run examples if this file is executed directly
if (require.main === module) {
runAllExamples();
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example uses require.main === module to detect direct execution. If it’s run under ESM (common with node --import tsx), require is undefined. Consider using isMainModule() (scripts/lib/is-main-module.ts) or an import.meta.url-based guard for ESM compatibility.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1287f5e0e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread api/internal/resolve-market.ts Outdated
Comment on lines +26 to +28
const allowedIps = (process.env.INTERNAL_IPS || '').split(',');
const clientIp = req.headers['x-forwarded-for'] || req.socket?.remoteAddress || '';
return allowedIps.some(ip => clientIp.toString().includes(ip));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject unauthenticated calls when INTERNAL_IPS is empty

The fallback auth path becomes effectively open when INTERNAL_API_KEY is unset and INTERNAL_IPS is missing/empty: (process.env.INTERNAL_IPS || '').split(',') yields [''], and clientIp.includes('') is always true, so any client is treated as authorized. In that deployment state, POST /api/internal/resolve-market can be called publicly and mutate signal outcomes, which is a production-impacting security regression.

Useful? React with 👍 / 👎.

Comment on lines +142 to +144
const exitPrice = direction === 'YES'
? exitSnapshot.yesPrice
: (1 - exitSnapshot.yesPrice);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recompute exit price after stop-loss/take-profit updates

exitPrice is captured before the stop-loss/take-profit branch, but that branch can replace exitSnapshot and exitReason. Because exitPrice is never recomputed, downstream P&L, correctness, and price-change metrics still use the old expiry/resolution price instead of the triggered stop/take-profit price, making backtest results incorrect whenever those guards fire.

Useful? React with 👍 / 👎.

const { data: signals, error: fetchError } = await supabase
.from('signal_outcomes')
.select('*')
.eq('market_id', market.ticker)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match Kalshi resolutions against stored market_id format

This lookup uses raw market.ticker, but signals are logged with market_id = topMarket.id (the normalized internal ID, e.g. kalshi-<ticker>), so Kalshi rows won't be found and remain unresolved. In practice the collector silently skips Kalshi outcome updates, which breaks win-rate/P&L tracking and ML labels for that platform.

Useful? React with 👍 / 👎.

aaravjj2 and others added 4 commits April 20, 2026 18:48
…dress review findings

- Remove Access-Control-Allow-Origin: * from risk/session endpoint
- Add API key auth + per-IP rate limiting to /api/risk/session
- Replace SELECT * with explicit column list in getRecentPerformance()
- Truncate raw tweet text to 280 chars before DB storage
- Replace 32-bit hash event IDs with crypto.randomUUID()
- Add ML circular-priors warning and MUSASHI_ML_ENABLED runtime gate
- Document fair-odds P&L assumption in calculatePnL and backtest README
- Update DEPLOYMENT.md with new env vars

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…cktest disclaimer

1. Event ID: replace crypto.randomUUID() with deterministic SHA-256 hash
   of (marketId + signalType + 5-min time bucket + text) for stable dedup
2. .gitignore: add *.docx, *.xlsx, *.pptx to prevent Office binaries in git
3. Dependency audit: all deps verified in-use, no removals needed
4. ML floor: add getMinRealSignals() clamped to min 50 with warning log
5. Retry-After: add 60-second header on 429 rate-limit responses
6. CORS: read ALLOWED_ORIGIN env var, reject wildcard in production
7. Backtest: prepend fair-odds disclaimer to all generated reports
8. Tests: add event ID determinism + ML floor clamp tests, fix risk/session
   tests to provide auth headers (81/81 passing)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…r format

1. Export generateEventId from signal-generator, rewrite unit tests to import
   the real function instead of inline re-implementations; add bucket-boundary
   test documenting 5-minute window dedup behavior
2. .gitignore: add *.doc, *.xls, *.ppt alongside existing Office patterns;
   add reference materials note in README
3. Rewrite getMinRealSignals test to import and call the actual exported
   function from signal-scorer-model.ts
4. Add Retry-After: 60 header test using pre-filled rate limit bucket
5. CORS: return 500 in production when ALLOWED_ORIGIN is unset instead of
   just logging; update ENVIRONMENT.md docs
6. Backtest disclaimer: switch to box-drawing character format matching spec
7. Dependency audit: all 4 new deps verified in-use, all MIT/Apache licensed

83/83 tests pass (35 unit + 39 API + 5 wallet + 4 smoke), typecheck clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aaravjj2 aaravjj2 force-pushed the aarav-jain-improvements branch from bb602aa to abc5afc Compare April 21, 2026 00:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants