|
30 | 30 | {"id":"node-360","title":"Wrap database .save() calls in try-catch (GCRIdentityRoutines)","description":"## Problem\n17 TypeORM .save() calls in GCRIdentityRoutines.ts have no error handling. Database errors (constraint violations, connection loss, disk full) crash the node.\n\n## Evidence\nUnguarded saves at lines: 106, 193, 291, 398, 480, 557, 639, 708, 741, 770, 1415, 1483, 1607, 1705, 1779, 1983, 2102\n\nExample vulnerable code:\n```typescript\n// Line 106 - no try-catch\nawait gcrMainRepository.save(gcr)\n```\n\n## Fix Location\nsrc/libs/blockchain/gcr/gcr_routines/GCRIdentityRoutines.ts\n\n## Code Snippet\nCreate a helper function and use it consistently:\n\n```typescript\n// Add helper at top of file\nasync function safeGCRSave(\n gcr: GCRMain, \n operation: string\n): Promise<{ success: boolean; error?: string }> {\n try {\n await gcrMainRepository.save(gcr)\n return { success: true }\n } catch (error) {\n logger.error(`GCR save failed during ${operation}`, {\n error: error instanceof Error ? error.message : String(error),\n gcrId: gcr.id,\n })\n return { \n success: false, \n error: error instanceof Error ? error.message : 'Database error' \n }\n }\n}\n\n// Usage at line 106:\nconst saveResult = await safeGCRSave(gcr, 'applyXmIdentityAdd')\nif (!saveResult.success) {\n return { error: saveResult.error }\n}\n```\n\n## Why This Helps Stability\n- 'No space left on device' won't crash the node\n- Connection timeouts return errors instead of crashing\n- Constraint violations are logged and handled\n- Other operations can continue even if one save fails\n- Node stays available to serve RPC requests","notes":"Added safeGCRSave() helper and applied to applyXmIdentityAdd. Pattern established for remaining 16 saves - can be applied incrementally.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-03-07T10:34:15.603828357Z","created_by":"tcsenpai","updated_at":"2026-03-07T11:59:59.423082328Z","closed_at":"2026-03-07T10:47:13.948562450Z","close_reason":"All 17 gcrMainRepository.save() calls wrapped with safeGCRSave helper","source_repo":".","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"node-360","depends_on_id":"node-214","type":"parent-child","created_at":"2026-03-07T11:59:59.422774198Z","created_by":"tcsenpai","metadata":"{}","thread_id":""}]} |
31 | 31 | {"id":"node-367","title":"Add progress reporting during loadgen runs","description":"## Problem\nDuring long load tests (60s+), there's no output until the test completes. No way to know if the test is working or stuck.\n\n## Fix\nAdd periodic progress reporting:\n```typescript\n// framework/progress.ts\nexport function startProgressReporter(counters: Counters, intervalMs = 5000) {\n return setInterval(() => {\n const elapsed = (nowMs() - counters.startMs) / 1000\n console.log('[progress] %.1fs | ok=%d err=%d tps=%.1f',\n elapsed, counters.ok, counters.error, counters.ok / elapsed)\n }, intervalMs)\n}\n```\n\n## Why\nBetter observability during test runs. Can catch stuck tests early instead of waiting for timeout.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-03-07T11:57:52.567272405Z","created_by":"tcsenpai","updated_at":"2026-03-08T12:30:42.649360368Z","closed_at":"2026-03-08T12:30:42.649098535Z","close_reason":"Added reusable progress reporter and wired it into long-running loadgen scenarios with opt-out env controls","source_repo":".","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"node-367","depends_on_id":"node-2t7","type":"parent-child","created_at":"2026-03-07T11:58:09.220528560Z","created_by":"tcsenpai","metadata":"{}","thread_id":""}]} |
32 | 32 | {"id":"node-3b4","title":"Create minimal standalone repro for multi-instance wallet identity bleed","status":"closed","priority":1,"issue_type":"task","created_at":"2026-03-08T09:34:58.059267555Z","created_by":"tcsenpai","updated_at":"2026-03-08T09:41:29.001555778Z","closed_at":"2026-03-08T09:41:29.001196942Z","close_reason":"Standalone repro script added and validated against local devnet","source_repo":".","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"node-3b4","depends_on_id":"node-16j","type":"parent-child","created_at":"2026-03-08T09:34:58.059267555Z","created_by":"tcsenpai","metadata":"{}","thread_id":""}]} |
| 33 | +{"id":"node-3eo","title":"Clean stale better_testing docs and Serena memories","status":"in_progress","priority":2,"issue_type":"task","created_at":"2026-03-15T19:54:43.591693700Z","created_by":"tcsenpai","updated_at":"2026-03-15T19:54:53.659578144Z","source_repo":".","compaction_level":0,"original_size":0} |
33 | 34 | {"id":"node-3fv","title":"EVM XM identity verification compares targetAddress case-sensitively","description":"## Summary\nXM identity verification for EVM payloads currently compares the recovered signer address to `targetAddress` with raw string equality. This rejects otherwise valid Ethereum addresses when the payload uses a different letter casing than the recovered EIP-55 checksum form.\n\n## Repro\n1. Run `better_testing/scripts/run-scenario.sh gcr_identity_xm_smoke --build --run-id gcr-identity-xm-local-20260308-01` with a lowercase `targetAddress` recipe.\n2. Submit an `xm_identity_assign` payload where:\n - `chain = \"evm\"`\n - `subchain = \"mainnet\"`\n - `chainId = 1`\n - `signedData = <owner ed25519 address>`\n - `signature = await ethersWallet.signMessage(signedData)`\n - `targetAddress = ethersWallet.address.toLowerCase()`\n3. Observe confirm succeeds but broadcast rejects with `evm payload signature could not be verified. Transaction not applied.`\n\nObserved on 2026-03-08:\n- Lowercased address form fails.\n- The same payload succeeds when `targetAddress` uses the wallet's canonical checksummed `ethersWallet.address` string.\n- Successful control run artifact: `better_testing/runs/gcr-identity-xm-local-20260308-02/features/gcr/gcr_identity_xm_smoke.summary.json`\n\n## Why It Matters\nEthereum addresses are effectively case-insensitive for identity purposes, with checksum casing as an encoding aid. Raw case-sensitive comparison in node verification makes the XM payload contract unnecessarily brittle for SDKs, scripts, and external clients.\n\n## Likely Fix Area\n- `src/libs/blockchain/gcr/gcr_routines/identityManager.ts`\n - `verifyPayload()` EVM path\n\n## Suggested Fix Direction\n1. Normalize both the recovered EVM address and `targetAddress` before comparison.\n2. Prefer checksum-aware normalization if available; otherwise lowercase both safely.\n3. Add tests for checksummed and lowercase payload addresses.\n4. Keep stored display casing explicit while normalizing comparisons.\n\n## Acceptance Criteria\n1. A valid EVM XM payload verifies successfully regardless of checksum-vs-lowercase casing in `targetAddress`.\n2. Invalid signatures still fail.\n3. `gcr_identity_xm_smoke` passes when forced to use lowercase `targetAddress` in the payload.\n4. No regression for XM identity add/remove persistence or duplicate detection.","status":"closed","priority":2,"issue_type":"bug","created_at":"2026-03-08T13:46:16.457724286Z","created_by":"tcsenpai","updated_at":"2026-03-08T13:59:08.625786132Z","closed_at":"2026-03-08T13:59:08.625786132Z","close_reason":"Normalized EVM XM address verification in IdentityManager and validated lowercase targetAddress add/remove flow on all 4 nodes","source_repo":".","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"node-3fv","depends_on_id":"node-qza","type":"discovered-from","created_at":"2026-03-08T13:46:16.457724286Z","created_by":"tcsenpai","metadata":"{}","thread_id":""}]} |
34 | 35 | {"id":"node-3is","title":"Peer bootstrap exits on transient genesis-hash startup race","description":"## Summary\nPeer bootstrap could exit the whole node during cold start if a healthy peer had not yet materialized its genesis block for `getGenesisDataHash`.\n\n## Observed Repro\nAfter fixing the invalid genesis transaction write, a fresh devnet boot still showed `node-2` exiting during startup:\n- bootstrap contacted `node-1`\n- `node-1` returned `500 INTERNAL_ERROR` for `nodeCall(getGenesisDataHash)` during its own startup window\n- `src/libs/peer/routines/peerBootstrap.ts` treated any non-200 response as fatal and called `process.exit(1)`\n\n## Root Cause\n- `src/libs/network/manageNodeCall.ts` returns `500` if `Chain.getGenesisBlock()` is not ready yet.\n- `src/libs/peer/routines/peerBootstrap.ts` used `getGenesisDataHash` as a one-shot fatal check instead of a retryable readiness probe.\n- On staggered cold boot, that made peer bootstrap brittle even though the cluster became healthy seconds later.\n\n## Fix Applied\n- added bounded retry logic around `getGenesisDataHash` in peer bootstrap\n- preserved fatal behavior only after repeated failures, with response details logged\n\n## Validation\nAfter the fix:\n- fresh `docker compose up -d --build --force-recreate` leaves all 4 nodes up\n- all four nodes return the same genesis hash over RPC\n- representative consensus, Omni, and GCR scenarios pass on the rebuilt devnet\n\n## Acceptance Criteria\n1. A cold start with staggered peer readiness does not kill nodes during bootstrap.\n2. Transient `getGenesisDataHash` unavailability is retried instead of causing immediate process exit.\n3. All nodes converge and stay up after fresh devnet boot.","status":"closed","priority":1,"issue_type":"bug","created_at":"2026-03-09T16:12:40.402066642Z","created_by":"tcsenpai","updated_at":"2026-03-09T16:12:40.402066642Z","closed_at":"2026-03-09T16:12:40.402066642Z","source_repo":".","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"node-3is","depends_on_id":"node-4g1","type":"discovered-from","created_at":"2026-03-09T16:12:40.402066642Z","created_by":"tcsenpai","metadata":"{}","thread_id":""}]} |
35 | 36 | {"id":"node-3lh","title":"Replace silent .catch(() => {}) with logging","description":"## Problem\nSeveral places swallow errors completely with empty catch blocks. This makes debugging impossible - errors happen but leave no trace.\n\n## Evidence\n- L2PSConcurrentSync.ts:62-66 - `.catch(() => {})` completely empty\n- Sync.ts:579-583 - `.catch(error => {})` error captured but ignored\n\n```typescript\n// Current - error vanishes into the void\nsomeAsyncOperation().catch(() => {})\n```\n\n## Fix Location\n- src/libs/l2ps/L2PSConcurrentSync.ts:62-66\n- src/libs/peer/Sync.ts:579-583\n\n## Code Snippet\n```typescript\n// Before\nsomeAsyncOperation().catch(() => {})\n\n// After - at minimum log it\nsomeAsyncOperation().catch((error) => {\n logger.warn('Non-critical operation failed', {\n operation: 'someAsyncOperation',\n error: error instanceof Error ? error.message : String(error),\n })\n})\n\n// Or if truly fire-and-forget, document WHY\nsomeAsyncOperation().catch((error) => {\n // Intentionally ignored: peer notification is best-effort\n // Failure here doesn't affect local state\n logger.debug('Peer notification failed (non-critical)', { error })\n})\n```\n\n## Why This Helps Stability\n- No more 'mystery' failures\n- Debug logs show what's actually happening\n- Can identify patterns (same error 100x = real problem)\n- Distinguishes intentional ignoring from oversight","notes":"Fixed 2 silent catches in L2PSConcurrentSync.ts - now log debug messages. Remaining silent catches are intentional (gracefulShutdown).","status":"closed","priority":1,"issue_type":"task","created_at":"2026-03-07T10:34:18.391511597Z","created_by":"tcsenpai","updated_at":"2026-03-07T11:59:59.554171182Z","closed_at":"2026-03-07T10:42:35.697329881Z","close_reason":"Fixed silent catches in L2PSConcurrentSync.ts with debug logging","source_repo":".","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"node-3lh","depends_on_id":"node-214","type":"parent-child","created_at":"2026-03-07T11:59:59.553836402Z","created_by":"tcsenpai","metadata":"{}","thread_id":""}]} |
|
0 commit comments