Skip to content

Add in-memory game state tracking and WebSocket broadcast on game change#4

Merged
kzryzstof merged 39 commits into
mainfrom
copilot/add-game-info-in-memory
Apr 6, 2026
Merged

Add in-memory game state tracking and WebSocket broadcast on game change#4
kzryzstof merged 39 commits into
mainfrom
copilot/add-game-info-in-memory

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 21, 2026

RetroArch had no module to track the currently running game in memory, and the WebSocket server had no mechanism to push game state to clients on connect or game change.

New: network/game_state.c / network/game_state.h

Standalone, thread-safe module storing the active game. Exposes a simple API around a single ra_game_state_t record:

typedef struct {
    char game_id      [64];    /* CRC-32 hex string */
    char game_name    [512];   /* ROM basename without extension */
    char game_path    [4096];  /* Full filesystem path */
    char console_id   [64];    /* e.g. "snes" */
    char console_name [256];   /* e.g. "Super Nintendo Entertainment System" */
    char core_name    [256];
    char db_name      [512];   /* Playlist/database name */
} ra_game_state_t;

game_state_to_json() serialises state as:

  • {"type":"game_playing","game_id":"...","game_name":"...",...} when a game is running
  • {"type":"no_game"} otherwise

WebSocket server (network/ws_server.c)

  • New client connects (LWS_CALLBACK_ESTABLISHED): immediately sends current game state via lws_callback_on_writable.
  • Broadcast (ws_server_notify_game_changed()): thread-safe — sets g_broadcast_pending under the existing lock, wakes the service loop with lws_cancel_service(). The service thread picks up the flag and calls lws_callback_on_writable_all_protocol(), keeping all libwebsockets I/O on the service thread. The background thread model is unchanged.
  • Removed the echo demonstration; LWS_CALLBACK_SERVER_WRITEABLE now writes the current game state JSON.

Hook points

  • Game loaded (tasks/task_content.c): after core_load_game() succeeds, populates ra_game_state_t from content_file_info, core_info_t, and rarch_system_info_t, then calls game_state_set() + ws_server_notify_game_changed().
  • Game unloaded (retroarch.c, CMD_EVENT_UNLOAD_CORE): calls game_state_clear() + ws_server_notify_game_changed() so clients receive {"type":"no_game"}.
  • Lifecycle: game_state_init() / game_state_deinit() called alongside ws_server_init() / ws_server_destroy().

Build

  • network/game_state.o added to Makefile.common under the HAVE_WEBSOCKET_SERVER block.
  • network/game_state.c added to griffin/griffin.c unity build.
Original prompt

I want to a have a new .c file that will keep in memory the current game being played in retro-arch. I want the ID, the name, the console Id, the console name at a minimum. If there is more information available about the game, I want it as well.

Now I want the new ws-server that we added to send this information whenever a new client connects the server. All the existing clients must be notified of a new game being played. Another message must be sent if no game is being played.

Keep the server running on its own thread; do not change it.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits March 21, 2026 11:11
Copilot AI changed the title [WIP] Add game information storage and notification system Add in-memory game state tracking and WebSocket broadcast on game change Mar 21, 2026
Copilot AI requested a review from kzryzstof March 21, 2026 11:13
@kzryzstof kzryzstof marked this pull request as ready for review April 4, 2026 20:14
@kzryzstof kzryzstof marked this pull request as draft April 5, 2026 13:18
@kzryzstof kzryzstof closed this Apr 6, 2026
@kzryzstof kzryzstof reopened this Apr 6, 2026
# Conflicts:
#	.github/workflows/Linux.yml
#	.github/workflows/MacOS.yml
#	.github/workflows/Windows-ARM64.yml
#	.github/workflows/Windows-x64-MXE.yml
@kzryzstof kzryzstof marked this pull request as ready for review April 6, 2026 14:41
@kzryzstof kzryzstof merged commit 8482138 into main Apr 6, 2026
0 of 4 checks passed
@kzryzstof kzryzstof deleted the copilot/add-game-info-in-memory branch April 11, 2026 12:10
kzryzstof pushed a commit that referenced this pull request Apr 26, 2026
Four real bugs in vfs_implementation_cdrom.c found during the
follow-up audit after the main vfs_implementation.c fixes.
Same bug class as the main vfs mmap read overflow already fixed.

vfs_cdrom: cuesheet read byte_pos wrap OOB
  In retro_vfs_file_read_cdrom cuesheet branch:

    if ((int64_t)len >= (int64_t)stream->cdrom.cue_len
          - stream->cdrom.byte_pos)
       len = stream->cdrom.cue_len - stream->cdrom.byte_pos - 1;
    memcpy(s, stream->cdrom.cue_buf + stream->cdrom.byte_pos, len);

  cue_len is size_t, byte_pos is int64_t.  When byte_pos equals
  or exceeds cue_len (reachable via seek, which does NOT clamp
  byte_pos), the subtraction "cue_len - byte_pos - 1" executes as
  unsigned size_t and wraps to SIZE_MAX.  The memcpy then reads
  SIZE_MAX bytes off the end of cue_buf.

  Similarly a negative byte_pos (reachable by SEEK_CUR with a
  negative offset when byte_pos was near zero) becomes a huge
  size_t after the cast and wraps the same subtraction.

  Reproduced under ASan:
    "negative-size-param: (size=-1)" in memcpy at
    vfs_implementation_cdrom.c:401

  Fix: guard byte_pos against being outside [0, cue_len) and
  clamp len against the remaining bytes using unsigned subtraction
  against "remaining = cue_len - byte_pos".

vfs_cdrom: bin read byte_pos+len wrap OOB
  Same bug class in the .bin branch:

    if (stream->cdrom.byte_pos + len >
          vfs_cdrom_toc.track[stream->cdrom.cur_track - 1].track_bytes)
       len -= (stream->cdrom.byte_pos + len)
            - vfs_cdrom_toc.track[stream->cdrom.cur_track - 1].track_bytes;

  Attacker-controlled len near UINT64_MAX wraps "byte_pos + len"
  past track_bytes, bypassing the bound check; the downstream
  cdrom_read() is then called with the unclamped len.

  Fix: clamp against "remaining = track_bytes - byte_pos" with
  unsigned subtraction, never adding byte_pos and len.

vfs_cdrom: cur_track == 0 indexes track[-1]
  If a user-controlled VFS path is crafted as
  "drive1-track00.bin" (or the Windows-style
  "d:/drive-track00.bin"), the two-digit parser produces
  cur_track = 0.  The read code then does
  "track[cur_track - 1]" = track[-1], an out-of-bounds read into
  the struct preceding vfs_cdrom_toc.track[].

  Fix: the parser rejects values outside [1, 99] and leaves the
  default cur_track = 1 in place on reject.  The read paths also
  defensively reject cur_track == 0 before indexing, so a
  malicious caller that sets cur_track through another vector
  still cannot trip the OOB.

vfs_cdrom: guard cur_track > 99 in bin read
  TOC array is track[99] (indices 0..98).  cur_track is
  unsigned char so mathematically can reach 255.  Defensive
  check rejects values above 99 before indexing.

Reachability: all four bugs are reachable by a core or frontend
that has VFS access to a CD-ROM-scheme stream (cdrom://...).
Bugs 1 and 2 escalate a weak seek primitive (byte_pos not
clamped) into arbitrary OOB reads.

VC6 compatibility: the whole file is gated by HAVE_CDROM, which
only the Linux and Win32 (non-Xbox) branches set.  VC6 is a Win32
target, so its branch compiles.  The fix uses only size_t,
uint64_t, and standard comparisons -- all VC6-compatible via the
existing stdint shim.

Regression test: libretro-common/samples/file/vfs_cdrom/
  True discriminator for bug #1.  Five subtests covering
  byte_pos==cue_len, byte_pos>cue_len, byte_pos<0, huge len, and
  a baseline success.  On unpatched code:
    SEGV on byte_pos==cue_len (plain run)
    ASan: "negative-size-param: (size=-1)" in memcpy
  On patched code:
    all five subtests pass cleanly; ASan clean; exit 0

  Bug #2 (bin read) and bugs #3/#4 (cur_track parsing) don't get
  a targeted regression test.  Bug #2 would require linking a
  full cdrom_toc setup which is platform-dependent; bugs #3/#4
  are guarded by the parser fix so the failing input cannot reach
  the OOB path, and writing a white-box test of the parser is
  low value once the upstream fix is in.

CI: libretro-common samples workflow updated.  Full dry-run
under the GHA shell contract:
  Built: 14  Ran: 13  Failed: 0
kzryzstof pushed a commit that referenced this pull request Apr 26, 2026
* Initial skeleton implementation of webp

* Fixed in this session:

Token tree structure rewritten to match libvpx's GetCoeffs() exactly (the old linear chain was wrong — VP8 uses a grouped binary tree)
vp8_bands[] array extended to 17 entries with sentinel (prevented out-of-bounds at n=16)
Coefficient types 1 and 3 swapped (Y2 uses type 1, B_PRED uses type 3 — matching libvpx)
Default coefficient probability table replaced with exact values from libvpx default_coef_probs.h
B_PRED sub-block mode parsing + 4x4 intra prediction implemented
Non-zero coefficient context tracking added
Bool decoder verified bit-exact against libvpx reference (500 consecutive reads match perfectly)

The remaining color issue:
The decoder only consumes 1,497 of 26,716 bytes (5.6%) from the token partition. The bool decoder, probability tables, update loop, and tree structure are all verified correct. The remaining 94.4% of unused data is high-entropy (7.99 bits/byte) — real coefficient data that's never being read. The most likely cause is that the macroblock mode decode from the first partition is desyncing for B_PRED macroblocks, causing the token partition to skip or misread blocks. When a B_PRED MB's 16 sub-block modes aren't consumed correctly from the first partition, the subsequent mode reads for following MBs drift, and the token partition's block-type selection (Y2 vs no-Y2) becomes wrong, compounding the desync.

* Update rwebp

* Update rwebp

* Critical Bug Found and Fixed: Missing EOB Check After Zero Coefficients
The vp8_decode_block function had a structural bug: it did not check for EOB after zero coefficients. In the VP8 coefficient decode tree, EOB must be checked at EVERY coefficient position — not just at the start and after non-zero values. The old code went directly from a zero coefficient to the next position's zero/nonzero check, skipping the EOB check entirely.
Fix applied: Restructured vp8_decode_block to check EOB (p[0]) at the TOP of every iteration, before the zero/nonzero decision (p[1]). Also fixed zigzag indexing and return value semantics.
Result: Mean RGB diff improved from 95.6 → 79.1 (17% improvement).
Remaining Issue: Token consumption still too low
Even after the fix, MB 0 consumes only 6 bytes while libvpx consumes 14 bytes. The remaining gap likely comes from:

Missing nonzero context propagation in the standalone test — the test passes ctx=0 for all blocks, but blocks after non-zero ones should get ctx=1 or ctx=2, producing more non-zero coefficients
Possible remaining issue in how the probability context row (0/1/2) is selected after zero vs non-zero coefficients — needs careful comparison against libvpx's tree traversal

All Fixes in Current Output File

IWHT >>3 normalization
DC from IWHT × y1_dc_q
pred16/pred8 DC edge cases
Per-segment loop filter with simple/normal modes
NEW: Corrected decode_block with EOB check at every position

* Current Status
The fix alone makes the image quality worse (87.1 → 98.5 mean diff) because skip_enabled is now correctly parsed as 1, causing skip bits to be read per-MB. The skip bit read position in the MB parsing loop needs to match the VP8 spec order (before y_mode, per RFC 6386 §19.3), and potentially other MB-level parsing adjustments are needed to fully benefit from the correct probability tables.
Key Remaining Issues

The skip_coeff bit position in the per-MB parse loop (currently after uv_mode, should be before y_mode per RFC 6386 §11.1)
The decode_block tree structure (whether EOB is checked after zero coefficients — libvpx does NOT check, but the restructured version gives empirically better results)
Additional header parse issues that may exist between the original code and the VP8 spec

* Summary of All Bugs Found
1. Missing refresh_entropy_probs bit (CRITICAL — VERIFIED)
Location: After quantizer deltas, before coefficient probability updates
Fix: Added (void)vp8b_bit(&br);
Impact: Without this bit, the CUP reads were shifted by 1 bit, producing 8 spurious probability updates instead of the correct 93. Verified: all 1056 coefficient probabilities now match libvpx exactly.
2. Missing prob_skip_false byte (CRITICAL — VERIFIED)
Location: After skip_enabled flag
Fix: When skip_enabled=1, read prob_skip = vp8b_lit(&br, 8) and use it (instead of prob=128) for per-MB skip bool reads.
Impact: Without this 8-bit probability, ALL subsequent first-partition parsing was shifted by 8 bits. With it, MB 0 y_mode and uv_mode parse correctly.
3. skip_coeff bit position (PARTIALLY RESOLVED)
The VP8 spec says skip should be read after segment_id and before y_mode. Our code reads it after uv_mode. Both positions produce wrong segment IDs, suggesting there may be one more alignment issue in the per-MB bitstream parsing. The "after uvm" position currently gives the best empirical results (mean diff 91.0).
Current Output File State

refresh_entropy_probs fix applied ✓
prob_skip_false fix applied ✓
skip_coeff position: after uv_mode (gives best results currently)
Coefficient probability tables: match libvpx exactly (93 correct updates)
Mean RGB diff: 91.0 (improved from the broken 98.5, but still worse than the original's accidental 87.1)

* Major Progress! Pixel (0,0) is now nearly perfect!
Bug #3 Fixed: Keyframe y_mode tree order was WRONG
The VP8 keyframe y_mode decode tree puts B_PRED first (bit=0 → B_PRED), but rwebp.c had DC_PRED first. From RFC 6386 §11.2, the correct keyframe y_mode tree order is:

B_PRED (mode 4) — first branch, prob=145
DC_PRED (mode 0) — second branch, prob=156
V_PRED (mode 1) — third branch, prob=163
H_PRED (mode 2) — fourth branch, prob=128
TM_PRED (mode 3) — default

This was swapped in the original code, causing ~57% of MBs to get the wrong prediction mode. The fix changed:

if (!get(145)) ym = 0 → if (!get(145)) ym = 4
And reordered all subsequent branches

Results

Pixel (0,0): [119,180,243] vs reference [118,181,241] — within 2 values!
Mean RGB diff: 80.6 (improved from 91.0)
Pixels within ±30: 3.7% (improved from 1.5%)

All Bugs Fixed So Far

Missing refresh_entropy_probs bit — 1 bit after quantizer deltas
Missing prob_skip_false byte — 8 bits when skip_enabled=1
Wrong keyframe y_mode tree order — B_PRED should be first branch, not last

* Bug #4 Fixed: Wrong Segment Tree Structure
The segment_id decode tree was using a binary tree (prob[0] splits into left={0,1} and right={2,3} subtrees), but the VP8 spec (RFC 6386 §10.2) defines it as a linear tree:
Wrong (binary):
prob[0]: bit=0 → {prob[1]: 0 or 1}, bit=1 → {prob[2]: 2 or 3}
Correct (linear, per RFC 6386):
prob[0]: bit=0 → seg 0
prob[1]: bit=0 → seg 1
prob[2]: bit=0 → seg 2, bit=1 → seg 3
This is the exact same class of bug as the y_mode tree (Bug #3). The linear tree means segment 0 is the "most likely" segment (first branch), while the old binary tree split the probability space differently.
This fix should resolve the cascading segment_id error that caused MB 0 to parse as seg=0 instead of seg=3, which then shifted all subsequent MB mode parsing and caused only 1784/3574 FP bytes to be consumed.
All 4 Bugs Fixed

* All Bugs Fixed — Summary
5 Bugs Found and Fixed
Results

Original: Mean diff 87.1, pixel (0,0) = [169,169,169] (uniform gray)
Final: Mean diff 80.4, pixel (0,0) = [118,181,241] (exact match with reference [118,181,241])
Coefficient probabilities verified bit-exact with libvpx (all 1056 entries match)
Bool decoder verified bit-exact between 16-bit and 64-bit implementations (0 divergences over 1920+ reads)

* Bug #5 Fixed: Missing vp8_decode_mode_mvs initialization reads
Root cause discovered: libvpx's vp8_decode_mode_mvs() function reads 1 flag bit + 8-bit literal value = 9 bits from the first partition bool decoder at the very start, BEFORE the per-MB mode loop. These bits correspond to a mode/MV probability initialization structure. Our code was not consuming these bits, causing all subsequent per-MB mode reads to be shifted.
All 5 Bugs Fixed
Result: Mean RGB diff 75.3 (from original 87.1, previous best 80.4)

* Bug #5 Fixed: Segment Tree Structure (Linear → Balanced)
The segment tree was implemented as a linear chain (seg0 → seg1 → seg2 → seg3) but VP8 uses a balanced binary tree:
        prob[0]
       /       \
   prob[1]    prob[2]
   /    \     /    \
 seg0  seg1  seg2  seg3
Confirmed by disassembling vp8_decode_mode_mvs(): after a bit=0 result at prob[0], the code loads prob[1] (pbi+0xf85) for the next read. After bit=1, it loads prob[2] (pbi+0xf86). This is the balanced tree structure, not the linear chain our code used.
All 5 Bugs Fixed
Result: MB(0,0) mean diff 3.6 (was 6.8), overall 80.6.
The remaining ~80 mean diff is caused by cascading BD state divergence through 805 MBs. The skip position (after uvm vs after seg) accounts for most of this — the correct position (after seg) gives worse overall results due to error amplification, even though it's structurally correct.

* Session Summary — All Bugs Fixed
Bugs Found and Fixed (8 total)
Key Discovery: Wrong struct offset
The segment_id field is at MODE_INFO offset 11 (not 16 as assumed throughout the investigation). libvpx actually produces seg=0 for MB0, matching our decoder. The "seg=3" measurement was a red herring from reading the wrong struct field.
Current State

MB0 sub-block modes: 12/16 match libvpx (was 7/16 before)
MB(0,0) mean diff: 3.6 (near-perfect)
Overall mean diff: 100.6 (high due to cascading from 4 remaining bmi mismatches)
The remaining bmi[7] divergence at identical BD states is the last unexplained issue

* Add missing CoreFoundation header include

* Add missing TargetConditionals.h includes
kzryzstof pushed a commit that referenced this pull request Apr 29, 2026
Found by the new ASan+UBSan CI workflow's headless SDL2
imageviewer smoke (b9777c8 + d967813).  Run #2 of the smoke
(after the v12 xvideo->sdl2 pivot) caught a SEGV at process
startup:

   AddressSanitizer:DEADLYSIGNAL
   ==9537==ERROR: AddressSanitizer: SEGV on unknown address
   0x000000000968.  The signal is caused by a READ memory
   access.

   #0 XQueryExtension      (libX11.so.6+0x361f6)
   #1 XInitExtension       (libX11.so.6)
   #2 XextAddDisplay       (libXext.so.6)
   #3 XScreenSaverQueryExtension (libXss.so.1+0x1565)
   #4 x11_suspend_screensaver
   #5 video_driver_init_internal
   libretro#6 drivers_init
   libretro#7 retroarch_main_init

The address 0x968 is a small offset off NULL -- libX11's
XQueryExtension dereferencing the Display * argument before
checking it for null.

Root cause:
gfx/common/x11_common.c::xss_screensaver_inhibit() was called
via x11_suspend_screensaver() with g_x11_dpy == NULL.  The flow:

  1. SDL2 video driver inits.
  2. sdl2_set_handles() in gfx/common/sdl2_common.c:54 calls
     video_driver_display_type_set(RARCH_DISPLAY_X11) so the
     display *type* gate matches X11, and routes the SDL-owned
     X Display through video_driver_display_set() at line 55.
  3. The file-scope `g_x11_dpy` in x11_common.c:66 stays at its
     initial NULL value.  It's only assigned at line 795 inside
     XOpenDisplay() in the xvideo / GL / X11-direct init paths,
     which the SDL2 driver doesn't take.
  4. drivers_init() -> video_driver_init_internal() ->
     x11_suspend_screensaver() -- line 286 gates on display_type
     == RARCH_DISPLAY_X11 (passes; SDL2 set it), then at lines
     289-292 tries dbus_suspend_screensaver() if HAVE_DBUS, then
     at line 293 calls xss_screensaver_inhibit(g_x11_dpy,
     enable) -- with g_x11_dpy == NULL.
  5. xss_screensaver_inhibit() at line 227 calls
     XScreenSaverQueryExtension(NULL, ...) -> libX11
     dereferences -> SEGV.

Why hasn't this been hit before?
Most desktop builds set HAVE_DBUS -- pkg-config finds dbus-1 on
basically every modern Linux distro -- so the
dbus_suspend_screensaver() call at line 290 short-circuits and
returns true before the xss path is reached.  The CI runner
doesn't apt-install libdbus-1-dev (HAVE_DBUS off) but libxss1 is
pulled in transitively by xvfb / x11-utils (HAVE_XSCRNSAVER on
at link time), exposing the corner.

A symmetric defensive check already exists at line 255 in
xdg_screensaver_inhibit():

   if (g_x11_dpy && g_x11_win)
      ...

The xss path was the missing-guard one.

* gfx/common/x11_common.c

  Add a `if (!dpy) return false;` guard at the top of
  xss_screensaver_inhibit().  Applies only to the
  HAVE_XSCRNSAVER branch; the no-op stub at line 237 already
  discards `dpy` with `(void) dpy;` and returns false.

  Inline comment explaining the failure mode points at the
  relevant call sites in sdl2_common.c so a future maintainer
  can trace the invariant without re-deriving it.

Verified locally:
  - The added check returns false for NULL dpy without
    invoking libX11 (matches the existing "extension not
    present" return contract).  Non-NULL dpy paths are
    unchanged.
  - xss_screensaver_inhibit() is `static`, single caller (line
    311 in x11_suspend_screensaver) -- no other reachable code
    paths to audit.

Cannot verify locally that this resolves the CI SEGV without
rebuilding retroarch with the same apt set the runner has --
the next CI run is the experiment.  Expected outcome:
xss_screensaver_inhibit() returns false harmlessly, the
function falls through to the xdg_screensaver path (which has
its own NULL guard), and the SDL2 smoke proceeds into
retro_run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants