Skip to content

4 RTSP tests fail in series — order/isolation issue manifesting as missing Session: header #77

@hyperpolymath

Description

@hyperpolymath

Background

After PR #71 (test-isolation: unique-name pattern for app-owned GenServers) lands, 4 RTSP tests still fail when run as part of the full suite but pass individually:

  • test/burble/transport/rtsp_test.exs:210 — standard UDP Transport header stores :udp and client_port pair
  • test/burble/transport/rtsp_test.exs:221 — RTP/AVP/TCP Transport header stores :tcp_interleaved
  • test/burble/transport/rtsp_test.exs:231 — explicit 'interleaved' token stores :tcp_interleaved and extracts port
  • test/burble/transport/rtsp_test.exs:242 — absent client_port token stores nil client_port

All four are in the describe "parse_transport_header/1 (via SETUP over TCP)" block. They share the same failure mode: assert {:ok, session} = RTSP.get_session(server, sid) returns {:error, :not_found} because sid is nil (no Session: header in the SETUP response).

Diagnosis (revised from "parser bug")

The initial diagnosis was "pre-existing parser bugs (no Session: header on unusual Transport headers)". On investigation, the SETUP handler in lib/burble/transport/rtsp.ex:785-836 unconditionally emits a Session: header — the response template is hard-coded:

response =
  "RTSP/1.0 200 OK\r\n" <>
    "Transport: #{transport_header}\r\n" <>
    "Session: #{sid}\r\n" <>
    "\r\n"

parse_transport_header/1 (lines 905-924) returns a tuple for any input — no crash path. So the missing Session: header isn't from a parser failure.

The actual root cause appears to be test-ordering / isolation:

  1. In isolation (mix test test/burble/transport/rtsp_test.exs:210): the test passes.
  2. As a series (mix test test/burble/transport/rtsp_test.exs:210 :221 :231 :242): test 1 passes, tests 2-4 fail.
  3. Full suite: all 4 fail.

Suggests test N's tear-down leaks something into test N+1's namespace, causing the test server's listener or named-process state to be in a bad state when test N+1's rtsp_setup connects.

Likely root cause candidates

  • with_named/2 (test helper, rtsp_test.exs:364-394): unregisters/re-registers __MODULE__ on a per-test basis. If the try/after cleanup runs out of order, or if Process.alive? checks race with supervisor teardown, the next test's Process.whereis(RTSP) could see stale state.
  • No def terminate/2 callback on the RTSP GenServer — listener socket relies on OS-level close on process death; possible delay before port reuse.
  • Burble.Transport.RTSP.handle_rtsp_session connection-handler processes spawned by test N may still be processing after the test server is torn down; if they call GenServer.call(__MODULE__, ...) while MODULE is mid-handoff, the call could return :noproc and crash the handler before sending the SETUP response.

Reproduction

cd server
# Apply PR-B (#71) version of the test:
git checkout <pr-b-head> -- test/burble/transport/rtsp_test.exs lib/burble/transport/rtsp.ex

# In isolation — passes:
mix test test/burble/transport/rtsp_test.exs:210 --seed 0
# 1 test, 0 failures

# As a series — fails:
mix test test/burble/transport/rtsp_test.exs:210 test/burble/transport/rtsp_test.exs:221 \
         test/burble/transport/rtsp_test.exs:231 test/burble/transport/rtsp_test.exs:242 \
         --seed 0
# 4 tests, 3 failures (test 1 passes, tests 2/3/4 fail)

Suggested fix direction

  1. Add an explicit def terminate(_reason, state) to Burble.Transport.RTSP that closes the listener socket synchronously, so port 19554 is fully released before the next test's start_supervised! runs.
  2. In rtsp_setup, increase the :gen_tcp.recv timeout on SETUP response from 3 s — or add an explicit retry on :noproc-like timeout.
  3. Investigate with_named/2 for race against supervisor teardown — possibly hold a monitor or use Process.flag(:trap_exit, true) in the test setup.

Context

Filed after PR #72 (governance baseline rot) merged. PR #71 (unique-name pattern, Buckets A+C) is still open as of writing — once it merges, these 4 will be the named residue per its commit message.

Refs #62 (test-isolation campaign) and PR #71 (Buckets A+C).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions