|
| 1 | +Here are the main issues / foot-guns I see in that snippet, grouped by severity. |
| 2 | + |
| 3 | +1) Input validation & type safety problems |
| 4 | + |
| 5 | +Blindly trusting $body[...] shapes. |
| 6 | +participant_attributes, participant_metadata, room_config are assumed to be the correct types. If a client sends "participant_attributes": "lol" you’ll pass a string into setAttributes() and may get a runtime error or (worse) unexpected serialization. |
| 7 | + |
| 8 | +Fix: explicitly validate types: |
| 9 | + |
| 10 | +participant_identity, participant_name, room_name → strings, non-empty, length capped |
| 11 | + |
| 12 | +participant_metadata → string (or JSON string, depending on SDK expectation) |
| 13 | + |
| 14 | +participant_attributes → associative array of strings |
| 15 | + |
| 16 | +room_config → array / specific schema expected by SDK |
| 17 | + |
| 18 | +!empty() is the wrong check for some fields. |
| 19 | +empty() treats "0", 0, false, [] as empty. If someone intentionally sets metadata to "0" you’ll skip it. |
| 20 | + |
| 21 | +Fix: use array_key_exists() / isset() + type checks instead. |
| 22 | + |
| 23 | +No bounds on identity/name/metadata sizes. |
| 24 | +A client can send megabytes of metadata/attributes and you’ll happily embed it into a JWT → big CPU + big response + possible gateway/proxy issues. |
| 25 | + |
| 26 | +Fix: enforce max lengths (identity/name/metadata) and max attribute count/size. |
| 27 | + |
| 28 | +2) Security / abuse concerns |
| 29 | + |
| 30 | +Unauthenticated token minting endpoint (likely). |
| 31 | +If this is exposed publicly without auth/rate limiting, anyone can mint tokens and join any room name they choose (including “admin-ish” room names), and they can set arbitrary identity/name/metadata/attributes. |
| 32 | + |
| 33 | +Fix: require auth (session cookie, API key, JWT from your app, etc.) + rate limit + allowlist/validate room names and identities. |
| 34 | + |
| 35 | +Identity spoofing. |
| 36 | +Because identity comes from the request body, a malicious client can claim to be another user (participant_identity: "alice"). |
| 37 | + |
| 38 | +Fix: identity/name should come from your authenticated user context, not from client input. |
| 39 | + |
| 40 | +Room name injection / namespace collisions. |
| 41 | +Letting clients pick arbitrary room_name can cause collisions or unauthorized access patterns. |
| 42 | + |
| 43 | +Fix: server decides the room or validates it against what the authenticated user is allowed to join. |
| 44 | + |
| 45 | +3) Error handling & operational problems |
| 46 | + |
| 47 | +Missing checks for env vars. |
| 48 | +If LIVEKIT_API_KEY, LIVEKIT_API_SECRET, or LIVEKIT_URL are missing, you’ll mint invalid tokens or return bad data without a clear error. |
| 49 | + |
| 50 | +Fix: validate envs and return 500 with a clear message (don’t leak secrets). |
| 51 | + |
| 52 | +No try/catch around SDK calls. |
| 53 | +->toJwt() and some setters can throw. As-is, you may return HTML/500 with no JSON body. |
| 54 | + |
| 55 | +Fix: wrap token generation in try { ... } catch (\Throwable $e) { ... }. |
| 56 | + |
| 57 | +No response headers. |
| 58 | +You’re returning JSON but not setting Content-Type: application/json. |
| 59 | + |
| 60 | +Fix: header('Content-Type: application/json'); (and ideally charset). |
| 61 | + |
| 62 | +json_decode without checking for non-object JSON. |
| 63 | +If the request body is valid JSON but not an object (e.g. [] or "hi"), $body['room_name'] will emit warnings because $body isn’t an array. |
| 64 | + |
| 65 | +Fix: after decode, ensure is_array($body). |
| 66 | + |
| 67 | +4) Grants / permissions clarity |
| 68 | + |
| 69 | +Grant is “join room” only, but still potentially too broad. |
| 70 | +setRoomJoin() allows joining the room; depending on defaults, users can publish/subscribe. That may be fine, but many apps want finer control. |
| 71 | + |
| 72 | +Fix: set explicit publish/subscribe/record/admin permissions if your SDK supports it and your use case needs it. |
| 73 | + |
| 74 | +RoomConfig trust boundary. |
| 75 | +If room_config can influence server-side room behavior, accepting it from clients is risky unless you strictly validate/allowlist. |
| 76 | + |
| 77 | +Fix: server-owned config only, or validate against a safe subset. |
| 78 | + |
| 79 | +5) Minor correctness / style nits |
| 80 | + |
| 81 | +Port 3000 comment is meaningless in PHP. |
| 82 | +PHP typically runs behind a web server or built-in server; the code can’t “make sure it’s on port 3000.” This is likely copy/paste from a Node quickstart. |
| 83 | + |
| 84 | +Potential JWT bloat. |
| 85 | +Attributes/metadata/config all contribute to token size. Large JWTs can exceed header limits if later used in headers, or simply become unwieldy. |
| 86 | + |
| 87 | +A “safer skeleton” checklist (no big rewrite) |
| 88 | + |
| 89 | +If you want quick improvements without changing the structure: |
| 90 | + |
| 91 | +Set headers: Content-Type: application/json |
| 92 | + |
| 93 | +Ensure $body is an array: if (!is_array($body)) ... |
| 94 | + |
| 95 | +Validate env vars are present |
| 96 | + |
| 97 | +Validate/normalize inputs (string lengths, allowed characters, attribute schema) |
| 98 | + |
| 99 | +Don’t take identity/name from the client; derive from auth |
| 100 | + |
| 101 | +Wrap minting in try/catch and return JSON errors |
| 102 | + |
| 103 | +Add auth + rate limiting |
| 104 | + |
| 105 | +If you tell me what LiveKit PHP SDK version you’re using and what types it expects for setMetadata, setAttributes, and setRoomConfig, I can point out the exact type mismatches and suggest concrete validation rules. |
| 106 | + |
| 107 | + |
| 108 | +Yes — an intentional reconnect (especially if it wasn’t fully handled / “reconnected” wasn’t applied cleanly) can explain exactly that pattern: B is publishing audio, C hears it, A doesn’t; A still sees B’s video; everyone else is fine. |
| 109 | + |
| 110 | +Here are the most common mechanisms that produce that “A can’t hear B, but everything else works” symptom, and how they relate to reconnect / missing handling. |
| 111 | + |
| 112 | +1) A’s receiver-side subscription for B’s audio got dropped or stuck |
| 113 | + |
| 114 | +After reconnect/resume, the SDK often has to re-sync: |
| 115 | + |
| 116 | +which tracks A is subscribed to, |
| 117 | + |
| 118 | +which track IDs/SIDs are current, |
| 119 | + |
| 120 | +and the receiver pipeline for each track. |
| 121 | + |
| 122 | +If the reconnect path misses “re-apply subscriptions” (or misses the audio subset), you can get: |
| 123 | + |
| 124 | +B’s video subscribed correctly (so A sees B), |
| 125 | + |
| 126 | +B’s audio not subscribed / not attached / not resumed (so A hears nothing), |
| 127 | + |
| 128 | +while C successfully re-subscribed (so C hears B). |
| 129 | + |
| 130 | +What you’d see in logs (often on A’s side, not B’s): |
| 131 | + |
| 132 | +track subscribed/unsubscribed events for B audio missing |
| 133 | + |
| 134 | +“muted”/“enabled=false”/“track not attached” for audio only |
| 135 | + |
| 136 | +receiver stats: video inbound bytes increasing; audio inbound bytes ~0 |
| 137 | + |
| 138 | +2) A is receiving B’s audio RTP, but decrypt/MLS state is wrong for that one stream |
| 139 | + |
| 140 | +If you’re using end-to-end encryption / MLS, a reconnect/desync can produce a selective decrypt failure: |
| 141 | + |
| 142 | +video might decrypt (different key usage / timing / SSRC mapping / separate sender keys) |
| 143 | + |
| 144 | +audio might fail decrypt (or fail key lookup) → silence |
| 145 | + |
| 146 | +other participants still fine (they have correct epoch/keys) |
| 147 | + |
| 148 | +This matches “C hears B, A doesn’t” because only A is out of sync. |
| 149 | + |
| 150 | +What you’d see: |
| 151 | + |
| 152 | +on A: “cannot decrypt frame”, “unknown key”, “epoch mismatch”, “discarding packet” for audio SSRC |
| 153 | + |
| 154 | +on B: usually nothing (B is just sending) |
| 155 | + |
| 156 | +on C: normal decrypt / no errors |
| 157 | + |
| 158 | +3) Track identity changed across reconnect and A is still bound to the old audio track |
| 159 | + |
| 160 | +An intentional reconnect can result in: |
| 161 | + |
| 162 | +B’s audio track being republished (new track SID / new transceiver / new SSRC), |
| 163 | + |
| 164 | +but A’s app logic or state machine still pointing at the old one. |
| 165 | + |
| 166 | +Result: |
| 167 | + |
| 168 | +UI shows B present + video (new video track handled) |
| 169 | + |
| 170 | +audio element for B is still bound to the old track (or never attached) |
| 171 | + |
| 172 | +C happened to bind to the new track |
| 173 | + |
| 174 | +Clues: |
| 175 | + |
| 176 | +two different audio track SIDs for B around the reconnect |
| 177 | + |
| 178 | +“unpublished old audio track” followed by “published new audio track” |
| 179 | + |
| 180 | +A never logs “subscribed to new audio track” |
| 181 | + |
| 182 | +4) Audio receiver exists but is muted/disabled only on A due to state not re-applied |
| 183 | + |
| 184 | +During reconnect, some SDKs re-create track objects; app code that manages mute/unmute can accidentally: |
| 185 | + |
| 186 | +keep B’s audio track muted on A, |
| 187 | + |
| 188 | +while video remains enabled. |
| 189 | + |
| 190 | +Clues: |
| 191 | + |
| 192 | +on A: audio track enabled=false, muted=true, or volume set to 0 |
| 193 | + |
| 194 | +no inbound audio level, but subscription exists |
| 195 | + |
| 196 | +Why “B-side logs only” often can’t prove this |
| 197 | + |
| 198 | +Because the failure is usually on A’s receiving path: |
| 199 | + |
| 200 | +B is publishing and C hears it → B’s publisher is probably fine. |
| 201 | + |
| 202 | +The divergence is: A’s subscription/decrypt/attach path is broken. |
| 203 | + |
| 204 | +So B’s logs might show reconnect happened, but to explain “A can’t hear B” you almost always need: |
| 205 | + |
| 206 | +A-side track events + inbound-rtp stats + (if E2EE) decrypt logs. |
| 207 | + |
| 208 | +What to check quickly (high signal) |
| 209 | + |
| 210 | +If you can grab these, it will pinpoint which bucket you’re in: |
| 211 | + |
| 212 | +On participant A (receiver) |
| 213 | + |
| 214 | +Did A ever log Subscribed to B’s audio track (track SID)? |
| 215 | + |
| 216 | +Are inbound RTP audio bytes/packets increasing for B’s audio SSRC? |
| 217 | + |
| 218 | +Any decrypt/MLS errors tied to audio SSRC or key epoch? |
| 219 | + |
| 220 | +Is the audio track muted/disabled or not attached to an audio element? |
| 221 | + |
| 222 | +On participant B (publisher) |
| 223 | + |
| 224 | +Did B’s audio track get republished (new track SID) around reconnect? |
| 225 | + |
| 226 | +Any “audio track ended/restarted” events? |
| 227 | + |
| 228 | +Direct answer to your question |
| 229 | + |
| 230 | +Intentional reconnect + missing reconnected handling absolutely can explain this behavior, most commonly by leaving A’s audio subscription or receiver pipeline out of sync while video stays fine. |
| 231 | + |
| 232 | +If MLS/E2EE is in play, it can also explain it via a selective decrypt/key mismatch on A for B’s audio stream. |
| 233 | + |
| 234 | +If you paste: |
| 235 | + |
| 236 | +the reconnect section from B’s logs, and |
| 237 | + |
| 238 | +the same time window from A’s logs (even shorter), |
| 239 | +I can tell you which of the above is most likely and what exact log lines to look for next. |
| 240 | + |
| 241 | +Here’s a small, practical pattern that works well for 100+ rooms with autoSubscribe:false and survives reconnect/resume without “A can’t hear B” type drift. |
| 242 | + |
| 243 | +Core idea |
| 244 | + |
| 245 | +Keep a single source of truth for “who should I be subscribed to right now?” |
| 246 | + |
| 247 | +On any event that can desync state (reconnect/resumed, participant joined/left, tracks published/unpublished, active speaker change), re-apply that desired subscription set. |
| 248 | + |
| 249 | +You don’t need to persist “current subscription state” perfectly — you can recompute desired state and call subscribe/unsubscribe idempotently. |
| 250 | + |
| 251 | +Minimal JS/TS snippet |
| 252 | +import { Room, RoomEvent, Track } from "livekit-client"; |
| 253 | + |
| 254 | +const room = new Room({ |
| 255 | + autoSubscribe: false, |
| 256 | + // ...other options |
| 257 | +}); |
| 258 | + |
| 259 | +// Your policy knobs |
| 260 | +const MAX_AUDIO_SUBS = 12; // keep small for 100+ rooms |
| 261 | +const pinnedIdentities = new Set<string>(); // optional: user pins |
| 262 | + |
| 263 | +function desiredAudioPublishers(): string[] { |
| 264 | + // 1) Pins always included |
| 265 | + const pins = [...pinnedIdentities]; |
| 266 | + |
| 267 | + // 2) Active speakers next (Room keeps this list updated) |
| 268 | + const speakers = room.activeSpeakers |
| 269 | + .map(p => p.identity) |
| 270 | + .filter(Boolean); |
| 271 | + |
| 272 | + // Merge, preserve order, cap |
| 273 | + const ordered = [...new Set([...pins, ...speakers])]; |
| 274 | + return ordered.slice(0, MAX_AUDIO_SUBS); |
| 275 | +} |
| 276 | + |
| 277 | +function applyAudioSubscriptions() { |
| 278 | + const wanted = new Set(desiredAudioPublishers()); |
| 279 | + |
| 280 | + for (const [, p] of room.remoteParticipants) { |
| 281 | + // Find this participant’s mic publication (if any) |
| 282 | + const micPub = [...p.audioTrackPublications.values()].find(pub => |
| 283 | + pub.source === Track.Source.Microphone |
| 284 | + ); |
| 285 | + if (!micPub) continue; |
| 286 | + |
| 287 | + const shouldSub = wanted.has(p.identity); |
| 288 | + |
| 289 | + // Idempotent: calling setSubscribed repeatedly is fine |
| 290 | + if (micPub.isSubscribed !== shouldSub) { |
| 291 | + micPub.setSubscribed(shouldSub); |
| 292 | + } |
| 293 | + } |
| 294 | +} |
| 295 | + |
| 296 | +// Re-apply policy on anything that can change reality/state |
| 297 | +room |
| 298 | + .on(RoomEvent.Connected, applyAudioSubscriptions) |
| 299 | + .on(RoomEvent.Reconnected, applyAudioSubscriptions) // WS/RTC recovered |
| 300 | + .on(RoomEvent.Resumed, applyAudioSubscriptions) // if using resume |
| 301 | + .on(RoomEvent.ParticipantConnected, applyAudioSubscriptions) |
| 302 | + .on(RoomEvent.ParticipantDisconnected, applyAudioSubscriptions) |
| 303 | + .on(RoomEvent.TrackPublished, applyAudioSubscriptions) |
| 304 | + .on(RoomEvent.TrackUnpublished, applyAudioSubscriptions) |
| 305 | + .on(RoomEvent.ActiveSpeakersChanged, applyAudioSubscriptions); |
| 306 | + |
| 307 | +Do you need to track participant list / subscription state? |
| 308 | + |
| 309 | +You need less than you think: |
| 310 | + |
| 311 | +Participant list |
| 312 | + |
| 313 | +No — the room.remoteParticipants map is your participant list. You can iterate it whenever you re-apply. |
| 314 | + |
| 315 | +Subscription state |
| 316 | + |
| 317 | +Not really. |
| 318 | + |
| 319 | +You can compute desired state and call setSubscribed(true/false). |
| 320 | + |
| 321 | +Checking pub.isSubscribed is only to avoid spamming calls; it’s optional. |
| 322 | + |
| 323 | +What you do want to track |
| 324 | + |
| 325 | +Policy inputs you own: |
| 326 | + |
| 327 | +pinnedIdentities (if you support pins) |
| 328 | + |
| 329 | +maybe a “stage” list / visible tiles list |
| 330 | + |
| 331 | +MAX_AUDIO_SUBS |
| 332 | + |
| 333 | +Everything else can be derived from the room. |
| 334 | + |
| 335 | +Why this helps your “reconnect caused selective audio loss” case |
| 336 | + |
| 337 | +If a reconnect/resume accidentally leaves some audio subscriptions “off” (or bound to old track pubs), calling applyAudioSubscriptions() after Reconnected/Resumed/TrackPublished forces the client back to the correct state. |
| 338 | + |
| 339 | +This is the key: treat reconnect as “my local state might be wrong; re-sync everything.” |
| 340 | + |
| 341 | +Two small extras that prevent common gotchas |
| 342 | + |
| 343 | +Handle track SID changes |
| 344 | +On reconnect, you can see new publications. Hooking TrackPublished and reapplying covers this. |
| 345 | + |
| 346 | +Make sure your “identity” is stable |
| 347 | +Use participant.identity (string) rather than SIDs that might change between sessions. |
0 commit comments