Skip to content

fix(datachannel): checkpoint lobby token refresh and llm resubscribe …#4818

Merged
mickelr merged 7 commits intowebex:nextfrom
Tianhui-Han:feat/datachannel_in_lobby_register_and_disconnect_reconnect_subscription
Apr 3, 2026
Merged

fix(datachannel): checkpoint lobby token refresh and llm resubscribe …#4818
mickelr merged 7 commits intowebex:nextfrom
Tianhui-Han:feat/datachannel_in_lobby_register_and_disconnect_reconnect_subscription

Conversation

@Tianhui-Han
Copy link
Copy Markdown
Contributor

@Tianhui-Han Tianhui-Han commented Mar 30, 2026

COMPLETES #< INSERT LINK TO ISSUE >

This pull request addresses

https://jira-eng-gpk2.cisco.com/jira/browse/SPARK-763776

by making the following changes

  • Lobby admission
    After being admitted from the lobby, the user must actively request and obtain the session token to connect to backend services.

  • Practice session promotion
    Attendees in a practice session do not receive PS tokens initially; when promoted to panelist they must actively request the PS token to gain panel access.

  • Reconnect and resubscribe
    On reconnect the client reestablishes the LLM connection and may need to resubscribe to previous channel

Change Type

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Tooling change
  • Internal code refactor

The following scenarios were tested

< ENUMERATE TESTS PERFORMED, WHETHER MANUAL OR AUTOMATED >

The GAI Coding Policy And Copyright Annotation Best Practices

  • GAI was not used (or, no additional notation is required)
  • Code was generated entirely by GAI
  • GAI was used to create a draft that was subsequently customized or modified
  • Coder created a draft manually that was non-substantively modified by GAI (e.g., refactoring was performed by GAI on manually written code)
  • Tool used for AI assistance (GitHub Copilot / Other - specify)
    • Github Copilot
    • Other - Please Specify
  • This PR is related to
    • Feature
    • Defect fix
    • Tech Debt
    • Automation

I certified that

  • I have read and followed contributing guidelines
  • I discussed changes with code owners prior to submitting this pull request
  • I have not skipped any automated checks
  • All existing and new tests passed
  • I have updated the documentation accordingly

Make sure to have followed the contributing guidelines before submitting.

@aws-amplify-us-east-2
Copy link
Copy Markdown

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-4818.d3m3l2kee0btzx.amplifyapp.com

@Tianhui-Han Tianhui-Han force-pushed the feat/datachannel_in_lobby_register_and_disconnect_reconnect_subscription branch from 9500d92 to d04a422 Compare March 30, 2026 07:25
@Tianhui-Han Tianhui-Han force-pushed the feat/datachannel_in_lobby_register_and_disconnect_reconnect_subscription branch from d04a422 to 1348a53 Compare March 31, 2026 02:57
@Tianhui-Han Tianhui-Han marked this pull request as ready for review March 31, 2026 05:39
@Tianhui-Han Tianhui-Han requested review from a team as code owners March 31, 2026 05:39
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1348a535cd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

});
this.locusInfo.on(LOCUSINFO.EVENTS.SELF_ADMITTED_GUEST, async (payload) => {
this.stopKeepAlive();
await this.ensureDefaultDatachannelTokenAfterAdmit();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not block admitted-guest flow on token prefetch

Awaiting ensureDefaultDatachannelTokenAfterAdmit() inside the SELF_ADMITTED_GUEST listener makes MEETING_SELF_GUEST_ADMITTED emission and subsequent updateLLMConnection() depend on a network token fetch. If /datachannel/token is slow or temporarily unavailable, the client can remain in a lobby-like state longer than necessary even though locus has already admitted the user. The admission event path should not be gated by this best-effort prefetch.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@mickelr mickelr Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not block admit process before get token. Maybe call it just before updateLLMConnection()

@mickelr mickelr added the validated If the pull request is validated for automation. label Apr 2, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50a55c2bae

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +280 to 287
const refreshedPracticeSessionToken = await this.ensurePracticeSessionDatachannelToken(meeting);

if (refreshedPracticeSessionToken) {
practiceSessionDatachannelToken = refreshedPracticeSessionToken;
}

// @ts-ignore - Fix type
return this.webex.internal.llm
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-check practice-session state after async token refresh

updatePSDataChannel() now awaits ensurePracticeSessionDatachannelToken() and then unconditionally calls registerAndConnect(...) using the earlier session state, but there is no second isJoinPracticeSessionDataChannel() guard after the await. If practice mode/panelist status flips while the refresh request is in flight (for example, host ends practice session), a stale invocation can reconnect the practice LLM channel after a newer invocation already cleaned it up, leaving captions/subscriptions attached to the wrong session.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d70cb8651

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +6405 to +6408
this.webex.internal.llm.setDatachannelToken(
fetchedDatachannelToken,
DataChannelTokenType.Default
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Revalidate meeting identity before caching fetched lobby token

ensureDefaultDatachannelTokenAfterAdmit() writes the fetched token into the shared LLM token cache after an awaited network call, but it never verifies that the meeting/locus is still the same one that initiated the fetch. If the user leaves (or switches meetings) while fetchDatachannelToken() is in flight, this stale token can overwrite the cache after teardown; the next admitted-guest flow then sees a token present and skips fetching the correct one, so updateLLMConnection() can reconnect with invalid credentials.

Useful? React with 👍 / 👎.

Comment on lines 5999 to +6001
private handleLLMOnline = (): void => {
// @ts-ignore
this.webex.internal.llm.off('online', this.handleLLMOnline);
this.restoreLLMSubscriptionsIfNeeded();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove persistent LLM online listener on meeting cleanup

After this change, handleLLMOnline no longer unregisters itself, which makes the listener effectively persistent for the Meeting instance. Teardown still calls cleanupLLMConneciton() without removeOnlineListener, so ended/disposed meetings can keep online callbacks attached and receive future online events from later connections, causing duplicate MEETING_TRANSCRIPTION_CONNECTED triggers and stale-instance callbacks over time.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c5a1f77643

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 3764 to 3768
this.ensureDefaultDatachannelTokenAfterAdmit().then(() => {
this.updateLLMConnection();
});

this.updateLLMConnection();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Serialize post-admit LLM reconnect attempts

This handler now invokes updateLLMConnection() twice (immediately and again in the ensureDefaultDatachannelTokenAfterAdmit().then(...) path). When prefetch resolves quickly (for example, token already cached or token-auth feature disabled), both calls can run concurrently while isConnected() is still false, causing two registerAndConnect() register requests for the same session. Because LLMChannel.register() persists whichever binding/websocket metadata returns last, this race can leave cached routing metadata out of sync with the actual connected socket and break later routed sends; these reconnect attempts should be serialized or the second call should only run when a new token was fetched.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated

Comment on lines +188 to +191
this.webex.internal.llm.setDatachannelToken(
datachannelToken,
dataChannelTokenType || DataChannelTokenType.PracticeSession
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Revalidate state before caching refreshed practice token

ensurePracticeSessionDatachannelToken() stores the refreshed token in the shared LLM cache before updatePSDataChannel() performs its stale-invocation checks. If practice eligibility or meeting lifecycle changes while refreshDataChannelToken() is in flight (e.g., user leaves and cleanup clears tokens), this late write can repopulate cache with a stale token; subsequent updates then read a cached token and skip refresh, leading to practice-session register attempts with invalid credentials. Guard this cache write with a fresh state/sequence check tied to the current meeting context.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 16219deb81

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

this._pendingOnlineListener = null;
}

const refreshedPracticeSessionToken = await this.ensurePracticeSessionDatachannelToken(meeting);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Revalidate default LLM connectivity after token refresh await

In updatePSDataChannel(), default-session connectivity is checked and _pendingOnlineListener is removed before await ensurePracticeSessionDatachannelToken(meeting), but there is no second default isConnected() check afterward. If the default LLM session drops while the token refresh is in flight, this stale invocation can still run registerAndConnect(..., LLM_PRACTICE_SESSION) even though the prerequisite connection is gone, and because the pending online listener was already cleared there is no automatic retry when default comes back online.

Useful? React with 👍 / 👎.

}

// @ts-ignore
this.webex.internal.voicea.updateSubchannelSubscriptions({subscribe: ['transcription']});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Await transcription resubscribe call in online handler

restoreLLMSubscriptionsIfNeeded() wraps updateSubchannelSubscriptions() in a synchronous try/catch, but updateSubchannelSubscriptions is async and its promise is not awaited or .catched here. If that promise rejects (for example, during reconnect races), the rejection bypasses this catch and becomes unhandled, so the intended warning log and graceful recovery path do not run.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d20822121

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +297 to +299
latestPracticeSessionDatachannelUrl !== practiceSessionDatachannelUrl
) {
return undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reconnect with latest practice URL after token refresh

The post-refresh stale check returns when latestPracticeSessionDatachannelUrl !== practiceSessionDatachannelUrl, but this branch does not schedule a follow-up reconnect. If the practice-session URL rotates while ensurePracticeSessionDatachannelToken() is in flight (and no role/practice-status transition happens), this invocation exits and the practice LLM channel can stay disconnected until an unrelated later trigger. Instead of returning here, continue with the latest URL or immediately re-invoke updatePSDataChannel().

Useful? React with 👍 / 👎.

@mickelr mickelr merged commit 36fd83f into webex:next Apr 3, 2026
21 of 22 checks passed
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

🎉 Your changes are now available!
Released in: v3.12.0-next.11
📖 View full changelog →
Packages Updated Version
webex 3.12.0-next.11
@webex/plugin-meetings 3.12.0-next.7

Thank you for your contribution!
🤖 This is an automated message. For questions, please refer to the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

validated If the pull request is validated for automation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants