Skip to content

Fallback to sync invocation if streaming fails#135

Merged
hubertzub-db merged 8 commits intodatabricks:mainfrom
hubertzub-db:fix-stream-fallback
Mar 12, 2026
Merged

Fallback to sync invocation if streaming fails#135
hubertzub-db merged 8 commits intodatabricks:mainfrom
hubertzub-db:fix-stream-fallback

Conversation

@hubertzub-db
Copy link
Copy Markdown
Contributor

@hubertzub-db hubertzub-db commented Feb 25, 2026

Summary

  • When the upstream streamText call encounters an error mid-stream, the chat server now transparently falls back to a non-streaming generateText call so the user still gets a response instead of just an error message.
  • I had to replace the previous writer.merge() approach with a manual drainStreamToWriter helper that reads chunks individually, detects errors, and triggers the fallback path when needed.

Demo/test (simulated error in agent's stream_handler)

stream-fallback.mov

Regression test (streaming still works)

stream-regression.mov

Demo of complete failure (both stream_handler and invoke_handler in agent are failing)

stream-complete-failure.mov

Test plan

  • Verify normal streaming still works end-to-end (no regression)
  • Simulate a streaming failure (e.g. upstream timeout or model error) and confirm the fallback fires and the user receives a complete response
  • Confirm error logging appears in the server console when fallback is triggered
  • Verify that if both streaming and fallback fail, the client receives a data-error message

@hubertzub-db hubertzub-db changed the title nice1 Fix streaming fallback Feb 25, 2026
@hubertzub-db hubertzub-db changed the title Fix streaming fallback Fallback to sync invocation if streaming fails Feb 25, 2026
* Reads all chunks from a UI message stream, forwarding non-error parts to the
* writer. Returns whether the stream encountered any errors.
*/
async function drainStreamToWriter(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: unfortunately we have to merge the streams manually, otherwise we won't have enough control to transparently switch over to fallback :/

* Reads all chunks from a UI message stream, forwarding non-error parts to the
* writer. Returns whether the stream encountered any errors.
*/
async function drainStreamToWriter(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry i think i wasn't clear, but we only want to try the non-streaming path if the streaming path never starts -- for errors that are within the stream itself, we do not want to falback right? this could result in duplicate content

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ex. when there is a network disconnect, we shouldn't be retrying the synchronous path, we should be trying to resume the stream.

pr looking at how the stream resumption works for context: #121

we could maybe? look at whether or not any content was emitted (ex. there was never a SSE w/ content in it?)

Copy link
Copy Markdown
Contributor Author

@hubertzub-db hubertzub-db Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry i think i wasn't clear, but we only want to try the non-streaming path if the streaming path never starts

If the streaming never starts because agent's streaming handler crashes at the beginning, then the chunk.value.type in middleware's drainStreamToWriter loop will have be set to error. That's due to agent framework/aisdk detecting a chunked stream even if the agent's stream function was never registered. In other words, even if agent<->middleware streaming completely fails at the very start, then the stream is still detected and passes initial messages (handshakes etc).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm since that's not trivial, maybe let's discuss this offline

Copy link
Copy Markdown
Contributor

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for working on this! i wasn't clear in specifying what type of fallback we wanted, but if it's possible to only fallback when the stream doesn't start in the first place, that would be very helpful

also, once we make the change, if we could write a playwright integration test for this, that would be super helpful

@bbqiu bbqiu self-requested a review February 26, 2026 21:09
@hubertzub-db
Copy link
Copy Markdown
Contributor Author

@bbqiu I've

Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
@hubertzub-db hubertzub-db force-pushed the fix-stream-fallback branch from 9b9b671 to 940cdb3 Compare March 5, 2026 12:18
@hubertzub-db hubertzub-db requested review from bbqiu and removed request for bbqiu March 5, 2026 12:18
Comment thread e2e-chatbot-app-next/server/src/routes/chat.ts

const partId = generateUUID();
writer.write({ type: 'text-start', id: partId });
writer.write({ type: 'text-delta', id: partId, delta: fallback.text });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we tried the case where the fallback is not just text? would this break for non text only output from the non streaming output?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so initially I didn't consider this scenario since we shouldn't really put other potentially heavy media into a single message (it's just a fallback after all), but now I realized that we should also handle other aisdk message types: reasoning, tool metadata etc.
Unfortunately, there's no way to reuse aisdk's internal code so all the message types are automatically handled 😭, so I handle all of them here:

https://github.com/databricks/app-templates/pull/135/changes#diff-c8036971ac7d2704cc9cd583b0015eb7f324212190f8a58ed4b5276363f3794dR74

Hope this works, at least for a fallback

Copy link
Copy Markdown
Contributor

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks great! only one comment abt non-text output when not streaming

uiStream: ReadableStream,
writer: UIMessageStreamWriter,
): Promise<{ failed: boolean; errorText?: string }> {
const reader = uiStream.getReader();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we close the reader

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm we shouldn't need that - the reader would be either automatically closed in the happy path, or GCed when failed (there's no other readers for the stream). However, let me add an explicit lock release to keep things same as before the change

Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
@hubertzub-db hubertzub-db requested review from bbqiu and fanzeyi March 6, 2026 14:40
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Copy link
Copy Markdown
Contributor

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR code lgtm -- have we had a chance to test this against an agent that actually calls tools etc? (asking mostly for the fallback generateText logic)

Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
@hubertzub-db
Copy link
Copy Markdown
Contributor Author

@bbqiu

have we had a chance to test this against an agent that actually calls tools etc? (asking mostly for the fallback generateText logic)

Seems to work fine as well, here's a result for generateText fallback w/ tool use (all the events arrive instant):

Screenshot 2026-03-12 at 10 18 14

@hubertzub-db hubertzub-db merged commit 99b58c6 into databricks:main Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants