feat: add streaming support to LLM API calls and configuration#122
feat: add streaming support to LLM API calls and configuration#122lybtt wants to merge 7 commits intoTencentCloudADP:mainfrom
Conversation
docs: deploy documentation on vercel
…8-docs-vercel Revert "docs: deploy documentation on vercel"
docs: deploy documentation on vercel
44602a2 to
97a4489
Compare
|
Thank you for your attention and contribution. For YouTube-GraphRAG, during the graph construction, since it inherently requires obtaining the complete LLM output for processing, switching to a streaming approach would not be very meaningful. During the inference phase, the primary factors affecting the speed and user experience are the question decomposition and the multiple retrieval iterations, where the time cost is dominated by the LLM's output generation. Since these are intermediate steps, switching to streaming output is probably unnecessary. Only the final step, where the LLM generates the answer based on the retrieved context, would be meaningful to change to streaming output. Additionally, if there are any files unrelated to this specific change, it is recommended to remove them from the commit. |
Just wanted to add that for lengthy text outputs, especially with privately deployed models that often have shorter timeout settings, streaming can help avoid timeouts by delivering content incrementally. |
No description provided.