fix: persist moving summary buffer to db and limit message fetch by date#5880
fix: persist moving summary buffer to db and limit message fetch by date#5880RenzoMXD wants to merge 2 commits intoFlowiseAI:mainfrom
Conversation
…h by date - Add 'summaryMessage' to MessageType in both server and components Interface - Load persisted summary from db on each request via summaryMessage row - Filter message query to only fetch messages after last summary date - Save updated summary back to db after pruning - Fix double-prepend of summary when no pruning is needed ~
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the efficiency and performance of conversation summary buffer memory by implementing database persistence for summaries and optimizing message retrieval. Previously, summaries were re-generated from scratch for every request, leading to high token usage and slow performance in long conversations. The changes ensure that summaries are stored, loaded, and used effectively, bounding the amount of text processed per turn and maintaining token usage within limits. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a key issue with conversation summary memory by persisting the summary to the database, which improves performance, reduces token consumption in long conversations, and fixes the double-prepending of summaries. However, it introduces two significant security concerns: an Insecure Direct Object Reference (IDOR) vulnerability due to unvalidated sessionId usage in database queries, which violates the authorization rule, and a persistent Indirect Prompt Injection risk where malicious instructions can be stored in the conversation summary and affect all future interactions in the session. Additionally, consider wrapping the database operations for updating the summary in a transaction to prevent potential race conditions and ensure data integrity.
| const summaryRow = await this.appDataSource.getRepository(this.databaseEntities['ChatMessage']).findOne({ | ||
| where: { sessionId: id, chatflowid: this.chatflowid, role: 'summaryMessage' }, | ||
| order: { createdDate: 'DESC' } | ||
| }) |
There was a problem hiding this comment.
The getChatMessages method uses a user-supplied sessionId (derived from overrideSessionId or this.sessionId) to perform database operations including fetching chat history, fetching the conversation summary, deleting the summary, and saving a new summary. There is no validation to ensure that the requester is authorized to access or modify the data associated with the provided sessionId. An attacker who knows or can guess a sessionId can read private conversation history and summaries of other users, or even overwrite their summaries, leading to unauthorized data access and potential disruption of other users' conversations. While chatflowid is also used in the query, it is often a public identifier, making the sessionId the primary secret for session isolation. If sessionId is not a cryptographically strong, unguessable identifier, this poses a significant risk of Insecure Direct Object Reference (IDOR).
References
- When implementing authorization checks, differentiate between internal and external API endpoints. Internal endpoints might have different authentication flows (e.g., cookie-based JWT for UI) compared to external API endpoints (e.g., API key authentication). Ensure that the correct authentication middleware populates
req.useror equivalent for each endpoint type to avoid unintended authorization failures.
| @@ -163,11 +175,32 @@ class ConversationSummaryBufferMemoryExtended extends FlowiseSummaryBufferMemory | |||
| } | |||
| } | |||
| this.movingSummaryBuffer = await this.predictNewSummary(prunedMemory, this.movingSummaryBuffer) | |||
There was a problem hiding this comment.
The conversation summary (movingSummaryBuffer) is generated by an LLM and persisted in the database. This summary is subsequently loaded and prepended to the conversation history in future turns as trusted context. Because the summary is derived from untrusted conversation history and is not sanitized before being re-inserted into the prompt, an attacker can use prompt injection techniques to influence the summary. This allows the attacker to inject persistent malicious instructions that will control the LLM's behavior in all subsequent turns of the conversation, even after the original malicious messages have been pruned from the history. This is a form of Indirect Prompt Injection that is made persistent by the changes in this pull request.
| await this.appDataSource.getRepository(this.databaseEntities['ChatMessage']).delete({ | ||
| sessionId: id, | ||
| chatflowid: this.chatflowid, | ||
| role: 'summaryMessage' | ||
| }) | ||
|
|
||
| await this.appDataSource.getRepository(this.databaseEntities['ChatMessage']).save({ | ||
| sessionId: id, | ||
| chatflowid: this.chatflowid, | ||
| role: 'summaryMessage', | ||
| content: this.movingSummaryBuffer, | ||
| chatType: 'INTERNAL', | ||
| chatId: id, | ||
| createdDate: new Date() | ||
| }) |
There was a problem hiding this comment.
The current implementation of deleting and then saving the summary message is not atomic and could lead to a race condition if multiple requests for the same session are processed concurrently. This could result in losing a summary or saving an outdated one. To ensure atomicity, these two operations should be wrapped in a database transaction.
await this.appDataSource.transaction(async (transactionalEntityManager) => {
await transactionalEntityManager.delete(this.databaseEntities['ChatMessage'], {
sessionId: id,
chatflowid: this.chatflowid,
role: 'summaryMessage'
})
await transactionalEntityManager.save(this.databaseEntities['ChatMessage'], {
sessionId: id,
chatflowid: this.chatflowid,
role: 'summaryMessage',
content: this.movingSummaryBuffer,
chatType: 'INTERNAL',
chatId: id,
createdDate: new Date()
})
})
Root cause
The
getChatMessagesmethod fetched all messages from the db on every request with no limit. Additionally,movingSummaryBufferwas an in-memory instance variable that was lost between requests (the class isre-instantiated per request), so the summary was never actually reused — every turn re-summarized the full history from scratch.
Fix
Persist the summary to db — after pruning generates a summary, it is saved as a
summaryMessagerow in theChatMessagetable for the session.Load the persisted summary on startup — at the start of each request, the latest
summaryMessagerow is fetched and restored intomovingSummaryBuffer.Date-filter the message query — the main message query now only fetches messages created after the last summary's timestamp, so the number of messages processed stays bounded regardless of conversation length.
Fix double-prepend — a pre-existing issue where the summary was prepended twice when no pruning was needed is also resolved.
Result
Token usage now stays within
maxTokenLimitacross long conversations. The amount of text re-summarized per turn is bounded to only the messages since the last summary, not the full history.Closes #5873