Skip to content

fix: persist moving summary buffer to db and limit message fetch by date#5880

Open
RenzoMXD wants to merge 2 commits intoFlowiseAI:mainfrom
RenzoMXD:bugfix/Summary-Buffer-Memory-Unbounded-Fetch
Open

fix: persist moving summary buffer to db and limit message fetch by date#5880
RenzoMXD wants to merge 2 commits intoFlowiseAI:mainfrom
RenzoMXD:bugfix/Summary-Buffer-Memory-Unbounded-Fetch

Conversation

@RenzoMXD
Copy link

@RenzoMXD RenzoMXD commented Mar 2, 2026

Root cause

The getChatMessages method fetched all messages from the db on every request with no limit. Additionally, movingSummaryBuffer was an in-memory instance variable that was lost between requests (the class is
re-instantiated per request), so the summary was never actually reused — every turn re-summarized the full history from scratch.

Fix

  1. Persist the summary to db — after pruning generates a summary, it is saved as a summaryMessage row in the ChatMessage table for the session.

  2. Load the persisted summary on startup — at the start of each request, the latest summaryMessage row is fetched and restored into movingSummaryBuffer.

  3. Date-filter the message query — the main message query now only fetches messages created after the last summary's timestamp, so the number of messages processed stays bounded regardless of conversation length.

  4. Fix double-prepend — a pre-existing issue where the summary was prepended twice when no pruning was needed is also resolved.

Result

Token usage now stays within maxTokenLimit across long conversations. The amount of text re-summarized per turn is bounded to only the messages since the last summary, not the full history.

Closes #5873

…h by date

- Add 'summaryMessage' to MessageType in both server and components Interface
- Load persisted summary from db on each request via summaryMessage row
- Filter message query to only fetch messages after last summary date
- Save updated summary back to db after pruning
- Fix double-prepend of summary when no pruning is needed
~
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the efficiency and performance of conversation summary buffer memory by implementing database persistence for summaries and optimizing message retrieval. Previously, summaries were re-generated from scratch for every request, leading to high token usage and slow performance in long conversations. The changes ensure that summaries are stored, loaded, and used effectively, bounding the amount of text processed per turn and maintaining token usage within limits.

Highlights

  • Summary Persistence: The movingSummaryBuffer is now persisted to the database as a summaryMessage after pruning, ensuring summaries are reused across requests and not lost between instantiations.
  • Optimized Message Fetching: The getChatMessages method now fetches messages created after the last summary's timestamp, significantly reducing the number of messages processed and bounding the amount of text re-summarized.
  • Summary Loading: The latest persisted summary is loaded from the database at the start of each request to initialize movingSummaryBuffer, allowing for continuous conversation context.
  • Double-Prepend Fix: An issue where the summary was prepended twice when no pruning was needed has been resolved, preventing redundant summary additions.
Changelog
  • packages/components/nodes/memory/ConversationSummaryBufferMemory/ConversationSummaryBufferMemory.ts
    • Imported Not and MoreThan from typeorm for advanced database querying.
    • Implemented logic to fetch the latest summaryMessage from the database and populate this.movingSummaryBuffer at the start of a request.
    • Modified the chatMessage query to exclude summaryMessage types and filter messages by createdDate to only retrieve those newer than the last summary.
    • Added database operations to delete existing summaryMessage entries and save the newly generated movingSummaryBuffer as a summaryMessage after pruning.
    • Introduced a check to remove a potentially duplicate summary from baseMessages before further processing, addressing a double-prepend issue.
  • packages/components/src/Interface.ts
    • Extended the MessageType type to include 'summaryMessage' to support the new summary persistence.
  • packages/server/src/Interface.ts
    • Extended the MessageType type to include 'summaryMessage' to support the new summary persistence.
Activity
  • No activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a key issue with conversation summary memory by persisting the summary to the database, which improves performance, reduces token consumption in long conversations, and fixes the double-prepending of summaries. However, it introduces two significant security concerns: an Insecure Direct Object Reference (IDOR) vulnerability due to unvalidated sessionId usage in database queries, which violates the authorization rule, and a persistent Indirect Prompt Injection risk where malicious instructions can be stored in the conversation summary and affect all future interactions in the session. Additionally, consider wrapping the database operations for updating the summary in a transaction to prevent potential race conditions and ensure data integrity.

Comment on lines +131 to +134
const summaryRow = await this.appDataSource.getRepository(this.databaseEntities['ChatMessage']).findOne({
where: { sessionId: id, chatflowid: this.chatflowid, role: 'summaryMessage' },
order: { createdDate: 'DESC' }
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The getChatMessages method uses a user-supplied sessionId (derived from overrideSessionId or this.sessionId) to perform database operations including fetching chat history, fetching the conversation summary, deleting the summary, and saving a new summary. There is no validation to ensure that the requester is authorized to access or modify the data associated with the provided sessionId. An attacker who knows or can guess a sessionId can read private conversation history and summaries of other users, or even overwrite their summaries, leading to unauthorized data access and potential disruption of other users' conversations. While chatflowid is also used in the query, it is often a public identifier, making the sessionId the primary secret for session isolation. If sessionId is not a cryptographically strong, unguessable identifier, this poses a significant risk of Insecure Direct Object Reference (IDOR).

References
  1. When implementing authorization checks, differentiate between internal and external API endpoints. Internal endpoints might have different authentication flows (e.g., cookie-based JWT for UI) compared to external API endpoints (e.g., API key authentication). Ensure that the correct authentication middleware populates req.user or equivalent for each endpoint type to avoid unintended authorization failures.

@@ -163,11 +175,32 @@ class ConversationSummaryBufferMemoryExtended extends FlowiseSummaryBufferMemory
}
}
this.movingSummaryBuffer = await this.predictNewSummary(prunedMemory, this.movingSummaryBuffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The conversation summary (movingSummaryBuffer) is generated by an LLM and persisted in the database. This summary is subsequently loaded and prepended to the conversation history in future turns as trusted context. Because the summary is derived from untrusted conversation history and is not sanitized before being re-inserted into the prompt, an attacker can use prompt injection techniques to influence the summary. This allows the attacker to inject persistent malicious instructions that will control the LLM's behavior in all subsequent turns of the conversation, even after the original malicious messages have been pruned from the history. This is a form of Indirect Prompt Injection that is made persistent by the changes in this pull request.

Comment on lines +179 to +193
await this.appDataSource.getRepository(this.databaseEntities['ChatMessage']).delete({
sessionId: id,
chatflowid: this.chatflowid,
role: 'summaryMessage'
})

await this.appDataSource.getRepository(this.databaseEntities['ChatMessage']).save({
sessionId: id,
chatflowid: this.chatflowid,
role: 'summaryMessage',
content: this.movingSummaryBuffer,
chatType: 'INTERNAL',
chatId: id,
createdDate: new Date()
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of deleting and then saving the summary message is not atomic and could lead to a race condition if multiple requests for the same session are processed concurrently. This could result in losing a summary or saving an outdated one. To ensure atomicity, these two operations should be wrapped in a database transaction.

                await this.appDataSource.transaction(async (transactionalEntityManager) => {
                    await transactionalEntityManager.delete(this.databaseEntities['ChatMessage'], {
                        sessionId: id,
                        chatflowid: this.chatflowid,
                        role: 'summaryMessage'
                    })

                    await transactionalEntityManager.save(this.databaseEntities['ChatMessage'], {
                        sessionId: id,
                        chatflowid: this.chatflowid,
                        role: 'summaryMessage',
                        content: this.movingSummaryBuffer,
                        chatType: 'INTERNAL',
                        chatId: id,
                        createdDate: new Date()
                    })
                })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant