fix(rag): fix document ID Python compatibility and respect defaultConfig limit#1362
Open
Fruank4 wants to merge 1 commit into
Open
fix(rag): fix document ID Python compatibility and respect defaultConfig limit#1362Fruank4 wants to merge 1 commit into
Fruank4 wants to merge 1 commit into
Conversation
…ltConfig limit - Document.generateDocumentId: use getContentText() instead of getContent() so the JSON key contains a plain string matching the Python implementation's _map_text_to_uuid, ensuring Java and Python generate identical UUIDs for the same document content when sharing a vector store. - KnowledgeRetrievalTools.retrieveKnowledge: fall back to defaultConfig.getLimit() instead of hardcoded 5 when the LLM omits the limit parameter, so the limit configured at construction time is honoured. Add tests: DocumentTest.testDocumentIdUsesTextNotContentBlockObject and KnowledgeRetrievalToolsTest covering both limit fallback and explicit override.
24df01e to
e14d687
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Document.generateDocumentId: 将getContent()(返回ContentBlock对象)替换为getContentText()(返回纯文本字符串)。原来 Jackson 序列化后的 JSON key 是{"content":{"text":"...","type":"text"}},而 Python 端的_map_text_to_uuid用的是{"content":"..."},导致同一份文档 Java/Python 生成的 UUID 不同,跨语言共用向量库时文档无法正确匹配。KnowledgeRetrievalTools.retrieveKnowledge: LLM 未传limit参数时,从硬编码的5改为使用defaultConfig.getLimit(),使构建时配置的 limit 生效。Test plan
DocumentTest#testDocumentIdUsesTextNotContentBlockObject:验证 ID 基于文本字符串而非 ContentBlock 对象结构生成KnowledgeRetrievalToolsTest#testNullLimitFallsBackToDefaultConfig:验证 null limit 时使用 defaultConfig.getLimit()KnowledgeRetrievalToolsTest#testExplicitLimitOverridesDefault:验证显式传入的 limit 优先于 defaultConfigDocumentTest、KnowledgeTest、ReActAgentRAGConfigTest全部通过🤖 Generated with Claude Code