Merged
Conversation
qin-ptr
approved these changes
Feb 7, 2026
qin-ptr
approved these changes
Feb 7, 2026
qin-ctx
approved these changes
Feb 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Related Issue
Type of Change
Changes Made
1. Commit 4c5e187: refactor: bitmap filter (最新)
此提交主要涉及代码重构、健壮性增强以及文档更新。
• 测试增强: 新增了
tests/vectordb/test_openviking_vectordb.py,包含一套完整的测试用例,覆盖了元数据创建、数据插入/更新/删除、以及各种复杂
过滤条件(包括前缀匹配、正则、时间范围等)的召回率测试。
• 数据处理健壮性: 在 DataProcessor 中引入了 TYPE_DEFAULTS,为 int64, float32, string
等类型提供了默认值,防止因上游数据缺失字段导致校验失败。同时移除了对 string 类型的硬性长度限制。
• 索引后端优化: 修改了 VikingVectorIndexBackend,在创建索引时会根据是否使用稀疏向量自动选择 flat_hybrid 或 flat
索引类型,并从标量索引中排除了 abstract 字段。
• C++ 核心优化: 优化了 IndexManagerImpl::search 中的 filter bitmap 处理逻辑。
• 文档更新: 更新了
README.md,移除了原本的“实际应用场景示例”(文档检索、推荐系统等),替换为“高级特性”章节,详细介绍了自动 ID
生成和向量归一化功能。
2. Commit 5150018: feat: use path field
此提交重点在于引入新的字段类型和改进数据处理流程。
• Schema 升级: 在 CollectionSchemas 中将 uri, parent_uri 字段类型从 string 更改为更具体的 path 类型,将 created_at,
updated_at 从 string 更改为 date_time 类型。
• 数据处理器集成: 在 LocalCollection, LocalIndex, PersistentIndex 等核心存储类中全面集成了
DataProcessor,用于统一处理数据的校验、类型转换和默认值填充。
• 文档扩展: 在 README.md 中新增了关于 time_range(时间范围查询)和 geo_range(地理位置范围查询)的过滤语法说明。
• 元数据处理: 更新了 IndexMeta 类,使用 DataProcessor 来辅助构建和解析标量索引的元数据。
Testing
Checklist
Screenshots (if applicable)
Additional Notes