fix(milvus): escape document source values in delete filter (CWE-89)#529
fix(milvus): escape document source values in delete filter (CWE-89)#529sebastiondev wants to merge 54 commits into
Conversation
…recated SpanAttributes (NVIDIA-AI-Blueprints#377) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
Signed-off-by: Niyati Singal <nsingal@nvidia.com>
* Added MIG Slice support for RTX 6000 pro Signed-off-by: Punit Kumar <punitk@nvidia.com> * Changed to default config in MIG slicing in rtx6000pro config --------- Signed-off-by: Punit Kumar <punitk@nvidia.com> Co-authored-by: niyatisingal <nsingal@nvidia.com>
…VIDIA-AI-Blueprints#385) * changes to docs per bug 5767861 (NVIDIA-AI-Blueprints#328) * Updated launchable with v2.4.0 tag (NVIDIA-AI-Blueprints#318) * updated support matrix (NVIDIA-AI-Blueprints#321) * Document the end‑to‑end flow from query to answer and show how to measure time spent in each stage of the RAG pipeline. (NVIDIA-AI-Blueprints#317) * adding oberservablility * Update docs/debugging.md Co-authored-by: nkmcalli <nkmcalli@yahoo.com> * Update docs/observability.md Co-authored-by: nkmcalli <nkmcalli@yahoo.com> * Add query-to-answer-pipeline doc and observability/debugging updates * Trigger CI * getting build to kick in for observability file * Fix typos in query-to-answer-pipeline.md and ensure file in PR for link check * get rid of PULL_REQUEST_SUMMARY --------- Co-authored-by: nkmcalli <nkmcalli@yahoo.com> * fixed files associated with build (NVIDIA-AI-Blueprints#322) * Add multimodal query integration tests to CI pipeline * changes to docs per bug 5767861 * updated files per bug 5880717 (NVIDIA-AI-Blueprints#327) * updated files per bug 5880717 * Update CONTRIBUTING.md * Update README.md * Update python-client.md * Update readme.md * Update readme.md * Update docs/deploy-helm.md Co-authored-by: nkmcalli <nkmcalli@yahoo.com> * Update docs/deploy-helm.md Co-authored-by: nkmcalli <nkmcalli@yahoo.com> --------- Co-authored-by: rkharwar-nv <rkharwar@nvidia.com> Co-authored-by: nkmcalli <nkmcalli@yahoo.com> Co-authored-by: Pranjal Doshi <pranjald@nvidia.com> Co-authored-by: nv-pranjald <150428320+nv-pranjald@users.noreply.github.com> * Fix workflow rule and doc bugs (NVIDIA-AI-Blueprints#331) * Revert back milvus version in conf.md to v2.6.5 * Modify workflow to run on any branch * Fix workflow push rule to run on protected branches * Add files via upload (NVIDIA-AI-Blueprints#326) Found an error in the Q&A section where images in the citation were not being printed. * Doc bug fixes (NVIDIA-AI-Blueprints#339) * updated helm instructions (NVIDIA-AI-Blueprints#333) * updated helm instructions * Update deploy-helm.md * fix broken image link (NVIDIA-AI-Blueprints#334) * Add release note for Audio model deployment on Kubernetes on RTX‑6000 Pro is not supported in this release.heiss/5863956a (NVIDIA-AI-Blueprints#335) * Add release note for Audio model deployment on Kubernetes on RTX‑6000 Pro is not supported in this release. * Add release note for Audio model deployment on Kubernetes on RTX‑6000 Pro is not supported in this release. * Fix broken image link in observability file * Fix CPU seach with GPU index doc * Fix VLLM profile instruction for nemotron-3-nano --------- Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Updated troubleshoot documentation for Elasticsearch connection timeout (NVIDIA-AI-Blueprints#341) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com> * updated path to image files so that html output is rendered correctly (NVIDIA-AI-Blueprints#363) * Updated helm instructions for mig-deployment prerequisites (NVIDIA-AI-Blueprints#364) * Updated helm instructions for mig-deployment * Update mig-deployment.md * Doc enhancement for noteboook (NVIDIA-AI-Blueprints#361) * Doc enhancement for noteboook * Update release notes * Update launchable.ipynb (NVIDIA-AI-Blueprints#365) Updated branch name State name changed from "FAILURE"->"FAILED" * Fix typo in release notes --------- Co-authored-by: rkharwar-nv <rkharwar@nvidia.com> * fixed links in deploy-helm and mig-deploymnent (NVIDIA-AI-Blueprints#367) * update artifacts to GA version for v2.4.0 release (NVIDIA-AI-Blueprints#359) * updated files according to style guide (NVIDIA-AI-Blueprints#369) * Revert deploy-helm and mig-deployment to pre-11a31a4 versions (NVIDIA-AI-Blueprints#372) * Fix release date in changelog (NVIDIA-AI-Blueprints#373) * Bump up version to 2.5.0 --------- Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com> Co-authored-by: Kurt Heiss <kheiss@nvidia.com> Co-authored-by: rkharwar-nv <rkharwar@nvidia.com> Co-authored-by: nkmcalli <nkmcalli@yahoo.com> Co-authored-by: Pranjal Doshi <pranjald@nvidia.com> Co-authored-by: nv-pranjald <150428320+nv-pranjald@users.noreply.github.com> Co-authored-by: Swapnil Masurekar <smasurekar@nvidia.com>
…rints#351) * feat: add rag_event_ingest example - event-driven document/video ingestion pipeline - Kafka consumer that monitors MinIO object storage for new uploads - Routes documents to RAG Ingestor, videos to VSS for analysis - Docker Compose deployment for Kafka, MinIO, and consumer - Jupyter notebook for end-to-end deployment and testing - Sample test data (PDF document, MP4 video) tracked via Git LFS Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: polish rag_event_ingest notebook - fix sections, descriptions, TOC Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * refactor: consolidate Setup into single cell - clone, deps, API keys Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * refactor: inline check_rag/vss/aidp_status into their usage cells Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * docs: add markdown description before every code cell Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: notebook is standalone entry point, clones RAG repo to ~/rag Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: set COLLECTION_NAME, load .env, simplify query_rag, add expected logs Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: left-align markdown tables in notebook Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: use HTML tables to force left alignment Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: replace API Keys markdown table with HTML for left alignment Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * feat: add storage verification, RAG frontend hints, and configurable consumer prompts - Add verify_file_in_storage() helper to confirm files landed in MinIO - Merge storage verification into document/video ingestion checks - Add RAG Frontend UI link (port 8090) to query sections - Make Kafka consumer VSS prompts configurable via env vars in docker-compose - Install git/git-lfs in notebook setup cell - Index cells in Deploy Continuous Ingestion section Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: skip RAG clone if directory already exists Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: url Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * feat: add continuous ingestion notebook for video and document pipeline Add rag_event_ingest.ipynb notebook that provides an end-to-end walkthrough for: - Deploying NVIDIA RAG stack (NIMs, Milvus, Ingestor, RAG Server) - Deploying NVIDIA VSS stack (VLM, LLM, Embedding, Reranker NIMs) - Deploying continuous ingestion pipeline (Kafka, MinIO, Kafka Consumer) - Configurable video analysis prompts for the Kafka consumer - Uploading documents and videos to MinIO with storage verification - Verifying ingestion via consumer logs - Querying ingested content via RAG API or Frontend UI Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: gpu Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: add ensurepip, fix VSS tag to v2.4.1, use GPUs 2-3 for VSS, update hw req to 4 GPUs Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * fix: tag Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * feat: resolve comment * fix: patch VSS config to use host-mapped ports for shared RAG embedding/reranker The via-server runs on the local_deployment_single_gpu_default network, not nvidia-rag, so it cannot resolve nemoretriever-embedding-ms or nemoretriever-ranking-ms. Route through host.docker.internal with the correct host-mapped ports instead (9080 for embedding, 1976 for reranker). Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor --------- Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Co-authored-by: anngu <anngu@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
* Fix query decomp doc and prompt * fix prompt in helm as well
…s#371) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
* confirming presence of switcher text in conf.py file * docs: adjust conf.py for 2.5.0
…current batch ingestion having indexing issues (NVIDIA-AI-Blueprints#389)
…AI-Blueprints#386) * Prompt tuining, low reasoning and reasoning budget * Filter out think token when enable filter is on * Use default prompt * Fix unit test * Add doc for nemotron thinking budget * Add question back in prompt.yaml
… (NVIDIA-AI-Blueprints#392) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
…DIA-AI-Blueprints#395) * Add config to enable nemotron parse only extraction in nv-ingest * Refactor nemotron parse only documentation * Remove nemotron parse only references from the previous section
…rs (NVIDIA-AI-Blueprints#402) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
* Update langchain-nvidia-ai-endpointsto >=1.1.0 * security: Update langgraph to version 1.10.0
…ueprints#397) * Update NIM wait times and patch VSS embed/rerank models Adjust expected NIM model loading wait from 2-5 min to ~10 min for RTX PRO 6000 hardware. Add explicit patching of VSS config.yaml to align embedding and reranker model names with RAG stack defaults. Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * Update VSS prompts to match default format and use seconds-based queries - Align consumer VSS prompts with VSS config.yaml defaults (sports-adapted): caption, caption_summarization, summary_aggregation with proper dedup/merge logic - Extract RAG embed/rerank model names dynamically from compose file - Add parse_compose_default helper to avoid hardcoded model names - Change time-range query from MM:SS to seconds format for VSS compatibility Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * Clean up notebook: rename variable, remove config overrides - Rename _rag_compose to _rag_compose_path for clarity - Remove hardcoded max_tokens and batch_size patches from VSS config - Simplify time-range query cell comments Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * Rename helper function and revert prompts to MM:SS format - Rename parse_compose_default to extract_rag_default_compose_var - Revert VSS prompts to MM:SS timestamp conversion style - Revert time-range query to MM:SS format Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor * Add GPU assignment table and update NIM container names Update notebook to reflect renamed NIM containers (nemoretriever-* → nemotron-*) and add default GPU assignment table for RTX PRO 6000 / H100 hardware. Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor --------- Signed-off-by: Minh Nguyen <minhngu@nvidia.com>
Co-authored-by: Kurt Heiss <kheiss@nvidia.com>
…structure (NVIDIA-AI-Blueprints#403) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
…y given in argument and adding messages list in logs (NVIDIA-AI-Blueprints#404)
…oint (NVIDIA-AI-Blueprints#412) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
* Update: Remove Vss Signed-off-by: Minh Nguyen <minhngu@nvidia.com> * Remove video processing (VSS) from kafka consumer Video handler, video analyzer service, and all VSS-related configuration have been removed to simplify the event ingestion pipeline to document-only processing. Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Made-with: Cursor Signed-off-by: Minh Nguyen <minhngu@nvidia.com> * fix: add USERID Signed-off-by: Minh Nguyen <minhngu@nvidia.com> * fix: update document Signed-off-by: Minh Nguyen <minhngu@nvidia.com> --------- Signed-off-by: Minh Nguyen <minhngu@nvidia.com> Co-authored-by: Minh Nguyen <minhngu@nvidia.com>
* confirming presence of switcher text in conf.py file * Added chunking information
…-elements, table-structure (NVIDIA-AI-Blueprints#410) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com> Co-authored-by: Shubhadeep Das <149712532+shubhadeepd@users.noreply.github.com>
…VIDIA-AI-Blueprints#419) * notebook: Add notebook showcasing langchain retriever connector * Update langchain connector version to 1.2.0 * Fix broken link in notebook doc
…oud endp…" and add ranker endpoint in nvdev (NVIDIA-AI-Blueprints#421) This reverts commit 1a11733. Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
* confirming presence of switcher text in conf.py file * Update documentation to reflect name change to NeMo Retriever Library * Update api-rag.md removed GitHub markers * Update change-model.md Removed GitHub markers * Update deploy-helm.md removed GitHub markers * Update deploy-helm.md Remove GitHub markers * Update deploy-helm.md Remove GitHub markers * Update mig-deployment.md removed GitHub markers * Update deploy-helm.md removed extra space
…-AI-Blueprints#429) Co-authored-by: rkharwar-nv <rkharwar@nvidia.com>
* confirming presence of switcher text in conf.py file * Continuous ingestion topic * updated index for continuous ingestion * updated reademe for continuous ingestion * Update continuous-ingestion-object-storage.md added RAG Blueprint * Update continuous-ingestion-object-storage.md Converted first sentence into 2 sentences * Update index.md * Apply suggestion from @nkmcalli Co-authored-by: nkmcalli <nmcallister@nvidia.com> * Update continuous-ingestion-object-storage.md * Update continuous-ingestion-object-storage.md * Update continuous-ingestion-object-storage.md --------- Co-authored-by: nkmcalli <nmcallister@nvidia.com>
…prints#430) * Nemotron 3 super deployment guide and migration guide * Organize gpu requirement and heading for nemotron3 super * Add instruction for updating values.yaml and refractor doc * Add cloud endpoint url in env file * Instruction to export llm max token in docker flow * remove unrequired llm api key from doc * Seprate yaml for nemotron 3 deployment * Remove unnecessary information for local hosted * Simplify docker deployment logs * Remove cuda device from rtx 6000 pro * Add prompt customization instruction in nemotron3 helm section * Instruction for prompt customization * Remove heading for rtx 6000 pro
Signed-off-by: smasurekar <smasurekar@nvidia.com>
…#434) * docs: add RAG accuracy benchmarks documentation * docs: Fix broken links and format in accuracy benchmark doc * Update accuracy-benchmarks.md * Update accuracy-benchmarks.md * Update accuracy-benchmarks.md * Update accuracy-benchmarks.md * Update accuracy-benchmarks.md * Update accuracy-benchmarks.md implmented changes as instructed by Sumit in Slack thread: https://nvidia.slack.com/archives/C09HAQRT1UY/p1773470561423909 --------- Co-authored-by: Kurt Heiss <kheiss@nvidia.com>
* Update changelog to include new additions * Update containers to GA version
…#438) * adding missing accuracy benchmark documentation * Update docs/evaluate.md Co-authored-by: nkmcalli <nmcallister@nvidia.com> --------- Co-authored-by: nkmcalli <nmcallister@nvidia.com>
…-AI-Blueprints#440) Signed-off-by: Swapnil Masurekar <smasurekar@nvidia.com>
…raversal (CWE-22) The tool_upload_documents and tool_update_documents MCP tools accepted arbitrary file paths from MCP clients without validation. An attacker controlling the MCP client (or an LLM agent making tool calls) could supply paths like /etc/shadow, /proc/self/environ, or ../../sensitive.yaml to read arbitrary files from the server filesystem and exfiltrate them by uploading to the ingestor. Add _validate_file_path() helper that resolves paths via os.path.realpath() (following symlinks) and verifies they reside within the allowed upload directory (MCP_UPLOAD_DIR env var, defaults to cwd). Raises ValueError for paths outside the sandbox. Both tool_upload_documents and tool_update_documents now call this validator before reading any file. Signed-off-by: Sebastion <sebastiondev@users.noreply.github.com>
* Updated Vidore Dataset to Vidore V3 Dataset (NVIDIA-AI-Blueprints#443) * Kheiss/rm early access1 (NVIDIA-AI-Blueprints#445) * remove early access from title * remove early access from title * Update documentation per broken link reporting for Brev and support matrix (NVIDIA-AI-Blueprints#458) * updated versioning method for RAG documentation
…22-mcp-server-file-acbb fix: validate file paths in MCP upload/update tools to prevent path traversal (CWE-22)
Package the GCNV data ingestor deployment into a reusable Helm chart with PVC, service, and namespace templates plus installation guidance for Trident-backed storage. Made-with: Cursor Signed-off-by: Raj Sahoo <raj.sahoo@netapp.com>
…ueprints#490) * docs(perf): add RAG performance measurement methodology Add performance benchmarking documentation covering TTFT and ITL metrics across four datasets (KG-RAG, RagBattlePacket, HotPotQA, BO767) plus synthetic workloads, comparing LLM-49B and VLM nano configurations with reasoning on/off. Signed-off-by: Truong Nguyen <tgnguyen@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> * Update docs/perf-benchmarks.md Co-authored-by: Kurt Heiss <kheiss@nvidia.com> --------- Signed-off-by: Truong Nguyen <tgnguyen@nvidia.com> Co-authored-by: Kurt Heiss <kheiss@nvidia.com>
User-supplied document names sent to delete_documents were interpolated
directly into Milvus boolean filter expressions via f-strings:
collection.delete(f"source['source_name'] == '{source_value}'")
A document name containing a single quote could break out of the string
literal and inject arbitrary boolean expressions, causing unintended
documents to match the delete filter. The same problem affected the
fallback 'source == ...' filter and the document_info delete filter
built from the basename.
Escape backslashes and single quotes with a small helper before
interpolating user-controlled source values into the filter
expression. Adds a unit test that demonstrates the original boolean
injection payload no longer breaks out of the literal.
|
Hi @sebastiondev thanks very much for your contribution! Please help rebase these changes on |
|
Hi @shubhadeepd, thanks for the feedback! I've rebased the fix onto However, I'm hitting a GitHub OAuth scope limitation when force-pushing: the I've already updated the PR base branch to |
|
Update: I've confirmed that the The rebase is done locally — the fix applies cleanly on
I've also updated the PR base to |
Summary
MilvusVDB.delete_documentsbuilds Milvus boolean filter expressions by interpolating user-controlled document names directly into f-strings. A document name containing a single quote can break out of the surrounding string literal and inject arbitrary boolean clauses, broadening the delete predicate beyond the intended document.src/nvidia_rag/utils/vdb/milvus/milvus_vdb.py, functiondelete_documentsDELETE /documentsingestor endpoint and the fault enables unintended / mass deletion of vectors and document metadata within a collection.Data flow
DELETE /documents(FastAPI ingestor) →document_names: List[str]→MilvusVDB.delete_documents(..., source_values=document_names)→ f-strings:A name like
evil.pdf' or '1'=='1rewrites the predicate intosource['source_name'] == 'evil.pdf' or '1'=='1', which matches every row in the collection.Fix
Milvus does not expose parameterised queries for
Collection.delete/ boolean expressions, so the correct mitigation is to escape values being interpolated into single-quoted string literals. The patch adds a small helper that escapes backslashes first and then single quotes, and uses it for every user-controlled value flowing into a filter expression indelete_documents(primary delete, document_info delete, and the legacy fallback path).The change is minimal and behaviour-preserving for legitimate document names (which do not contain single quotes or backslashes).
Tests
Added
test_delete_documents_escapes_filter_injectionintests/unit/test_utils/test_vdb/test_milvus_vdb.py. It feeds the classic payloadevil.pdf' or '1'=='1intodelete_documents, captures the expression handed toCollection.delete, and asserts:or '1'=='1'is not present in the filter expression.The existing
delete_documentstests continue to pass, confirming the escape is transparent for normal inputs.Security analysis
DELETE /documentsroute forwardsdocument_namesstraight intodelete_documentswith no sanitisation in between; we traced the data flow through the route handler to the sink.collection_name— typically the default from environment configuration — can submit{"document_names": ["x' or '1'=='1"], "collection_name": "..."}and trigger deletion of every vector matching the broadened predicate, plus the correspondingdocument_inforows. There is no authentication on this route in the shipped configuration.source_valueanddoc_namecan no longer terminate the string literal, so the user-controlled portion of the expression is confined to a literal string compared by==. The integrity-impacting primitive (mass delete) is removed.Adversarial review
Before submitting we tried to disprove this finding. We checked whether Milvus might already escape filter-expression literals server-side (it does not — the expression language treats
'as a literal terminator), whether any upstream FastAPI validator constrainsdocument_namesto filename-safe characters (none does; the field is a free-formList[str]), and whether the route requires authentication (it does not in the default deployment). We also looked for parameterised-query support forCollection.deleteto avoid manual escaping; the Milvus Python SDK does not provide one for boolean expressions, which is why escaping is the appropriate fix.cc @lewiswigmore