From 75e30ba879a9f182542850d36fa86c419cd672c5 Mon Sep 17 00:00:00 2001 From: Shepard2154 Date: Sun, 31 May 2026 09:40:47 +0300 Subject: [PATCH 1/3] docs: remove extra info from CONTRIBUTING guide --- CONTRIBUTING.md | 91 +++++++++++++------------------------------------ 1 file changed, 24 insertions(+), 67 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 1a49057..48c53be 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,6 +1,6 @@ # Contributing -## Development Setup +## Development setup From the repository root: @@ -8,89 +8,46 @@ From the repository root: python3 -m venv .venv source .venv/bin/activate pip install -e ".[dev]" +pre-commit install ``` -See the [Development setup](README.md#development-setup) section in the root README for a quick overview. - ## Tests -Layout mirrors the `vexrag/` package under `tests/` (e.g. `tests/core/`, `tests/cli/`, `tests/e2e/`). Shared fixtures live in `tests/conftest.py`; CLI/scan stubs in `tests/mocks.py`. - -```bash -poe test # fast unit tests (excludes integration and e2e) -poe test-integration # wiring tests (mocked HTTP, etc.) -poe test-e2e # full vx scan against rag_examples (see below) -poe cov # unit tests with coverage (excludes e2e) -``` - -### E2E smoke scans (`poe test-e2e`) - -Prerequisites: +| Command | Scope | +| --- | --- | +| `poe test` | Unit tests (default for PRs) | +| `poe test-integration` | Wiring tests with mocked HTTP | +| `poe test-e2e` | Full `vx scan` smoke tests | +| `poe cov` | Unit tests with coverage | -- Ollama running at `http://localhost:11434` with `llama3:8b`: +E2E tests require Ollama with `llama3:8b`, port `8080`, and deps from `rag_examples/` (see [README](README.md)). They skip cleanly when prerequisites are missing. - ```bash - ollama pull llama3:8b - ``` +## Code quality -- Repo dev install: `pip install -e ".[dev]"` and `.venv/bin/vx` available. -- Port `8080` free (all e2e cases share one RAG target). -- RAG example dependencies installed in the example directory or current Python env (see each `rag_examples/*/requirements.txt`). -- For **native** vector-DB poisoner cases (`ollama-smoke-native-poisoner.yaml`), install optional extras as needed: - - ```bash - pip install -e ".[dev,sentence-transformers,faiss,chroma,qdrant]" - ``` - -Expect ~7 sequential smoke scans (several minutes each with LLM calls). Tests skip cleanly when Ollama, models, deps, or port `8080` are unavailable — normal in CI without GPU/Ollama. - -The **`medium_qdrant:native`** case uses a Qdrant **server** (`qdrant/qdrant` via Docker) so the RAG target and `vx scan` do not contend for an embedded `qdrant_data/` lock. Prerequisites: - -- Docker installed and running; image `qdrant/qdrant` (pulled on first run). -- Same Ollama, `vx`, port `8080`, and native extras as other native cases. +CI runs the same checks as a local PR gate: ```bash -poe test-e2e -k "medium_qdrant:native" +poe check # ruff + mypy +poe test ``` -Without Docker, that single test is **skipped**; the other six e2e cases are unchanged. - -## Code Quality - -Run checks before committing: +Auto-fix formatting and lint issues: ```bash -ruff check . -ruff format --check . +poe fix ``` -Auto-fix issues: +Or run hooks manually: `pre-commit run --all-files`. -```bash -ruff check --fix . -ruff format . -``` - -## Pre-commit Hooks - -Install hooks once per clone: - -```bash -pre-commit install -``` +## Pull requests -Run hooks manually on all files: - -```bash -pre-commit run --all-files -``` - -## Git Commits +- Keep changes minimal and focused on one concern. +- Avoid unrelated refactors in the same PR. -- **Commits are human-only.** The Cursor agent must not run `git commit` or `git push` unless you explicitly ask it to. -- When changes are ready, review the diff and commit locally yourself. +## Releasing (maintainers) -## Commit Scope +1. Bump `version` in `pyproject.toml`. +2. Merge to `master` and wait for CI to pass. +3. `git tag vX.Y.Z && git push origin vX.Y.Z`. -- Keep changes minimal and focused on one concern. -- Avoid unrelated refactors in the same commit. +The [package-release workflow](.github/workflows/package-release.yml) builds `dist/*`, creates a GitHub Release, and publishes stable tags to PyPI. Prerelease tags (e.g. `v1.0.0-rc1`) skip PyPI. From e62193210969fbc78962056202c42fa74cad6f57 Mon Sep 17 00:00:00 2001 From: Shepard2154 Date: Sun, 31 May 2026 09:54:08 +0300 Subject: [PATCH 2/3] feat: add workflow for publishing vexrag --- .../{ci.yml => lint-typecheck-test.yml} | 4 +- .github/workflows/package-release.yml | 49 +++++++++++++++++++ 2 files changed, 51 insertions(+), 2 deletions(-) rename .github/workflows/{ci.yml => lint-typecheck-test.yml} (91%) create mode 100644 .github/workflows/package-release.yml diff --git a/.github/workflows/ci.yml b/.github/workflows/lint-typecheck-test.yml similarity index 91% rename from .github/workflows/ci.yml rename to .github/workflows/lint-typecheck-test.yml index c6f7c3f..abce8eb 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/lint-typecheck-test.yml @@ -1,4 +1,4 @@ -name: CI +name: lint-typecheck-test on: push: @@ -7,7 +7,7 @@ on: pull_request: jobs: - check-and-test: + lint-typecheck-test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 diff --git a/.github/workflows/package-release.yml b/.github/workflows/package-release.yml new file mode 100644 index 0000000..8664dfd --- /dev/null +++ b/.github/workflows/package-release.yml @@ -0,0 +1,49 @@ +name: package-release + +on: + push: + tags: + - "v*" + +permissions: + contents: write + id-token: write + +jobs: + package-release: + runs-on: ubuntu-latest + environment: pypi + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - uses: actions/setup-python@v5 + with: + python-version: "3.11" + + - name: Install build + run: pip install build + + - name: Build sdist and wheel + run: python -m build + + - name: Verify tag matches package version + run: | + TAG_VERSION="${GITHUB_REF_NAME#v}" + PYPROJECT_VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml','rb'))['project']['version'])") + if [ "$TAG_VERSION" != "$PYPROJECT_VERSION" ]; then + echo "Tag version ($TAG_VERSION) does not match pyproject.toml ($PYPROJECT_VERSION)" + exit 1 + fi + + - name: Create GitHub Release + uses: softprops/action-gh-release@v2 + with: + generate_release_notes: true + files: dist/* + prerelease: ${{ contains(github.ref_name, '-') }} + + - name: Publish to PyPI + if: ${{ !contains(github.ref_name, '-') }} + uses: pypa/gh-action-publish-to-pypi@release/v1 From 892d96050a539546f8cb2bd1d247cca9d5969634 Mon Sep 17 00:00:00 2001 From: Shepard2154 Date: Sun, 31 May 2026 10:40:50 +0300 Subject: [PATCH 3/3] docs: update README and vexrag version (0.2.0) --- README.md | 122 +++++++++++++++++++++++++++++++++---------------- pyproject.toml | 2 +- 2 files changed, 83 insertions(+), 41 deletions(-) diff --git a/README.md b/README.md index 621c2f4..e919fa3 100644 --- a/README.md +++ b/README.md @@ -4,57 +4,113 @@

Project: in development - Python 3.11 + Python 3.11+ PyPI Downloads Telegram chat

-Most RAG security tools focus on jailbreaking or prompt injection. VexRAG is different: it injects poisoned passages directly into the retrieval index and measures whether the system’s answers remain factually correct. It is not about safety refusals — it’s about functional correctness under adversarial data manipulation. +Most RAG security tools focus on jailbreaking or prompt injection. VexRAG is different: it injects poisoned passages directly into the retrieval index and measures the system’s answer functional correctness under adversarial data manipulation. -> **Stability notice (pre-0.2.0):** VexRAG is currently test-stage software and is **not production-ready**. -> Until version `0.2.0`, backward compatibility is **not guaranteed** and updates may include **breaking changes**. +## Threat model — when to use VexRAG -**Sample RAG stacks** for getting started: [rag_examples](rag_examples/README.md). +VexRAG is for **security testing your own RAG** in a **controlled, isolated environment** (not production). -## Quickstart +**Use VexRAG if:** -### Prerequisites +- Retrieval data may be **untrusted or partially adversarial** (uploads, crawls, third-party corpora). +- You need to measure behavior when an attacker can **poison or skew the index**. +- You are building **trust-aware RAG** and want evidence of resilience, not only prompt-level guards. + +**You probably do not need VexRAG if:** + +- Every indexed document is fully trusted, ingestion is strictly controlled, and you accept that risk without red-team validation. +- You only care about **query-time** prompt injection or jailbreaks (VexRAG targets **retrieval and corpus** attacks). + +## Science-first approach + +VexRAG implements **paper-backed** attacks, not ad-hoc heuristics: + + + + + + + + + + + + + + + + + + + + + +
MethodPaperSummary
PoisonedRAGarXiv:2402.07867Poisoning the retrieval corpus
HijackRAGarXiv:2410.22832Hijacking retrieved contexts
+ +See [`vexrag/attack_algorithms/`](vexrag/attack_algorithms/) for implementation details and fidelity notes. + +> **Warning — real corpus mutation** +> Scans write poisoned passages into the target retrieval index. Configs default to `cleanup: true`, but interrupted runs may still leave poison behind. +> Never target production or shared indexes. Back up the retrieval database before testing on your own data. + +## Versioning Policy + +VexRAG is an early-stage library. Until `1.0.0`, treat **any release as potentially breaking** — configs, CLI flags, and APIs may change without a major version bump. + +When we deprecate public functionality, it stays available for **two minor releases** before removal (e.g. deprecated in `0.3.0`, removed in `0.5.0`). + +From `1.0.0` onward we plan to follow [SemVer](https://semver.org/) (breaking changes in major releases only). + +## Prerequisites + +Use a sample RAG target from [rag_examples](rag_examples/README.md): start the example app (each folder’s README has the command), then scan it with `vx`. + +For the default Ollama-based configs you need **Python 3.11+** and a running Ollama daemon. CI and releases are tested on **3.11** only; newer versions may work but are not officially supported yet. ```bash -python --version # requires 3.11 +python --version # 3.11+ required; 3.11 tested in CI ollama list ``` -Install/pull required Ollama models: +Install/pull Ollama models for scan configs: ```bash ollama pull llama3:8b -ollama pull nomic-embed-text:latest ``` -You also need a running target API endpoint (for the small example: `http://localhost:8080`). +For full benchmarks (`ollama-default.yaml` and some advanced configs), also pull `nomic-embed-text:latest`. -### 1) Install VexRAG +All [rag_examples](rag_examples/README.md) targets default to `http://localhost:8080`; run one example at a time. + +## Installation ```bash pip install vexrag ``` -For vector DB-specific extras: +Optional extras (install what your stack needs): ```bash pip install "vexrag[qdrant]" pip install "vexrag[chroma]" pip install "vexrag[faiss]" +pip install "vexrag[sentence-transformers]" ``` -### 2) Verify installation +The `sentence-transformers` extra enables the in-process embedding provider in scan configs; model weights download from Hugging Face on first use. + +Verify the CLI: ```bash vx --help ``` -### 3) Run a scan from config +## Run a scan ```bash vx scan --config path/to/scan.yaml @@ -62,41 +118,22 @@ vx scan --config path/to/scan.yaml Use sample configs from `rag_examples/` as a starting point. -### 4) First successful scan (small local example) +## First successful scan From `rag_examples/small/rag_01_in_memory_en`: ```bash python3 -m venv .venv source .venv/bin/activate +pip install vexrag pip install -r requirements.txt -python small_rag.py +python3 small_rag.py vx scan --config scan_configs_examples/ollama-smoke.yaml ``` Expected outcome: -- `small_rag.py` serves the target API on `http://localhost:8080`. -- `vx scan` completes and prints a scan report with attack/evaluation results (no connection/preflight errors). - -## Development setup - -From the repository root: - -```bash -python3 -m venv .venv -source .venv/bin/activate -pip install -e ".[dev]" -pre-commit install # optional -``` - -Run quality checks: - -```bash -ruff check . -ruff format --check . -``` - -See [CONTRIBUTING.md](CONTRIBUTING.md) for commit and workflow notes. +- `python3 small_rag.py` serves the target API on `http://localhost:8080` (embeddings via Hugging Face; first run may download model weights). +- `vx scan` finishes with a scan report. Smoke config needs only `ollama pull llama3:8b`. ## Project roadmap @@ -107,11 +144,16 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for commit and workflow notes. - [x] Support for vLLM and Ollama - [x] Simple RAG examples for quick onboarding to VexRAG - [x] Support for Qdrant, FAISS, Chroma, and file-based retrieval backends +- [x] Codebase hardening: refactors, typing, tooling, *removing AI slop* ### In Progress -- [ ] Codebase hardening: refactors, typing, tooling, *removing AI slop* +- [ ] Stable Python API to run scans and generate cases from code, not only via `vx` ### Ideas / Backlog - [ ] Expand red-team methods in VexRAG - [ ] Expand supported retrieval backends - [ ] Implement a web version of VexRAG + +## Feedback + +Feel free to [open a GitHub issue](https://github.com/Shepard2154/VexRAG/issues) for bugs, questions, or attack methods you would like to see in VexRAG. Pull requests and local development notes are in [CONTRIBUTING.md](CONTRIBUTING.md). diff --git a/pyproject.toml b/pyproject.toml index 7a46674..9c36b53 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "vexrag" -version = "0.1.3" +version = "0.2.0" description = "A Red Team framework that evaluates RAG functional correctness when the retrieval backend contains poisoned passages." readme = "README.md" requires-python = ">=3.11"