Shepard2154 · Shepard2154 · May 31, 2026 · May 31, 2026 · May 31, 2026 · May 31, 2026
diff --git a/.github/workflows/ci.yml → .github/workflows/lint-typecheck-test.yml b/.github/workflows/ci.yml → .github/workflows/lint-typecheck-test.yml
@@ -1,4 +1,4 @@
-name: CI
+name: lint-typecheck-test
 
 on:
   push:
@@ -7,7 +7,7 @@ on:
   pull_request:
 
 jobs:
-  check-and-test:
+  lint-typecheck-test:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v4

diff --git a/.github/workflows/package-release.yml b/.github/workflows/package-release.yml
@@ -0,0 +1,49 @@
+name: package-release
+
+on:
+  push:
+    tags:
+      - "v*"
+
+permissions:
+  contents: write
+  id-token: write
+
+jobs:
+  package-release:
+    runs-on: ubuntu-latest
+    environment: pypi
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Install build
+        run: pip install build
+
+      - name: Build sdist and wheel
+        run: python -m build
+
+      - name: Verify tag matches package version
+        run: |
+          TAG_VERSION="${GITHUB_REF_NAME#v}"
+          PYPROJECT_VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml','rb'))['project']['version'])")
+          if [ "$TAG_VERSION" != "$PYPROJECT_VERSION" ]; then
+            echo "Tag version ($TAG_VERSION) does not match pyproject.toml ($PYPROJECT_VERSION)"
+            exit 1
+          fi
+
+      - name: Create GitHub Release
+        uses: softprops/action-gh-release@v2
+        with:
+          generate_release_notes: true
+          files: dist/*
+          prerelease: ${{ contains(github.ref_name, '-') }}
+
+      - name: Publish to PyPI
+        if: ${{ !contains(github.ref_name, '-') }}
+        uses: pypa/gh-action-publish-to-pypi@release/v1
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,96 +1,53 @@
 # Contributing
 
-## Development Setup
+## Development setup
 
 From the repository root:
 
 ```bash
 python3 -m venv .venv
 source .venv/bin/activate
 pip install -e ".[dev]"
+pre-commit install
 ```
 
-See the [Development setup](README.md#development-setup) section in the root README for a quick overview.
-
 ## Tests
 
-Layout mirrors the `vexrag/` package under `tests/` (e.g. `tests/core/`, `tests/cli/`, `tests/e2e/`). Shared fixtures live in `tests/conftest.py`; CLI/scan stubs in `tests/mocks.py`.
-
-```bash
-poe test              # fast unit tests (excludes integration and e2e)
-poe test-integration  # wiring tests (mocked HTTP, etc.)
-poe test-e2e          # full vx scan against rag_examples (see below)
-poe cov               # unit tests with coverage (excludes e2e)
-```
-
-### E2E smoke scans (`poe test-e2e`)
-
-Prerequisites:
+| Command | Scope |
+| --- | --- |
+| `poe test` | Unit tests (default for PRs) |
+| `poe test-integration` | Wiring tests with mocked HTTP |
+| `poe test-e2e` | Full `vx scan` smoke tests |
+| `poe cov` | Unit tests with coverage |
 
-- Ollama running at `http://localhost:11434` with `llama3:8b`:
+E2E tests require Ollama with `llama3:8b`, port `8080`, and deps from `rag_examples/` (see [README](README.md)). They skip cleanly when prerequisites are missing.
 
-  ```bash
-  ollama pull llama3:8b
-  ```
+## Code quality
 
-- Repo dev install: `pip install -e ".[dev]"` and `.venv/bin/vx` available.
-- Port `8080` free (all e2e cases share one RAG target).
-- RAG example dependencies installed in the example directory or current Python env (see each `rag_examples/*/requirements.txt`).
-- For **native** vector-DB poisoner cases (`ollama-smoke-native-poisoner.yaml`), install optional extras as needed:
-
-  ```bash
-  pip install -e ".[dev,sentence-transformers,faiss,chroma,qdrant]"
-  ```
-
-Expect ~7 sequential smoke scans (several minutes each with LLM calls). Tests skip cleanly when Ollama, models, deps, or port `8080` are unavailable — normal in CI without GPU/Ollama.
-
-The **`medium_qdrant:native`** case uses a Qdrant **server** (`qdrant/qdrant` via Docker) so the RAG target and `vx scan` do not contend for an embedded `qdrant_data/` lock. Prerequisites:
-
-- Docker installed and running; image `qdrant/qdrant` (pulled on first run).
-- Same Ollama, `vx`, port `8080`, and native extras as other native cases.
+CI runs the same checks as a local PR gate:
 
 ```bash
-poe test-e2e -k "medium_qdrant:native"
+poe check   # ruff + mypy
+poe test
 ```
 
-Without Docker, that single test is **skipped**; the other six e2e cases are unchanged.
-
-## Code Quality
-
-Run checks before committing:
+Auto-fix formatting and lint issues:
 
 ```bash
-ruff check .
-ruff format --check .
+poe fix
 ```
 
-Auto-fix issues:
+Or run hooks manually: `pre-commit run --all-files`.
 
-```bash
-ruff check --fix .
-ruff format .
-```
-
-## Pre-commit Hooks
-
-Install hooks once per clone:
-
-```bash
-pre-commit install
-```
+## Pull requests
 
-Run hooks manually on all files:
-
-```bash
-pre-commit run --all-files
-```
-
-## Git Commits
+- Keep changes minimal and focused on one concern.
+- Avoid unrelated refactors in the same PR.
 
-- **Commits are human-only.** The Cursor agent must not run `git commit` or `git push` unless you explicitly ask it to.
-- When changes are ready, review the diff and commit locally yourself.
+## Releasing (maintainers)
 
-## Commit Scope
+1. Bump `version` in `pyproject.toml`.
+2. Merge to `master` and wait for CI to pass.
+3. `git tag vX.Y.Z && git push origin vX.Y.Z`.
 
-- Keep changes minimal and focused on one concern.
-- Avoid unrelated refactors in the same commit.
+The [package-release workflow](.github/workflows/package-release.yml) builds `dist/*`, creates a GitHub Release, and publishes stable tags to PyPI. Prerelease tags (e.g. `v1.0.0-rc1`) skip PyPI.
diff --git a/README.md b/README.md
@@ -4,99 +4,136 @@
 
 <p align="center">
   <a href="https://github.com/Shepard2154/VexRAG"><img src="https://img.shields.io/badge/project-in%20development-F59E0B?style=flat-square" alt="Project: in development" height="28"></a>
-  <a href="https://www.python.org/"><img src="https://img.shields.io/badge/python-3.11-blue?style=flat-square" alt="Python 3.11" height="28"></a>
+  <a href="https://www.python.org/"><img src="https://img.shields.io/badge/python-3.11+-blue?style=flat-square" alt="Python 3.11+" height="28"></a>
   <a href="https://pepy.tech/projects/vexrag"><img src="https://static.pepy.tech/personalized-badge/vexrag?period=total&units=ABBREVIATION&left_color=BLACK&right_color=GREEN&left_text=downloads" alt="PyPI Downloads" height="28"></a>
   <a href="https://t.me/vexrag"><img src="https://img.shields.io/badge/chat-join-blue?style=flat-square&logo=telegram" alt="Telegram chat" height="28"></a>
 </p>
 
-Most RAG security tools focus on jailbreaking or prompt injection. VexRAG is different: it injects poisoned passages directly into the retrieval index and measures whether the system’s answers remain factually correct. It is not about safety refusals — it’s about functional correctness under adversarial data manipulation.
+Most RAG security tools focus on jailbreaking or prompt injection. VexRAG is different: it injects poisoned passages directly into the retrieval index and measures the system’s answer functional correctness under adversarial data manipulation.
 
-> **Stability notice (pre-0.2.0):** VexRAG is currently test-stage software and is **not production-ready**.
-> Until version `0.2.0`, backward compatibility is **not guaranteed** and updates may include **breaking changes**.
+## Threat model — when to use VexRAG
 
-**Sample RAG stacks** for getting started: [rag_examples](rag_examples/README.md).
+VexRAG is for **security testing your own RAG** in a **controlled, isolated environment** (not production).
 
-## Quickstart
+**Use VexRAG if:**
 
-### Prerequisites
+- Retrieval data may be **untrusted or partially adversarial** (uploads, crawls, third-party corpora).
+- You need to measure behavior when an attacker can **poison or skew the index**.
+- You are building **trust-aware RAG** and want evidence of resilience, not only prompt-level guards.
+
+**You probably do not need VexRAG if:**
+
+- Every indexed document is fully trusted, ingestion is strictly controlled, and you accept that risk without red-team validation.
+- You only care about **query-time** prompt injection or jailbreaks (VexRAG targets **retrieval and corpus** attacks).
+
+## Science-first approach
+
+VexRAG implements **paper-backed** attacks, not ad-hoc heuristics:
+
+<table border="1" cellpadding="8" cellspacing="0">
+  <thead>
+    <tr>
+      <th align="left">Method</th>
+      <th align="left">Paper</th>
+      <th align="left">Summary</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>PoisonedRAG</strong></td>
+      <td><a href="https://arxiv.org/abs/2402.07867">arXiv:2402.07867</a></td>
+      <td>Poisoning the retrieval corpus</td>
+    </tr>
+    <tr>
+      <td><strong>HijackRAG</strong></td>
+      <td><a href="https://arxiv.org/abs/2410.22832">arXiv:2410.22832</a></td>
+      <td>Hijacking retrieved contexts</td>
+    </tr>
+  </tbody>
+</table>
+
+See [`vexrag/attack_algorithms/`](vexrag/attack_algorithms/) for implementation details and fidelity notes.
+
+> **Warning — real corpus mutation**  
+> Scans write poisoned passages into the target retrieval index. Configs default to `cleanup: true`, but interrupted runs may still leave poison behind.  
+> Never target production or shared indexes. Back up the retrieval database before testing on your own data.
+
+## Versioning Policy
+
+VexRAG is an early-stage library. Until `1.0.0`, treat **any release as potentially breaking** — configs, CLI flags, and APIs may change without a major version bump.
+
+When we deprecate public functionality, it stays available for **two minor releases** before removal (e.g. deprecated in `0.3.0`, removed in `0.5.0`).
+
+From `1.0.0` onward we plan to follow [SemVer](https://semver.org/) (breaking changes in major releases only).
+
+## Prerequisites
+
+Use a sample RAG target from [rag_examples](rag_examples/README.md): start the example app (each folder’s README has the command), then scan it with `vx`.
+
+For the default Ollama-based configs you need **Python 3.11+** and a running Ollama daemon. CI and releases are tested on **3.11** only; newer versions may work but are not officially supported yet.
 
 ```bash
-python --version  # requires 3.11
+python --version  # 3.11+ required; 3.11 tested in CI
 ollama list
 ```
 
-Install/pull required Ollama models:
+Install/pull Ollama models for scan configs:
 
 ```bash
 ollama pull llama3:8b
-ollama pull nomic-embed-text:latest
 ```
 
-You also need a running target API endpoint (for the small example: `http://localhost:8080`).
+For full benchmarks (`ollama-default.yaml` and some advanced configs), also pull `nomic-embed-text:latest`.
 
-### 1) Install VexRAG
+All [rag_examples](rag_examples/README.md) targets default to `http://localhost:8080`; run one example at a time.
+
+## Installation
 
 ```bash
 pip install vexrag
 ```
 
-For vector DB-specific extras:
+Optional extras (install what your stack needs):
 
 ```bash
 pip install "vexrag[qdrant]"
 pip install "vexrag[chroma]"
 pip install "vexrag[faiss]"
+pip install "vexrag[sentence-transformers]"
 ```
 
-### 2) Verify installation
+The `sentence-transformers` extra enables the in-process embedding provider in scan configs; model weights download from Hugging Face on first use.
+
+Verify the CLI:
 
 ```bash
 vx --help
 ```
 
-### 3) Run a scan from config
+## Run a scan
 
 ```bash
 vx scan --config path/to/scan.yaml
 ```
 
 Use sample configs from `rag_examples/` as a starting point.
 
-### 4) First successful scan (small local example)
+## First successful scan
 
 From `rag_examples/small/rag_01_in_memory_en`:
 
 ```bash
 python3 -m venv .venv
 source .venv/bin/activate
+pip install vexrag
 pip install -r requirements.txt
-python small_rag.py
+python3 small_rag.py
 vx scan --config scan_configs_examples/ollama-smoke.yaml
 ```
 
 Expected outcome:
-- `small_rag.py` serves the target API on `http://localhost:8080`.
-- `vx scan` completes and prints a scan report with attack/evaluation results (no connection/preflight errors).
-
-## Development setup
-
-From the repository root:
-
-```bash
-python3 -m venv .venv
-source .venv/bin/activate
-pip install -e ".[dev]"
-pre-commit install   # optional
-```
-
-Run quality checks:
-
-```bash
-ruff check .
-ruff format --check .
-```
-
-See [CONTRIBUTING.md](CONTRIBUTING.md) for commit and workflow notes.
+- `python3 small_rag.py` serves the target API on `http://localhost:8080` (embeddings via Hugging Face; first run may download model weights).
+- `vx scan` finishes with a scan report. Smoke config needs only `ollama pull llama3:8b`.
 
 ## Project roadmap
 
@@ -107,11 +144,16 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for commit and workflow notes.
 - [x] Support for vLLM and Ollama
 - [x] Simple RAG examples for quick onboarding to VexRAG
 - [x] Support for Qdrant, FAISS, Chroma, and file-based retrieval backends
+- [x] Codebase hardening: refactors, typing, tooling, *removing AI slop*
 
 ### In Progress
-- [ ] Codebase hardening: refactors, typing, tooling, *removing AI slop*
+- [ ] Stable Python API to run scans and generate cases from code, not only via `vx`
 
 ### Ideas / Backlog
 - [ ] Expand red-team methods in VexRAG
 - [ ] Expand supported retrieval backends
 - [ ] Implement a web version of VexRAG
+
+## Feedback
+
+Feel free to [open a GitHub issue](https://github.com/Shepard2154/VexRAG/issues) for bugs, questions, or attack methods you would like to see in VexRAG. Pull requests and local development notes are in [CONTRIBUTING.md](CONTRIBUTING.md).
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "vexrag"
-version = "0.1.3"
+version = "0.2.0"
 description = "A Red Team framework that evaluates RAG functional correctness when the retrieval backend contains poisoned passages."
 readme = "README.md"
 requires-python = ">=3.11"