Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
95365cb
chore: improve coverage configuration for more accurate metrics
CaptainDriftwood Jan 16, 2026
733c356
test: add coverage tests and improve exclusion patterns
CaptainDriftwood Jan 16, 2026
8305455
test: add error handling tests for receive operations
CaptainDriftwood Jan 16, 2026
cc53224
fix: detect pre-started Docker containers in test fixture
CaptainDriftwood Jan 16, 2026
80addf2
docs: add LLMS.txt and README badge
CaptainDriftwood Feb 28, 2026
59de894
fix: resolve test warnings for async mocks and pytest-asyncio
CaptainDriftwood Feb 28, 2026
cb369bf
chore: add pytest-xdist for parallel test execution
CaptainDriftwood Feb 28, 2026
4b3e96d
chore: update uv.lock dependencies
CaptainDriftwood Mar 1, 2026
1ab9fe1
fix: use correct state reset in AsyncIcapClient connection error handler
CaptainDriftwood Mar 1, 2026
268e1d0
fix: use USER_AGENT constant instead of hardcoded string
CaptainDriftwood Mar 1, 2026
641c60f
docs: update installation instructions for PyPI availability
CaptainDriftwood Mar 1, 2026
85c633b
docs: fix project structure to show correct pytest_plugin location
CaptainDriftwood Mar 1, 2026
c53ef2f
fix: add missing AsyncIterator return type to _iter_chunks
CaptainDriftwood Mar 1, 2026
0ae71dc
fix: upgrade testcontainers to 4.x for Docker Compose v2 support
CaptainDriftwood Mar 1, 2026
a99106d
feat: add max_response_size parameter to prevent DoS attacks
CaptainDriftwood Mar 1, 2026
7b8dd77
fix: make ICAP headers case-insensitive per RFC 3507
CaptainDriftwood Mar 1, 2026
73efe42
docs: improve docstrings for options(), chunk_size, and timeout
CaptainDriftwood Mar 1, 2026
9bd9e8a
chore: add __all__ to response.py
CaptainDriftwood Mar 1, 2026
6b8da63
fix: validate status code range (100-599) in response parsing
CaptainDriftwood Mar 1, 2026
6384543
feat: add header validation to prevent CRLF injection
CaptainDriftwood Mar 1, 2026
8f4bdf6
feat: add header section size limit to prevent DoS
CaptainDriftwood Mar 1, 2026
96e8749
test: add preview mode and HTTP encapsulation edge case tests
CaptainDriftwood Mar 1, 2026
daee025
docs: update basic_example.py to use high-level API
CaptainDriftwood Mar 1, 2026
a3d2339
docs: add Quick Reference and Preview Mode sections to README
CaptainDriftwood Mar 1, 2026
3d0f84f
test: add shared fixtures and utilities for integration tests (Phase 1)
CaptainDriftwood Mar 2, 2026
12ba274
test: add large file handling integration tests (Phase 2)
CaptainDriftwood Mar 2, 2026
9a7c1bd
test: add concurrent load testing (Phase 3)
CaptainDriftwood Mar 2, 2026
5b65101
test: add connection robustness tests (Phase 4)
CaptainDriftwood Mar 2, 2026
4f14870
fix: correct API usage in integration tests
CaptainDriftwood Mar 2, 2026
d62eb20
refactor: normalize API parity and add Encapsulated header parsing
CaptainDriftwood Mar 2, 2026
b0a9090
refactor: split pytest plugin mock.py into focused modules
CaptainDriftwood Mar 2, 2026
cbdf08f
refactor: extract shared protocol logic to _protocol.py utilities
CaptainDriftwood Mar 2, 2026
f46ec23
feat: add ISTag header support to IcapResponse
CaptainDriftwood Mar 2, 2026
fb66c9b
test: add property-based fuzzing tests with hypothesis
CaptainDriftwood Mar 2, 2026
0e2218f
fix: validate negative Content-Length and Encapsulated offsets
CaptainDriftwood Mar 2, 2026
642576c
fix: resolve lint and type annotation issues
CaptainDriftwood Mar 2, 2026
346a6a7
test: add performance benchmarks with pytest-benchmark
CaptainDriftwood Mar 2, 2026
2d7a6b8
docs: update LLMS.txt with new API documentation
CaptainDriftwood Mar 2, 2026
17d8f1f
fix: reject negative chunk sizes in chunked transfer encoding
CaptainDriftwood Mar 4, 2026
2953a5e
docs: add max_response_size parameter to LLMS.txt
CaptainDriftwood Mar 4, 2026
e113123
chore: remove rembg dev dependency (breaks CI on Python 3.8/3.10)
CaptainDriftwood Mar 4, 2026
dd85241
chore: upgrade setuptools to fix CVE (GHSA-5rjg-fvgr-3xxf)
CaptainDriftwood Mar 4, 2026
1226ddd
fix: consume trailing CRLF after chunked body terminator
CaptainDriftwood Mar 4, 2026
41390e9
test: skip flaky CI integration tests with TODO to fix later
CaptainDriftwood Mar 4, 2026
5601a35
fix: move imports before module-level variable to fix E402
CaptainDriftwood Mar 4, 2026
c6cf4db
test: skip flaky benchmark test in CI environment
CaptainDriftwood Mar 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
235 changes: 235 additions & 0 deletions LLMS.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
# python-icap

> Pure Python ICAP client library with no external dependencies

python-icap is a Python library for communicating with ICAP (Internet Content Adaptation Protocol) servers. It implements RFC 3507 for integrating with servers like c-icap and SquidClamav for antivirus scanning, content filtering, and data loss prevention.

## Key Features

- Pure Python implementation with zero runtime dependencies
- Sync (`IcapClient`) and async (`AsyncIcapClient`) clients with full API parity
- High-level file scanning: `scan_file()`, `scan_bytes()`, `scan_stream()`
- SSL/TLS support with custom certificates and mutual TLS
- Bundled pytest plugin for testing without a live server
- Python 3.8+ support

## Installation

```bash
pip install python-icap
```

## Quick Start

```python
from icap import IcapClient

with IcapClient('localhost', port=1344) as client:
response = client.scan_file('/path/to/file.pdf')
if response.is_no_modification:
print("File is clean")
else:
print("Threat detected")
```

Async usage:

```python
import asyncio
from icap import AsyncIcapClient

async def scan():
async with AsyncIcapClient('localhost') as client:
response = await client.scan_bytes(b"content")
print(f"Clean: {response.is_no_modification}")

asyncio.run(scan())
```

## Public API

### Main Classes

Import from `icap`:

- `IcapClient(address, port=1344, timeout=10, ssl_context=None, max_response_size=104857600)` - Synchronous client
- `AsyncIcapClient(address, port=1344, timeout=10.0, ssl_context=None, max_response_size=104857600)` - Async client (max_response_size limits response size to prevent DoS, default 100MB)
- `IcapResponse` - Response object with `status_code`, `headers`, `body`, `is_success`, `is_no_modification`, `encapsulated`, `istag`
- `EncapsulatedParts` - Dataclass for parsed Encapsulated header offsets (`req_hdr`, `req_body`, `res_hdr`, `res_body`, `null_body`, `opt_body`)
- `CaseInsensitiveDict` - Case-insensitive dictionary for ICAP headers (per RFC 3507)

### High-Level Methods (Recommended)

- `scan_file(filepath, service='avscan')` - Scan file by path
- `scan_bytes(data, service='avscan', filename=None)` - Scan bytes directly
- `scan_stream(stream, service='avscan', filename=None, chunk_size=0)` - Scan file-like objects

### Low-Level Methods

- `options(service)` - Query server capabilities
- `respmod(service, http_request, http_response, headers=None, preview=None)` - Response modification
- `reqmod(service, http_request, http_body=None, headers=None)` - Request modification

### IcapResponse Properties

- `status_code` - ICAP status code (200, 204, etc.)
- `headers` - Case-insensitive header dictionary
- `body` - Response body bytes
- `is_success` - True for 2xx status codes
- `is_no_modification` - True for 204 (content is clean)
- `encapsulated` - Parsed `EncapsulatedParts` from Encapsulated header (or None)
- `istag` - ISTag header value for cache validation (RFC 3507 Section 4.7)

### Connection Management

- `connect()` / `disconnect()` - Manual connection control
- `is_connected` - Property to check connection status
- Context manager support: `with IcapClient(...) as client:`

### Exceptions

Import from `icap.exception`:

- `IcapException` - Base exception class
- `IcapConnectionError` - Connection failures
- `IcapTimeoutError` - Operation timeouts
- `IcapProtocolError` - Malformed responses
- `IcapServerError` - Server 5xx errors

## Pytest Plugin

The bundled pytest plugin provides fixtures for testing ICAP integrations.

### Mock Fixtures (no server required)

- `mock_icap_client` - Returns clean (204) responses
- `mock_async_icap_client` - Async version
- `mock_icap_client_virus` - Returns virus detection responses
- `mock_icap_client_timeout` - Raises `IcapTimeoutError`
- `mock_icap_client_connection_error` - Raises `IcapConnectionError`

### Response Builder

```python
from icap.pytest_plugin import IcapResponseBuilder

response = IcapResponseBuilder().clean().build() # 204 No Modification
response = IcapResponseBuilder().virus("Trojan.Gen").build() # Virus detected
response = IcapResponseBuilder().error(503, "Unavailable").build() # Error
```

### Pre-built Response Fixtures

- `icap_response_clean` - 204 No Modification response
- `icap_response_virus` - Virus detection response (X-Virus-ID: EICAR-Test)
- `icap_response_options` - OPTIONS response with server capabilities
- `icap_response_error` - 500 Server Error response
- `icap_response_builder` - Factory fixture returning `IcapResponseBuilder`

### MockIcapClient

```python
from icap.pytest_plugin import MockIcapClient, IcapResponseBuilder

client = MockIcapClient()
client.on_respmod(IcapResponseBuilder().virus("Trojan").build())
response = client.scan_bytes(b"content")
client.assert_called("scan_bytes", times=1)
```

### Markers

```python
@pytest.mark.icap_mock(response="clean")
def test_clean(icap_mock):
assert icap_mock.scan_bytes(b"data").is_no_modification

@pytest.mark.icap_mock(response="virus", virus_name="EICAR")
def test_virus(icap_mock):
assert not icap_mock.scan_bytes(b"data").is_no_modification
```

### Advanced Mock Features

```python
from icap.pytest_plugin import MockIcapClient, IcapResponseBuilder, MatcherBuilder

# Conditional responses based on content
client = MockIcapClient()
client.when(
MatcherBuilder().body_contains(b"EICAR").build()
).respond(IcapResponseBuilder().virus("EICAR-Test").build())
client.default_response(IcapResponseBuilder().clean().build())

# Call inspection
response = client.scan_bytes(b"test")
call = client.last_call
assert call.method == "scan_bytes"
assert call.args[0] == b"test"

# Strict mode (fails if no matching response)
@pytest.mark.icap_mock(strict=True)
def test_strict(icap_mock):
icap_mock.on_respmod(IcapResponseBuilder().clean().build())
# MockResponseExhaustedError if called more times than configured
```

## Project Structure

```
src/icap/
├── __init__.py # Public API exports
├── icap.py # Synchronous IcapClient
├── async_icap.py # AsyncIcapClient
├── response.py # IcapResponse, EncapsulatedParts, CaseInsensitiveDict
├── exception.py # Custom exceptions
├── _protocol.py # Shared protocol utilities
└── pytest_plugin/ # Bundled pytest plugin
├── plugin.py # Fixtures and markers
├── builder.py # IcapResponseBuilder
├── mock_client.py # MockIcapClient (sync)
├── mock_async.py # MockAsyncIcapClient
├── matchers.py # ResponseMatcher, MatcherBuilder
├── call_record.py # MockCall for call inspection
├── protocols.py # ResponseCallback protocols
└── mock.py # Re-exports for backward compatibility
```

## Common Service Names

- `"avscan"` or `"srv_clamav"` - ClamAV virus scanning
- `"squidclamav"` - SquidClamav service
- `"echo"` - Echo service (testing)

## Testing

```bash
# Unit tests
pytest -m "not integration"

# Integration tests (requires Docker)
docker compose -f docker/docker-compose.yml up -d
pytest -m integration
```

## EICAR Test String

Standard test string for triggering antivirus detection:

```python
EICAR = b'X5O!P%@AP[4\\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*'
```

## Protocol Reference

- RFC 3507: Internet Content Adaptation Protocol
- Default port: 1344
- Methods: OPTIONS, REQMOD, RESPMOD

## Links

- Repository: https://github.com/CaptainDriftwood/python-icap
- PyPI: https://pypi.org/project/python-icap/
- c-icap: https://c-icap.sourceforge.net/
- SquidClamav: https://squidclamav.darold.net/
- ClamAV: https://www.clamav.net/
92 changes: 87 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
[![Python 3.8 | 3.9 | 3.10 | 3.11 | 3.12 | 3.13 | 3.14](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11%20%7C%203.12%20%7C%203.13%20%7C%203.14-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![No Dependencies](https://img.shields.io/badge/dependencies-none-brightgreen.svg)](pyproject.toml)
[![LLMS.txt](https://img.shields.io/badge/LLMS-txt-blue)](https://github.com/CaptainDriftwood/python-icap/blob/master/LLMS.txt)

[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
Expand All @@ -21,6 +22,7 @@ A pure Python ICAP (Internet Content Adaptation Protocol) client with no externa

## Table of Contents

- [Quick Reference](#quick-reference)
- [Overview](#overview)
- [What is ICAP?](#what-is-icap)
- [Key Differences from HTTP](#key-differences-from-http)
Expand All @@ -39,6 +41,7 @@ A pure Python ICAP (Internet Content Adaptation Protocol) client with no externa
- [Scanning Content with RESPMOD](#scanning-content-with-respmod)
- [Scanning Files](#scanning-files)
- [Manual File Scanning (lower-level API)](#manual-file-scanning-lower-level-api)
- [Preview Mode](#preview-mode)
- [Async Usage](#async-usage)
- [Basic Async Example](#basic-async-example)
- [Concurrent Scanning](#concurrent-scanning)
Expand Down Expand Up @@ -75,6 +78,39 @@ python-icap provides a clean, Pythonic API for integrating ICAP into your applic
- **Pytest plugin** - Mock clients and fixtures for testing without a live server
- **Zero dependencies** - Pure Python stdlib implementation

## Quick Reference

```python
# Common imports
from icap import IcapClient, AsyncIcapClient, IcapResponse
from icap.exception import IcapException, IcapConnectionError, IcapTimeoutError

# Scan bytes (simplest)
with IcapClient("localhost") as client:
response = client.scan_bytes(b"content")
is_clean = response.is_no_modification

# Scan file
with IcapClient("localhost") as client:
response = client.scan_file("/path/to/file.pdf")

# Async scan
async with AsyncIcapClient("localhost") as client:
response = await client.scan_bytes(b"content")

# Check server capabilities
with IcapClient("localhost") as client:
response = client.options("avscan")
preview_size = response.headers.get("Preview") # bytes for preview mode

# Response properties
response.status_code # 200, 204, etc.
response.is_success # True for 2xx
response.is_no_modification # True for 204 (clean)
response.headers # Dict of ICAP headers
response.body # Response body bytes
```

## What is ICAP?

**ICAP (Internet Content Adaptation Protocol)** is a simple protocol that lets network devices (like proxies) send HTTP content to a separate server for inspection or modification before passing it along.
Expand Down Expand Up @@ -188,10 +224,11 @@ This allows the ICAP server to efficiently parse the message without scanning th

## Installation

> **Note:** This package is not yet published to PyPI due to a name collision. Install directly from source.

```bash
# Standard installation
# Install from PyPI
pip install python-icap

# Or install from source
pip install .

# Development installation (editable)
Expand Down Expand Up @@ -322,6 +359,51 @@ else:
print("File contains threats")
```

### Preview Mode

ICAP servers can advertise a preview size via OPTIONS, allowing clients to send just the beginning of a file for initial scanning. If the server can determine the content is clean from the preview alone, it returns 204; otherwise it requests the full content with 100 Continue.

```python
from icap import IcapClient

with IcapClient('localhost', port=1344) as client:
# Step 1: Query server capabilities to get preview size
options_response = client.options('avscan')
preview_size = options_response.headers.get('Preview')

if preview_size:
print(f"Server supports preview mode: {preview_size} bytes")
else:
print("Server does not advertise preview support")

# Step 2: Send RESPMOD with preview enabled
# The client handles the 100 Continue flow automatically
content = b"Large file content here..." * 1000

response = client.scan_bytes(
content,
service='avscan',
preview=int(preview_size) if preview_size else None
)

if response.is_no_modification:
print("Content is clean")
else:
print("Threat detected")
```

**How Preview Mode Works:**

1. Client sends OPTIONS to discover the server's preview size
2. Client sends RESPMOD with only the first N bytes (preview)
3. Server analyzes the preview:
- Returns **204 No Modification** if the preview is enough to determine content is clean
- Returns **100 Continue** to request the remaining data
4. If 100 Continue received, client sends the rest of the content
5. Server returns final verdict (200 or 204)

Preview mode reduces bandwidth and latency when scanning large files that are obviously clean (or obviously malicious) from their headers.

## Async Usage

python-icap includes an async client (`AsyncIcapClient`) for use with `asyncio`. The async client provides the same API as the sync client but with `async`/`await` syntax.
Expand Down Expand Up @@ -596,8 +678,8 @@ python-icap/
│ ├── async_icap.py # Asynchronous ICAP client
│ ├── _protocol.py # Shared protocol constants
│ ├── response.py # Response handling
── exception.py # Custom exceptions
── pytest_src/icap/ # Pytest plugin for ICAP testing
── exception.py # Custom exceptions
│ └── pytest_plugin/ # Bundled pytest plugin for testing
├── tests/ # Unit tests
├── examples/ # Usage examples
├── docker/ # Docker setup for integration testing
Expand Down
Loading