| Field | Value |
|---|---|
| Document ID | SPEC-TESTINGKIT-001 |
| Version | 1.0.0 |
| Status | Draft |
| Author | Phenotype Architecture Team |
| Created | 2026-04-05 |
| Last Updated | 2026-04-05 |
| Target Release | 1.0.0 |
- SOTA.md - State-of-the-art research
- ADR.md - Architecture decision records
- README.md - Quick start guide
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0.0 | 2026-04-05 | Architecture Team | Initial specification |
TestingKit is a comprehensive, multi-language testing framework designed for the Phenotype ecosystem. It provides:
- Language-native testing utilities for Rust, Python, and Go
- Cross-language testing patterns for unified developer experience
- Code quality analysis integrated with testing workflows
- Mocking and test doubles with language-idiomatic APIs
- Test fixtures and data generation for reproducible tests
- Performance and integration testing infrastructure
In Scope:
- Unit testing utilities and patterns
- Integration testing infrastructure
- Mocking frameworks
- Test fixtures and builders
- Code quality analysis (code smells, patterns)
- Performance testing support
- CI/CD integration
- Cross-language coordination
Out of Scope:
- GUI testing (use dedicated tools like Playwright)
- Mobile testing
- Hardware-in-the-loop testing
- Compliance certification frameworks
- Phenotype Contributors - Testing their contributions
- Ecosystem Developers - Building on Phenotype
- CI/CD Systems - Automated testing pipelines
- Quality Engineers - Code quality enforcement
| Metric | Target |
|---|---|
| Test execution speed | <10ms per unit test |
| Mock setup time | <5 lines of code |
| Code smell detection | 90%+ accuracy |
| Documentation coverage | 100% public APIs |
| CI integration time | <2 minutes setup |
┌─────────────────────────────────────────────────────────────────┐
│ TestingKit Ecosystem │
├─────────────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Rust │ │ Python │ │ Go │ │
│ │ Testing │ │ Testing │ │ Testing │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └─────────────────┼─────────────────┘ │
│ │ │
│ ┌────────────┴────────────┐ │
│ │ Shared Patterns Layer │ │
│ │ • Test Data Formats │ │
│ │ • Result Aggregation │ │
│ │ • CI/CD Integration │ │
│ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
| Component | Purpose | Dependencies | Lines of Code |
|---|---|---|---|
| phenotype-testing | Core utilities | tokio, tracing, rand | ~500 |
| phenotype-mock | Mocking framework | parking_lot | ~400 |
| phenotype-test-fixtures | Test data | chrono, uuid, serde | ~200 |
| phenotype-test-infra | Integration infra | tokio, tempfile | ~300 |
| phenotype-compliance-scanner | Quality checks | syn, quote | ~400 |
| Component | Purpose | Dependencies | Lines of Code |
|---|---|---|---|
| pheno-testing | Core utilities | pytest, anyio | ~800 |
| pheno-quality | Code quality | ast, pylint | ~1000 |
| Component | Purpose | Dependencies | Lines of Code |
|---|---|---|---|
| phenotype-testing | Core utilities | testify | ~200 |
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Test Source │────▶│ Test Discovery │────▶│ Test Execution │
│ Code Files │ │ Language-native │ │ Parallel/Serial │
└─────────────────┘ └─────────────────┘ └────────┬────────┘
│
┌──────────────────────────┼──────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Mock Context │ │ Fixture Setup │ │ Quality Check │
│ (if needed) │ │ (if needed) │ │ (optional) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└───────────────────────┴────────────────────┘
│
▼
┌─────────────────┐
│ Test Result │
│ Aggregation │
└────────┬────────┘
│
┌────────────────────────┼────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ JUnit XML │ │ Coverage │ │ CI/CD │
│ Report │ │ Report │ │ Integration │
└─────────────────┘ └─────────────────┘ └─────────────────┘
//! Phenotype Testing - Core testing utilities
pub mod assertions; // Assertion helpers
pub mod generators; // Test data generators
pub mod runtime; // Test runtime setup
// Core functions
pub async fn timeout<F, T>(future: F, duration: Duration) -> Result<T, Elapsed>;
pub async fn retry_async<F, Fut, T, E>(operation: F, max_attempts: u32, base_delay: Duration) -> Result<T, E>;
pub fn block_on<F, T>(future: F) -> T;
pub fn test_id() -> String;
pub fn random_port() -> u16;
pub async fn wait_for<F, Fut>(condition: F, timeout: Duration) -> bool;Signature:
pub async fn timeout<F, T>(future: F, duration: Duration) -> Result<T, tokio::time::error::Elapsed>
where
F: Future<Output = T>,Behavior:
- Executes
futurewith the specified timeout - Returns
Ok(result)if future completes within timeout - Returns
Err(Elapsed)if timeout expires - Uses tokio's timer for accuracy
Examples:
#[tokio::test]
async fn test_with_timeout() {
let result = timeout(
async { expensive_operation().await },
Duration::from_secs(5)
).await;
assert!(result.is_ok(), "Operation timed out");
}Error Handling:
- Elapsed error contains no additional information
- Caller responsible for interpreting timeout as failure
Signature:
pub async fn retry_async<F, Fut, T, E>(
mut operation: F,
max_attempts: u32,
base_delay: Duration,
) -> Result<T, E>
where
F: FnMut() -> Fut,
Fut: Future<Output = Result<T, E>>,Behavior:
- Retries operation up to
max_attemptstimes - Uses exponential backoff:
base_delay * 2^attempt - Returns first Ok result
- Returns last Err if all attempts fail
Retry Strategy:
| Attempt | Delay Formula | Example (base=100ms) |
|---|---|---|
| 1 | base_delay | 100ms |
| 2 | base_delay * 2 | 200ms |
| 3 | base_delay * 4 | 400ms |
| 4 | base_delay * 8 | 800ms |
Examples:
#[tokio::test]
async fn test_with_retry() {
let result = retry_async(
|| async { flaky_network_call().await },
5, // Max 5 attempts
Duration::from_millis(100), // Start with 100ms
).await;
assert!(result.is_ok());
}Random String Generator:
pub fn random_string(len: usize) -> String- Uses alphanumeric charset: A-Z, a-z, 0-9
- Cryptographically insecure (for testing only)
- Thread-safe
Random Email Generator:
pub fn random_email() -> String- Format:
{random(10)}@example.com - Guaranteed valid email format
Random UUID Generator:
pub fn random_uuid() -> String- RFC 4122 version 4 UUID format
- Example:
550e8400-e29b-41d4-a716-446655440000
Signature:
pub fn random_port() -> u16- Returns ports from dynamic range: 49152-65535
- Uses thread_rng for distribution
- No guarantee of availability
Usage Pattern:
#[tokio::test]
async fn test_server() {
let port = random_port();
let server = TestServer::bind(port).await;
// Test server...
}CallRecord:
#[derive(Debug, Clone, Default)]
pub struct CallRecord {
pub method: String,
pub args: Vec<String>,
pub return_value: Option<String>,
pub count: usize,
}Matcher:
#[derive(Debug, Clone, Default)]
pub struct Matcher {
pub method: String,
pub expected_args: Option<Vec<String>>,
}Expectation:
#[derive(Debug, Clone, Default)]
pub struct Expectation {
pub matcher: Matcher,
pub return_value: Option<String>,
pub times: Option<usize>,
pub called_count: usize,
}Construction:
impl MockContext {
pub fn new() -> Self;
}Call Recording:
pub fn record_call(&self, method: impl Into<String>, args: Vec<String>);Verification:
pub fn verify_called(&self, method: impl AsRef<str>) -> bool;
pub fn verify_called_with(&self, method: impl AsRef<str>, args: &[&str]) -> bool;
pub fn verify_call_count(&self, method: impl AsRef<str>, expected: usize) -> bool;
pub fn call_count(&self, method: impl AsRef<str>) -> usize;Expectations:
pub fn expect(&self, method: impl Into<String>) -> ExpectationBuilder;
pub fn get_return_value(&self, method: impl AsRef<str>, args: &[String]) -> Option<String>;
pub fn verify_all(&self) -> Result<(), Vec<String>>;Fluent Interface:
impl ExpectationBuilder {
pub fn with_args(mut self, args: Vec<impl Into<String>>) -> Self;
pub fn returns<T: Into<String>>(mut self, value: T) -> Self;
pub fn times(mut self, count: usize) -> Self;
pub fn build(self);
}Usage Example:
let ctx = MockContext::new();
ctx.expect("get_user")
.with_args(vec!["123"])
.returns(r#"{"id": "123", "name": "Alice"}"#)
.times(1)
.build();
// Use mock...
ctx.verify_all().expect("All expectations met");Purpose: Generate mock struct boilerplate
Syntax:
mock_trait!(
MockName for TraitPath {
fn method_name(arg1: Type1, arg2: Type2) -> ReturnType;
}
);Expansion:
// Input:
mock_trait!(MockDatabase for Database {
fn get(&self, key: &str) -> Option<String>;
});
// Expands to:
pub struct MockDatabase {
context: phenotype_mock::MockContext,
}
impl MockDatabase {
pub fn new() -> Self {
Self {
context: phenotype_mock::MockContext::new(),
}
}
pub fn context(&self) -> &phenotype_mock::MockContext {
&self.context
}
}
impl Default for MockDatabase {
fn default() -> Self {
Self::new()
}
}Definition:
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TestData<T> {
pub id: Uuid,
pub name: String,
pub value: T,
pub created_at: DateTime<Utc>,
pub metadata: HashMap<String, String>,
}Builder Pattern:
impl<T: Default> TestData<T> {
pub fn new(name: impl Into<String>, value: T) -> Self;
pub fn with_metadata(mut self, key: impl Into<String>, value: impl Into<String>) -> Self;
}Usage:
let data = TestData::new("test-user", User::default())
.with_metadata("source", "fixture")
.with_metadata("version", "1.0");Definition:
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TestScenario {
pub name: String,
pub description: String,
pub setup: Vec<TestStep>,
pub execution: Vec<TestStep>,
pub teardown: Vec<TestStep>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TestStep {
pub name: String,
pub action: String,
pub expected_result: String,
}Purpose: HTTP test server for integration tests
API:
pub struct TestServer {
pub addr: SocketAddr,
pub base_url: String,
_temp_dir: TempDir,
}
impl TestServer {
pub async fn new() -> std::io::Result<Self>;
pub fn url(&self, path: &str) -> String;
}Lifecycle:
- Create temp directory for server files
- Bind to random port on localhost
- Store address and base URL
- Clean up temp directory on drop
Purpose: Temporary database for integration tests
API:
pub struct TestDatabase {
pub connection_string: String,
_temp_dir: TempDir,
}
impl TestDatabase {
pub fn new() -> std::io::Result<Self>;
pub async fn setup(&self) -> Result<(), Box<dyn std::error::Error>>;
pub async fn teardown(&self) -> Result<(), Box<dyn std::error::Error>>;
}Default Configuration:
- SQLite in temp directory
- Connection string:
sqlite:{temp_path}/test.db - Auto-cleanup on drop
Purpose: Aggregate all test resources
API:
pub struct TestContext {
pub server: Option<TestServer>,
pub database: Option<TestDatabase>,
pub temp_dir: TempDir,
}
impl TestContext {
pub fn new() -> std::io::Result<Self>;
pub async fn with_server(mut self) -> std::io::Result<Self>;
pub fn with_database(mut self) -> std::io::Result<Self>;
}Builder Pattern Usage:
let ctx = TestContext::new()?
.with_server().await?
.with_database()?;
// Use ctx.server, ctx.database, ctx.temp_dir
// All resources cleaned up when ctx dropsMCP (Model Context Protocol) Testing:
# Process monitoring
from pheno_testing.mcp_qa.process import ProcessMonitor
monitor = ProcessMonitor(pid=1234)
monitor.start_monitoring()
metrics = monitor.get_metrics()
# metrics.cpu_percent, metrics.memory_mb, metrics.status
# Structured logging
from pheno_testing.mcp_qa.logging import MCPFormatter, MCPLogger
logger = MCPLogger(formatter=MCPFormatter(
include_context=True,
include_timestamp=True,
))
# Connection management
from pheno_testing.mcp_qa.monitoring import ConnectionManager
manager = ConnectionManager(
max_connections=10,
connection_timeout=30.0,
)Benchmark Decorator:
from pheno_testing.performance import Benchmark
@Benchmark(warmup=5, iterations=100, timeout=60.0)
def test_database_query():
return db.query("SELECT * FROM large_table")Load Testing:
from pheno_testing.performance import LoadTester
tester = LoadTester(
target=test_function,
concurrent_users=10,
duration=60.0,
)
results = tester.run()
# results.requests_per_second
# results.average_latency
# results.error_ratefrom pheno_testing.fixtures import async_fixture
@async_fixture
async def async_client():
client = await Client.connect()
yield client
await client.close()Supported Smells:
| Smell | Description | Detection Method |
|---|---|---|
| God Object | Class with too many responsibilities | Method/field count |
| Feature Envy | Method using another class's data | Data flow analysis |
| Data Clump | Related data appearing together | Co-occurrence analysis |
| Shotgun Surgery | Change requires many modifications | Change coupling |
| Divergent Change | Class modified for different reasons | Change history |
| Message Chain | Excessive method chaining | Call chain length |
| Duplicate Code | Similar code blocks | AST comparison |
| Lazy Class | Minimal functionality class | Method complexity |
| Refused Bequest | Unused inheritance | Override analysis |
| Middle Man | Excessive delegation | Call forwarding |
Detector Interface:
from pheno_quality.tools import CodeSmellDetector
detector = CodeSmellDetector(
rules=[
GodObjectRule(max_methods=20),
FeatureEnvyRule(threshold=0.7),
]
)
issues = detector.analyze_file("src/service.py")
for issue in issues:
print(f"{issue.location}: {issue.severity} - {issue.message}")Supported Patterns:
| Pattern | Validation Approach |
|---|---|
| Clean Architecture | Dependency direction |
| Domain-Driven Design | Aggregate boundaries |
| SOLID | Interface analysis |
| Hexagonal | Port/adapter matching |
| Layered | Layer dependency rules |
| Microservices | Service boundary detection |
Validator Interface:
from pheno_quality.tools import ArchitecturalValidator
validator = ArchitecturalValidator(
patterns=[CleanArchitecture(), DDD()]
)
report = validator.validate_project("src/")
for violation in report.violations:
print(f"{violation.rule}: {violation.location}")# conftest.py
import pytest
from pheno_quality.pytest_plugin import QualityPlugin
def pytest_configure(config):
config.pluginmanager.register(QualityPlugin(
rules="pheno_quality.rules.STANDARD",
fail_on="error",
))CLI Usage:
# Run tests with quality checks
pytest --quality
# Quality-only run
pytest --quality-only
# Fail on warnings too
pytest --quality --quality-fail-level=warning| Function | Signature | Purpose |
|---|---|---|
| timeout | async fn<F,T>(F, Duration) -> Result<T, Elapsed> |
Execute with timeout |
| timeout_default | async fn<F,T>(F) -> Result<T, Elapsed> |
Execute with 5s timeout |
| block_on | fn<F,T>(F) -> T |
Block on async in sync context |
| test_id | fn() -> String |
Generate unique test ID |
| random_port | fn() -> u16 |
Generate random port |
| wait_for | async fn<F,Fut>(F, Duration) -> bool |
Wait for condition |
| retry_async | async fn<F,Fut,T,E>(F, u32, Duration) -> Result<T,E> |
Retry with backoff |
| Type | Purpose |
|---|---|
| CallRecord | Record of mock invocation |
| Matcher | Argument matching specification |
| Expectation | Expected call specification |
| MockContext | Thread-safe mock state |
| ExpectationBuilder | Fluent expectation construction |
| Type | Purpose |
|---|---|
| TestData | Generic test data container |
| TestScenario | Multi-step test definition |
| TestStep | Single test step |
| TestEnv | Isolated test environment |
| Type | Purpose |
|---|---|
| TestServer | HTTP server for integration tests |
| TestDatabase | Temporary database |
| TestContext | Aggregated test resources |
| PortAllocator | Sequential port allocation |
| Module | Purpose |
|---|---|
| mcp_qa | MCP testing framework |
| performance | Benchmarking utilities |
| fixtures | Test fixtures |
| markers | Custom pytest markers |
| Class | Purpose |
|---|---|
| CodeSmellDetector | Detect code smells |
| ArchitecturalValidator | Validate architecture |
| PatternDetector | Detect patterns |
name: TestingKit CI
on: [push, pull_request]
jobs:
rust-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-action@stable
- name: Install Nextest
run: cargo install cargo-nextest
- name: Run Rust Tests
run: cargo nextest run --profile ci
- name: Code Quality
run: cargo run -p phenotype-compliance-scanner
python-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install Dependencies
run: |
pip install -e "python/pheno-testing"
pip install -e "python/pheno-quality"
- name: Run Python Tests
run: pytest python/ --cov --cov-report=xml
- name: Code Quality
run: pytest python/ --quality<?xml version="1.0" encoding="UTF-8"?>
<testsuites>
<testsuite name="phenotype-testing" tests="42" failures="0" errors="0" time="1.23">
<testcase name="test_timeout_success" time="0.01"/>
<testcase name="test_retry_async" time="0.05">
<system-out>Attempt 1 failed, retrying...</system-out>
</testcase>
<testcase name="test_mock_context" time="0.001"/>
</testsuite>
</testsuites># Generate coverage
cargo tarpaulin --out Xml --out Html
# Or with llvm-cov
cargo llvm-cov --html# Generate coverage
pytest --cov=pheno_testing --cov-report=xml --cov-report=html| Metric | Requirement | Measurement |
|---|---|---|
| Unit test execution | < 10ms/test | Mean across suite |
| Mock setup | < 1ms | From construction to first use |
| Fixture creation | < 5ms | Simple fixture |
| Test discovery | < 1s per 1000 tests | Cold start |
| Metric | Requirement | Measurement |
|---|---|---|
| File analysis | < 100ms per 1000 LOC | Single file |
| Project analysis | < 5s per 10K LOC | Entire project |
| Incremental analysis | < 1s | Changed files only |
| Resource | Limit | Notes |
|---|---|---|
| Memory per test | 100MB | Soft limit |
| Disk per test | 50MB | Temp files |
| Concurrent tests | CPU cores | Configurable |
| Test timeout | 60s | Default |
Process Isolation:
- Each test should run in isolation
- No shared mutable state between tests
- Mock contexts are thread-safe but not process-safe
File System Isolation:
- Use TempDir for all file operations
- Clean up on test completion
- Unique directories per test
Network Isolation:
- Use random ports for test servers
- Prefer loopback (127.0.0.1) only
- Mock external services
Sensitive Data:
- Never commit real credentials
- Use fixture generators for test data
- Sanitize logs and reports
Random Data:
- Generators are not cryptographically secure
- Not suitable for security-critical code
- Use proper crypto for production
Detection Rules:
- Flag hardcoded secrets
- Detect unsafe patterns
- Validate input sanitization
┌─────────────────────────────────────┐
│ E2E Tests (5%) │
│ Cross-language integration │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ Integration Tests (15%) │
│ Component interactions │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ Unit Tests (80%) │
│ Individual functions/types │
└─────────────────────────────────────┘
| Category | Percentage | Tools |
|---|---|---|
| Unit | 80% | Built-in + Nextest/pytest |
| Integration | 15% | TestServer, TestDatabase |
| E2E | 5% | Full stack scenarios |
| Property-based | 10% | proptest, Hypothesis |
| Mutation | Periodic | cargo-mutants, mutmut |
All Code Must Have:
- Unit tests for public APIs
- Integration tests for I/O operations
- Documentation tests (Rust)
- Property-based tests for algorithms
Quality Gates:
- 80% line coverage minimum
- No flaky tests in CI
- All code smells resolved or documented
- Mutation score > 50%
Rust (crates.io):
phenotype-testing@1.0.0
phenotype-mock@1.0.0
phenotype-test-fixtures@1.0.0
phenotype-test-infra@1.0.0
Python (PyPI):
pheno-testing==1.0.0
pheno-quality==1.0.0
Go (proxy):
github.com/phenotype/testing
Semantic Versioning:
- MAJOR: Breaking API changes
- MINOR: New features, backward compatible
- PATCH: Bug fixes
Pre-release:
-alpha.X- Early testing-beta.X- Feature complete-rc.X- Release candidate
| Metric | Type | Alert Threshold |
|---|---|---|
| Test duration | Histogram | P99 < 10s |
| Test flakiness | Gauge | > 1% |
| Coverage | Gauge | < 80% |
| Mock usage | Counter | N/A |
Test Execution Trace:
use tracing::{info_span, instrument};
#[instrument]
#[test]
fn test_with_tracing() {
let span = info_span!("test_execution", test_name = "test_with_tracing");
let _enter = span.enter();
// Test code with automatic tracing
}| Feature | Priority | Target |
|---|---|---|
| Visual test diff | Medium | 1.1.0 |
| Snapshot testing | High | 1.1.0 |
| Fuzzing integration | Medium | 1.2.0 |
| WebAssembly testing | Low | 1.3.0 |
| AI test generation | Research | TBD |
-
AI-Assisted Test Generation
- LLM-based test case generation
- Automated test repair
- Smart test selection
-
Distributed Testing
- Remote test execution
- Test result aggregation
- Cluster-based parallelization
| Term | Definition |
|---|---|
| Fixture | Reusable test setup/teardown |
| Mock | Test double with verification |
| Stub | Test double with canned responses |
| Code Smell | Indicator of deeper problems |
| Property-based | Testing via generated inputs |
| Mutation Testing | Testing by mutating code |
| Flaky Test | Non-deterministic test |
From mockall (Rust):
// mockall
#[automock]
trait Database { }
// phenotype-mock
mock_trait!(MockDatabase for Database { });From unittest.mock (Python):
# unittest.mock
mock = Mock()
mock.method.return_value = 42
# pheno-quality
# Use fixture-based approach with type safetyIssue: Tests timeout unexpectedly
- Check for blocking operations in async tests
- Verify Tokio runtime configuration
- Use
timeout()wrapper
Issue: Mock verification fails
- Ensure
verify_all()called - Check argument matching (exact vs. partial)
- Verify method names match exactly
Issue: Quality checks too slow
- Enable incremental analysis
- Exclude generated code
- Tune rule thresholds
See CONTRIBUTING.md for:
- Development setup
- Code style guidelines
- Test requirements
- PR process
End of Specification Document
Pattern 1: Verify Method Called
#[test]
fn test_service_calls_repository() {
let ctx = MockContext::new();
let mock_repo = MockRepository::with_context(&ctx);
let service = UserService::new(mock_repo);
service.get_user(1);
assert!(ctx.verify_called("get_user"));
assert!(ctx.verify_called_with("get_user", &["1"]));
}Pattern 2: Stub Return Values
#[test]
fn test_service_uses_repository_result() {
let ctx = MockContext::new();
ctx.expect("get_user")
.with_args(vec!["1"])
.returns(r#"{"id":1,"name":"Alice"}"#)
.build();
let mock_repo = MockRepository::with_context(&ctx);
let service = UserService::new(mock_repo);
let user = service.get_user(1);
assert_eq!(user.name, "Alice");
}Pattern 3: Mock Sequence
#[test]
fn test_service_calls_in_order() {
let ctx = MockContext::new();
// First call
ctx.expect("begin_transaction").times(1).build();
// Second call
ctx.expect("save").times(1).build();
// Third call
ctx.expect("commit").times(1).build();
let mock_repo = MockRepository::with_context(&ctx);
let service = UserService::new(mock_repo);
service.create_user("Alice");
ctx.verify_all().expect("All expectations met");
}Pattern 1: Basic Async Test
#[tokio::test]
async fn test_async_operation() {
let result = async_operation().await;
assert!(result.is_ok());
}Pattern 2: Async with Timeout
#[tokio::test]
async fn test_async_with_timeout() {
let result = timeout(
async_operation(),
Duration::from_secs(5)
).await;
assert!(result.is_ok());
}Pattern 3: Concurrent Operations
#[tokio::test]
async fn test_concurrent_operations() {
let handles: Vec<_> = (0..10)
.map(|i| tokio::spawn(async move {
operation(i).await
}))
.collect();
let results = futures::future::join_all(handles).await;
for result in results {
assert!(result.is_ok());
}
}Pattern 1: Database Setup
struct DatabaseFixture {
db: TestDatabase,
connection: Connection,
}
impl DatabaseFixture {
async fn new() -> Self {
let db = TestDatabase::new().unwrap();
let connection = create_connection(&db.connection_string).await;
run_migrations(&connection).await;
Self { db, connection }
}
async fn seed_data(&self) {
// Insert test data
}
}
#[tokio::test]
async fn test_with_database() {
let fixture = DatabaseFixture::new().await;
fixture.seed_data().await;
// Run tests
}Pattern 2: HTTP Server Fixture
struct ServerFixture {
server: TestServer,
client: TestClient,
}
impl ServerFixture {
async fn new() -> Self {
let server = TestServer::new().await.unwrap();
let client = TestClient::new(&server.base_url);
Self { server, client }
}
}
#[tokio::test]
async fn test_api_endpoint() {
let fixture = ServerFixture::new().await;
let response = fixture.client.get("/api/users").await;
assert_eq!(response.status, 200);
}use proptest::prelude::*;
proptest! {
#[test]
fn test_sort_reverses_reverse(input in prop::collection::vec(1..100i32, 0..100)) {
let mut sorted = input.clone();
sorted.sort();
sorted.reverse();
let mut double_reversed = sorted.clone();
double_reversed.reverse();
double_reversed.sort();
prop_assert_eq!(input, double_reversed);
}
#[test]
fn test_merge_preserves_elements(
left in prop::collection::vec(1..100i32, 0..50),
right in prop::collection::vec(1..100i32, 0..50)
) {
let mut merged = left.clone();
merged.extend(right.clone());
merged.sort();
let total_len = left.len() + right.len();
prop_assert_eq!(merged.len(), total_len);
}
}#![cfg(fuzzing)]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
if let Ok(s) = std::str::from_utf8(data) {
// Test parser with random input
let _ = parser::parse(s);
}
});#[tokio::test]
async fn test_under_load() {
let metrics = Arc::new(Mutex::new(LoadMetrics::default()));
let start = Instant::now();
let handles: Vec<_> = (0..100)
.map(|_| {
let metrics = metrics.clone();
tokio::spawn(async move {
let req_start = Instant::now();
let result = make_request().await;
let duration = req_start.elapsed();
metrics.lock().unwrap().record(duration, result.is_ok());
})
})
.collect();
futures::future::join_all(handles).await;
let total_duration = start.elapsed();
let final_metrics = metrics.lock().unwrap();
println!("Total time: {:?}", total_duration);
println!("Success rate: {:.2}%", final_metrics.success_rate() * 100.0);
println!("Avg latency: {:?}", final_metrics.avg_latency());
}name: TestingKit CI
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
rust-tests:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
rust: [stable, nightly]
steps:
- uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-action@stable
with:
toolchain: ${{ matrix.rust }}
- name: Install cargo-nextest
uses: taiki-e/install-action@nextest
- name: Cache cargo registry
uses: actions/cache@v3
with:
path: ~/.cargo/registry
key: ${{ runner.os }}-cargo-registry-${{ hashFiles('**/Cargo.lock') }}
- name: Cache cargo index
uses: actions/cache@v3
with:
path: ~/.cargo/git
key: ${{ runner.os }}-cargo-index-${{ hashFiles('**/Cargo.lock') }}
- name: Cache cargo build
uses: actions/cache@v3
with:
path: target
key: ${{ runner.os }}-cargo-build-target-${{ hashFiles('**/Cargo.lock') }}
- name: Check formatting
run: cargo fmt -- --check
- name: Run clippy
run: cargo clippy -- -D warnings
- name: Build
run: cargo build --verbose
- name: Run tests with nextest
run: cargo nextest run --profile ci
- name: Generate coverage
run: |
cargo install cargo-tarpaulin
cargo tarpaulin --out Xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
files: ./cobertura.xml
fail_ci_if_error: true
python-tests:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12']
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
pip install -e "python/pheno-testing"
pip install -e "python/pheno-quality"
pip install pytest pytest-cov pytest-asyncio hypothesis
- name: Run tests with coverage
run: |
pytest python/ --cov=pheno_testing --cov=pheno_quality \
--cov-report=xml --cov-report=html
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage.xml
quality-gates:
runs-on: ubuntu-latest
needs: [rust-tests, python-tests]
steps:
- uses: actions/checkout@v4
- name: Check code quality
run: |
cd python/pheno-quality
python -m pheno_quality.tools.code_smell_detector --fail-on-errorstages:
- test
- coverage
- quality
variables:
CARGO_HOME: $CI_PROJECT_DIR/.cargo
RUST_BACKTRACE: 1
cache:
paths:
- .cargo/
- target/
rust:test:
stage: test
image: rust:latest
script:
- cargo test --verbose
artifacts:
reports:
junit: target/nextest/ci/junit.xml
python:test:
stage: test
image: python:3.12
script:
- pip install -e "python/pheno-testing"
- pytest --junitxml=report.xml
artifacts:
reports:
junit: report.xml
coverage:
stage: coverage
image: rust:latest
script:
- cargo install cargo-tarpaulin
- cargo tarpaulin --out Xml
coverage: '/\d+\.?\d*%/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: cobertura.xmlIssue 1: Flaky Tests
- Cause: Timeouts, race conditions, external dependencies
- Solution: Use TestServer, MockContext, deterministic timeouts
Issue 2: Slow Tests
- Cause: Database setup, network calls, file I/O
- Solution: Mock external deps, use in-memory databases, parallel execution
Issue 3: Test Isolation Failures
- Cause: Global state, shared resources
- Solution: Use TestEnv, clean up in teardown, process-per-test
Issue 4: Async Test Failures
- Cause: Runtime not initialized, timing issues
- Solution: Use #[tokio::test], proper timeout handling
- Use RUST_BACKTRACE=1 for Rust test failures
- Use pytest -v --tb=long for detailed Python tracebacks
- Run single test to isolate issues
- Add logging for complex scenarios
- Use test profiling to find slow tests
From mockall (Rust):
// Before: mockall
#[automock]
trait Database { }
// After: phenotype-mock
mock_trait!(MockDatabase for Database { });From unittest.mock (Python):
# Before
from unittest.mock import Mock
mock = Mock()
mock.method.return_value = 42
// After: Use fixture-based patterns with pheno-testing- Start with new tests using TestingKit
- Migrate critical tests first
- Maintain legacy tests during transition
- Use adapter patterns for integration
| Term | Definition |
|---|---|
| Fixture | Reusable test setup/teardown |
| Mock | Test double with verification |
| Stub | Test double with canned responses |
| Spy | Test double that records calls |
| Fake | Working simplified implementation |
| Code Smell | Indicator of deeper problems |
| Property-based | Testing via generated inputs |
| Mutation Testing | Testing by mutating code |
| Flaky Test | Non-deterministic test |
| Coverage | Percentage of code exercised by tests |
- "xUnit Test Patterns" - Gerard Meszaros
- "Test Driven Development" - Kent Beck
- "Growing Object-Oriented Software" - Freeman & Pryce
- "The Art of Unit Testing" - Roy Osherove
- "Testing with Rust" - Various platforms
- "Advanced pytest" - Test automation university
- "Mutation Testing Workshop" - Conference materials
| Version | Date | Changes |
|---|---|---|
| 1.0.0 | 2026-04-05 | Initial release |
Scenario 1: Mocking Async Functions
use phenotype_mock::{MockContext, mock_async_trait};
use async_trait::async_trait;
#[async_trait]
trait AsyncDatabase {
async fn query(&self, sql: &str) -> Result<Vec<Row>, Error>;
async fn execute(&self, sql: &str) -> Result<u64, Error>;
}
mock_async_trait!(MockAsyncDatabase for AsyncDatabase {
async fn query(&self, sql: &str) -> Result<Vec<Row>, Error>;
async fn execute(&self, sql: &str) -> Result<u64, Error>;
});
#[tokio::test]
async fn test_async_mock() {
let ctx = MockContext::new();
ctx.expect("query")
.with_args(vec!["SELECT * FROM users"])
.returns(r#"[{"id":1,"name":"Alice"}]"#)
.build();
let db = MockAsyncDatabase::with_context(&ctx);
let rows = db.query("SELECT * FROM users").await.unwrap();
assert_eq!(rows.len(), 1);
assert_eq!(rows[0].get("name"), Some(&Value::String("Alice".to_string())));
}Scenario 2: Mock with Side Effects
#[test]
fn test_mock_with_side_effects() {
let ctx = MockContext::new();
let call_count = Arc::new(AtomicU32::new(0));
ctx.expect("process")
.times(3)
.with_side_effect({
let count = call_count.clone();
move || {
count.fetch_add(1, Ordering::SeqCst);
}
})
.build();
// Execute multiple times
for _ in 0..3 {
ctx.trigger("process");
}
assert_eq!(call_count.load(Ordering::SeqCst), 3);
}Scenario 3: Conditional Mock Responses
#[test]
fn test_conditional_responses() {
let ctx = MockContext::new();
// Return different values based on input
ctx.expect("get")
.with_args(vec!["user:1"])
.returns(r#"{"id":1,"name":"Alice"}"#)
.build();
ctx.expect("get")
.with_args(vec!["user:2"])
.returns(r#"{"id":2,"name":"Bob"}"#)
.build();
ctx.expect("get")
.with_args(vec!["user:999"])
.returns("") // Simulate not found
.build();
// Verify different responses
assert!(ctx.get_return_value("get", &["user:1".to_string()]).is_some());
assert!(ctx.get_return_value("get", &["user:2".to_string()]).is_some());
assert!(ctx.get_return_value("get", &["user:999".to_string()]).is_some());
assert!(ctx.get_return_value("get", &["user:unknown".to_string()]).is_none());
}Scenario 1: Multi-Service Test Environment
struct MicroserviceTestEnv {
api_server: TestServer,
database: TestDatabase,
cache: TestCache,
message_queue: TestMessageQueue,
}
impl MicroserviceTestEnv {
async fn new() -> Result<Self, Error> {
Ok(Self {
api_server: TestServer::new().await?,
database: TestDatabase::new()?,
cache: TestCache::new(),
message_queue: TestMessageQueue::new().await?,
})
}
async fn setup_integration(&self) -> Result<(), Error> {
// Configure services to communicate
self.api_server.configure_db(&self.database).await;
self.api_server.configure_cache(&self.cache).await;
self.api_server.configure_mq(&self.message_queue).await;
Ok(())
}
}
#[tokio::test]
async fn test_end_to_end_flow() {
let env = MicroserviceTestEnv::new().await.unwrap();
env.setup_integration().await.unwrap();
// Seed test data
env.database.seed("tests/fixtures/integration_data.sql").await;
// Execute API call
let response = env.api_server
.client()
.post("/api/orders")
.json(&order_request)
.send()
.await;
// Verify response
assert_eq!(response.status(), 201);
// Verify database state
let orders = env.database.query("SELECT * FROM orders").await;
assert_eq!(orders.len(), 1);
// Verify cache
let cached = env.cache.get("order:latest").await;
assert!(cached.is_some());
// Verify message published
let messages = env.message_queue.consume("orders.created").await;
assert_eq!(messages.len(), 1);
}Scenario 2: Performance Test Environment
struct PerformanceTestEnv {
server: TestServer,
load_generator: LoadGenerator,
metrics_collector: MetricsCollector,
}
impl PerformanceTestEnv {
async fn run_load_test(&self, config: LoadTestConfig) -> PerformanceReport {
let start = Instant::now();
// Generate load
self.load_generator.run(config).await;
// Collect metrics
let metrics = self.metrics_collector.collect().await;
PerformanceReport {
duration: start.elapsed(),
total_requests: metrics.total_requests,
successful_requests: metrics.successful,
failed_requests: metrics.failed,
avg_latency: metrics.avg_latency(),
p95_latency: metrics.p95_latency(),
p99_latency: metrics.p99_latency(),
throughput: metrics.throughput(),
}
}
}
#[tokio::test]
async fn test_api_performance() {
let env = PerformanceTestEnv::new().await.unwrap();
let config = LoadTestConfig {
concurrent_users: 100,
duration: Duration::from_secs(60),
ramp_up: Duration::from_secs(10),
};
let report = env.run_load_test(config).await;
// Assert performance criteria
assert!(report.avg_latency < Duration::from_millis(100));
assert!(report.p95_latency < Duration::from_millis(200));
assert!(report.p99_latency < Duration::from_millis(500));
assert_eq!(report.failed_requests, 0);
}Pattern 1: Parametrized Fixtures
use phenotype_test_fixtures::ParameterizedFixture;
struct UserFixture {
user_type: UserType,
permissions: Vec<Permission>,
}
#[fixture(param = "admin")]
fn admin_user() -> UserFixture {
UserFixture {
user_type: UserType::Admin,
permissions: vec![Permission::All],
}
}
#[fixture(param = "regular")]
fn regular_user() -> UserFixture {
UserFixture {
user_type: UserType::User,
permissions: vec![Permission::Read, Permission::Write],
}
}
#[fixture(param = "guest")]
fn guest_user() -> UserFixture {
UserFixture {
user_type: UserType::Guest,
permissions: vec![Permission::Read],
}
}
#[test]
#[parametrized_fixture(user_fixture, ["admin", "regular", "guest"])]
fn test_user_permissions(user: UserFixture) {
match user.user_type {
UserType::Admin => assert!(user.permissions.contains(&Permission::All)),
UserType::User => {
assert!(user.permissions.contains(&Permission::Read));
assert!(user.permissions.contains(&Permission::Write));
}
UserType::Guest => {
assert!(user.permissions.contains(&Permission::Read));
assert!(!user.permissions.contains(&Permission::Write));
}
}
}Pattern 2: Hierarchical Fixtures
// Base fixture
#[fixture]
fn database() -> TestDatabase {
TestDatabase::new().unwrap()
}
// Derived fixture
#[fixture]
fn populated_database(database: TestDatabase) -> TestDatabase {
database.seed(include_str!("fixtures/users.sql"));
database.seed(include_str!("fixtures/orders.sql"));
database
}
// Further derived
#[fixture]
fn database_with_orders(populated_database: TestDatabase) -> TestDatabase {
populated_database.execute("INSERT INTO orders ...");
populated_database
}
#[test]
fn test_with_populated_db(database_with_orders: TestDatabase) {
let orders = database_with_orders.query("SELECT * FROM orders");
assert!(!orders.is_empty());
}Pattern 3: Scoped Fixtures
use phenotype_testing::fixtures::{fixture, Scope};
// Function-scoped: fresh for each test
#[fixture(scope = Scope::Function)]
fn temp_file() -> TempFile {
TempFile::new()
}
// Module-scoped: shared within module
#[fixture(scope = Scope::Module)]
fn module_cache() -> Cache {
Cache::new()
}
// Session-scoped: shared across all tests
#[fixture(scope = Scope::Session)]
fn test_config() -> Config {
Config::load("tests/config.yaml")
}Factory with Builders
use phenotype_test_fixtures::{Factory, Builder};
#[derive(Builder)]
struct UserBuilder {
id: u64,
name: String,
email: String,
role: Role,
created_at: DateTime<Utc>,
}
impl Factory for UserBuilder {
type Product = User;
fn default() -> Self {
Self {
id: generate_id(),
name: "Test User".to_string(),
email: format!("user{}@example.com", generate_id()),
role: Role::User,
created_at: Utc::now(),
}
}
fn build(self) -> User {
User {
id: self.id,
name: self.name,
email: self.email,
role: self.role,
created_at: self.created_at,
}
}
}
#[test]
fn test_user_factory() {
// Default user
let user1 = UserBuilder::new().build();
assert_eq!(user1.role, Role::User);
// Custom user
let admin = UserBuilder::new()
.name("Admin User")
.email("admin@example.com")
.role(Role::Admin)
.build();
assert_eq!(admin.role, Role::Admin);
// Multiple users
let users: Vec<_> = (0..100)
.map(|i| UserBuilder::new().name(format!("User {}", i)).build())
.collect();
assert_eq!(users.len(), 100);
}Custom Assertion Macros
#[macro_export]
macro_rules! assert_within_range {
($value:expr, $min:expr, $max:expr) => {
assert!(
$value >= $min && $value <= $max,
"Expected {} to be within range [{}, {}], but got {}",
stringify!($value),
$min,
$max,
$value
);
};
}
#[macro_export]
macro_rules! assert_matches {
($result:expr, $pattern:pat) => {
match $result {
$pattern => (),
_ => panic!(
"Expected {} to match {}, but got {:?}",
stringify!($result),
stringify!($pattern),
$result
),
}
};
}
#[test]
fn test_custom_assertions() {
let score = 85;
assert_within_range!(score, 0, 100);
let result = process_data();
assert_matches!(result, Ok(Data { status: Status::Active, .. }));
}Rule 1: God Object Detection
# Detector implementation
class GodObjectDetector(CodeSmellDetector):
MAX_METHODS = 20
MAX_FIELDS = 15
MAX_DEPENDENCIES = 10
def analyze(self, cls: ClassDef) -> List[CodeSmell]:
smells = []
method_count = len(cls.methods)
field_count = len(cls.fields)
dependency_count = len(self.get_dependencies(cls))
if method_count > self.MAX_METHODS:
smells.append(CodeSmell(
type=SmellType.GOD_OBJECT,
message=f"Class has {method_count} methods (max {self.MAX_METHODS})",
severity=Severity.WARNING,
location=cls.location
))
if field_count > self.MAX_FIELDS:
smells.append(CodeSmell(
type=SmellType.GOD_OBJECT,
message=f"Class has {field_count} fields (max {self.MAX_FIELDS})",
severity=Severity.WARNING,
location=cls.location
))
return smellsRule 2: Feature Envy Detection
class FeatureEnvyDetector(CodeSmellDetector):
THRESHOLD = 0.7 # 70% of method calls on other class
def analyze(self, method: MethodDef) -> List[CodeSmell]:
smells = []
# Count method calls on different classes
external_calls = defaultdict(int)
total_calls = 0
for call in method.method_calls:
if call.receiver != "self" and call.receiver != method.class_name:
external_calls[call.receiver_type] += 1
total_calls += 1
if total_calls > 0:
for class_name, count in external_calls.items():
ratio = count / total_calls
if ratio > self.THRESHOLD:
smells.append(CodeSmell(
type=SmellType.FEATURE_ENVY,
message=f"Method makes {ratio:.0%} calls on {class_name}",
severity=Severity.INFO,
location=method.location,
suggestion=f"Consider moving method to {class_name}"
))
return smellsRule 3: Duplicate Code Detection
class DuplicateCodeDetector(CodeSmellDetector):
SIMILARITY_THRESHOLD = 0.8
MIN_LINES = 5
def analyze_module(self, module: Module) -> List[CodeSmell]:
smells = []
code_blocks = self.extract_code_blocks(module)
for i, block1 in enumerate(code_blocks):
for block2 in code_blocks[i+1:]:
similarity = self.calculate_similarity(block1, block2)
if similarity > self.SIMILARITY_THRESHOLD:
smells.append(CodeSmell(
type=SmellType.DUPLICATE_CODE,
message=f"{similarity:.0%} similar code blocks detected",
severity=Severity.WARNING,
locations=[block1.location, block2.location],
suggestion="Consider extracting common logic to a function"
))
return smells
def calculate_similarity(self, block1: CodeBlock, block2: CodeBlock) -> float:
# Use AST-based or token-based similarity
tokens1 = self.tokenize(block1)
tokens2 = self.tokenize(block2)
return jaccard_similarity(tokens1, tokens2)| Metric | Formula | Target |
|---|---|---|
| Line Coverage | Lines executed / Total lines | > 80% |
| Branch Coverage | Branches taken / Total branches | > 75% |
| Function Coverage | Functions called / Total functions | > 90% |
| Statement Coverage | Statements executed / Total statements | > 80% |
| Condition Coverage | Boolean conditions evaluated / Total conditions | > 70% |
| MC/DC | Modified condition/decision coverage | > 90% (safety-critical) |
| Metric | Description | Target |
|---|---|---|
| Test Execution Time | Time to run full suite | < 5 minutes |
| Test Maintainability | Ease of test updates | < 10 min per change |
| Flaky Test Rate | % of non-deterministic tests | < 1% |
| Test Documentation | % of tests with docstrings | > 80% |
| Assertion Density | Assertions per test | 3-5 |
TestingKit follows Semantic Versioning (SemVer):
- MAJOR: Breaking API changes
- MINOR: New features, backward compatible
- PATCH: Bug fixes, backward compatible
- All tests passing
- Documentation updated
- CHANGELOG.md updated
- Version bumped in Cargo.toml
- Git tag created
- Crates.io published
- GitHub release notes
| Target | Platform | Priority |
|---|---|---|
| crates.io | Rust | Primary |
| PyPI | Python | Primary |
| GitHub Releases | All | Secondary |
We welcome contributions! Please see our Contributing Guide for details on:
- Code of Conduct
- Development setup
- Pull request process
- Coding standards
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: Questions and ideas
- Discord: Real-time community chat
Thanks to all contributors who have helped make TestingKit better!
- Claessen, K., & Hughes, J. (2000). QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs.
- Meszaros, G. (2007). xUnit Test Patterns: Refactoring Test Code.
- Beizer, B. (1990). Software Testing Techniques.
- Myers, G. J. (2011). The Art of Software Testing.
- Google Testing Blog - Various articles on testing best practices
- Martin Fowler's Testing Articles
- Netflix Tech Blog - Distributed Testing
- Microsoft Research - Testing at Scale
- ISO/IEC/IEEE 29119 - Software Testing Standards
- ISTQB Testing Certification Materials
- OWASP Testing Guide
End of Final Specification Document