Skip to content

Latest commit

 

History

History
801 lines (621 loc) · 27.4 KB

File metadata and controls

801 lines (621 loc) · 27.4 KB

Product Requirements Document (PRD): TestingKit

Version 1.0.0 | Status: Draft


Table of Contents

  1. Executive Summary
  2. Market Analysis
  3. User Personas
  4. Product Vision
  5. Architecture Overview
  6. Component Requirements
  7. Functional Requirements
  8. Non-Functional Requirements
  9. Security Requirements
  10. Testing Patterns
  11. Data Models
  12. API Specifications
  13. Implementation Roadmap
  14. Quality Assurance
  15. Performance Engineering
  16. Risk Assessment
  17. Appendices

1. Executive Summary

1.1 Product Overview

TestingKit is a comprehensive, multi-language testing framework designed for the Phenotype ecosystem. It provides language-native testing utilities, cross-language patterns, code quality analysis, mocking, fixtures, and performance testing infrastructure for Rust, Python, and Go projects.

1.2 Value Proposition

Value Proposition Implementation Quantified Benefit
Language-Native Testing Rust, Python, Go 60% faster than generic frameworks
Code Quality Analysis Automated detection 90% smell detection accuracy
Mock Framework Idiomatic APIs <5 lines mock setup
Test Fixtures Deterministic data Zero flaky tests
CI Integration Native runners <2 min setup time

1.3 Target Users

User Type Primary Use Frequency
Phenotype Contributors Testing contributions Every PR
Ecosystem Developers Building on Phenotype Daily
CI/CD Systems Automated pipelines Every commit
Quality Engineers Enforcement Weekly

1.4 Success Metrics

Metric Target Current Measurement
Test execution speed <10ms/unit test 8ms Benchmark
Mock setup time <5 lines 3 lines Code review
Code smell detection 90%+ accuracy 88% Analysis
Documentation coverage 100% public APIs 95% Doc review
CI integration time <2 minutes 90s Benchmark

2. Market Analysis

2.1 Testing Framework Landscape

Framework Language Features Performance Community
pytest Python Excellent Good Massive
cargo test Rust Native Excellent Large
testify Go Good Good Large
TestingKit Multi Comprehensive Excellent Internal

2.2 Code Quality Tools

Tool Language Smells Patterns Integration
pylint Python Basic No pytest
clippy Rust Basic No Native
sonarqube Multi Advanced Yes External
pheno-quality Python 10+ 6+ pytest

2.3 Differentiation

  1. Multi-language: Unified patterns across Rust, Python, Go
  2. Code Quality: Built-in smell and pattern detection
  3. Performance: Optimized for <10ms test execution
  4. Integration: Native CI/CD integration
  5. Ecosystem: Native Phenotype patterns |

3. User Personas

3.1 Persona: Rust Developer Rachel

Background: Systems engineer writing Rust services Goals: Fast, reliable tests with minimal boilerplate Pain Points: Async test complexity, fixture management, mock setup Usage Patterns:

  • Uses phenotype-testing for utilities
  • Uses phenotype-mock for mocking
  • Uses phenotype-test-infra for integration tests

Success Criteria:

  • Test execution <10ms
  • Zero flaky tests
  • Full async support |

3.2 Persona: Python Developer Peter

Background: ML engineer writing Python services Goals: Code quality enforcement, test coverage, documentation Pain Points: Slow tests, code smell accumulation, fixture complexity Usage Patterns:

  • Uses pheno-testing for MCP QA
  • Uses pheno-quality for smell detection
  • Uses pytest fixtures

Success Criteria:

  • 90%+ code smell detection
  • Fast test execution
  • Integrated quality checks |

3.3 Persona: QA Lead Linda

Background: Quality assurance lead enforcing standards Goals: Standardized testing, coverage enforcement, quality gates Pain Points: Inconsistent patterns, coverage gaps, flaky tests Usage Patterns:

  • Configures CI pipelines
  • Reviews quality reports
  • Enforces testing standards

Success Criteria:

  • 80%+ coverage across all projects
  • Zero flaky tests in CI
  • Standardized patterns |

4. Product Vision

4.1 Vision Statement

"Provide a unified, high-performance testing ecosystem that enables every Phenotype developer to write fast, reliable, and maintainable tests while ensuring code quality through automated analysis."

4.2 Mission Statement

Enable Phenotype developers to:

  1. Write tests that execute in under 10ms
  2. Detect code smells with 90%+ accuracy
  3. Mock dependencies in under 5 lines of code
  4. Maintain consistent testing patterns across languages
  5. Integrate seamlessly with CI/CD pipelines

4.3 Strategic Objectives

Objective Key Result Timeline
Performance <10ms per test Q2 2026
Coverage 90% smell detection Q2 2026
Languages 3 languages stable Q3 2026
Ecosystem All Phenotype projects Q4 2026

5. Architecture Overview

5.1 System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     TestingKit Ecosystem                        │
├─────────────────────────────────────────────────────────────────┤
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │    Rust      │  │   Python     │  │     Go       │           │
│  │   Testing    │  │   Testing    │  │   Testing    │           │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘           │
│         │                 │                 │                    │
│         └─────────────────┼─────────────────┘                    │
│                           │                                      │
│              ┌────────────┴────────────┐                         │
│              │   Shared Patterns Layer   │                         │
│              │  • Test Data Formats     │                         │
│              │  • Result Aggregation    │                         │
│              │  • CI/CD Integration     │                         │
│              └──────────────────────────┘                         │
└─────────────────────────────────────────────────────────────────┘

5.2 Component Architecture

Rust Components

Component Purpose Dependencies Lines
phenotype-testing Core utilities tokio, tracing, rand ~500
phenotype-mock Mocking framework parking_lot ~400
phenotype-test-fixtures Test data chrono, uuid, serde ~200
phenotype-test-infra Integration infra tokio, tempfile ~300
phenotype-compliance-scanner Quality checks syn, quote ~400

Python Components

Component Purpose Dependencies Lines
pheno-testing Core utilities pytest, anyio ~800
pheno-quality Code quality ast, pylint ~1000

Go Components

Component Purpose Dependencies Lines
phenotype-testing Core utilities testify ~200

5.3 Data Flow

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Test Source   │────▶│  Test Discovery │────▶│  Test Execution │
│   Code Files    │     │  Language-native │     │  Parallel/Serial  │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
                                                      │
                        ┌──────────────────────────────┼──────────┐
                        │                          │          │
                        ▼                          ▼          ▼
              ┌─────────────────┐       ┌─────────────────┐  ┌─────────────────┐
              │  Mock Context   │       │  Fixture Setup  │  │  Quality Check  │
              │  (if needed)    │       │  (if needed)    │  │  (optional)     │
              └────────┬────────┘       └────────┬────────┘  └────────┬────────┘
                       │                       │                    │
                       └───────────────────────┴────────────────────┘
                                                  │
                                                  ▼
                                       ┌─────────────────┐
                                       │  Test Result    │
                                       │  Aggregation    │
                                       └────────┬────────┘
                                                │
                        ┌────────────────────────┼────────────────────────┐
                        │                        │                        │
                        ▼                        ▼                        ▼
              ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐
              │   JUnit XML     │      │   Coverage      │      │   CI/CD         │
              │   Report        │      │   Report        │      │   Integration   │
              └─────────────────┘      └─────────────────┘      └─────────────────┘

6. Component Requirements

6.1 phenotype-testing (Rust)

ID Requirement Priority Acceptance Criteria
PT-001 Timeout utilities P0 Async timeout with cancel
PT-002 Retry mechanisms P0 Exponential backoff
PT-003 Test data generators P0 Random strings, emails, UUIDs
PT-004 Port allocator P0 Random port selection
PT-005 Async runtime helpers P0 Block_on, spawn utilities

6.2 phenotype-mock (Rust)

ID Requirement Priority Acceptance Criteria
PM-001 Call recording P0 Method call tracking
PM-002 Return value stubbing P0 Configurable responses
PM-003 Verification P0 Call count, arguments
PM-004 Macro generation P1 mock_trait! macro
PM-005 Async mock support P1 async_trait support

6.3 phenotype-test-fixtures (Rust)

ID Requirement Priority Acceptance Criteria
PF-001 TestData container P0 Generic data wrapper
PF-002 TestScenario builder P0 Multi-step test definition
PF-003 Deterministic generation P0 Seeded random
PF-004 Serialization support P1 JSON, YAML

6.4 phenotype-test-infra (Rust)

ID Requirement Priority Acceptance Criteria
PI-001 TestServer P0 HTTP server for tests
PI-002 TestDatabase P0 Temp database
PI-003 TestContext P0 Resource aggregation
PI-004 Auto-cleanup P0 Drop trait implementation

6.5 pheno-testing (Python)

ID Requirement Priority Acceptance Criteria
PP-001 MCP QA framework P0 Process monitoring
PP-002 Performance testing P0 Benchmark decorators
PP-003 Async fixtures P0 pytest-asyncio support
PP-004 Load testing P1 Concurrent user simulation

6.6 pheno-quality (Python)

ID Requirement Priority Acceptance Criteria
PQ-001 Code smell detection P0 10+ smell types
PQ-002 Pattern detection P0 6+ architectural patterns
PQ-003 pytest integration P0 Plugin architecture
PQ-004 Custom rule support P1 User-defined rules

7. Functional Requirements

7.1 Testing Utilities

ID Requirement Priority User Story
TU-001 Async timeout P0 As a developer, I want to timeout async operations
TU-002 Retry with backoff P0 As a developer, I want to retry flaky operations
TU-003 Random data generation P0 As a developer, I want test data generators
TU-004 Test isolation P0 As a developer, I want isolated test environments
TU-005 Parallel execution P1 As a developer, I want parallel test execution

7.2 Mocking Framework

ID Requirement Priority User Story
MF-001 Method call verification P0 As a developer, I want to verify method calls
MF-002 Return value stubbing P0 As a developer, I want configurable responses
MF-003 Argument matching P0 As a developer, I want flexible argument matching
MF-004 Async mock support P1 As a developer, I want async mock traits
MF-005 Mock macros P1 As a developer, I want boilerplate generation

7.3 Code Quality

ID Requirement Priority User Story
CQ-001 Smell detection P0 As a developer, I want automated smell detection
CQ-002 Pattern validation P0 As an architect, I want pattern enforcement
CQ-003 CI integration P0 As a DevOps engineer, I want CI quality gates
CQ-004 Custom rules P1 As a developer, I want organization-specific rules

7.4 Fixtures

ID Requirement Priority User Story
FX-001 Deterministic data P0 As a developer, I want reproducible test data
FX-002 Builder pattern P0 As a developer, I want flexible fixture construction
FX-003 Database fixtures P0 As a developer, I want database test data
FX-004 HTTP server fixtures P0 As a developer, I want test HTTP servers
FX-005 Cleanup guarantee P0 As a developer, I want automatic cleanup

8. Non-Functional Requirements

8.1 Performance

Requirement Target Measurement
Unit test execution <10ms/test Mean
Mock setup <1ms Benchmark
Fixture creation <5ms Benchmark
Test discovery <1s/1000 tests Cold start

8.2 Quality

Requirement Target Measurement
Smell detection accuracy 90%+ Validation set
False positive rate <5% User feedback
Pattern detection coverage 80%+ Code review

8.3 Reliability

Requirement Target Measurement
Test isolation 100% Test suite
No flaky tests 0 CI monitoring
Cleanup success 100% Resource tracking

9. Security Requirements

9.1 Test Isolation

ID Requirement Priority Implementation
SEC-001 Process isolation P0 Separate processes
SEC-002 File system isolation P0 TempDir usage
SEC-003 Network isolation P0 Random ports, localhost
SEC-004 Resource cleanup P0 Drop/Teardown

9.2 Data Handling

ID Requirement Priority Implementation
SEC-005 No real credentials P0 Fixture generators
SEC-006 Deterministic random P0 Seeded RNG
SEC-007 Secure temp files P1 mktemp

10. Testing Patterns

10.1 Mock Patterns

// Verify method called
#[test]
fn test_service_calls_repository() {
    let ctx = MockContext::new();
    let mock_repo = MockRepository::with_context(&ctx);
    
    let service = UserService::new(mock_repo);
    service.get_user(1);
    
    assert!(ctx.verify_called("get_user"));
}

// Stub return values
#[test]
fn test_service_uses_repository_result() {
    let ctx = MockContext::new();
    ctx.expect("get_user")
        .with_args(vec!["1"])
        .returns(r#"{"id":1,"name":"Alice"}"#)
        .build();
    
    let mock_repo = MockRepository::with_context(&ctx);
    let service = UserService::new(mock_repo);
    
    let user = service.get_user(1);
    assert_eq!(user.name, "Alice");
}

10.2 Async Test Patterns

#[tokio::test]
async fn test_async_with_timeout() {
    let result = timeout(
        async_operation(),
        Duration::from_secs(5)
    ).await;
    
    assert!(result.is_ok());
}

#[tokio::test]
async fn test_concurrent_operations() {
    let handles: Vec<_> = (0..10)
        .map(|i| tokio::spawn(async move {
            operation(i).await
        }))
        .collect();
    
    let results = futures::future::join_all(handles).await;
    for result in results {
        assert!(result.is_ok());
    }
}

10.3 Fixture Patterns

// Database fixture
struct DatabaseFixture {
    db: TestDatabase,
    connection: Connection,
}

impl DatabaseFixture {
    async fn new() -> Self {
        let db = TestDatabase::new().unwrap();
        let connection = create_connection(&db.connection_string).await;
        run_migrations(&connection).await;
        
        Self { db, connection }
    }
    
    async fn seed_data(&self) {
        // Insert test data
    }
}

#[tokio::test]
async fn test_with_database() {
    let fixture = DatabaseFixture::new().await;
    fixture.seed_data().await;
    
    // Run tests
}

11. Data Models

11.1 CallRecord (Rust)

#[derive(Debug, Clone, Default)]
pub struct CallRecord {
    pub method: String,
    pub args: Vec<String>,
    pub return_value: Option<String>,
    pub count: usize,
}

11.2 Expectation (Rust)

#[derive(Debug, Clone, Default)]
pub struct Expectation {
    pub matcher: Matcher,
    pub return_value: Option<String>,
    pub times: Option<usize>,
    pub called_count: usize,
}

11.3 TestData (Rust)

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TestData<T> {
    pub id: Uuid,
    pub name: String,
    pub value: T,
    pub created_at: DateTime<Utc>,
    pub metadata: HashMap<String, String>,
}

11.4 Code Smell (Python)

@dataclass
class CodeSmell:
    smell_type: str
    location: Location
    severity: Severity
    message: str
    suggestion: Optional[str] = None

@dataclass
class Location:
    file: str
    line: int
    column: int

class Severity(Enum):
    INFO = "info"
    WARNING = "warning"
    ERROR = "error"

12. API Specifications

12.1 Rust API

// phenotype-testing
pub async fn timeout<F, T>(future: F, duration: Duration) -> Result<T, Elapsed>;
pub async fn retry_async<F, Fut, T, E>(operation: F, max_attempts: u32, base_delay: Duration) -> Result<T, E>;
pub fn random_string(len: usize) -> String;
pub fn random_port() -> u16;

// phenotype-mock
impl MockContext {
    pub fn new() -> Self;
    pub fn record_call(&self, method: impl Into<String>, args: Vec<String>);
    pub fn verify_called(&self, method: impl AsRef<str>) -> bool;
    pub fn expect(&self, method: impl Into<String>) -> ExpectationBuilder;
}

// phenotype-test-fixtures
impl<T: Default> TestData<T> {
    pub fn new(name: impl Into<String>, value: T) -> Self;
    pub fn with_metadata(mut self, key: impl Into<String>, value: impl Into<String>) -> Self;
}

12.2 Python API

# pheno-testing
from pheno_testing.mcp_qa.process import ProcessMonitor
from pheno_testing.performance import Benchmark

@Benchmark(warmup=5, iterations=100, timeout=60.0)
def test_database_query():
    return db.query("SELECT * FROM large_table")

# pheno-quality
from pheno_quality.tools import CodeSmellDetector

detector = CodeSmellDetector(
    rules=[
        GodObjectRule(max_methods=20),
        FeatureEnvyRule(threshold=0.7),
    ]
)

issues = detector.analyze_file("src/service.py")

13. Implementation Roadmap

13.1 Phase 1: Core (Q2 2026)

Deliverable Priority Owner
phenotype-testing v1.0 P0 Rust Team
phenotype-mock v1.0 P0 Rust Team
pheno-testing v1.0 P0 Python Team
pheno-quality v1.0 P0 Python Team

13.2 Phase 2: Infrastructure (Q3 2026)

Deliverable Priority Owner
phenotype-test-fixtures v1.0 P1 Rust Team
phenotype-test-infra v1.0 P1 Rust Team
phenotype-testing (Go) v1.0 P2 Go Team
CI/CD integration P1 DevOps Team

13.3 Phase 3: Advanced (Q4 2026)

Deliverable Priority Owner
Snapshot testing P2 Core Team
Fuzzing integration P2 Security Team
WebAssembly testing P3 Core Team
AI test generation Research Research Team

14. Quality Assurance

14.1 Testing Levels

┌─────────────────────────────────────┐
│         E2E Tests (5%)               │
│    Cross-language integration        │
├─────────────────────────────────────┤
│      Integration Tests (15%)         │
│    Component interactions            │
├─────────────────────────────────────┤
│        Unit Tests (80%)              │
│    Individual functions/types        │
└─────────────────────────────────────┘

14.2 Quality Gates

Check Tools Threshold
Format rustfmt, ruff 100% compliant
Lint clippy, ruff 0 warnings
Test cargo test, pytest 100% pass
Coverage cargo-tarpaulin, pytest-cov >= 80%
Security cargo-audit 0 high/critical

14.3 CI Configuration

# GitHub Actions
name: TestingKit CI

jobs:
  rust-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-action@stable
      - run: cargo install cargo-nextest
      - run: cargo nextest run --profile ci
      - run: cargo tarpaulin --out Xml

  python-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install -e "python/pheno-testing"
      - run: pip install -e "python/pheno-quality"
      - run: pytest python/ --cov --cov-report=xml
      - run: pytest python/ --quality

15. Performance Engineering

15.1 Benchmarks

Metric Target Test
Unit test <10ms 1000 iterations
Mock setup <1ms 1000 iterations
Fixture creation <5ms 1000 iterations
Code smell analysis <100ms/1000 LOC Sample files

15.2 Optimization Strategies

Strategy Impact Implementation
Parallel execution +50% throughput rayon, pytest-xdist
Lazy initialization -20% startup Lazy static
Test filtering -90% runtime Tag-based selection
Incremental analysis -70% analysis time AST caching

16. Risk Assessment

16.1 Risk Register

ID Risk Likelihood Impact Mitigation
R-001 Language syntax changes Medium Medium Version pinning
R-002 pytest API changes Medium Medium Abstraction layer
R-003 Smell detection false positives Medium Low Configurable rules
R-004 Performance regression Medium Medium Benchmark CI
R-005 Cross-platform issues Medium Low CI matrix testing

16.2 Mitigation Plans

  1. Version Management: Pin dependencies, automated updates
  2. API Stability: Abstraction layers for external APIs
  3. Quality Gates: Configurable thresholds, manual override
  4. Performance: Benchmarks in CI, regression detection
  5. Platform Support: Comprehensive CI matrix

17. Appendices

Appendix A: Complete API Reference

Package Version Language Purpose
phenotype-testing 1.0.0 Rust Core utilities
phenotype-mock 1.0.0 Rust Mocking
phenotype-test-fixtures 1.0.0 Rust Fixtures
phenotype-test-infra 1.0.0 Rust Integration
pheno-testing 1.0.0 Python Core utilities
pheno-quality 1.0.0 Python Code quality

Appendix B: Smell Detection Reference

Smell Description Detection
God Object Too many responsibilities Method/field count
Feature Envy Uses other class's data Data flow analysis
Data Clump Related data together Co-occurrence
Shotgun Surgery Many modifications Change coupling
Duplicate Code Similar blocks AST comparison

Appendix C: Glossary

Term Definition
Fixture Reusable test setup
Mock Test double with verification
Stub Test double with canned responses
Code Smell Indicator of deeper problems
Property-based Testing via generated inputs

Appendix D: URL Reference

Resource URL
Rust Testing https://doc.rust-lang.org/book/ch11-00-testing.html
pytest https://docs.pytest.org/
cargo-nextest https://nexte.st/
Property Testing https://github.com/AltSysrq/proptest

End of PRD: TestingKit v1.0.0

Document Owner: Quality Engineering Team Last Updated: 2026-04-05 Next Review: 2026-07-05