This document outlines the comprehensive testing strategy for the MarkItDown MCP server to ensure reliability, security, and compatibility before release.
-
MCPRequest/MCPResponse serialization/deserialization
- Valid JSON-RPC 2.0 format
- Invalid JSON handling
- Missing required fields
- Type validation
-
Request routing
initializemethod handlingtools/listmethod handlingtools/callmethod handling- Unknown method handling
- Invalid method names
-
Error handling
- Internal server errors
- Request validation errors
- Tool execution errors
- Timeout handling
-
convert_file tool
- File path validation
- Base64 content decoding
- File existence checks
- Permission validation
- Return format validation
-
list_supported_formats tool
- Format list accuracy
- Categorization correctness
- Response structure
-
convert_directory tool
- Directory traversal logic
- File filtering
- Progress tracking
- Error aggregation
- Document conversion
- Success path testing
- Error handling
- Result formatting
- Memory management
-
Server lifecycle
- Initialization sequence
- Clean shutdown
- Graceful error recovery
- Connection state management
-
Tool execution flow
- Request parsing → Tool execution → Response formatting
- Concurrent request handling
- Request timeout behavior
- Resource cleanup
-
File operations
- Read permissions
- Path traversal security
- Symbolic link handling
- Network drive compatibility
- Large file handling
-
Directory operations
- Recursive traversal
- Mixed file types
- Empty directories
- Nested structures
For each supported format, test:
- Valid files: Typical use cases
- Edge cases: Empty files, minimal content
- Large files: Memory and performance impact
- Corrupted files: Graceful error handling
- Special cases: Password-protected, encrypted
- Simple text PDFs
- Complex layouts with tables/images
- Scanned PDFs (image-based)
- Password-protected PDFs
- Corrupted PDF files
- Multi-page documents
- Large PDFs (100+ pages)
- Excel (.xlsx, .xls)
- Multiple worksheets
- Formulas and calculations
- Charts and graphs
- Large spreadsheets
- Password-protected files
- Word (.docx)
- Simple text documents
- Complex formatting
- Images and tables
- Track changes/comments
- PowerPoint (.pptx)
- Text-heavy slides
- Image-heavy presentations
- Animations and transitions
- EXIF metadata extraction
- Photos with full EXIF data
- Images without metadata
- Corrupted EXIF data
- Format variety
- JPG, PNG, GIF, BMP, TIFF, WebP
- Different resolutions
- Color vs. grayscale
- Speech recognition
- Clear speech recordings
- Multiple speakers
- Background noise
- Different audio qualities
- Format support
- MP3, WAV, FLAC, M4A, OGG, WMA
- Different bitrates
- Mono vs. stereo
- Web formats: HTML, XML, JSON, CSV
- Text formats: TXT, MD, RST
- Archives: ZIP files with mixed content
- E-books: EPUB files
-
Concurrent requests
- Multiple simultaneous conversions
- Resource contention
- Memory usage patterns
- CPU utilization
-
Large file handling
- Files > 100MB
- Memory efficiency
- Streaming vs. loading
- Timeout behavior
-
Resource limits
- Maximum concurrent requests
- Memory exhaustion scenarios
- CPU-bound vs. I/O-bound operations
- Recovery from resource exhaustion
-
Load testing
- Sustained high request rates
- Gradual load increase
- Peak load handling
- Performance degradation patterns
-
Path traversal attacks
../../../etc/passwdattempts- Absolute path handling
- Symbolic link exploitation
- Network path attempts
-
Malicious content
- Files with embedded scripts
- Zip bombs
- Files with excessive metadata
- Binary files disguised as text
-
Denial of Service (DoS)
- Large file uploads
- Infinite loop scenarios
- Memory exhaustion attempts
- CPU exhaustion attacks
-
Information disclosure
- Error message content
- File path leakage
- System information exposure
-
Operating Systems
- macOS (Intel/Apple Silicon)
- Windows 10/11
- Ubuntu/Debian Linux
- CentOS/RHEL
-
Python Versions
- Python 3.10, 3.11, 3.12, 3.13
- Virtual environments
- System Python vs. user installations
-
Optional dependencies
- Missing dependencies behavior
- Partial dependency installation
- Version compatibility ranges
- Dependency conflict resolution
-
Claude Desktop Integration
- Different Claude Desktop versions
- Configuration variations
- Network conditions
- Error recovery scenarios
-
File not found
- Non-existent paths
- Deleted files during processing
- Network disconnections
-
Permission errors
- Read-only files
- Protected directories
- Insufficient privileges
-
Format errors
- Unsupported file types
- Corrupted files
- Incomplete files
-
System failures
- Out of memory
- Disk full
- Network timeouts
- Process kills
-
Dependency failures
- Missing libraries
- Version conflicts
- Runtime errors
-
Tool discovery
- Tools appear in interface
- Descriptions are clear
- Parameter hints work
-
Conversion workflows
- Single file conversion
- Batch directory conversion
- Error reporting clarity
- Progress indication
- User-friendly errors
- Clear problem descriptions
- Actionable solutions
- No technical jargon
- Helpful suggestions
Create a comprehensive test dataset including:
- Small files (< 1KB) of each format
- Medium files (1KB - 10MB) representing typical use
- Large files (> 10MB) for performance testing
- Edge cases: Empty files, single character, maximum size
- Corrupted files: Intentionally broken formats
- Special characters: Unicode filenames, spaces, symbols
- Happy path: Ideal conditions, all dependencies available
- Error paths: Missing dependencies, invalid inputs
- Edge cases: Boundary conditions, unusual inputs
- Real-world: Typical user files and workflows
- Unit tests: pytest framework
- Integration tests: Full MCP protocol simulation
- Performance tests: Load generation and metrics
- CI/CD: GitHub Actions for multiple environments
- Claude Desktop integration: Real environment testing
- User workflow validation: End-to-end scenarios
- Exploratory testing: Edge cases and creative usage
- ✅ All 29+ file formats convert successfully
- ✅ All MCP tools work as documented
- ✅ Error handling is graceful and informative
- ✅ Performance meets acceptable thresholds
- ✅ No crashes under normal usage
- ✅ Graceful degradation under stress
- ✅ Memory leaks eliminated
- ✅ Resource cleanup on errors
- ✅ No path traversal vulnerabilities
- ✅ No information disclosure
- ✅ DoS protection mechanisms
- ✅ Safe handling of malicious files
- ✅ Works on all target platforms
- ✅ Compatible with all supported Python versions
- ✅ Handles missing dependencies gracefully
- ✅ Integrates properly with Claude Desktop
- Set up test framework and infrastructure
- Implement unit tests for core functionality
- Create basic test data set
- Complete unit test coverage
- Implement integration tests
- File format testing for major formats
- Complete file format coverage
- Performance and stress testing
- Security testing
- End-to-end testing with Claude Desktop
- Multi-platform compatibility testing
- User experience validation
- Bug fixes and retesting
- Large file handling - Memory issues, timeouts
- Concurrent requests - Resource contention, race conditions
- Dependency management - Missing/incompatible packages
- Security vulnerabilities - Path traversal, DoS attacks
- Comprehensive performance testing with realistic data
- Security review of all input handling
- Dependency testing across multiple environments
- Staged rollout with monitoring and rollback capability
- Test suite - Comprehensive automated tests
- Test data - Representative file collection
- Performance benchmarks - Baseline metrics
- Security assessment - Vulnerability analysis
- Compatibility matrix - Platform/version support
- User testing report - Real-world validation results