Mint-Claw · dagangtj · Feb 26, 2026
diff --git a/QUALITY_SCORING.md b/QUALITY_SCORING.md
@@ -0,0 +1,328 @@
+# Multi-Dimensional Quality Scoring
+
+## Overview
+
+This module provides automated quality assessment for structured outputs (JSON, Markdown, Code, Text) using a multi-dimensional scoring algorithm.
+
+## Features
+
+- **Auto-format detection**: Automatically identifies content type
+- **5-dimensional scoring**: Comprehensive quality assessment
+- **Fast performance**: 100+ submissions per second
+- **Actionable feedback**: Specific improvement suggestions
+- **Threshold validation**: Pass/fail determination
+
+## Scoring Dimensions
+
+| Dimension | Weight | Description |
+|-----------|--------|-------------|
+| **Completeness** | 30% | Required fields/sections present |
+| **Format Compliance** | 20% | Valid syntax, proper structure |
+| **Coverage** | 25% | Depth and breadth of content |
+| **Clarity** | 15% | Readability, organization |
+| **Validity** | 10% | Logical consistency |
+
+**Pass Threshold**: 0.70 (70%)
+
+## Quality Ratings
+
+| Score Range | Rating | Description |
+|-------------|--------|-------------|
+| 0.90+ | A+ | Excellent |
+| 0.85-0.89 | A | Very Good |
+| 0.80-0.84 | B+ | Good |
+| 0.75-0.79 | B | Above Average |
+| 0.70-0.74 | C+ | Acceptable |
+| 0.65-0.69 | C | Below Average |
+| 0.60-0.64 | D | Poor |
+| < 0.60 | F | Failing |
+
+## Installation
+
+No external dependencies required. Uses Python 3.10+ standard library only.
+
+```bash
+# Copy the module
+cp quality_scorer.py your_project/
+
+# Run tests
+python3 test_quality_scorer.py
+
+# Run examples
+python3 examples.py
+```
+
+## Usage
+
+### Basic Usage
+
+```python
+from quality_scorer import QualityScorer
+
+scorer = QualityScorer()
+result = scorer.score(your_content)
+
+print(f"Score: {result.weighted_score}")
+print(f"Rating: {result.quality_rating}")
+print(f"Pass: {result.pass_threshold}")
+```
+
+### Output Structure
+
+```python
+@dataclass
+class QualityScore:
+    weighted_score: float          # 0.0-1.0
+    quality_rating: str            # A+, A, B+, B, C+, C, D, F
+    scores: Dict[str, float]       # Individual dimension scores
+    feedback: List[str]            # Improvement suggestions
+    pass_threshold: bool           # True if >= 0.70
+```
+
+### Example Output
+
+```json
+{
+  "weighted_score": 0.847,
+  "quality_rating": "A",
+  "scores": {
+    "completeness": 0.900,
+    "format": 1.000,
+    "coverage": 0.850,
+    "clarity": 0.750,
+    "validity": 0.800
+  },
+  "feedback": [
+    "Detected format: json",
+    "JSON structure has good depth",
+    "Well-formatted with proper indentation"
+  ],
+  "pass_threshold": true
+}
+```
+
+## Format-Specific Scoring
+
+### JSON
+
+**Completeness**:
+- Non-empty objects/arrays
+- Nested structures present
+- Reasonable key count (≥3)
+
+**Format**:
+- Valid JSON syntax
+- Proper nesting
+
+**Coverage**:
+- Structure depth (≥2 levels)
+- Key count (≥5)
+- Content length (≥200 chars)
+
+**Clarity**:
+- Formatted with indentation
+- Descriptive key names
+
+**Validity**:
+- No null/empty values
+- No placeholder text
+
+### Markdown
+
+**Completeness**:
+- Headers present
+- Sufficient content (>100 chars)
+- Lists or structure
+
+**Format**:
+- Valid header levels (≤6)
+- No broken links
+- Proper list syntax
+
+**Coverage**:
+- Multiple sections (≥3)
+- List items (≥5)
+- Word count (≥200)
+
+**Clarity**:
+- Logical header hierarchy
+- Reasonable line length (<120)
+- Whitespace separation
+
+**Validity**:
+- No placeholder text
+- No empty sections
+
+### Code
+
+**Completeness**:
+- Functions/classes present
+- Comments/documentation
+- Multi-line structure (>5 lines)
+
+**Format**:
+- Balanced braces/parentheses
+- Proper syntax
+
+**Coverage**:
+- Multiple functions (≥3)
+- Comment lines (≥5)
+- Total lines (≥50)
+
+**Clarity**:
+- Proper indentation
+- Reasonable line length (<100)
+- Blank line separation
+
+**Validity**:
+- No syntax error markers
+- No placeholder code
+
+### Text
+
+**Completeness**:
+- Adequate word count (≥50)
+- Multiple paragraphs
+- Proper punctuation
+
+**Format**:
+- Proper spacing
+- Capitalization
+- No excessive newlines
+
+**Coverage**:
+- Multiple paragraphs (≥3)
+- Sentences (≥10)
+- Word count (≥200)
+
+**Clarity**:
+- Reasonable sentence length (10-25 words)
+- Paragraph breaks
+- Line length (<100)
+
+**Validity**:
+- No placeholder text
+- No empty sections
+
+## Performance
+
+Tested on MacBook Pro M1:
+- **100 submissions**: < 0.01s
+- **1,000 submissions**: < 0.1s
+- **10,000 submissions**: < 1s
+
+Meets requirement: **100 submissions < 10s** ✅
+
+## API Reference
+
+### `QualityScorer`
+
+Main scoring class.
+
+#### Methods
+
+##### `detect_format(content: str) -> str`
+
+Auto-detect content format.
+
+**Returns**: `'json'`, `'markdown'`, `'code'`, or `'text'`
+
+##### `score(content: str) -> QualityScore`
+
+Score content across all dimensions.
+
+**Args**:
+- `content`: Content to score
+
+**Returns**: `QualityScore` object
+
+### `score_submission(content: str) -> dict`
+
+Convenience function returning dict instead of dataclass.
+
+## Testing
+
+```bash
+# Run all tests
+python3 test_quality_scorer.py
+
+# Expected output:
+# ✓ Format detection tests passed
+# ✓ JSON scoring tests passed
+# ✓ Markdown scoring tests passed
+# ✓ Code scoring tests passed
+# ✓ Text scoring tests passed
+# ✓ Performance test passed: 100 submissions in 0.00s
+# ✓ Edge case tests passed
+# ✓ Dimension scoring tests passed
+# ✓ Quality rating tests passed
+# ✓ Feedback generation tests passed
+# ✅ All tests passed!
+```
+
+## Examples
+
+See `examples.py` for comprehensive usage examples:
+
+```bash
+python3 examples.py
+```
+
+Includes:
+1. JSON content scoring
+2. Markdown content scoring
+3. Code content scoring
+4. Batch scoring
+5. Quality comparison
+
+## Integration with ContentSplit API
+
+```python
+from quality_scorer import QualityScorer
+
+# In your API endpoint
+@app.post("/api/repurpose")
+async def repurpose_content(request: RepurposeRequest):
+    # Generate content
+    results = generate_content(request)
+
+    # Score quality
+    scorer = QualityScorer()
+    for platform, content in results.items():
+        quality = scorer.score(content)
+        results[platform] = {
+            "content": content,
+            "quality_score": quality.weighted_score,
+            "quality_rating": quality.quality_rating
+        }
+
+    return results
+```
+
+## Limitations
+
+- **Language**: English-optimized (works with other languages but may need tuning)
+- **Context**: No semantic understanding (syntax/structure only)
+- **Domain**: General-purpose (not specialized for specific domains)
+
+## Future Enhancements
+
+Potential improvements (not in scope for bounty):
+
+- NLP-based feedback generation
+- ML classifier for format detection
+- Domain-specific rubrics
+- Multi-language support
+- Semantic similarity scoring
+
+## License
+
+MIT License - Free to use and modify
+
+## Author
+
+Built for Mint-Claw/content-split bounty #1
+
+## Support
+
+For issues or questions, open an issue on GitHub.