Skip to content

Commit efa3edf

Browse files
committed
Implement high-performance Raw API with safe wrapper
- Added `CSVParser.parse(data:body:)` for zero-copy parsing with memory safety - Optimized parser with SIMD quoted field scanning and smart unescaping - Added `Raw Parse` benchmarks showing 2.5x speedup - Updated README with Raw API usage guide and benchmark results - Added tests for safe parser wrapper
1 parent 99ea911 commit efa3edf

4 files changed

Lines changed: 280 additions & 117 deletions

File tree

README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -410,6 +410,40 @@ Benchmark results on Apple Silicon (M-series):
410410

411411
*Parallel decoding has overhead for smaller files. Benefits appear with larger datasets (1M+ rows) or complex parsing.
412412

413+
### Raw High-Performance API (Codable Bypass)
414+
415+
For performance-critical tasks (pre-processing, filtering, or massive datasets), you can bypass `Codable` overhead entirely using the zero-copy `CSVParser` API. This achieves **2.5x to 3x higher throughput** (~670K rows/s).
416+
417+
**Safe Usage:**
418+
Use the `CSVParser.parse(data:)` wrapper to ensure memory safety.
419+
420+
```swift
421+
let data = Data(contentsOf: bigFile)
422+
423+
// Count rows where age > 18
424+
let count = try CSVParser.parse(data: data) { parser in
425+
var validCount = 0
426+
for row in parser {
427+
// 'row' is a zero-allocation View
428+
// Access fields by index (0-based)
429+
if let ageStr = row.string(at: 1), let age = Int(ageStr), age > 18 {
430+
validCount += 1
431+
}
432+
}
433+
return validCount
434+
}
435+
```
436+
437+
This approach avoids allocating `struct` or `class` instances for every row, drastically reducing ARC traffic.
438+
439+
#### Raw API Benchmarks (1M Rows)
440+
441+
| Benchmark | Time | Throughput | Speedup vs Codable |
442+
|-----------|------|------------|-------------------|
443+
| Raw Parse (Iterate Only) | 1.49 s | **~670K rows/s** | **2.6x** |
444+
| Raw Parse (Iterate + String) | 1.54 s | **~650K rows/s** | **2.5x** |
445+
| Raw Parse (Quoted Fields) | - | **~885K rows/s** | **2.8x** |
446+
413447
### Special Strategies (1K rows)
414448

415449
| Benchmark | Time | Throughput |

0 commit comments

Comments
 (0)