Skip to content

Conversation

@gmanvel
Copy link

@gmanvel gmanvel commented Dec 8, 2025

Description

This PR introduces performance optimizations to TokenTextChunker that significantly reduce memory allocations while maintaining correctness. The optimizations target the hot path of document chunking during indexing pipelines.

Problem Description

The original Chunk method had several allocation-heavy patterns:

  1. List<T>.GetRange() - Allocates a new List + backing array per chunk iteration
  2. new int[] for token values - Allocates per chunk, immediately becomes garbage
  3. LINQ chain for document IDs - .Select().Distinct().ToArray() creates multiple intermediate allocations
  4. No capacity pre-allocation - Lists grow via repeated reallocation

Solution

Optimizations Applied
Optimization Before After
Chunk token access GetRange() (allocates List) CollectionsMarshal.AsSpan().Slice() (zero-alloc view)
Token value buffer new int[n] per chunk ArrayPool<int>.Shared.Rent/Return
Document ID collection LINQ .Distinct().ToArray() Reusable HashSet<string> with Clear()
Document ID iteration Check every token Check only on slice boundary transitions
Results list Default capacity Pre-calculated capacity estimate
Flattened list New per call Option to reuse across calls (singleton scenario, see Additional Notes)

Proposed Changes

Reduce allocations by

  • Use Span based API to process chunks of flattened list without allocations
  • Use pooled array to set token values instead of allocating
  • Remove LINQ from for finding documentIds (improve allocations)

Benchmark Results

Scenario 1: Thread-Safe (New List Per Call, implementation in this PR)

Overall Improvement
Document Size Avg Perf Improvement Avg Allocation Reduction
Small ~7.5% faster ~24.5% less alloc
Medium ~7.0% faster ~20.7% less alloc
Large ~9.0% faster ~23.5% less alloc

Small Documents
Chunk Baseline (us) Optimized (us) Perf Gain Baseline KB Optimized KB Alloc Reduction
512/0 14.83 13.75 7.3% 11.84 8.95 24.4%
512/64 14.93 13.69 8.3% 11.84 8.95 24.4%
512/128 14.95 13.84 7.4% 11.84 8.95 24.4%
1024/0 15.02 13.78 8.2% 11.84 8.95 24.4%
1024/64 15.06 13.88 7.8% 11.84 8.95 24.4%
1024/128 14.92 13.71 8.1% 11.84 8.95 24.4%
2048/0 14.88 13.71 7.9% 11.84 8.95 24.4%
2048/64 14.92 13.61 8.8% 11.84 8.95 24.4%
2048/128 15.06 13.74 8.8% 11.84 8.95 24.4%

Medium Documents
Chunk Baseline (us) Optimized (us) Perf Gain Baseline KB Optimized KB Alloc Reduction
512/0 1423.50 1345.73 5.4% 1229.23 972.81 20.8%
512/64 1482.25 1366.19 7.8% 1294.57 1001.84 22.6%
512/128 1539.59 1405.99 8.7% 1379.28 1039.42 24.7%
1024/0 1416.45 1331.83 6.0% 1218.56 968.75 20.5%
1024/64 1482.40 1339.31 9.7% 1247.44 981.60 21.3%
1024/128 1452.28 1348.64 7.1% 1279.12 995.73 22.1%
2048/0 1419.49 1326.13 6.6% 1213.16 966.71 20.3%
2048/64 1426.52 1331.34 6.7% 1226.86 972.87 20.7%
2048/128 1438.68 1350.76 6.1% 1240.25 978.79 21.1%

Large Documents
Chunk Baseline (us) Optimized (us) Perf Gain Baseline KB Optimized KB Alloc Reduction
512/0 14431.18 13195.54 8.6% 10740.19 8181.02 23.8%
512/64 14857.46 13865.31 6.7% 11394.85 8471.49 25.7%
512/128 15668.28 14183.00 9.5% 12275.82 8858.92 27.9%
1024/0 14254.90 13182.59 7.5% 10633.63 8139.95 23.5%
1024/64 14326.83 13505.65 5.7% 10931.39 8272.53 24.3%
1024/128 14614.92 13603.63 6.9% 11273.66 8424.56 25.3%
2048/0 14105.07 13348.39 5.4% 10580.09 8119.37 23.3%
2048/64 14231.75 13290.38 6.6% 10722.42 8182.49 23.7%
2048/128 14467.38 13382.53 7.5% 10873.30 8249.97 24.2%

Scenario 2: Singleton Pattern (Reused List, not implemented)

Overall Improvement
Document Size Avg Perf Improvement Avg Allocation Reduction
Small ~4–6% faster ~59–60% less alloc
Medium ~6–8% faster ~62–65% less alloc
Large ~5–8% faster ~61–63% less alloc

Small Documents
Chunk Baseline (us) Optimized (us) Perf Gain Baseline KB Optimized KB Alloc Reduction
512/0 14.74 14.33 2.8% 11.84 4.79 59.5%
512/64 15.01 14.34 4.5% 11.84 4.79 59.5%
512/128 15.01 14.07 6.3% 11.84 4.79 59.5%
1024/0 15.25 14.28 6.3% 11.84 4.79 59.5%
1024/64 14.97 14.30 4.5% 11.84 4.79 59.5%
1024/128 14.96 14.34 4.1% 11.84 4.79 59.5%
2048/0 14.89 14.28 4.1% 11.84 4.79 59.5%
2048/64 14.94 14.20 5.0% 11.84 4.79 59.5%
2048/128 15.09 14.33 5.0% 11.84 4.79 59.5%

Medium Documents
Chunk Baseline (us) Optimized (us) Perf Gain Baseline KB Optimized KB Alloc Reduction
512/0 1433.14 1341.80 6.4% 1229.23 460.47 62.5%
512/64 1470.74 1375.65 6.5% 1294.57 489.50 62.2%
512/128 1529.06 1406.22 8.0% 1379.28 527.08 61.8%
1024/0 1411.96 1341.78 5.0% 1218.55 456.41 62.6%
1024/64 1441.19 1363.08 5.4% 1247.44 469.25 62.4%
1024/128 1445.92 1375.79 4.8% 1279.12 483.38 62.2%
2048/0 1410.96 1342.18 4.9% 1213.17 454.36 62.5%
2048/64 1420.30 1357.43 4.4% 1226.86 460.53 62.5%
2048/128 1438.00 1364.08 5.1% 1240.25 466.45 62.4%

Large Documents
Chunk Baseline (us) Optimized (us) Perf Gain Baseline KB Optimized KB Alloc Reduction
512/0 14333.60 13579.96 5.2% 10740.19 4084.48 62.0%
512/64 14889.53 13956.63 6.3% 11394.80 4374.99 61.6%
512/128 15610.82 14532.23 6.9% 12275.76 4762.45 61.2%
1024/0 14293.71 13700.90 4.2% 10633.63 4043.39 62.0%
1024/64 14435.81 13786.02 4.5% 10931.39 4175.95 61.8%
1024/128 14579.98 13913.99 4.6% 11273.66 4328.09 61.6%
2048/0 13971.48 13534.53 3.1% 10580.09 4022.77 62.0%
2048/64 14157.79 13762.70 2.8% 10722.42 4085.99 61.9%
2048/128 14464.31 13927.94 3.7% 10873.30 4153.47 61.8%

Checklist

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have updated the documentation (if necessary).
  • I have added appropriate unit tests (if applicable).

Additional Notes

NOTE scenario 2, with reusable flattened list allocation improvements are huge, but the implication is that TokenTextChunker is not thread safe - a design choice I don't feel comfortable to make, due to limited knowledge of intended use/tradeoffs.

Implementation would imply:

private List<(int SliceIndex, int Token)> _flattened;

public IReadOnlyList<TextChunk> Chunk(IReadOnlyList<ChunkSlice> slices, ChunkingConfig config)
{
    ...
    _flattened ??= new List<(int SliceIndex, int Token)>(capacity:4096);
   _flattened.Clear();
}

@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.46%. Comparing base (feb6294) to head (22e277d).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main       #8      +/-   ##
==========================================
+ Coverage   75.24%   75.46%   +0.22%     
==========================================
  Files         115      115              
  Lines        4751     4757       +6     
  Branches      798      798              
==========================================
+ Hits         3575     3590      +15     
+ Misses        861      854       -7     
+ Partials      315      313       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@KSemenenko
Copy link
Member

As ususal amazing PR
thanks a lot!

@KSemenenko KSemenenko merged commit 45b370f into managedcode:main Dec 9, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants