-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Implement object pooling for frequently allocated request and response objects to reduce GC pressure and improve performance in high-throughput scenarios.
Current Behavior
Every API call creates new objects:
var request = new ChatCompletionRequest
{
Model = "model",
Messages = new List<ChatMessage> { ... } // New allocation
};
// Request object becomes garbage after useProposed Solution
Use Microsoft.Extensions.ObjectPool or a custom pooling implementation:
public class ChatCompletionRequestPool
{
private static readonly ObjectPool<ChatCompletionRequest> Pool =
ObjectPool.Create<ChatCompletionRequest>();
public static ChatCompletionRequest Rent() => Pool.Get();
public static void Return(ChatCompletionRequest request)
{
request.Reset(); // Clear for reuse
Pool.Return(request);
}
}
// Usage with IDisposable pattern
using var request = ChatCompletionRequestPool.Rent();
request.Model = "model";
request.Messages.Add(ChatMessage.User("Hello"));
var response = await client.CreateChatCompletionAsync(request);Alternative: Struct-Based DTOs
For simple requests, consider struct-based DTOs to avoid heap allocations entirely:
public readonly struct ChatCompletionRequestBuilder
{
// Build request without allocations
}Expected Benefits
- ~10% memory reduction in high-throughput scenarios
- Reduced GC pauses - Fewer Gen0/Gen1 collections
- Lower latency variance - More consistent performance
Priority
🟢 P2 - Medium Impact
Considerations
- Pooling adds complexity - only worth it for high-frequency usage
- Need to ensure proper reset/cleanup of pooled objects
- Consider making this opt-in for users who need it
- Thread-safety must be handled correctly
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request