Skip to content

Add object pooling for request/response DTOs #4

@GmausDev

Description

@GmausDev

Summary

Implement object pooling for frequently allocated request and response objects to reduce GC pressure and improve performance in high-throughput scenarios.

Current Behavior

Every API call creates new objects:

var request = new ChatCompletionRequest 
{
    Model = "model",
    Messages = new List<ChatMessage> { ... }  // New allocation
};
// Request object becomes garbage after use

Proposed Solution

Use Microsoft.Extensions.ObjectPool or a custom pooling implementation:

public class ChatCompletionRequestPool
{
    private static readonly ObjectPool<ChatCompletionRequest> Pool = 
        ObjectPool.Create<ChatCompletionRequest>();

    public static ChatCompletionRequest Rent() => Pool.Get();
    public static void Return(ChatCompletionRequest request)
    {
        request.Reset(); // Clear for reuse
        Pool.Return(request);
    }
}

// Usage with IDisposable pattern
using var request = ChatCompletionRequestPool.Rent();
request.Model = "model";
request.Messages.Add(ChatMessage.User("Hello"));
var response = await client.CreateChatCompletionAsync(request);

Alternative: Struct-Based DTOs

For simple requests, consider struct-based DTOs to avoid heap allocations entirely:

public readonly struct ChatCompletionRequestBuilder
{
    // Build request without allocations
}

Expected Benefits

  • ~10% memory reduction in high-throughput scenarios
  • Reduced GC pauses - Fewer Gen0/Gen1 collections
  • Lower latency variance - More consistent performance

Priority

🟢 P2 - Medium Impact

Considerations

  • Pooling adds complexity - only worth it for high-frequency usage
  • Need to ensure proper reset/cleanup of pooled objects
  • Consider making this opt-in for users who need it
  • Thread-safety must be handled correctly

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions