Celery Integration for Poolboy #147
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces Celery-based distributed task processing for Poolboy, enabling horizontal scalability and improved performance through asynchronous background processing of resource management operations.
Motivation
Poolboy supports a manager mode architecture where multiple operator replicas distribute event handling across pods. This approach provides good reliability and basic load distribution.
Celery integration complements this architecture by adding:
Solution Architecture
The implementation adds three new Kubernetes components alongside the existing Poolboy operator:
Processing Flow
When workers are enabled:
groupprimitive to batch process multiple resources in parallelThe operator falls back to synchronous processing when workers are disabled, ensuring backward compatibility.
Key Features
Feature Flags per Resource Type
Each resource class can be individually enabled for worker processing:
useWorkers.resourcePool.enableduseWorkers.resourceHandle.enableduseWorkers.resourceClaim.enabledThis allows gradual migration and rollback per component.
Operation Modes
Three modes control how periodic reconciliation is handled:
schedulerdaemonbothPartitioned Queues
Tasks are routed to partitioned queues using consistent hashing of
namespace/name:Distributed Locking
Redis-based distributed locks prevent concurrent processing of the same resource across workers:
Shared Redis Cache
A unified cache system enables state sharing between operator and workers:
Horizontal Pod Autoscaler
Workers support HPA based on CPU/memory utilization for automatic scaling under load.
Components Migrated
Note: ResourceClaim binding still occurs in the operator to leverage in-memory cache for pool handle discovery. Once bound, subsequent updates are processed by workers.
Configuration
All configuration is managed through Helm
values.yaml:Observability