fix(pool): preserve Cache readiness with clones#297
Conversation
340c0c8 to
a9052dc
Compare
|
I looked at this and it fixes the tower contract, but I don't think it solves the underlying issue I hit trying to put connection limits in the connect stack below the cache. I had a connection limit layer in front of the connector, reserves a semaphore permit in // Cache::poll_ready
if let Some(svc) = self.shared.lock().unwrap().take() {
self.ready = Ready::Cached(svc);
return Poll::Ready(Ok(()));
}
self.connector.poll_ready(cx) // <-- capped worker parks hereUnder sustained reuse (512 concurrent, connection limit@64, H1) it deadlocks — zero throughput, 64/65 worker threads in // Shared::put (idle return)
while let Some(tx) = self.waiters.pop() { ... } // wakes a waiter
// Cache::call, miss path
locked.waiters.push(tx); // only reachable AFTER poll_ready returned ReadySo a worker parked in This PR is right for what it fixes, a clone can't |
Previously, a Cache was considered ready if the shared list of services was not empty. But, since Cache could be cloned, it was possible to observe on one clone that services was non-empty, a second clone could take a cached service, and then the first clone at
callwould see the list empty and need to thenconnector.call()without having polled its readiness.The proposed solution here is that if a Cache polls for readiness and sees that there is a service available, it moves that out of the list and into a per-clone slot. Then a later
callcan be sure that either a service is in its owned slot, or it calledconnector.poll_ready().cc @aajtodd