-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Idle connection management #3866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi @praboud, thank you for your contribution! We will take a look at it soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds idle connection management to ConnectionPool and BlockingConnectionPool to automatically close connections that haven't been used for a configurable timeout period. This feature is disabled by default and activated by setting the idle_connection_timeout parameter. A singleton IdleConnectionCleanupManager with a background thread handles cleanup for all pools efficiently, reducing connection overhead during low-traffic periods.
Key Changes:
- Introduced
PooledConnectionwrapper to track last-used timestamps - Implemented
IdleConnectionCleanupManagersingleton with heapq-based scheduling - Added
idle_connection_timeoutandidle_check_intervalparameters to connection pools and Redis client
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| redis/connection.py | Core implementation: added PooledConnection class, IdleConnectionCleanupManager singleton, and _cleanup_idle_connections methods to both pool types |
| redis/client.py | Added idle_connection_timeout and idle_check_interval parameters to Redis client constructor |
| redis/cluster.py | Added new parameters to cluster connection pool configuration allowlist |
| tests/test_connection_pool.py | Comprehensive test suite for idle connection cleanup including mock time utilities and integration tests |
| tests/test_multiprocessing.py | Updated to handle PooledConnection wrapper in assertions |
| tests/test_maint_notifications.py | Updated to extract connection from PooledConnection wrapper |
| tests/test_maint_notifications_handling.py | Refactored to use new _get_free_connections() method |
Comments suppressed due to low confidence (1)
redis/connection.py:551
- This import of module logging is redundant, as it was previously imported on line 4.
import logging
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| e, | ||
| exc_info=True, | ||
| ) | ||
|
|
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing return statement. The function signature declares -> Optional[float] and the docstring states it returns "Timestamp of the oldest remaining connection, or None if pool is empty." However, the function doesn't return oldest_connection_time at the end. It should return this value so the cleanup manager can properly reschedule the pool.
Add at the end of the function:
return oldest_connection_time| return oldest_connection_time |
redis/connection.py
Outdated
| try: | ||
| if self._in_maintenance: | ||
| self._lock.acquire() | ||
| self._locked = True | ||
| yield | ||
| finally: | ||
| if self._locked: | ||
| try: | ||
| self._lock.release() | ||
| except Exception: | ||
| pass | ||
| self._locked = False |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential concurrency issue with _locked flag. The _locked flag is used to track whether the lock was acquired, but it's not thread-safe. If multiple threads call _maintenance_lock() simultaneously when _in_maintenance is True, they could both see _locked=False, both acquire the lock (one will block), and both set _locked=True. Then in the finally block, the first thread to exit will set _locked=False, causing the second thread to potentially not release the lock or encounter issues.
Consider using a thread-local variable or a more robust locking pattern. For example:
@contextmanager
def _maintenance_lock(self):
acquired = False
try:
if self._in_maintenance:
self._lock.acquire()
acquired = True
yield
finally:
if acquired:
try:
self._lock.release()
except Exception:
pass| try: | |
| if self._in_maintenance: | |
| self._lock.acquire() | |
| self._locked = True | |
| yield | |
| finally: | |
| if self._locked: | |
| try: | |
| self._lock.release() | |
| except Exception: | |
| pass | |
| self._locked = False | |
| acquired = False | |
| try: | |
| if self._in_maintenance: | |
| self._lock.acquire() | |
| acquired = True | |
| yield | |
| finally: | |
| if acquired: | |
| try: | |
| self._lock.release() | |
| except Exception: | |
| pass |
| idle_connection_timeout: Optional[float] = None, | ||
| idle_check_interval: float = 60.0, |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing documentation for new parameters. The idle_connection_timeout and idle_check_interval parameters are added to the __init__ method but are not documented in the docstring. These parameters should be documented in the Args section to help users understand their purpose and usage.
Consider adding documentation like:
idle_connection_timeout:
If set, connections that have been idle (not used) for longer than
this timeout (in seconds) will be automatically closed. If None (default),
idle connections are never closed. Only used when connection_pool is not provided.
idle_check_interval:
How frequently (in seconds) to check for idle connections. Defaults to 60 seconds.
Only used when idle_connection_timeout is set and connection_pool is not provided.
redis/connection.py
Outdated
| if wait_time > 0: | ||
| # Sleep until next scheduled check (or until notified) | ||
| self._condition.wait(timeout=wait_time) | ||
| return |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing return statement after wait(). When wait_time > 0, the function waits but then falls through to execute the code below (lines 2536-2548) instead of returning None. This could cause the function to pop from the schedule and process the pool immediately after waking up, even if the wait was interrupted before the scheduled time.
Add return None after the wait:
if wait_time > 0:
# Sleep until next scheduled check (or until notified)
self._condition.wait(timeout=wait_time)
return None| return | |
| return None |
redis/connection.py
Outdated
|
|
||
| with self._condition: | ||
| self._reschedule_pool(metadata, oldest_conn_time) | ||
| # the pool after the pool is rescheduled, we can clean up the WAL |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment has a grammatical error. Should be "After the pool is rescheduled" instead of "the pool after the pool is rescheduled".
| # the pool after the pool is rescheduled, we can clean up the WAL | |
| # After the pool is rescheduled, we can clean up the WAL |
redis/connection.py
Outdated
| self.timeout = timeout | ||
| self._in_maintenance = False | ||
| self._locked = False | ||
| self.pool: Queue[PooledConnection | None] = self.queue_class(max_connections) |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect initialization order. The self.pool attribute is assigned on line 3071 before calling super().__init__() on line 3072. However, super().__init__() calls reset() (via the parent ConnectionPool), and reset() (defined at line 3080) also assigns self.pool at line 3083. This means:
- Line 3071 initializes
self.pool - Line 3072 calls
super().__init__()which eventually callsreset() - Line 3083 (in
reset()) overwritesself.pool, making line 3071 redundant
The assignment on line 3071 should be removed, as reset() will properly initialize the pool.
| self.pool: Queue[PooledConnection | None] = self.queue_class(max_connections) |
tests/test_connection_pool.py
Outdated
| mock_datetime.datetime.side_effect = lambda *args, **kwargs: datetime.datetime( | ||
| *args, **kwargs | ||
| ) |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 'lambda' is just a simple wrapper around a callable object. Use that object directly.
| mock_datetime.datetime.side_effect = lambda *args, **kwargs: datetime.datetime( | |
| *args, **kwargs | |
| ) | |
| mock_datetime.datetime.side_effect = datetime.datetime |
redis/connection.py
Outdated
| except Exception: | ||
| pass |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| pass | |
| except Exception as e: | |
| logger.warning( | |
| "Error releasing maintenance lock in BlockingConnectionPool: %s", | |
| e, | |
| exc_info=True, | |
| ) |
Description of change
Adds idle connection management to
ConnectionPool. This is disabled by default, but clients can configure aidle_connection_timeout. If set, connections that aren't used for longer than the configured timeout are closed automatically by a background thread. (There's a singletonIdleConnectionCleanupManagerwhich schedules these checks for all connection pools, to avoid needing to allocate a separate thread for each pool.)Right now, because connections are never closed once opened, clients have a "high water mark" property - the client will hold open as many connections as it needed during the busiest period that client has experienced. This can cause far more connections to be used than are really needed, as well as making the number of connections used difficult to reason about. I've deployed this patch to production, and seen a roughly 2x reduction in connections used in practice, on a large deployment of clients.
Pull Request check-list
Please make sure to review and check all of these items:
NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.