Currently, if the InfiniStore server restarts, clients lose connection and are unable to automatically reconnect—even if the server retains the same IP and port. This affects system resilience and availability during routine restarts or failure recoveries.
Expected Behavior
- The client should detect the disconnection.
- Once the server is back online (same IP/port), the client should attempt to reconnect without requiring manual intervention.
Use Case
This enhancement is critical for vLLM that rely on InfiniStore where uptime and robustness are important. If we plan to rollout new version of the kv cache server, it doesn't require engine to be restarted.