Skip to content

Investigate RDMA-to-TCP Fallback Mode for Broader Availability #171

@Jeffwan

Description

@Jeffwan

Description

This is something @DwyaneShi proposed earlier in offline discussion. I am creating an issue to track it. In environments where RDMA is not consistently available, the system currently fails to operate if RDMA initialization fails. One possible improvement is to provide a fallback mode to TCP.

Open Questions & Discussion Points

  • Should the client exit or fallback when RDMA is unavailable?
  • If fallback is allowed, how do we inform the user or log the performance degradation clearly?
  • Are there use cases where TCP fallback is preferred for availability over strict RDMA-only operation?

Tradeoffs

  • TCP fallback ensures availability but significantly degrades performance.
  • RDMA-only mode is optimal for performance but less fault-tolerant.

I will just open this issue for discussion and this is supposed to be a low priority item. We can discuss it later

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions