prov/rxd: optimize TX critical path#11967
Conversation
|
I see that CI failures are related. I'll fix them in the next commit. |
|
To get you started with fixing the errors: Please add descriptions of changes in the commits and sign-off messages (should quickly fix your DCO failure) Here are some of the common errors from Intel CI:
Here are appveyor ones: |
|
@zachdworkin would you please share with me logs of Intel Jenkins? thanks! |
You only are creating failures in fabtests. The other middlewares are reporting passes (for now) Fabtests UDP failures: 1 (reg, dbg, dl builds all the same) Fabtests verbs;ofi_rxd failures: 2 (reg, dbg, dl builds all the same) server: fi_rdm_atomic -o all -I 1000 -U -p "verbs;ofi_rxd" -b -s n1 |
Thank you! I can reproduce it locally. |
|
@zachdworkin I've been debugging this locally and hopefully now fi_rdm_atomic tests now should be fixed. Am I understanding right that fi_rdm_tagged_peek with UDP still fails? For some reason I can't reproduce it on my M1 Mac and x86/ConnectX-5 testbeds. Thank you! |
The rdm_tagged_peek test is passing now. I ran it 50 times without failure. Now the only failing test is inside ubertest and its #14. My log isn't telling me why its failing. I can look into it more if you need |
This PR introduces several optimizations in RxD provider datapath to optimise throughput and latency:
Baseline performance on CX-5 100G testbed:
Performance after this PR: