Spent a bit of time looking at some parts of the benchmarking setup, and had a couple of notes and comments:
- I think we're using iperf wrong. We are using the data from the sender, we should be looking at the receiver. Notice how the Bitrate is very different for sender vs receiver in this example:
$ iperf -c 127.0.0.1 -u -b 10g
Connecting to host 127.0.0.1, port 5201
[ 5] local 127.0.0.1 port 50191 connected to 127.0.0.1 port 5201
[ ID] Interval Transfer Bitrate Total Datagrams
[ 5] 0.00-1.00 sec 1.16 GBytes 10.0 Gbits/sec 38132
[ 5] 1.00-2.00 sec 1.16 GBytes 10.0 Gbits/sec 38161
[ 5] 2.00-3.00 sec 1.16 GBytes 9.99 Gbits/sec 38106
[ 5] 3.00-4.00 sec 1.17 GBytes 10.0 Gbits/sec 38184
[ 5] 4.00-5.00 sec 1.16 GBytes 10.0 Gbits/sec 38151
[ 5] 5.00-6.00 sec 1.16 GBytes 10.0 Gbits/sec 38143
[ 5] 6.00-7.00 sec 1.16 GBytes 9.99 Gbits/sec 38114
[ 5] 7.00-8.00 sec 1.16 GBytes 10.0 Gbits/sec 38165
[ 5] 8.00-9.00 sec 1.16 GBytes 9.99 Gbits/sec 38140
[ 5] 9.00-10.00 sec 1.16 GBytes 10.0 Gbits/sec 38169
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.00 sec 11.6 GBytes 10.0 Gbits/sec 0.000 ms 0/381465 (0%) sender
[ 5] 0.00-10.00 sec 8.43 GBytes 7.24 Gbits/sec 0.023 ms 105024/381411 (28%) receiver
We need to use the bitrate on the receiver side. The sender can push as much data as you want, but for these measurements we care about the data that was actually received. Look at the difference here: https://github.com/libp2p/test-plans/actions/runs/5466146370/jobs/9950640038#step:12:29
- The hypothetical max for this use case should be 50% of the instance bandwidth according to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html. (3.4 Gbps). I think it's worth linking this doc somewhere.
- The "local" vs "remote" backends are a bit confusing. These are both running on AWS hardware. Could we consolidate them or rename them? I would suggest alternate names, but I don't really understand them.
- What's the ami of the short-lived module? Doesn't seem set, and I can't find the default
- Should we make sure to set the MTU to 1500? (This might not be the default)
- Do we need to bump the UDP send window as well? I'm not sure, but it might be fine since quic-go doesn't complain about it. Any insight here @marten-seemann?
- Can we add comments around the AMI ids to describe them? It wasn't clear that these were the Amazon Linux AMIs
- Maybe include this one-liner:
aws ec2 describe-images \
--image-id ami-06e46074ae430fba6 \
--query "Images[*].Description[]" \
--output text \
--region us-east-1
Spent a bit of time looking at some parts of the benchmarking setup, and had a couple of notes and comments:
We need to use the bitrate on the receiver side. The sender can push as much data as you want, but for these measurements we care about the data that was actually received. Look at the difference here: https://github.com/libp2p/test-plans/actions/runs/5466146370/jobs/9950640038#step:12:29