So just recently we got need of checking RDMA ping between nodes, and as we got a few the task of running nb_send_bw.exe was getting a little fiddly. As we were waiting for repair storage jobs to finish between restarts I wrote a few lines to automating task.
As you might know Mellanox got this tool for checking RDMA ping between netword cards, you need to run it on one node in the server (listener) mode and on the second as the client, the typical output for it (when works) is:
[FOR LISTENING SERVER]
C:\Program Files\Mellanox\MLNX_VPI\IB\Tools>.\nd_read_bw.exe -S 10.102.22.15 Listening for incoming connection request... Connection accepted.
[FOR CLIENT SERVER]
C:\Program Files\Mellanox\MLNX_VPI\IB\Tools>nd_send_bw.exe -C 10.102.22.15 #qp #bytes #iterations MR [Mmps] Gb/s CPU Util. 0 65536 100000 0.017 8.84 100.00 Test finished. Releasing resources...
To test connections properly it needs to be run on each cluster node against each IP. Yeah…
Function Get-MellanoxNdSendBw will do that whole job for you. It is dirty but I guess anyway you will adjust it to your own needs.
PS C:\Windows\system32> Get-MellanoxNdSendBw -ClusterName HyperVCluster1
Connection NODEHV1 to NODEHV2 on NIC 10.102.22.11 is OK
Connection NODEHV1 to NODEHV3 on NIC 10.102.22.11 is OK
Connection NODEHV2 to NODEHV1 on NIC 10.102.22.11 is OK
Connection NODEHV2 to NODEHV3 on NIC 10.102.22.11 is OK
…and so on, till all nodes are checked against all interfaces.
So once again, how to use it? Easy! Like this:
Get-MellanoxNdSendBw -ClusterName <ClusterName>