CentOS7.6にアップデートしたらOFEDがうまく動かなくなったので、他の方法でInfiniBandを動かそうと思います。
yum groupinstall "Infiniband Support" yum install infiniband-diags
# ibstat CA 'mlx4_0' CA type: MT26428 Number of ports: 2 Firmware version: 2.9.1000 Hardware version: b0 Node GUID: 0x0021280001fca108 System image GUID: 0x0021280001fca10b Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x0259086a Port GUID: 0x0021280001fca109 Link layer: InfiniBand Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x0259086a Port GUID: 0x0021280001fca10a Link layer: InfiniBand
GUIDを確認したらopensmに設定を追記します。
#vi /etc/sysconfig/opensm
以下を追記します。
GUIDS=”0x0021280001fca109 0x0021280001fca10a”
サービスを起動します。
#systemctl enable opensm.service #systemctl start opensm.service #systemctl status opensm.service ● opensm.service - Starts the OpenSM InfiniBand fabric Subnet Manager Loaded: loaded (/usr/lib/systemd/system/opensm.service; enabled; vendor preset: disabled) Active: active (running) since 日 2019-06-09 21:23:48 JST; 2min 24s ago Docs: man:opensm Process: 14266 ExecStart=/usr/libexec/opensm-launch (code=exited, status=0/SUCCESS) CGroup: /system.slice/opensm.service tq14268 /bin/bash /usr/libexec/opensm-launch tq14270 /usr/sbin/opensm -g 0x0021280001fca109 --subnet_prefix 0xf... tq14271 /bin/bash /usr/libexec/opensm-launch mq14272 /usr/sbin/opensm -g 0x0021280001fca10a --subnet_prefix 0xf...