CentOS7.6にアップデートしたらOFEDがうまく動かなくなったので、他の方法でInfiniBandを動かそうと思います。
yum groupinstall "Infiniband Support" yum install infiniband-diags
# ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 2
Firmware version: 2.9.1000
Hardware version: b0
Node GUID: 0x0021280001fca108
System image GUID: 0x0021280001fca10b
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x0259086a
Port GUID: 0x0021280001fca109
Link layer: InfiniBand
Port 2:
State: Down
Physical state: Polling
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x0259086a
Port GUID: 0x0021280001fca10a
Link layer: InfiniBand
GUIDを確認したらopensmに設定を追記します。
#vi /etc/sysconfig/opensm
以下を追記します。
GUIDS=”0x0021280001fca109 0x0021280001fca10a”
サービスを起動します。
#systemctl enable opensm.service
#systemctl start opensm.service
#systemctl status opensm.service
● opensm.service - Starts the OpenSM InfiniBand fabric Subnet Manager
Loaded: loaded (/usr/lib/systemd/system/opensm.service; enabled; vendor preset: disabled)
Active: active (running) since 日 2019-06-09 21:23:48 JST; 2min 24s ago
Docs: man:opensm
Process: 14266 ExecStart=/usr/libexec/opensm-launch (code=exited, status=0/SUCCESS)
CGroup: /system.slice/opensm.service
tq14268 /bin/bash /usr/libexec/opensm-launch
tq14270 /usr/sbin/opensm -g 0x0021280001fca109 --subnet_prefix 0xf...
tq14271 /bin/bash /usr/libexec/opensm-launch
mq14272 /usr/sbin/opensm -g 0x0021280001fca10a --subnet_prefix 0xf...