RaspberryPi 4でGlusterFSを構築する

クラスタ構築のお勉強用に3台ほどRaspberryPi 4(8GB)を買ったので、そのうち2台で分散ストレージを構築しようと思います。

共有するストレージはGW2.5OR-U3+GX2 SSD 512GBです。sdaとして認識している状態です。

$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0 476.9G  0 disk
mmcblk0     179:0    0  57.9G  0 disk
tqmmcblk0p1 179:1    0   512M  0 part /boot/firmware
mqmmcblk0p2 179:2    0  57.4G  0 part /
$ lsusb
Bus 002 Device 002: ID 0080:a001 Unknown JMS578 based SATA bridge

パーティション作って、ext4でフォーマット、マウントポイントを作ってマウントします。PARTUUIDはblkidで確認したものを指定します。

# parted /dev/sda mklabel gpt
# parted /dev/sda mkpart brick 0% 100%
# mkfs.ext4 /dev/sda1
# mkdir -p /gfs/brick
# blkid | grep sda1
/dev/sda1: UUID="5cf44f34-f462-4ac4-9a16-2b0623becd53" BLOCK_SIZE="4096" TYPE="ext4" PARTLABEL="brick" PARTUUID="8d963c2f-e4c2-4ea1-b67d-5b3b435d015d"
# echo 'PARTUUID="5cf44f34-f462-4ac4-9a16-2b0623becd53" /gfs/brick ext4 nofail  0       1' >> /etc/fstab
# systemctl daemon-reload
# mount /gfs/brick

glusterfsをインストールし、自動起動を有効にします。

# apt-get -y install glusterfs-server
# systemctl enable glusterd --now

ノード2と接続し、確認します。

llcpi01 # gluster peer probe llcpi02
peer probe: success
llcpi01 # gluster peer status
Number of Peers: 1

Hostname: llcpi02
Uuid: 8000483e-efb7-46af-adf4-0768e1ff65ac
State: Peer in Cluster (Connected)

ボリュームを作成します。

llcpi01 # gluster volume create gvol1 replica 2 llcpi01:/gfs/brick/gvol0 llcpi02:/gfs/brick/gvol0
llcpi01 # gluster volume start gvol1 

マウント設定をします。02のノードではbackup-volfile-serversの指定を変えて同様に指定します。

llcpi01 # mkdir /shared
llcpi01 # echo 'localhost:/gvol1 /shared glusterfs defaults,_netdev,backup-volfile-servers=llcpi02 0 0' >> /etc/fstab
llcpi01 # mount /shared

これで/sahredにGlusterFSがマウントされているはずです。

せっかくなのでベンチマークをとってみます。SSDの読み取りを見てみるとかなり遅いです。調べてみるとUASPが怪しいです。

# hdparm -t /dev/sda1
/dev/sda1:
 Timing buffered disk reads:  18 MB in 54.47 seconds = 338.41 kB/sec

見てみるとDriver=uasですね。

# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
    |__ Port 2: Dev 2, If 0, Class=Mass Storage, Driver=uas, 5000M
:
# lsusb
Bus 002 Device 002: ID 0080:a001 Unknown JMS578 based SATA bridge

起動オプションで無効にするが良いらしいので、無効にします。/boot/cmdline.txtに「usb-storage.quirks=0080:a001:u」を追記し、再起動します。

再起動後、確認するとDriver=usb-storageになっていればOKです。

# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
    |__ Port 2: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M

再度読み込みをベンチマークします。今度は192 MB/sなので、そこそこの値が出ています。

# hdparm -t /dev/sda1
/dev/sda1:
 Timing buffered disk reads: 578 MB in  3.00 seconds = 192.49 MB/sec

SSD直接でR=219MB/s、W=210MB/sです。SATA2.0時代のSSDみたいな速度ですね。

# wget http://www.winkey.jp/downloads/visit.php/fio-crystaldiskmark
# cp -p fio-crystaldiskmark fio-crystaldiskmark-sda1
# sed -i s@directory=/tmp/@directory=/gfs/brick/@g fio-crystaldiskmark-sda1
# fio fio-crystaldiskmark-sda1
:
Run status group 0 (all jobs):
   READ: bw=208MiB/s (219MB/s), 208MiB/s-208MiB/s (219MB/s-219MB/s), io=1024MiB (1074MB), run=4912-4912msec

Run status group 1 (all jobs):
  WRITE: bw=200MiB/s (210MB/s), 200MiB/s-200MiB/s (210MB/s-210MB/s), io=1024MiB (1074MB), run=5119-5119msec

Run status group 2 (all jobs):
   READ: bw=175MiB/s (183MB/s), 175MiB/s-175MiB/s (183MB/s-183MB/s), io=1024MiB (1074MB), run=5863-5863msec

Run status group 3 (all jobs):
  WRITE: bw=201MiB/s (211MB/s), 201MiB/s-201MiB/s (211MB/s-211MB/s), io=1024MiB (1074MB), run=5100-5100msec

Run status group 4 (all jobs):
   READ: bw=14.9MiB/s (15.6MB/s), 14.9MiB/s-14.9MiB/s (15.6MB/s-15.6MB/s), io=892MiB (935MB), run=60001-60001msec

Run status group 5 (all jobs):
  WRITE: bw=24.9MiB/s (26.2MB/s), 24.9MiB/s-24.9MiB/s (26.2MB/s-26.2MB/s), io=1024MiB (1074MB), run=41057-41057msec

Run status group 6 (all jobs):
   READ: bw=16.5MiB/s (17.3MB/s), 16.5MiB/s-16.5MiB/s (17.3MB/s-17.3MB/s), io=990MiB (1039MB), run=60001-60001msec

Run status group 7 (all jobs):
  WRITE: bw=27.9MiB/s (29.2MB/s), 27.9MiB/s-27.9MiB/s (29.2MB/s-29.2MB/s), io=1024MiB (1074MB), run=36750-36750msec

Disk stats (read/write):
  sda: ios=490008/531162, merge=20/89, ticks=186180/123165, in_queue=310445, util=99.17%

GlusterFSでR=125MB/s、W=78MB/sです。1GbEなので125MB/sが論理最大なので、ギリギリまで出ていると思います。

# cp -p fio-crystaldiskmark fio-crystaldiskmark-shared
# sed -i s@directory=/tmp/@directory=/shared/@g fio-crystaldiskmark-shared
# fio fio-crystaldiskmark-shared
Run status group 0 (all jobs):
   READ: bw=119MiB/s (125MB/s), 119MiB/s-119MiB/s (125MB/s-125MB/s), io=1024MiB (1074MB), run=8578-8578msec

Run status group 1 (all jobs):
  WRITE: bw=74.4MiB/s (78.1MB/s), 74.4MiB/s-74.4MiB/s (78.1MB/s-78.1MB/s), io=1024MiB (1074MB), run=13756-13756msec

Run status group 2 (all jobs):
   READ: bw=114MiB/s (120MB/s), 114MiB/s-114MiB/s (120MB/s-120MB/s), io=1024MiB (1074MB), run=8949-8949msec

Run status group 3 (all jobs):
  WRITE: bw=69.7MiB/s (73.1MB/s), 69.7MiB/s-69.7MiB/s (73.1MB/s-73.1MB/s), io=1024MiB (1074MB), run=14682-14682msec

Run status group 4 (all jobs):
   READ: bw=5604KiB/s (5739kB/s), 5604KiB/s-5604KiB/s (5739kB/s-5739kB/s), io=328MiB (344MB), run=60001-60001msec

Run status group 5 (all jobs):
  WRITE: bw=7250KiB/s (7424kB/s), 7250KiB/s-7250KiB/s (7424kB/s-7424kB/s), io=425MiB (445MB), run=60001-60001msec

Run status group 6 (all jobs):
   READ: bw=13.2MiB/s (13.8MB/s), 13.2MiB/s-13.2MiB/s (13.8MB/s-13.8MB/s), io=791MiB (830MB), run=60004-60004msec

Run status group 7 (all jobs):
  WRITE: bw=7309KiB/s (7485kB/s), 7309KiB/s-7309KiB/s (7485kB/s-7485kB/s), io=428MiB (449MB), run=60001-60001msec

おまけでboot用のSDカード(キオクシアのKLMEA064G)です。公称値だと読み込み100MB/sですが、R=41MB/s、W=19MB/sです。

# cp -p fio-crystaldiskmark fio-crystaldiskmark-root
# sed -i s@directory=/tmp/@directory=/@g fio-crystaldiskmark-root
# fio fio-crystaldiskmark-root
:
Run status group 0 (all jobs):
   READ: bw=41.1MiB/s (43.1MB/s), 41.1MiB/s-41.1MiB/s (43.1MB/s-43.1MB/s), io=1024MiB (1074MB), run=24940-24940msec

Run status group 1 (all jobs):
  WRITE: bw=18.6MiB/s (19.5MB/s), 18.6MiB/s-18.6MiB/s (19.5MB/s-19.5MB/s), io=1024MiB (1074MB), run=54977-54977msec

Run status group 2 (all jobs):
   READ: bw=37.8MiB/s (39.6MB/s), 37.8MiB/s-37.8MiB/s (39.6MB/s-39.6MB/s), io=1024MiB (1074MB), run=27086-27086msec

Run status group 3 (all jobs):
  WRITE: bw=18.5MiB/s (19.4MB/s), 18.5MiB/s-18.5MiB/s (19.4MB/s-19.4MB/s), io=1024MiB (1074MB), run=55439-55439msec

Run status group 4 (all jobs):
   READ: bw=9314KiB/s (9537kB/s), 9314KiB/s-9314KiB/s (9537kB/s-9537kB/s), io=546MiB (572MB), run=60001-60001msec

Run status group 5 (all jobs):
  WRITE: bw=5133KiB/s (5256kB/s), 5133KiB/s-5133KiB/s (5256kB/s-5256kB/s), io=301MiB (315MB), run=60001-60001msec

Run status group 6 (all jobs):
   READ: bw=11.8MiB/s (12.4MB/s), 11.8MiB/s-11.8MiB/s (12.4MB/s-12.4MB/s), io=709MiB (743MB), run=60011-60011msec

Run status group 7 (all jobs):
  WRITE: bw=6086KiB/s (6232kB/s), 6086KiB/s-6086KiB/s (6232kB/s-6232kB/s), io=357MiB (374MB), run=60016-60016msec

Disk stats (read/write):
  mmcblk0: ios=325247/172468, merge=28/67, ticks=2030607/2102035, in_queue=4132642, util=99.62%

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です

日本語が含まれない投稿は無視されますのでご注意ください。(スパム対策)