GlusterFS: Basic setup on CentOS

In this blog post we will describe how to setup GlusterFS and test basic features.

GlusterFS is an Open Source, distributed filesystem capable of rapid provisioning of additional storage based on the needs.

Today we look at how to install and configure GlusterFs and get familiar with some basic operations on different types of volumes available in GlusterFS.

1. Resources

We are going to use three CentOS 7.1 systems to setup GlusterFS. Two systems to setup a cluster and one system will work as a client. On the client we will be mounting the volumes etc.

Below we describe the details of the 3 systems used in this setup:

node1.example.com 192.168.0.1
node2.example.com 192.168.0.2
client1.example.com 192.168.0.3

Note- This setup is done in an environment where DNS is working. You can also add entries in /etc/hosts for test setup.

Configure the data disk:

In this section we will configure a data disk on each of the server running GlusterFS software.

We will be using an additional disk (read hard drive, /dev/sdb) for GlusterFS purpose exclusively. This is recommended or else you need to have a separate partition for GlusterFS.

In GlusterFS terminology this data disk on which we create a XFS filesystem is called a brick.

  • Create a single partition on the available raw disk (/dev/sdb):
# fdisk /dev/sdb
  • Format the partition with XFS filesystem:
 # mkfs.xfs -i size=512 /dev/sdb1
  • Mount the partition on a mount-point:
  # mkdir -p /export/sdb1
  # mount /dev/sdb1 /export/sdb1
  • Add an entry to /etc/fstab, this will ensure the disk gets mounted automatically after reboot:
# echo "/dev/sdb1 /export/sdb1 xfs defaults 0 0"  >> /etc/fstab

Repeat these steps on the second server running GlusterFS.

2. Installation

  • Add GlusterFS repo in yum repositories to install required packages (do it on all the 3 hosts):
# wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/RHEL/glusterfs-epel.repo

As of today it will install version 3.7.1-1.el7.x86_64

On GlusterFS server systems:
  • Install packages on both systems, start the glusterd service and ensure it gets started automatically on system startup.
# yum -y install glusterfs-server
# systemctl start glusterd
# systemctl status glusterd
# systemctl enable glusterd
On GlusterFS client system:
# yum -y install glusterfs glusterfs-fuse
3. Creating storage pool

Before configuring GlusterFS volume we need to create a storage pool consisting of the storage servers.

A storage pool is a trusted network of storage servers. To add additional storage server to pool use probe command.

In our case as we know there are 2 servers dedicated for this purpose we will use one of the server and peer it with the other as shown below:

[root@node1 ~]# gluster peer probe node2.example.com
Probe successful  

To confirm peer status:

[root@node1 ~]# gluster peer status
Number of Peers: 1

Hostname: node2.example.com  
Uuid: cb553ff1-8572-4624-998d-d7247750b5ad  
State: Peer in Cluster (Connected)  

The above can also be checked from the other server host.

4. GlusterFS Volumes

A volume is a logical collection of bricks where each brick is an export directory on a server in the trusted storage pool. Most of the gluster management operations happen on the volume.

In this section we will be covering following type of volumes only

* Distributed Volume
* Replicated Volume
* Stripped Volume

Creating Distributed Volumes:

Distributes files randomely throughout the bricks in the volume.

Create volume on node1.example.com

 [root@node1 ~]# gluster volume create vol1 node1.example.com:/export/sdb1/dist node2.example.com.com:/export/sdb1/dist

The above command would create the appropriate bricks (read: export directories) under /export/sdb1 on both the systems
with name dist (these can be different on both systems).

In order to access the volume we must start the volume, one can start the volume by running the below command:

[root@node1 ~]# gluster volume start vol1

To verify information of a volume:

[root@node1 ~]# gluster volume info vol1

Volume Name: vol1
Type: Distribute
Volume ID: 121db153-5c39-43f7-acdb-b0f0ce0d8ec7
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: node1.example.com:/export/sdb1/dist
Brick2: node2.example.com:/export/sdb1/dist
Options Reconfigured:
performance.readdir-ahead: on

Mount volume on client system and create files to verify if its distributing files on both GlusterFS servers:

[root@client ~]# mkdir -p /mnt/dist-gluster
[root@client ~]# mount -t glusterfs node1.example.com:/vol1 /mnt/gluster/
[root@client ~]# cd /mnt/dist-gluster
[root@client ~]# touch t1 t2 t3 t4 t5 t6
[root@client dist-gluster]# ll
total 0
-rw-r--r-- 1 root root 0 Jun 12 21:47 t1
-rw-r--r-- 1 root root 0 Jun 12 21:47 t2
-rw-r--r-- 1 root root 0 Jun 12 21:47 t3
-rw-r--r-- 1 root root 0 Jun 12 21:47 t4
-rw-r--r-- 1 root root 0 Jun 12 21:47 t5
-rw-r--r-- 1 root root 0 Jun 12 21:47 t6

Verify status of files on both Gluster server nodes:

[root@node1 ~]# ll /export/sdb1/dist
total 0
-rw-r--r-- 2 root root 0 Jun 12 21:47 t1
-rw-r--r-- 2 root root 0 Jun 12 21:47 t3
-rw-r--r-- 2 root root 0 Jun 12 21:47 t4
-rw-r--r-- 2 root root 0 Jun 12 21:47 t5
-rw-r--r-- 2 root root 0 Jun 12 21:47 t6

[root@node2 ~]# ll /export/sdb1/dist
total 0
-rw-r--r-- 2 root root 0 Jun 12 21:47 t2

All files are distributed between two GlusterFS servers.

Creating Replicated Volumes:

Replicated volumes replicate or mirror data across two or more nodes in the cluster. It can be used in environments where high-availability and high-reliability are critical.

Create replicated volume on node1.example.com and start the volume.

[root@node1 ~]# gluster volume create vol2 rep 2 node1.example.com:/export/sdb1/rep node2.example.com:/export/sdb1/rep

[root@node1 ~]# gluster volume start vol2

We specify the replication factor as 2 in the create volume command above.

Verify status of volume by running below command:

[root@node1 ~]# gluster volume info vol2

Volume Name: vol2
Type: Replicate
Volume ID: b73cf0d1-513d-4f60-981d-5deb01a5d234
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node1.example.com:/export/sdb1/rep
Brick2: node2.example.com:/export/sdb1/rep
Options Reconfigured:
performance.readdir-ahead: on

On client system create a directory and mount the volume:

[root@client ~]# mkdir -p /mnt/rep-gluster
[root@client ~]# mount -t glusterfs node1.example.com:/vol1 /mnt/rep-gluster/

Create files to test replication on server nodes.

[root@client rep-gluster]# touch t1 t2 t3 t4 t5 t6
[root@client rep-gluster]# ll
total 0
-rw-r--r-- 1 root root 0 Jun 13 02:27 t1
-rw-r--r-- 1 root root 0 Jun 13 02:27 t2
-rw-r--r-- 1 root root 0 Jun 13 02:27 t3
-rw-r--r-- 1 root root 0 Jun 13 02:27 t4
-rw-r--r-- 1 root root 0 Jun 13 02:27 t5
-rw-r--r-- 1 root root 0 Jun 13 02:27 t6

Verify on both Gluster server nodes:

[root@node1 ~]# ll /export/sdb1/rep
total 0
-rw-r--r-- 2 root root 0 Jun 13 02:27 t1
-rw-r--r-- 2 root root 0 Jun 13 02:27 t2
-rw-r--r-- 2 root root 0 Jun 13 02:27 t3
-rw-r--r-- 2 root root 0 Jun 13 02:27 t4
-rw-r--r-- 2 root root 0 Jun 13 02:27 t5
-rw-r--r-- 2 root root 0 Jun 13 02:27 t6

[root@node2 ~]# ll /export/sdb1/rep
total 0
-rw-r--r-- 2 root root 0 Jun 13 02:27 t1
-rw-r--r-- 2 root root 0 Jun 13 02:27 t2
-rw-r--r-- 2 root root 0 Jun 13 02:27 t3
-rw-r--r-- 2 root root 0 Jun 13 02:27 t4
-rw-r--r-- 2 root root 0 Jun 13 02:27 t5
-rw-r--r-- 2 root root 0 Jun 13 02:27 t6

Here files are replicated on both gluster nodes.

Creating Striped Volumes:

In this volume type it will stripes data across bricks in the volume. It should be used in high concurrency environments accessing very large files.

Create striped volume on node1.example.com

[root@node1 ~]# gluster volume create vol3  strip 2 node1.example.com:/export/sdb1/strip node2.example.com:/export/sdb1/strip

Start the volume and verify the volume information

[root@node1 ~]# gluster volume start vol3
[root@node1 ~]# gluster volume info vol3

Volume Name: vol3
Type: Stripe
Volume ID: 39c2a2e0-e2bf-4aed-b23e-d24e3abbada8
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node1.example.com:/export/sdb1/strip
Brick2: node2.example.com:/export/sdb1/strip
Options Reconfigured:
performance.readdir-ahead: on

Create a directory and mount volume on client system:

[root@client ~]# mkdir -p /mnt/strip-gluster
[root@client ~]# mount -t glusterfs node1.example.com:/vol3 /mnt/strip-gluster/

Create a large file using fallocate for testing. We are creating 512MB file here.

[root@client ~]# cd /mnt/strip-gluster
[root@client ~]# fallocate -l 512M file.txt
[root@client strip-gluster]# ll
total 524288
-rw-r--r-- 1 root root 536870912 Jun 12 22:12 file.txt

Verify on both Gluster server nodes:

[root@node1 ~]# ll /export/sdb1/strip/
total 262144
-rw-r--r-- 2 root root 268435456 Jun 12 22:12 file.txt
[root@node2 ~]# ll /export/sdb1/strip/
total 262144
-rw-r--r-- 2 root root 268435456 Jun 12 22:12 file.txt

Note:- After mounting volume on client system add mount parameters at the end of /etc/fstab file to make it available to server on every reboot.

5. Conclusion

In this simple blog post we showed how one can easily get started with GlusterFS and start using it with demonstration of basic volume types and the commands. In next blog post we will touch upon some of the advance features offerred by GlusterFS like Access Control Lists (ACLs), Disk encryption and secure/authorized communication between the servers.