Pacemaker NFS Cluster on Rhel 7|Centos 7

NFS CLUSTER IMAGE

In a production environment, if a server goes down for any reason that affects entire production. In order to overcome this issue, we need to configure servers in the cluster so that if any one of the nodes goes down the other available node will take over the production load. This article provides complete configuration details setting up two node active/passive Pacemaker NFS Cluster on Rhel 7|Centos 7.


Topic

  • How to configure PaceMaker NFS active passive cluster on centos 7?
  • How to configure PaceMaker NFS active passive cluster on RHEL 7?
  • Configure PaceMaker NFS active passive cluster on Linux/Ubuntu/Debian/Centos/Rhel?
  • NFS Active/passive cluster with Pacemaker
  • NFS Pacemaker cluster



Solution


In this demonstration, we will configure 2 node active passive NFS cluster with Pacemaker cluster utility.

Cluster node information

Node name: node1.example.local, node2.example.local
Node IP: 192.168.5.20, 192.168.5.21
Virtual IP: 192.168.5.23
Cluster Name: Cluster1

Prerequisites
  • Bare minimal or base installation of centos 7 on KVM virtulazitaion
  • Shared ISCSI SAN storage setup
  • Fencing
  • LVM
  • Volume group exclusive activation
  • Virtual or floating IP Address


Cluster Configuration

Following are the step by step procedure setting up a two node active/passive Pacemaker NFS Cluster on Rhel 7|Centos 7.

DNS Host Entry [1]
  • If you do not have a DNS server then make host name entries for all cluster nodes in /etc/hosts file on each cluster node.
Node 1 host entry:
[root@node1 ~]# cat /etc/hosts
192.168.5.20    node1.example.local    node1
192.168.5.21    node2.example.local    node2

Node 2 host entry:
[root@node2 ~]# cat /etc/hosts
192.168.5.20    node1.example.local     node1
192.168.5.21    node2.example.local     node2

Package installation [2]
  • Install the following packages on each node.
On node 1:
[root@node1 ~]# yum install pcs pacemaker corosync fence-agents-virsh fence-virt \
pacemaker-remote fence-agents-all lvm2-cluster resource-agents \
psmisc policycoreutils-python gfs2-utils

On node 2:
[root@node2 ~]# yum install pcs pacemaker corosync fence-agents-virsh \
fence-virt pacemaker-remote fence-agents-all lvm2-cluster resource-agents \
psmisc policycoreutils-python gfs2-utils

Hacluster Password setup [3]
  • Set password for hacluster user on each node and make sure password never expires and password must be same on each node.
On node 1:
[root@node1 ~]# echo "centos" | passwd hacluster --stdin
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.

On node 2:
[root@node2 ~]# echo "centos" | passwd hacluster --stdin
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.

Enable Cluster Service [4]
  • Start cluster services on all nodes and enable the service to start during system startup.
On node1:
[root@node1 ~]# systemctl start pcsd.service; systemctl enable pcsd.service

On node2:
[root@node2 ~]# systemctl start pcsd.service; systemctl enable pcsd.service

Authorize cluster nodes [5]
  • On any one of the node, execute the following command to authorize cluster nodes before we create a cluster.
On node1:
[root@node1 ~]# pcs cluster auth node1.example.local node2.example.local -u hacluster -p redhat
node1.example.local: Authorized
node2.example.local: Authorized

Cluster Setup [6]
  • Follow the steps below to setup your cluster. Here we define cluster name as cluster1.
On node1:
[root@node1 ~]# pcs cluster setup --start --name cluster1 node1.example.local node2.example.local
Destroying cluster on nodes: node1.example.local, node2.example.local...
node1.example.local: Stopping Cluster (pacemaker)...
node2.example.local: Stopping Cluster (pacemaker)...
node1.example.local: Successfully destroyed cluster
node2.example.local: Successfully destroyed cluster

Sending cluster config files to the nodes...
node1.example.local: Succeeded
node2.example.local: Succeeded

Starting cluster on nodes: node1.example.local, node2.example.local...
node1.example.local: Starting Cluster...
node2.example.local: Starting Cluster...

Synchronizing pcsd certificates on nodes node1.example.local, node2.example.local...
node1.example.local: Success
node2.example.local: Success

Restarting pcsd on the nodes in order to reload the certificates...
node1.example.local: Success
node2.example.local: Success

Note
If we encounter with any error from the above command then execute the following command to setup the cluster with --force switch.

 # pcs cluster setup --start --name gsrcluster1 node1.example.local node2.example.local --force

Firewall configuration – Optional [7]
  • Perform the following steps on any one of the node to enable the pacemaker cluster services to start during system startup and add the firewall rules.
On node1:
[root@node1 ~]# pcs cluster enable --all
node1.example.local: Cluster Enabled
node2.example.local: Cluster Enabled

[root@node1 ~]# firewall-cmd --permanent --add-service=high-availability
[root@node1 ~]# firewall-cmd --reload

On Node 2:
[root@node2 ~]# firewall-cmd --permanent --add-service=high-availability
[root@node2 ~]# firewall-cmd --reload

Verify Cluster Service [8]
  • Execute the following command on any cluster node to check cluster service status.
[root@node1 ~]# pcs status 
Cluster name: cluster1
WARNING: no stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: node2.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sat Nov 16 17:40:27 2019      Last change: Sat Nov 16 17:00:37 2019 by hacluster via crmd on node2.example.local

2 nodes and 0 resources configured

Online: [ node1.example.local node2.example.local ]

No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

  • We can also execute the following two commands to check cluster status.
[root@node1 ~]# crm_mon -r1
Stack: corosync
Current DC: node2.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sat Nov 16 17:40:52 2019          Last change: Sat Nov 16 17:00:37 2019 by hacluster via crmd on node2.ex
ample.local

2 nodes and 0 resources configured

Online: [ node1.example.local node2.example.local ]

No active resources

OR 

[root@node1 ~]# pcs cluster status 
Cluster Status:
 Stack: corosync
 Current DC: node2.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
 Last updated: Sat Nov 16 17:43:09 2019     Last change: Sat Nov 16 17:00:37 2019 by hacluster via crmd on node2.example.local
 2 nodes and 0 resources configured

PCSD Status:
  node1.example.local: Online
  node2.example.local: Online


Fencing configuration

The cluster is nothing without fencing. Fencing is a mechanism which protects a cluster from any kind of data corruption. Suppose we are using shared storage, if the shared storage is mounted on both the cluster nodes at the same time, then there is a possibility of data corruption. If cluster notices, any one of the cluster node has no connectivity with the shared storage or any kind of cluster communication issue, at this time, the available cluster node will fence the problematic cluster node in order to avoid any kind of data corruption using fencing mechanism. The Linux operating system supports many kind of fencing mechanism. You can execute pcs stonith list command to list supported fencing mechanism. In this demonstration, we will configure KVM fencing.


KVM Fencing configuration on KVM Host [9]
  • To configure KVM fence, Install the following packages on the KVM physical host.
# yum install fence-virt fence-virtd fence-virtd-libvirt fence-virtd-multicast fence-virtd-serial

KVM Key Generation on KVM Host [10]
  • Generate a fence key on the KVM host and copy that key to all the nodes in cluster.
# mkdir /etc/cluster
# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4k count=1

  • Create a directory /etc/cluster on all the cluster nodes and then copy fence_xvm.key from the physical KVM host to /etc/cluster directory on all the cluster nodes.

  • On RHEL 6 system, /etc/cluster directory is created automatic during the cluster setup.

On Physical KVM Host:
# for cnodes in 192.168.5.{20..21}; do ssh root@$cnodes "mkdir /etc/cluster"; done
# for cnodes in 192.168.5.{20..21}; do scp /etc/cluster/fence_xvm.key root@$cnodes:/etc/cluster; done

Fence Configuration on KVM host [11]
  • Edit /etc/fence_virt.conf file on the physical KVM host and apply the following changes:
ON Physical KVM Host:

# cat /etc/fence_virt.conf

        backends {
            libvirt {
                uri = "qemu:///system";
            }

        }

        listeners {
            multicast {
                key_file = "/etc/cluster/fence_xvm.key";
                interface = "virbr0";
                port = "1229";
                address = "225.0.0.12";
                family = "ipv4";
            }

        }

        fence_virtd {
            backend = "libvirt";
            listener = "multicast";
            module_path = "/usr/lib64/fence-virt";
        }

OR

  • Execute the following command to create /etc/fence_virt.conf file on the physical KVM host.
On Physical KVM Host:

# fence_virtd -c
        Parsing of /etc/fence_virt.conf failed.
        Start from scratch [y/N]? y
        Module search path [/usr/lib64/fence-virt]: 
        Listener module [multicast]: 
        Multicast IP Address [225.0.0.12]: 
        Multicast IP Port [1229]: 
        Interface [none]: virbr0      <---- Interface used for communication between the cluster nodes.
        Key File [/etc/cluster/fence_xvm.key]: 
        Backend module [libvirt]: 
        Libvirt URI [qemu:///system]: 

        Configuration complete.

        === Begin Configuration ===
        backends {
            libvirt {
                uri = "qemu:///system";
            }

        }

        listeners {
            multicast {
                key_file = "/etc/cluster/fence_xvm.key";
                interface = "virbr0";
                port = "1229";
                address = "225.0.0.12";
                family = "ipv4";
            }

        }

        fence_virtd {
            backend = "libvirt";
            listener = "multicast";
            module_path = "/usr/lib64/fence-virt";
        }

        === End Configuration ===
        Replace /etc/fence_virt.conf with the above [y/N]? y

Firewall Configuration for Fencing – Optional [12]
  • Add the following firewall rules on the physical KVM host and then start and enable fence_virtd service. Firewall rules are only needed if the KVM host has firewall enabled.
  # firewall-cmd --add-service=fence_virt --permanent
  # systemctl start fence_virtd.service
  # systemctl enable fence_virtd.service
  # systemctl status fence_virtd.service

Fence Client Configuration [13]
  • Install fence-virt package on each cluster node. This package is by default installed with fence-agents-all package but make sure it is installed.
[root@node1 ~]# rpm -qa | grep -i fence-virt
fence-virt-0.3.2-5.el7.x86_64

[root@node2 ~]# rpm -qa | grep -i fence-virt
fence-virt-0.3.2-5.el7.x86_64

Fencing Test [14]
  • Execute the below command on all cluster nodes and on the physical KVM host to validate fencing.
ON Physical KVM Host:
# fence_xvm -o list
c7node1                          3edcd0e9-1455-4c2a-a7be-9bff108c73b6 on
c7node2                          545fc023-7d6c-4b58-9b61-546c36a2b1c4 on

ON Cluster Nodes:
[root@node1 ~]# fence_xvm -o list
c7node1                          3edcd0e9-1455-4c2a-a7be-9bff108c73b6 on
c7node2                          545fc023-7d6c-4b58-9b61-546c36a2b1c4 on

[root@node2 ~]# fence_xvm -o list
c7node1                          3edcd0e9-1455-4c2a-a7be-9bff108c73b6 on
c7node2                          545fc023-7d6c-4b58-9b61-546c36a2b1c4 on

  • To fence the nodes, you need to use the VM UUIDs number shown above after node name.

  • Execute the following command to fence a node.

[root@node1 ~]# fence_xvm -o reboot -H 545fc023-7d6c-4b58-9b61-546c36a2b1c4

  • After execution of the above command, you will see node2 will be rebooted automatically.

Cluster fence configuration [15]
  • To create a fence device for each node, run following commands on any one of the cluster node.
ON Node1:
# pcs stonith create fencedev1  fence_xvm key_file=/etc/cluster/fence_xvm.key
# pcs stonith create fencedev2  fence_xvm key_file=/etc/cluster/fence_xvm.key

OR

  • If the fencing configuration does not work, remove the above configuration and execute the following command to create fence device.
ON Node1:
[root@node1 ~]# pcs stonith create fencedev1 fence_xvm pcmk_host_map="node1:c7node1 node2:c7node2" key_file=/etc/cluster/fence_xvm.key 
[root@node1 ~]# pcs stonith create fencedev2 fence_xvm pcmk_host_map="node1:c7node1 node2:c7node2" key_file=/etc/cluster/fence_xvm.key 

  • Start the above created fence devices on respective nodes. This option prevents the cluster from keep rebooting all cluster nodes while something goes wrong on any one of the cluster node, so the cluster will only fence the affected node.
ON Node1:
[root@node1 ~]# pcs constraint location fencedev1 prefers node1.example.local
[root@node1 ~]# pcs constraint location fencedev2 prefers node2.example.local

  • Execute the following command to view the constraint order.
ON Node1:
[root@node1 ~]# pcs constraint list
Location Constraints:
  Resource: fencedev1
    Enabled on: node1.example.local (score:INFINITY)
  Resource: fencedev2
    Enabled on: node2.example.local (score:INFINITY)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:

  • Verify stonith and cluster status.
ON Node1:
[root@node1 ~]# pcs stonith
 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local

  • View Cluster Status:
[root@node1 ~]# pcs status

OR

[root@node1 ~]# crm_mon -r1
Stack: corosync
Current DC: node1.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sat Nov 16 20:54:11 2019      Last change: Sat Nov 16 20:49:45 2019 by root via cibadmin on node1.example.local

2 nodes and 2 resources configured

Online: [ node1.example.local node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local

Note
crm_mon r command is used to monitor real time activity of cluster.

Fence a node from cluster [16]

Execute the following command on any one of the cluster node to fence a node from the cluster for testing purpose.

[root@node1 ~]# pcs stonith fence node2
Node: node2 fenced

After Node 2 comes up:
[root@node2 ~]# pcs stonith fence node1
Node: node1 fenced

  • With the above command you would see the node will be rebooted and removed from cluster. If you have not enabled the cluster service for system startup then manually start the cluster service on the fenced node with following command.
# pcs cluster start 

  • Execute the below command to view stonith configuration.
[root@node1 ~]# pcs stonith show --full
 Resource: fencedev1 (class=stonith type=fence_xvm)
  Attributes: pcmk_host_map="node1:c7node1 node2:c7node2" key_file=/etc/cluster/fence_xvm.key
  Operations: monitor interval=60s (fencedev1-monitor-interval-60s)
 Resource: fencedev2 (class=stonith type=fence_xvm)
  Attributes: pcmk_host_map="node1:c7node1 node2:c7node2" key_file=/etc/cluster/fence_xvm.key
  Operations: monitor interval=60s (fencedev2-monitor-interval-60s)

  • Execute the below command to view stonith property.
[root@node1 ~]# pcs property --all | grep -i stonith
 stonith-action: reboot
 stonith-enabled: true
 stonith-timeout: 60s
 stonith-watchdog-timeout: (null)

  • Execute the below command to check all property for pcs cluster.
[root@node1 ~]# pcs property --all | less
[....]

NFS Resource Configuration

NFS resource configuration requires the following prerequisites:

  • Shared Storage: This is the shared SAN storage from the storage server available to all cluster nodes through iscsi or fcoe.
  • NFS Server: This is nothing but nfs server resource.
  • Virtual IP Address: All nfs client would connect nfs server by using this virtual ip.

In this demonstration, we will use ISCSI Storage which is configured on another node. We’ll configure ISCSI initiator on both nodes to configure filesystem on shared LUN.

ISCSI Target and Initiator Configuration [17]

Following are the ISCSI LUN details available from the ISCSI Target or ISCSI SAN server.

Server IP Address: 192.168.5.24
Acl Name: iqn.2017-12.local.srv1:test
Target Name: iqn.2017-12.local.srv1:target1

  • Install iscsi-initiator-utils package on both cluster node.
[root@node1 ~]# yum install iscsi-initiator-utils
[root@node2 ~]# yum install iscsi-initiator-utils

  • Click the link Shared ISCSI SAN storage setup for ISCSI Target server configuration.

  • Edit /etc/iscsi/initiatorname.iscsi file on both cluster node and add the ISCSI IQN LUN name.

[root@node1 ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2017-12.local.srv1:test

[root@node2 ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2017-12.local.srv1:test

  • Restart and enable the initiator service on both the node.
On Node 1:
[root@node1 ~]# systemctl enable iscsid.service
Created symlink from /etc/systemd/system/multi-user.target.wants/iscsid.service to /usr/lib/systemd/system/iscsid.service.
[root@node1 ~]# systemctl restart iscsid.service

On Node 2:
[root@node2 ~]# systemctl enable iscsid.service
Created symlink from /etc/systemd/system/multi-user.target.wants/iscsid.service to /usr/lib/systemd/system/iscsid.service.
[root@node2 ~]# systemctl restart iscsid.service

  • Discover the target on both nodes using below command
[root@node1 ~]# iscsiadm --mode discoverydb --type sendtargets --portal 192.168.5.24 --discover
192.168.5.24:3260,1 iqn.2017-12.local.srv1:target1

[root@node2 ~]# iscsiadm --mode discoverydb --type sendtargets --portal 192.168.5.24 --discover
192.168.5.24:3260,1 iqn.2017-12.local.srv1:target1

  • Login to the discovered target on both nodes
On Node 1:
[root@node1 ~]# iscsiadm --mode node --targetname iqn.2017-12.local.srv1:target1 --portal 192.168.5.24:3260 --login
Logging in to [iface: default, target: iqn.2017-12.local.srv1:target1, portal: 192.168.5.24,3260] (multiple)
Login to [iface: default, target: iqn.2017-12.local.srv1:target1, portal: 192.168.5.24,3260] successful.

On Node 2:
[root@node2 ~]# iscsiadm --mode node --targetname iqn.2017-12.local.srv1:target1 --portal 192.168.5.24:3260 --login
Logging in to [iface: default, target: iqn.2017-12.local.srv1:target1, portal: 192.168.5.24,3260] (multiple)
Login to [iface: default, target: iqn.2017-12.local.srv1:target1, portal: 192.168.5.24,3260] successful.

  • Execute lsblk command on both nodes, you will see the LUN is discovered as a new block device. In our case, it is sda.

[root@node1 ~]# lsblk 
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0    5G  0 disk     <<<<<<<<<<<<<<<<<<< Here
sr0              11:0    1 1024M  0 rom  
vda             252:0    0   10G  0 disk 
├─vda1          252:1    0  500M  0 part /boot
└─vda2          252:2    0  9.5G  0 part 
  ├─centos-root 253:0    0  8.5G  0 lvm  /
  └─centos-swap 253:1    0    1G  0 lvm  [SWAP]

[root@node2 ~]# lsblk 
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0    5G  0 disk          <<<<<<<<<<<<< Here
sr0              11:0    1 1024M  0 rom  
vda             252:0    0   10G  0 disk 
├─vda1          252:1    0  500M  0 part /boot
└─vda2          252:2    0  9.5G  0 part 
  ├─centos-root 253:0    0  8.5G  0 lvm  /
  └─centos-swap 253:1    0    1G  0 lvm  [SWAP]


LVM CONFIGURATION [18]

Execute the following steps on any one of the cluster node:

  • Create a PV or physical volume on the shared LUN or block device sda.
[root@node1 ~]# pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created.

[root@node1 ~]# pvs
  PV         VG     Fmt  Attr PSize PFree 
  /dev/sda          lvm2 ---  5.00g  5.00g
  /dev/vda2  centos lvm2 a--  9.51g 40.00m

  • Create a volume group.
[root@node1 ~]# vgcreate NFSVG /dev/sda
  Volume group "NFSVG" successfully created

[root@node1 ~]# vgs
  VG     #PV #LV #SN Attr   VSize VFree 
  NFSVG    1   0   0 wz--n- 4.97g  4.97g
  centos   1   2   0 wz--n- 9.51g 40.00m

  • Create logical volume.
[root@node1 ~]# lvcreate -L +1G -n lv1 NFSVG
  Logical volume "lv1" created.

[root@node1 ~]# lvcreate -l 100%FREE -n lv2 NFSVG
  Logical volume "lv2" created.

[root@node1 ~]# lvs
  LV   VG     Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv1  NFSVG  -wi-a----- 1.00g                                                    
  lv2  NFSVG  -wi-a----- 3.97g                                                    
  root centos -wi-ao---- 8.47g                                                    
  swap centos -wi-ao---- 1.00g             

  • Create Filesystem
[root@node1 ~]# mkfs.xfs /dev/NFSVG/lv1 
meta-data=/dev/NFSVG/lv1         isize=256    agcount=4, agsize=65536 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@node1 ~]# mkfs.xfs /dev/NFSVG/lv2 
meta-data=/dev/NFSVG/lv2         isize=256    agcount=4, agsize=260096 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=1040384, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


Volume group exclusive activation [19]

There is a risk of corrupting the volume group’s metadata, if the volume group is activated outside of the cluster. To overcome this issue, make the volume group entry in /etc/lvm/lvm.conf file on each cluster node which allows only the cluster to activate the volume group. The Volume group exclusive activation configuration is not needed with clvmd.

To configure the Volume group exclusive activation make sure the cluster service is stopped. Follow the same process if you have configured iscsi storage by using tarcgetctl package on redhat/centos distribution. Apply this process once after iscsi target is configured.

  • Stop cluster service on one of the cluster nodes.
[root@node1 ~]# pcs cluster stop --all
node1.example.local: Stopping Cluster (pacemaker)...
node2.example.local: Stopping Cluster (pacemaker)...
node1.example.local: Stopping Cluster (corosync)...
node2.example.local: Stopping Cluster (corosync)...

  • Use the following command on both cluster nodes to disable and stop lvm2-lvmetad service and replace use_lvmetad = 1 to use_lvmetad = 0 in /etc/lvm/lvm.conf file.
[root@node1 ~]# lvmconf --enable-halvm --services --startstopservices
[root@node2 ~]# lvmconf --enable-halvm --services --startstopservices

  • Execute the following command to see volume groups(VG).
On Node 1:
[root@node1 ~]# vgs --noheadings -o vg_name
  NFSVG  <<<<<<<<<< Cluster VG
  centos 

On Node 2:
  [root@node2 ~]# vgs --noheadings -o vg_name
  NFSVG  <<<<<<<<<< Cluster VG
  centos  

  • Edit /etc/lvm/lvm.conf file on each cluster node and add the list of volume groups (OS VG) which are not part of cluster storage. This tells LVM not to active cluster VG during system startup.

  • In our system, we have one volume group i.e. centos which is a OS VG and there are no other volume group configuration.

[root@node1 ~]# vim /etc/lvm/lvm.conf
volume_list = [ "centos" ]

[root@node2 ~]# vim /etc/lvm/lvm.conf
volume_list = [ "centos" ]

Conditional Note
If the operating system(OS) doesn’t use LVM, configure the volume_list parameter in lvm.conf file as below:

  volume_list = []

  • Execute below command on both cluster nodes to rebuild the initramfs boot image and reboot the cluster nodes. Once the command executed successfully, OS will not try to activate the volume group(VG) controlled by the cluster.
On Node 1:
[root@node1 ~]# cp -a /boot/initramfs-$(uname -r).img $(uname -r) /boot/initramfs-$(uname -r).img $(uname -r).bak
[root@node1 ~]# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
[root@node1 ~]# reboot 

On Node 2:
[root@node2 ~]# cp -a /boot/initramfs-$(uname -r).img $(uname -r) /boot/initramfs-$(uname -r).img $(uname -r).bak
[root@node2 ~]# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
[root@node2 ~]# reboot

  • After system reboot, run lvscan command on all the cluster nodes, you would see lv1 and lv2 logical volumes will be shown as inactive state.

On Node 1:
[root@node1 ~]# lvscan 
  ACTIVE            '/dev/centos/swap' [1.00 GiB] inherit
  ACTIVE            '/dev/centos/root' [8.47 GiB] inherit
  inactive          '/dev/NFSVG/lv1' [1.00 GiB] inherit
  inactive          '/dev/NFSVG/lv2' [3.97 GiB] inherit

On Node 2:
[root@node2 ~]# lvscan
  ACTIVE            '/dev/centos/swap' [1.00 GiB] inherit
  ACTIVE            '/dev/centos/root' [8.47 GiB] inherit
  inactive          '/dev/NFSVG/lv1' [1.00 GiB] inherit
  inactive          '/dev/NFSVG/lv2' [3.97 GiB] inherit

  • Now start the cluster service on one of the cluster nodes.
On Node 1:
[root@node1 ~]# pcs cluster start --all
node1.example.local: Starting Cluster...
node2.example.local: Starting Cluster...

Configure NFS Resource [20]
  • Create a volume group resource and file system resource for the cluster on one of the cluster node:
On Node 1:
[root@node1 ~]# pcs resource create lvm-res LVM volgrpname="NFSVG" exclusive=true --group nfs-group
[root@node1 ~]# pcs resource create nfs-fs1_res Filesystem  device="/dev/NFSVG/lv1" directory="/nfsserverdir" fstype="xfs" --group nfs-group
[root@node1 ~]# pcs resource create nfs-fs2_res Filesystem  device="/dev/NFSVG/lv2" directory="/nfssharedir" fstype="xfs" --group nfs-group

  • Once nfs-fs1_res and nfs-fs2_res resources are created, you see that the cluster will crate a directory /nfsserverdir and /nfssharedir and mount the shared storage. Execute df -h command to validate whether the shared storage is mounted or not.
On Node 1:
[root@node1 ~]# df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root  8.5G  1.2G  7.4G  14% /
devtmpfs                 911M     0  911M   0% /dev
tmpfs                    921M   54M  867M   6% /dev/shm
tmpfs                    921M  8.4M  912M   1% /run
tmpfs                    921M     0  921M   0% /sys/fs/cgroup
/dev/vda1                497M  123M  375M  25% /boot
tmpfs                    185M     0  185M   0% /run/user/0
/dev/mapper/NFSVG-lv1   1014M   33M  982M   4% /nfsserverdir
/dev/mapper/NFSVG-lv2    4.0G   33M  4.0G   1% /nfssharedir

Create Floating or Virtual IP [21]
  • Create resource for floating IP address / Virtual IP address for NFS cluster resource. Execute the following command on one of the cluster node.
On Node 1:
[root@node1 ~]# pcs resource create NFS-VIP ocf:heartbeat:IPaddr2 ip=192.168.5.23 nic="eth0" cidr_netmask=24 op monitor interval=30s --group nfs-group

[root@node1 ~]# pcs resource create nfsserver-res nfsserver nfs_shared_infodir=/nfsserverdir nfs_ip=192.168.5.23  --group nfs-group

  • The major benefit of virtual IP is that, if one cluster node goes down then the virtual IP resource will be moved automatically to another cluster node. So that the users are working on nfs server will not suffer any issue to accessing the NFS server.

  • Verify status of nfs server

On Node 1:
[root@node1 ~]# systemctl status nfs-server.service
● nfs-server.service - NFS server and services
   Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
   Active: active (exited) since Sat 2019-11-16 23:41:45 IST; 49min ago
  Process: 8548 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS)
  Process: 8544 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
 Main PID: 8548 (code=exited, status=0/SUCCESS)

Nov 16 23:41:45 node1.example.local systemd[1]: Starting NFS server and services...
Nov 16 23:41:45 node1.example.local systemd[1]: Started NFS server and services.

Export nfs shares [22]
  • Create /nfssharedir/shares1 and /nfssharedir/shares2 directories on active cluster resource node and run below commands:
On Node 1:
[root@node1 ~]# mkdir -p /nfssharedir/shares1 /nfssharedir/shares2
[root@node1 ~]# pcs resource create nfs-share1 exportfs clientspec=192.168.5.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfssharedir/shares1 fsid=0 --group nfs-group
[root@node1 ~]# pcs resource create nfs-share2 exportfs clientspec=192.168.5.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfssharedir/shares2 fsid=1 --group nfs-group

  • Ceate nfsnotify resource for sending NFSv3 reboot notifications once the entire NFS deployment has been initialized.
[root@node1 ~]# pcs resource create notify-nfs nfsnotify source_host=192.168.5.23 --group nfs-group

Set Resource Order [23]
  • Set the constraint order to start cluster resources on one of the cluster nodes.
On Node 1:
[root@node1 ~]# pcs constraint order start lvm-res then nfs-fs1_res
Adding lvm-res nfs-fs1_res (kind: Mandatory) (Options: first-action=start then-action=start)

[root@node1 ~]# pcs constraint order start nfs-fs2_res then NFS-VIP
Adding nfs-fs2_res NFS-VIP (kind: Mandatory) (Options: first-action=start then-action=start)

[root@node1 ~]# pcs constraint order start NFS-VIP then nfsserver-res
Adding NFS-VIP nfsserver-res (kind: Mandatory) (Options: first-action=start then-action=start)

[root@node1 ~]# pcs constraint order start nfsserver-res then nfs-share1
Adding nfsserver-res nfs-share1 (kind: Mandatory) (Options: first-action=start then-action=start)

[root@node1 ~]# pcs constraint order start nfsserver-res then nfs-share2
Adding nfsserver-res nfs-share2 (kind: Mandatory) (Options: first-action=start then-action=start)

[root@node1 ~]# pcs constraint order start nfsserver-res then notify-nfs
Adding nfsserver-res notify-nfs (kind: Mandatory) (Options: first-action=start then-action=start)

  • Execute the below command on one of the cluster nodes to view cluster resource order.
On Node 1:
[root@node1 ~]# pcs constraint list
Location Constraints:
  Resource: fencedev1
    Enabled on: node1.example.local (score:INFINITY)
  Resource: fencedev2
    Enabled on: node2.example.local (score:INFINITY)
Ordering Constraints:
  start lvm-res then start nfs-fs1_res (kind:Mandatory)
  start nfs-fs2_res then start NFS-VIP (kind:Mandatory)
  start NFS-VIP then start nfsserver-res (kind:Mandatory)
  start nfsserver-res then start nfs-share1 (kind:Mandatory)
  start nfsserver-res then start nfs-share2 (kind:Mandatory)
  start nfsserver-res then start notify-nfs (kind:Mandatory)
Colocation Constraints:
Ticket Constraints:
  • Verify cluster status
[root@node1 ~]# pcs status 
Cluster name: cluster1
Stack: corosync
Current DC: node1.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sun Nov 17 01:00:25 2019      Last change: Sun Nov 17 00:56:59 2019 by root via cibadmin on node1.example.local

2 nodes and 10 resources configured

Online: [ node1.example.local node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local
 Resource Group: nfs-group
     lvm-res    (ocf::heartbeat:LVM):   Started node1.example.local
     nfs-fs1_res    (ocf::heartbeat:Filesystem):    Started node1.example.local
     nfs-fs2_res    (ocf::heartbeat:Filesystem):    Started node1.example.local
     NFS-VIP    (ocf::heartbeat:IPaddr2):   Started node1.example.local
     nfsserver-res  (ocf::heartbeat:nfsserver): Started node1.example.local
     nfs-share1 (ocf::heartbeat:exportfs):  Started node1.example.local
     nfs-share2 (ocf::heartbeat:exportfs):  Started node1.example.local
     notify-nfs (ocf::heartbeat:nfsnotify): Started node1.example.local

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

  • Now run the lvscan command on active cluster node validate that the lv1 and lv2 status is shown as ACTIVE.
[root@node1 ~]# lvscan 
  ACTIVE            '/dev/centos/swap' [1.00 GiB] inherit
  ACTIVE            '/dev/centos/root' [8.47 GiB] inherit
  ACTIVE            '/dev/NFSVG/lv1' [1.00 GiB] inherit
  ACTIVE            '/dev/NFSVG/lv2' [3.97 GiB] inherit

Firewall Configuration on Cluster nodes [24]
  • Add the following firewall rules on each cluster node to allow NFS traffic. If firewalld is disabled on cluster nodes, no need of executing the following commands.
On Node 1:
[root@node1 ~]# firewall-cmd --permanent --add-service=nfs
[root@node1 ~]# firewall-cmd --permanent --add-service=mountd
[root@node1 ~]# firewall-cmd --permanent --add-service=rpc-bind
[root@node1 ~]# firewall-cmd --reload

On Node 2:
[root@node2 ~]# firewall-cmd --permanent --add-service=nfs
[root@node2 ~]# firewall-cmd --permanent --add-service=mountd
[root@node2 ~]# firewall-cmd --permanent --add-service=rpc-bind
[root@node2 ~]# firewall-cmd --reload


Cluster Configuration Validation and Testing

  • it should be standard practice to check cluster configuration after making any changes in cluster configuration. Follow the steps below to verify cluster configuration.
[root@node1 ~]# crm_verify -L -V

Test NFS Share
  • Mount the nfs share on client system with the following steps.
View NFS exports:
# showmount -e 192.168.5.23
Export list for 192.168.5.23:
/nfssharedir/shares1 192.168.5.0/255.255.255.0
/nfssharedir/shares2 192.168.5.0/255.255.255.0
  • Create mount point directory on client system and mount the server exports.
# mkdir /tmp/test_nfs /tmp/test_nfs
# mount -t nfs 192.168.5.23:/nfssharedir/shares1 /tmp/test_nfs
# mount -t nfs 192.168.5.23:/nfssharedir/shares2 /tmp/test_nfs2/

 # df -h
 Filesystem                                            Size   Used Avail Use%   Mounted on
 192.168.5.23:/nfssharedir/shares1  4.0G   32M  8.0G   1% /tmp/test_nfs
 192.168.5.23:/nfssharedir/shares2  4.0G   32M  8.0G   1% /tmp/test_nfs2


Cluster FailOver Test [28]
  • View cluster status
[root@node1 ~]# crm_mon -r1
Stack: corosync
Current DC: node1.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sun Nov 17 09:43:29 2019      Last change: Sun Nov 17 01:27:43 2019 by root via crm_attribute on node1.example.local

2 nodes and 10 resources configured

Online: [ node1.example.local node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local
 Resource Group: nfs-group
     lvm-res    (ocf::heartbeat:LVM):   Started node1.example.local
     nfs-fs1_res    (ocf::heartbeat:Filesystem):    Started node1.example.local
     nfs-fs2_res    (ocf::heartbeat:Filesystem):    Started node1.example.local
     NFS-VIP    (ocf::heartbeat:IPaddr2):   Started node1.example.local
     nfsserver-res  (ocf::heartbeat:nfsserver): Started node1.example.local
     nfs-share1 (ocf::heartbeat:exportfs):  Started node1.example.local
     nfs-share2 (ocf::heartbeat:exportfs):  Started node1.example.local
     notify-nfs (ocf::heartbeat:nfsnotify): Started node1.example.local

  • In the above output, we see that, the cluster resources are running on node1.

  • Run the below command to put node1 on standby mode and run again crm_mon -r1 command to view cluster status.

  • We’ll see that once we put the node1 to standby mode, the resource are moving from node1 to node2 withn a few seconds.

On Node 1:
[root@node1 ~]# pcs cluster standby node1.example.local

[root@node1 ~]# crm_mon -r1
Stack: corosync
Current DC: node1.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sun Nov 17 09:46:54 2019      Last change: Sun Nov 17 09:46:31 2019 by root via crm_attribute on node1.example.local

2 nodes and 10 resources configured

Node node1.example.local: standby
Online: [ node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node2.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local
 Resource Group: nfs-group
     lvm-res    (ocf::heartbeat:LVM):   Started node2.example.local
     nfs-fs1_res    (ocf::heartbeat:Filesystem):    Started node2.example.local
     nfs-fs2_res    (ocf::heartbeat:Filesystem):    Started node2.example.local
     NFS-VIP    (ocf::heartbeat:IPaddr2):   Started node2.example.local
     nfsserver-res  (ocf::heartbeat:nfsserver): Started node2.example.local
     nfs-share1 (ocf::heartbeat:exportfs):  Started node2.example.local
     nfs-share2 (ocf::heartbeat:exportfs):  Started node2.example.local
     notify-nfs (ocf::heartbeat:nfsnotify): Started node2.example.local

  • Run the following command to remove node1 from standby mode then verify the cluster status.
On Node 1:
[root@node1 ~]# pcs cluster unstandby node1.example.local

[root@node1 ~]# crm_mon -r1
Stack: corosync
Current DC: node1.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sun Nov 17 09:49:28 2019      Last change: Sun Nov 17 09:49:20 2019 by root via crm_attribute on node1.example.local

2 nodes and 10 resources configured

Online: [ node1.example.local node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local
 Resource Group: nfs-group
     lvm-res    (ocf::heartbeat:LVM):   Started node2.example.local
     nfs-fs1_res    (ocf::heartbeat:Filesystem):    Started node2.example.local
     nfs-fs2_res    (ocf::heartbeat:Filesystem):    Started node2.example.local
     NFS-VIP    (ocf::heartbeat:IPaddr2):   Started node2.example.local
     nfsserver-res  (ocf::heartbeat:nfsserver): Started node2.example.local
     nfs-share1 (ocf::heartbeat:exportfs):  Started node2.example.local
     nfs-share2 (ocf::heartbeat:exportfs):  Started node2.example.local
     notify-nfs (ocf::heartbeat:nfsnotify): Started node2.example.local

  • Also we can test Failover by rebooting the one of the cluster node, where cluster resources are running and verify the cluster status.
On Node 2:
[root@node2 ~]# reboot

On Node 1:
[root@node1 ~]# crm_mon -r1
Stack: corosync
Current DC: node1.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sun Nov 17 09:53:06 2019      Last change: Sun Nov 17 09:49:20 2019 by root via crm_attribute on node1.example.local

2 nodes and 10 resources configured

Online: [ node1.example.local node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local
 Resource Group: nfs-group
     lvm-res    (ocf::heartbeat:LVM):   Started node1.example.local
     nfs-fs1_res    (ocf::heartbeat:Filesystem):    Started node1.example.local
     nfs-fs2_res    (ocf::heartbeat:Filesystem):    Started node1.example.local
     NFS-VIP    (ocf::heartbeat:IPaddr2):   Started node1.example.local
     nfsserver-res  (ocf::heartbeat:nfsserver): Started node1.example.local
     nfs-share1 (ocf::heartbeat:exportfs):  Started node1.example.local
     nfs-share2 (ocf::heartbeat:exportfs):  Started node1.example.local
     notify-nfs (ocf::heartbeat:nfsnotify): Started node1.example.local


I have taken a few days of time to learn paceMaker cluster and create this document for all viewers. I would appreciate your suggestions and comments on this article.



You May Also Like

About the Author: Andrew Joseph

4 Comments

  1. hi and Thank you
    but i want to ask you something
    is the data stored on node 1 synchronized with node 2, when node 1 fails then node 2 has full data of node 1 so that the user can continue working, for example, if the virtual machine saves node 1, then when node 1 fails, the machine virtual boot on node 2 or it is dead
    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *