Replace Volume|Disk On DRBD Pacemaker Cluster

Replace Volume|Disk On DRBD Pacemaker Cluster

It is an obvious behavior that DRBD cluster volume or disk may fail. In order to maintain High Availability we need to replace the failed volume or disk without affecting the cluster service. In this article we will demonstrate on how to Replace failed Volume|Disk On DRBD Pacemaker Cluster.


Topic

  • How to replace a failed Volume|Disk on DRBD Pacemaker Cluster?
  • DRBD Replace Volume|Disk on Pacemaker Cluster
  • DRBD Replace failed Volume|Disk


Solution


Stop PaceMaker Cluster service

  • Before making any changes first stop the cluster service.
[root@node2 ~]# pcs cluster stop --all
node1.example.local: Stopping Cluster (pacemaker)...
node2.example.local: Stopping Cluster (pacemaker)...
node1.example.local: Stopping Cluster (corosync)...
node2.example.local: Stopping Cluster (corosync)...

  • Detach DRBD resource on failed volume node and check DRBD resource status.
[root@node1 ~]# drbdadm detach mysql
[root@node1 ~]# drbdadm status
mysql role:Secondary
  disk:Diskless
  node2.example.local role:Secondary
    peer-disk:UpToDate

[root@node2 ~]# drbdadm status
mysql role:Secondary
  disk:UpToDate
  node1.example.local role:Secondary
    peer-disk:Diskless

  • Replace the failed device on cluster node and start DRBD service.
[root@node1 ~]# systemctl start drbd
[root@node2 ~]# systemctl start drbd

[root@node1 ~]# drbdadm status
mysql role:Secondary
  disk:Diskless
  node2.example.local role:Secondary
    peer-disk:UpToDate

[root@node2 ~]# drbdadm status
mysql role:Secondary
  disk:UpToDate
  node1.example.local role:Secondary
    peer-disk:Diskless

  • Note: Once we start the DRBD service, the output we get that the DRBD resource is up to date on node2 and Diskless mode on node1.

  • Create new meta data on failed volume node and re-attach the resource.

[root@node1 ~]# drbdadm create-md mysql
initializing activity log
initializing bitmap (320 KB) to all zero
Writing meta data...
New drbd meta data block successfully created.

[root@node1 ~]# drbdadm create-md mysql
initializing activity log
initializing bitmap (320 KB) to all zero
Writing meta data...
New drbd meta data block successfully created.

[root@node1 ~]# drbdadm attach mysql

[root@node1 ~]# drbdadm status
mysql role:Secondary
  disk:Inconsistent
  node2.example.local role:Secondary
    replication:SyncTarget peer-disk:UpToDate done:1.81

  • Note: Once we re-attach the DRBD resource, the volume synchronization happens automatically. Execute drbdsetup status -vs command along with watch command to view the real time synchronization.
[root@node1 ~]# watch -n .2 drbdsetup status -vs

  • DRBD synchronization process takes time, monitor the synchronization status and change one of the cluster node to primary mode once the synchronization completed.
[root@node1 ~]# drbdadm status
mysql role:Secondary
  disk:UpToDate
  node2.example.local role:Secondary
    peer-disk:UpToDate

[root@node2 ~]# drbdadm status
mysql role:Secondary
  disk:UpToDate
  node1.example.local role:Secondary
    peer-disk:UpToDate

[root@node1 ~]# drbdadm primary mysql 
[root@node1 ~]# drbdadm status
mysql role:Primary
  disk:UpToDate
  node2.example.local role:Secondary
    peer-disk:UpToDate

  • Mount the DRBD volume on primary node and check the data.
[root@node1 ~]# mount /dev/drbd0 /mnt/
[root@node1 ~]# ls /mnt/
aria_log.00000001  ibdata1      ib_logfile1  performance_schema  test1  test3
aria_log_control   ib_logfile0  mysql        run                 test2  test4

  • The above output confirms that we are able to view the data.

  • Unmount DRBD volume and stop DRBD service service on cluster nodes.
[root@node1 ~]# umount /mnt/
[root@node1 ~]# drbdadm secondary mysql 
[root@node1 ~]# drbdadm status
mysql role:Secondary
  disk:UpToDate
  node2.example.local role:Secondary
    peer-disk:UpToDate

[root@node1 ~]# systemctl stop drbd
[root@node2 ~]# systemctl stop drbd

  • Now start the cluster service and check resources.
[root@node1 ~]# pcs cluster start --all

[root@node1 ~]# pcs status
Cluster name: mysqlcluster
Stack: corosync
Current DC: node2.example.local (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sun Dec 15 21:29:35 2019      Last change: Sun Dec 15 20:26:38 2019 by root via cibadmin on node1.example.local

2 nodes and 7 resources configured

Online: [ node1.example.local node2.example.local ]

Full list of resources:

 fencedev1  (stonith:fence_xvm):    Started node1.example.local
 fencedev2  (stonith:fence_xvm):    Started node2.example.local
 Master/Slave Set: drbd_master_slave [drbd_mysql]
     Masters: [ node1.example.local ]
     Slaves: [ node2.example.local ]
 Resource Group: mysql-group
     mysql_fs   (ocf::heartbeat:Filesystem):    Started node1.example.local
     mysql_vip  (ocf::heartbeat:IPaddr2):   Started node1.example.local
     mysql-server   (ocf::heartbeat:mysql): Started node1.example.local

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


You May Also Like

About the Author: Andrew Joseph

5 Comments

  1. Thanks , I’ve recently been looking for info about this subject for ages and yours is the greatest I have discovered till now. But, what about the bottom line? Are you sure about the source?

  2. “Great article and right to the point. I don’t know if this is in fact the best place to ask but do you folks have any thoughts on where to get some professional writers? Thank you ??”

  3. “Generally I do not read article on blogs, but I wish to say that this write-up very pressured me to check out and do it! Your writing style has been surprised me. Thank you, very nice article.”

Leave a Reply

Your email address will not be published. Required fields are marked *