跳转至

使用 Pacemaker + Corosync + DRBD 完成 SuSe Mysql HA 方案


2014-12-16 by dongnan

环境

存储

# 建议使用相同大小的分区或磁盘,本例假设使用整个磁盘:
物理磁盘: /dev/sdb
DRBD设备: /dev/drbd0

网络

# 主机名
grep -E "^172." /etc/hosts
172.27.233.33   pn33
172.27.233.34   pn34

# pacemaker 网络设备
网卡 :eth0
CIDR :172.27.233.0/24 
pn33 :172.27.233.33 
pn34 :172.27.233.34 
vip:172.27.233.35

# drbd 网络设备
网卡 :eth1
CIDR :10.0.0.0/24 
pn33 (主节点):10.0.0.33 
pn34 (备节点):10.0.0.34
端口:TCP 20000

软件版本

# 操作系统
lsb_release -d
Description:    SUSE Linux Enterprise Server 11 (x86_64)

# 软件包
suse-mysql-ha.tar.gz

# drbd
drbdadm --version | tail -n1
DRBDADM_VERSION=8.4.4

# mysql
rpm -qa | grep mysql-server
mysql-server-5.1.52-1.el6_0.1.x86_64

# pacemaker
rpm -qa | grep pacemaker-
pacemaker-1.1.9-1.15

# corosync
rpm -qa | grep corosync-
corosync-1.4.7-1.1

同步时间

所有节点用NTP服务器同步时间,sntp -P no -r 172.27.233.45

DRBD部分

安装

tar zxvf drbd-8.4.4.tar.gz
cd drbd-8.4.4/
./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-km
make
make install

加载模块提示错误

modprobe drbd

FATAL: module '/lib/modules/3.0.13-0.27-default/updates/drbd.ko' is unsupported
Use --allow-unsupported or set allow_unsupported_modules to 1 in
/etc/modprobe.d/unsupported-modules

允许使用未支持的模块

sed -i 's/allow_unsupported_modules 0/allow_unsupported_modules 1/' /etc/modprobe.d/unsupported-modules

载入模块

modprobe drbd

模块信息

modinfo drbd

filename:       /lib/modules/3.0.13-0.27-default/updates/drbd.ko
alias:          block-major-147-*
license:        GPL
version:        8.4.4
description:    drbd - Distributed Replicated Block Device v8.4.4
....

注意:以上步骤,主节点与备份节点都需要执行

配置

全局配置文件
# 编辑文件
vim /etc/drbd.d/global_common.conf

# 它已包含一些预定义值。
# 编辑 `startup/disk/net` 部分,添加类似如下值:

startup {
        # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
        wfc-timeout 120;
        degr-wfc-timeout 120;
}

disk {
        # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
        # no-disk-drain no-md-flushes max-bio-bvecs
        on-io-error detach;
}

net {
        protocol C;
        verify-alg md5;
        #...省略
}
创建资源文件
# 编辑文件
vim /etc/drbd.d/mysql.res

# 添加如下内容
resource mysql {

     on  pn33 {
         device /dev/drbd0;
         disk /dev/sdb;
         meta-disk internal;
         address 10.0.0.33:20000;
     }

     on  pn34 {
         device /dev/drbd0;
         disk /dev/sdb;
         meta-disk internal;
         address 10.0.0.34:20000;
     }
}
检查配置文件语法
drbdadm dump all

如果命令返回错误,请重新核对配置文件。

备份节点

复制配置文件至pn34即可

scp -r /etc/drbd.conf root@pn34:/etc/
scp -r /etc/drbd.d/* root@pn34:/etc/drbd.d/

启动DRBD

分别在两个节点上执行以下命令

drbdadm create-md mysql
/etc/init.d/drbd start

注意: 分别在两个节点上执行,单独启动一个节点时,会等待另一个节点上线直至超时。

DRBD初始状态
cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@fat-tyre, 2013-09-20 17:45:16
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:5242684

注意: 可以看到两个节点并磁盘未同步(Inconsistent),而且角色都处于Secondary状态。

同步数据

设置pn33 成为主节点并开始同步数据。

drbdadm -- --overwrite-data-of-peer primary mysql
DRBD数据同步状态
cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@fat-tyre, 2013-09-20 17:45:16
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:685376 nr:0 dw:0 dr:685376 al:0 bm:42 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4412348
    [=>..................] sync'ed: 13.5% (4308/4976)M
    finish: 0:02:08 speed: 34,268 (34,268) K/sec
同步完成
cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@fat-tyre, 2013-09-20 17:45:16
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:5097724 nr:0 dw:0 dr:5097724 al:0 bm:312 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注意: 磁盘状态将转变为UpToDate/UpToDate状态。

创建DRBD设备的文件系统

同步完成后就可以为其格式化文件系统,pn33drbd状态为 primary

mkfs.ext3 /dev/drbd0

测试DRBD数据同步

主节点
mkdir /mysqldata
mount /dev/drbd0 /mysqldata

写入一个文件

hostname > /mysqldata/file

将设备卸载同时将角色由主降为备

umount /mysqldata
drbdadm secondary mysql
备份节点

pn34 角色升为主,同时挂载drbd0设备。

mkdir /mysqldata
drbdadm primary mysql
mount /dev/drbd0 /mysqldata

读取写入测试

# 读数据
cat /mysqldata/file
pn33    # 返回结果

# 写入数据
hostname >> /mysqldata/file

# 成功
cat /mysqldata/file

pn33
pn34

卸载drbd设备

umount /mysqldata
drbdadm secondary mysql

注意:在pn33写入的数据,pn34挂载后可以正常读写,说明同步成功

注意事项

  • 两个节点中,同一个时刻只能有一台处于 primary状态,另一台处于 secondary状态。
  • 挂载 drbd设备之前必须切换到primary状态。
  • 处于secondary状态的服务器上不能挂载drbd设备。
  • 转换状态之前,确保drbd设备没有挂载。
  • 确保两个节点的drbd 服务开机禁止启动,pacemaker 会启动drbd服务。

Mysql部分

安装

安装包自动安装

配置

停止服务

/etc/init.d/mysql stop

更改数据目录

# 编辑配置文件
vim /etc/my.cnf

# 数据目录更改为DRBD设备
[mysqld]
datadir=/mysqldata

注意: 确保mysql数据目录设置在drbd 设备上。

验证

挂载drbd设备

# 确保`pn33` 角色为 `primary` 否则则执行以下命令:
drbdadm primary mysql
mount /dev/drbd0 /mysqldata

拷贝数据到drbd0

cp -r /var/lib/mysql/* /mysqldata/
chown -R mysql.mysql /mysqldata/

注意:本例假设mysql 服务已经启动过并生成数据,如果mysql 服务没有启动过跳过此步骤。

测试mysql

# 启动数据库
/etc/init.d/mysql start

# 查看日志
tail /var/log/mysql/mysqld.log

# 关闭数据库
/etc/init.d/mysql stop

注意:没有错误信息代表mysql工作正常,最后关闭mysqld服务。

备份节点

pn34 节点需要执行

  • 停止mysql
  • 更改mysql数据目录
  • 更改属主:chown -R mysql.mysql /mysqldata/

注意事项

  • suse 操作系统下的 mysql 服务名称为 mysql (非 mysqld)。
  • 确保 mysql服务开机禁止启动,pacemaker 会启动 mysql服务。

Corosync部分

内容较多详细请参考PDF文档

Pacemaker部分

内容较多详细请参考PDF文档

FailOver 测试

前提条件,检查服务状态。

chkconfig | grep -E "mysql|drbd|openais"
drbd                 off
mysql                off
openais              on

场景-重启master

主节点

crm(live)# status

Last updated: Fri Nov  7 11:59:27 2014
Last change: Fri Nov  7 11:46:29 2014 by root via cibadmin on pn33
Stack: classic openais (with plugin)
Current DC: pn34 - partition with quorum
Version: 1.1.9-2db99f1
2 Nodes configured, 2 expected votes
5 Resources configured.

Online: [ pn33 pn34 ]

 Master/Slave Set: ms-drbd-mysql [drbd-mysql]
     Masters: [ pn33 ]
     Slaves: [ pn34 ]
 fs-drbd-mysql    (ocf::heartbeat:Filesystem):    Started pn33
 service-mysql    (lsb:mysql):    Started pn33
 vip-mysql    (ocf::heartbeat:IPaddr):    Started pn33

重启系统

sync && reboot

备节点

crm status

Last updated: Fri Nov  7 12:03:21 2014
Last change: Fri Nov  7 11:46:29 2014 by root via cibadmin on pn33
Stack: classic openais (with plugin)
Current DC: pn34 - partition WITHOUT quorum
Version: 1.1.9-2db99f1
2 Nodes configured, 2 expected votes
5 Resources configured.

Online: [ pn34 ]
OFFLINE: [ pn33 ]

 Master/Slave Set: ms-drbd-mysql [drbd-mysql]
     Masters: [ pn34 ]
     Stopped: [ drbd-mysql:1 ]
 fs-drbd-mysql    (ocf::heartbeat:Filesystem):   Started pn34
 service-mysql    (lsb:mysql):    Started pn34
 vip-mysql    (ocf::heartbeat:IPaddr):    Started pn34

日志记录角色切换过程

Nov  7 17:44:04 pn34 kernel: [690180.686342] block drbd0: peer( Primary -> Secondary )
Nov  7 17:44:05 pn34 kernel: [690180.788265] drbd mysql: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Nov  7 17:44:05 pn34 kernel: [690180.788286] drbd mysql: asender terminated
Nov  7 17:44:05 pn34 kernel: [690180.788289] drbd mysql: Terminating drbd_a_mysql
Nov  7 17:44:05 pn34 kernel: [690180.789834] drbd mysql: Connection closed
Nov  7 17:44:05 pn34 kernel: [690180.789840] drbd mysql: conn( TearDown -> Unconnected )
Nov  7 17:44:05 pn34 kernel: [690180.789841] drbd mysql: receiver terminated
Nov  7 17:44:05 pn34 kernel: [690180.789844] drbd mysql: Restarting receiver thread
Nov  7 17:44:05 pn34 kernel: [690180.789846] drbd mysql: receiver (re)started
Nov  7 17:44:05 pn34 kernel: [690180.789849] drbd mysql: conn( Unconnected -> WFConnection )
Nov  7 17:44:05 pn34 kernel: [690180.920674] block drbd0: role( Secondary -> Primary )
Nov  7 17:44:05 pn34 kernel: [690180.921877] block drbd0: new current UUID B53401839EF2D3DF:633AB21C6BAEC118:6EE1E4BB4DEF0160:6EE0E4BB4DEF0160
Nov  7 17:44:06 pn34 kernel: [690182.127729] kjournald starting.  Commit interval 15 seconds
Nov  7 17:44:06 pn34 kernel: [690182.137531] EXT3-fs (drbd0): using internal journal
Nov  7 17:44:06 pn34 kernel: [690182.137534] EXT3-fs (drbd0): mounted filesystem with ordered data mode
Nov  7 17:44:44 pn34 kernel: [690220.039499] drbd mysql: Handshake successful: Agreed network protocol version 101
Nov  7 17:44:44 pn34 kernel: [690220.039502] drbd mysql: Agreed to support TRIM on protocol level
Nov  7 17:44:44 pn34 kernel: [690220.039534] drbd mysql: conn( WFConnection -> WFReportParams )
Nov  7 17:44:44 pn34 kernel: [690220.039536] drbd mysql: Starting asender thread (from drbd_r_mysql [19024])
Nov  7 17:44:44 pn34 kernel: [690220.040081] block drbd0: drbd_sync_handshake:
Nov  7 17:44:44 pn34 kernel: [690220.040084] block drbd0: self B53401839EF2D3DF:633AB21C6BAEC118:6EE1E4BB4DEF0160:6EE0E4BB4DEF0160 bits:51 flags:0
Nov  7 17:44:44 pn34 kernel: [690220.040086] block drbd0: peer 633AB21C6BAEC118:0000000000000000:6EE1E4BB4DEF0160:6EE0E4BB4DEF0160 bits:0 flags:0
Nov  7 17:44:44 pn34 kernel: [690220.040089] block drbd0: uuid_compare()=1 by rule 70
Nov  7 17:44:44 pn34 kernel: [690220.040092] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Nov  7 17:44:44 pn34 kernel: [690220.040206] block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 42(1), total 42; compression: 100.0%
Nov  7 17:44:44 pn34 kernel: [690220.040991] block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 42(1), total 42; compression: 100.0%
Nov  7 17:44:44 pn34 kernel: [690220.040998] block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
Nov  7 17:44:44 pn34 kernel: [690220.041920] block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
Nov  7 17:44:44 pn34 kernel: [690220.041930] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
Nov  7 17:44:44 pn34 kernel: [690220.041935] block drbd0: Began resync as SyncSource (will sync 204 KB [51 bits set]).
Nov  7 17:44:44 pn34 kernel: [690220.041959] block drbd0: updated sync UUID B53401839EF2D3DF:633BB21C6BAEC118:633AB21C6BAEC118:6EE1E4BB4DEF0160
Nov  7 17:44:44 pn34 kernel: [690220.550876] block drbd0: Resync done (total 1 sec; paused 0 sec; 204 K/sec)
Nov  7 17:44:44 pn34 kernel: [690220.550880] block drbd0: updated UUIDs B53401839EF2D3DF:0000000000000000:633BB21C6BAEC118:633AB21C6BAEC118
Nov  7 17:44:44 pn34 kernel: [690220.550886] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )

场景-拔掉电源

主节点

crm status

Last updated: Fri Nov  7 18:00:12 2014
Last change: Fri Nov  7 15:25:43 2014 by root via cibadmin on pn33
Stack: classic openais (with plugin)
Current DC: pn34 - partition with quorum
Version: 1.1.9-2db99f1
2 Nodes configured, 2 expected votes
5 Resources configured.

Online: [ pn33 pn34 ]

 Master/Slave Set: ms-drbd-mysql [drbd-mysql]
     Masters: [ pn34 ]
     Slaves: [ pn33 ]
 fs-drbd-mysql    (ocf::heartbeat:Filesystem):    Started pn34
 service-mysql    (lsb:mysql):    Started pn34
 vip-mysql    (ocf::heartbeat:IPaddr):    Started pn34

拔掉电源

# 模拟主节点服务器宕机

备节点

# drbd 状态
cat /proc/drbd

version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@pn34, 2014-10-3017:45:12
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:348 dw:600 dr:5247217 al:5 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:208

# 挂载状态    
mount | tail -n1
/dev/drbd0 on /mysqldata type ext3 (rw)

# mysql状态
mysqladmin ping
mysqld is alive

# vip状态
ifconfig eth0:0

eth0:0 Link encap:Ethernet HWaddr 00:50:56:9C:00:1X
      inet addr:172.27.233.35 Bcast:172.27.233.255 Mask:255.255.255.0
      UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

参考

回到页面顶部