MHA介绍
MHA是一位日本MySQL大牛用Perl写的一套MySQL故障切换方案,来保证数据库系统的高可用.在宕机的时间内(通常10—30秒内),完成故障切换,部署MHA,可避免主从一致性问题,节约购买新服务器的费用,不影响服务器性能,易安装,不改变现有部署。
还支持在线切换,从当前运行master切换到一个新的master上面,只需要很短的时间(0.5-2秒内),此时仅仅阻塞写操作,并不影响读操作,便于主机硬件维护。
在有高可用,数据一致性要求的系统上,MHA 提供了有用的功能,几乎无间断的满足维护需要。
优点:
1.master自动监控和故障转移
在当前已存在的主从复制环境中,MHA可以监控master主机故障,并且故障自动转移。
即使有一些slave没有接受新的relay log events,MHA也会从最新的slave自动识别差异的relay log events,并apply差异的event到其他slaves。因此所有的slave都是一致的。MHA秒级别故障转移(9-12秒监测到主机故障,任选7秒钟关闭电源主机避免脑裂,接下来apply差异relay logs,注册到新的master,通常需要时间10-30秒即total downtime)。另外,在配置文件里可以配置一个slave优先成为master。因为MHA修复了slave之间的一致性,dba就不用去处理一致性问题。
当迁移新的master之后,并行恢复其他slave。即使有成千上万的slave,也不会影响恢复master时间,slave也很快完成。
DeNA公司在150+主从环境中用MHA。当其中一个master崩溃,MHA4秒完成故障转移,这是主动/被动集群解决方案无法完成的。
2.互动(手动)master故障转移
MHA可以用来只做故障转移,而不监测master,MHA只作为故障转移的交互。
3.非交互式故障转移
非交互式的故障转移也提供(不监控master,自动故障转移)。这个特性很有用,特别是你已经安装了其他软件监控master。比如,用Pacemaker(Heartbeat)监测master故障和vip接管,用MHA故障转移和slave提升。
4.在线切换master到不同主机
在很多情况下,有必要将master转移到其他主机上(如替换raid控制器,提升master机器硬件等等)。这并不是master崩溃,但是计划维护必须去做。计划维护导致downtime,必须尽可能快的恢复。快速的master切换和优雅的阻塞写操作是必需的,MHA提供了这种方式。优雅的master切换, 0.5-2秒内阻塞写操作。在很多情况下0.5-2秒的downtime是可以接受的,并且即使不在计划维护窗口。这意味着当需要更换更快机器,升级高版本时,dba可以很容易采取动作。
5.master crash不会导致主从数据不一致性
当master crash后,MHA自动识别slave间relay logevents的不同,然后应用与不同的slave,最终所有slave都同步。结合通过半同步一起使用,几乎没有任何数据丢失。
其他高可用方案
6.MHA部署不影响当前环境设置
MHA最重要的一个设计理念就是尽可能使用简单。使用与5.0+以上主从环境,其他HA方案需要改变mysql部署设置,MHA不会让dba做这些部署配置,同步和半同步环境都可以用。启动/停止/升级/降级/安装/卸载 MHA都不用改变mysql主从(如启动/停止)。
当你需要升级MHA到新版本时,不需要停止mysql,仅仅更新HMA版本,然后重新启动MHAmanger即可。
MHA 支持包含5.0/5/1/5.5(应该也支持5.6,翻译文档时MHA开发者没更新对于5.6版本)。有些HA方案要求特定的mysql版本(如mysqlcluster,mysql with global transaction id 等),而且你可能不想仅仅为了MasterHA而迁移应用。很多情况下,公司已经部署了许多传统的mysql应用,开发或dba不想花太多时间迁移到不同的存储引擎或新的特性(newer bleeding edge distributions 不知道这个是否该这么翻译)。
7.不增加服务器费用
MHA 包含MHA Manager和MHA node。MHA node运行在每台mysql服务器上,Manager可以单独部署一台机器,监控100+以上master,总服务器数量不会有太大增加。需要注意的是Manager也可以运行在slaves中的一台机器上。
8.性能无影响
当监控master,MHA只是几秒钟(默认3秒)发送ping包,不发送大的查询。主从复制性能不受影响
9.适用任何存储引擎
Mysql不仅仅适用于事务安全的innodb引擎,在主从中适用的引擎,MHA都可以适用。即使用遗留环境的mysiam引擎,不进行迁移,也可以用MHA。
安装配置方法
1、示例环境介绍
- centos 6.4 x64
- mysql-5.6.16-linux-glibc2.5-x86_64.tar.gz
- mha4mysql-manager-0.54.tar.gz
- mha4mysql-node-0.54.tar.gz
2、服务器说明
- masnager 192.168.216.50
- master 192.168.216.51
- slave 192.168.216.52
三台服务器都安装mysql,将manager作为管理节点
3、配置等价性
manager:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
ssh -keyger -t rsa ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.50 ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.51 ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.52 ssh 192.168.216.50 date ssh 192.168.216.51 date ssh 192.168.216.52 date |
master:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
ssh -keyger -t rsa ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.50 ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.51 ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.52 ssh 192.168.216.50 date ssh 192.168.216.51 date ssh 192.168.216.52 date |
slave:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
ssh -keyger -t rsa ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.50 ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.51 ssh -coyp- id -i /root/ . ssh /id_rsa .pub root@192.168.216.52 ssh 192.168.216.50 date ssh 192.168.216.51 date ssh 192.168.216.52 date |
4、安装mysql
1
2
3
4
5
6
7
8
9
10
11
12
|
groupadd mysql useradd -g mysql -s /bin/nologin -M mysql mkdir -pv /data/mysql tar -zxvf mysql-5.6.16-linux-glibc2.5-x86_64. tar .gz mv mysql-5.6.16-linux-glibc2.5-x86_64 /usr/local/mysql cd /usr/local/mysql/script ./ mysql_install_db –user=mysql –basedir= /usr/local/mysql –datadair= /data/mysql |
创建mysql配置文件
1
|
vim /usr/local/mysql/my .cnf |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
|
[mysql] # CLIENT # port = 3306 socket = /tmp/mysql.sock [mysqld] # GENERAL # user = mysql default-storage-engine = InnoDB socket = /tmp/mysql.sock pid-file = /data/mysql/mysql.pid # MyISAM # key-buffer-size = 32M myisam-recover = FORCE,BACKUP # SAFETY # max-allowed-packet = 16M max-connect-errors = 1000000 # DATA STORAGE # datadir = /data/mysql/ # BINARY LOGGING # server_id = 1 #不同服务器不一样 log-bin = /data/mysql/mysql-bin expire-logs-days = 14 sync-binlog = 1 # CACHES AND LIMITS # tmp-table-size = 32M max-heap-table-size = 32M query-cache-type = 0 query-cache-size = 0 max-connections = 500 thread-cache-size = 50 open-files-limit = 65535 table-definition-cache = 1024 table-open-cache = 2048 # INNODB # innodb-flush-method = O_DIRECT innodb-log-files-in-group = 2 innodb-log-file-size = 64M innodb-flush-log-at-trx-commit = 1 innodb-file-per-table = 1 innodb-buffer-pool-size = 592M # LOGGING # log-error = /data/mysql/mysql-error.log log-queries-not-using-indexes = 1 slow-query-log = 1 slow-query-log-file = /data/mysql/mysql-slow.log |
5、配置权限
manager:
1
2
3
4
5
6
7
8
9
10
11
|
grant all privileges on *.* to root@ '127.0.0.1' identified by 'root' ; grant all privileges on *.* to root@ 'localhost' identified by 'root' ; grant all privileges on *.* to root@ '192.168.216.50' identified by 'root' ; grant replication slave on *.* to slave@ '192.168.216.50' identified by 'slave' ; grant replication slave on *.* to slave@ '192.168.216.51' identified by 'slave' ; grant replication slave on *.* to slave@ '192.168.216.52' identified by 'slave' ; |
master:
1
2
3
4
5
6
7
8
9
10
11
|
grant all privileges on *.* to root@ '127.0.0.1' identified by 'root' ; grant all privileges on *.* to root@ 'localhost' identified by 'root' ; grant all privileges on *.* to root@ '192.168.216.50' identified by 'root' ; grant replication slave on *.* to slave@ '192.168.216.50' identified by 'slave' ; grant replication slave on *.* to slave@ '192.168.216.51' identified by 'slave' ; grant replication slave on *.* to slave@ '192.168.216.52' identified by 'slave' ; |
slave:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
grant all privileges on *.* to root@ '127.0.0.1' identified by 'root' ; grant all privileges on *.* to root@ 'localhost' identified by 'root' ; grant all privileges on *.* to root@ '192.168.216.50' identified by 'root' ; grant all privileges on *.* to root@ '192.168.216.51' identified by 'root' ; grant replication slave on *.* to slave@ '192.168.216.50' identified by 'slave' ; grant replication slave on *.* to slave@ '192.168.216.51' identified by 'slave' ; grant replication slave on *.* to slave@ '192.168.216.52' identified by 'slave' ; |
6、安装依赖包
管理服务器manager需要安装以下全部
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
perl-Config-Tiny perl-Params-Validate perl-Parallel-ForkManager perl-Log-Dispatch |- perl-MIME-Lite-3.027-2.el6.noarch.rpm |- perl-MIME-Types-1.28-2.el6.noarch.rpm |- perl-Email-Date-Format-1.002-5.el6.noarch.rpm |- perl-MailTools-2.04-4.el6.noarch.rpm |- perl-TimeDate-1.16-11.1.el6.noarch.rpm |- perl-Data-ShowTable-3.3-3.4.noarch.rpm |- perl-Mail-Sender-0.8.22-21.1.noarch.rpm |- perl-IO-Socket-SSL-1.31-2.el6.noarch.rpm |- perl-Net-LibIDN-0.12-3.el6.x86_64.rpm |- perl-Net-SSLeay-1.35-9.el6.x86_64.rpm |-perl-Win32API-Registry |- perl-Mail-Sendmail-0.79_16-4.2.noarch.rpm |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
rpm -ivh ncftp-debuginfo-3.2.3-1.3.x86_64.rpm rpm -ivh perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm rpm -ivh perl-Params-Validate-0.91-2.4.x86_64.rpm rpm -ivh perl-Config-Tiny-2.12-7.1.el6.noarch.rpm rpm -ivh perl-MIME-Types-1.28-2.el6.noarch.rpm rpm -ivh perl-Email-Date-Format-1.002-5.el6.noarch.rpm rpm -ivh perl-TimeDate-1.16-11.1.el6.noarch.rpm rpm -ivh perl-Data-ShowTable-3.3-3.4.noarch.rpm rpm -ivh perl-MailTools-2.04-4.el6.noarch.rpm rpm -ivh perl-MIME-Lite-3.027-2.el6.noarch.rpm rpm -ivh perl-Net-LibIDN-0.12-3.el6.x86_64.rpm rpm -ivh perl-Net-SSLeay-1.35-9.el6.x86_64.rpm rpm -ivh perl-IO-Socket-SSL-1.31-2.el6.noarch.rpm rpm -ivh perl-Mail-Sendmail-0.79_16-4.2.noarch.rpm rpm -ivh perl-Mail-Sender-0.8.22-21.1.noarch.rpm rpm -ivh perl-Log-Dispatch-2.22-7.3.noarch.rpm |
如果安perl-DBD-MySQL-4.013-3.el6.x86_64.rpm在检查rep是报错则需要手动编译安装
1
2
3
4
5
6
7
|
tar -zxvf DBD-mysql-4.027. tar .gz cd DBD-mysql-4.0.27 perl Makefile.PL make && make install |
节点服务器安装
1
2
3
|
rpm -ivh ncftp-debuginfo-3.2.3-1.3.x86_64.rpm rpm -ivh perl-DBD-MySQL-4.013-3.el6.x86_64.rpm |
7、创建软连接
1
2
3
|
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql |
导出mysql库搜索路径
1
2
3
4
5
|
vim /etc/ld .so.conf.d /mysql-x86_64 .conf /usr/local/mysql/lib ldconfig |
8、在所有服务器上安装mha4mysql-node-0.54.tar.gz
1
2
3
4
5
6
7
|
tar -zxvf mha4mysql-node-0.54. tar .gz cd mha4mysql-node-0.54 perl Makefile.PL make && make install |
9、在管理服务上安装 mha4mysql-manager-0.54.tar.gz
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
tar -zxvf mha4mysql-manager-0.54. tar .gz cd mha4mysql-manager-0.54 perl Makefile.PL make && make install mkdir -pv /etc/masterha mkdir -pv /masterha/app1 cp samples /conf/ * /etc/masterha cp samples /scripts/ * /usr/local/bin |
1
|
vim /etc/masterha/app1 .cnf |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
|
[server default] manager_workdir=/masterha/app1 manager_log=/masterha/app1/manager.log user=root password=root ssh_user=root repl_user=slave repl_password=slave shutdown_script="" #master_ip_failover_script="/usr/local/bin/masterha_ip_failover" master_ip_online_change_script="/usr/local/bin/masterha_ip_failover " report_script="" [server1] hostname=192.168.216.50 master_binlog_dir="/data/mysql/" candidate_master=1 [server2] hostname=192.168.216.51 master_binlog_dir="/data/mysql/" candidate_master=1 [server3] hostname=192.168.216.52 master_binlog_dir="/data/mysql/" candidate_master=1 |
10、测试ssh连接
1
|
masterha_check_ssh –conf= /etc/masterha/app1 .cnf |
11、测试replication
1
|
masterha_check_repl –conf= /etc/masterha/app1 .cnf |
12、开启管理节点进程
1
|
masterha_manager –conf= /etc/masterha/app1 .cnf |
13、测试故障转移
关闭主库mysql,查看从库的的状态是否将同步ip切换到新的主库
14、设置故障转移的ip
1
|
vim /etc/masterha/app1 .cnf |
1
2
3
|
master_ip_failover_script="/usr/local/bin/masterha_ip_failover " master_ip_online_change_script="/usr/local/bin/masterha_ip_failover " |
编辑故障转移脚本,将vip设置成192.168.216.100
1
|
vim /usr/local/bin/masterha_ip_failover |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
|
#!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); # my $vip = '172.16.21.119/24'; # Virtual IP my $vip = '192.168.216.100/24'; # Virtual IP my $key = "1"; my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { print "Enabling the VIP – $vip on the new master – $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; `ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`; exit 0; } else { &usage(); exit 1; } } # A simple system call that enable the VIP on the new master sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } # A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=port –new_master_host=host –new_master_ip=ip –new_master_port=port\n"; } |
测试脚本
1
|
/usr/local/bin/masterha_ip_failover – command =status –ssh_user=root –orig_master_host=192.168.216.51 –orig_master_ip=192.168.216.51 –orig_master_port=3306 |
虚拟ip要手动启动,在主库故障后会自动转移。
1
|
/usr/local/bin/masterha_ip_failover – command =start –ssh_user=root –orig_master_host=192.168.216.51 –orig_master_ip=192.168.216.51 –orig_master_port=3306 –new_master_host=192.168.216.51 |
测试ip故障转移:
关闭主库mysql,查看vip是否转移的新的主库上。