I finally got around to implementing database replication for all of my production servers today and I have to say, it was much easier than I thought it would be. No server reboots were required. The only impact on application availability was the few seconds it took to reboot MariaDB on each of the master database host servers.
Most of my production servers are hosted with Linode at their Fremont, CA data centre, so I decided to spin up one replication server per master at Digital Ocean's San Francisco region. There are only six hops between the two data centres and the ping times are incredibly low (around 2-3ms) so I’m quite confident this will be sufficient for replication. (Addendum: Two weeks later and replication has been running flawlessly.)
All of my servers run Debian, but the process should be very, very similar on all distros.
Here’s the approach I used for each production server:
I spun up a new VPS and initialised it using a custom script that I have carefully crafted over the last few years. It takes around 10 minutes to run and sets up everything I need on a production server, including MariaDB, Apache, PHP, SSH, fully-configured IPTables, etc. It creates a non-root user with my CLI environment all set up the way I like it. Nice!
I set up SSH access from the slave server to the production server mainly for the purpose of transfering the initial database dump.
I opened port 3306 on the production server, but only for the IP address of the slave server. A future enhancement is to enable SSL for the client connections.
Made the following changes to /etc/mysql/my.cnf on the production server:
Changed: bind-address from 127.0.0.1 to 0.0.0.0 Uncommented: server-id=1 Uncommented: log-bin=/var/log/mysql/mysql-bin.log
The longest this took on any of the servers was 10 seconds.
grant replication slave on *.* to 'repluser'@'1.2.3.4' identified by 'password';
The next bit is really cool. Because I'm using MariaDB 10.0.2, that meant I could use the Global Transaction ID option for replication. All this really does is simplify the process of setting the right log position in the slave(s), but given the importance of getting that right, this feature is actually very handy. The really cool thing is that you can include the --gtid option in your database backup command and it will include the right log position in the dump.
mysqldump -uroot -ppassword --gtid --master-data --databases db1 db2 db3 > backup.sql
I haven’t worked out how to replicate a single database yet, only entire servers, so I replicated all of the user databases on each production server to the corresponding slave. I did exclude the information_schema, performance_schema and mysql databases though, figuring that I probably did not want to overwrite these databases on the slave. Once the dump file was created I copied it to the slave server using scp
.
Before doing anything on the slave server I edited /etc/mysql/my.cnf and made the following changes:
Set server-id=2 Uncommented log-bin=/var/log/mysql/mysql-bin.log
Then I created each of the databases:
create database db1;
create database db2;
create database db3;
Simples:
mysql -uroot -ppassword < backup.sql
Set up the connection to the master server:
change master to master_host='hostname', master_port=3306, master_user='repluser', master_password='password', master_use_gtid=slave_pos;
Finally, I issued this command:
start slave;
And voila! Replication was up and running!
The show slave status
command is used to see the status of replication on the slave server. If you do this immediately after issuing the start slave
command you may actually see messages in the output indicating that the slave is applying changes necessary to bring it up to date with the master.
The show master status
command, issued on the master server, shows the binary log filename and position on the master. The important thing to check is that the log position matches the log position on the slave. I have been checking it periodically to ensure that the log position on the slave is keeping up with the master and it hasn’t faltered once. Note that the output on the slave from the show slave status
command is quite a bit longer and there are a few different logs being referred to, so you need to be careful that you’re looking at the right field.
That was it! Although it took a few hours to figure out the general process, once I had it sorted out I managed to do the entire process from scratch on my very last production server in about 45 minutes. This set-up makes a great (and cheap!) addition to my backup strategy and means that should catastrophe strike at the Fremont data centre, I should have a fairly recent, intact copy of all my databases elsewhere.