Testing a replicated backup system
If you have a high need for availability for a production application, but don’t have the money to invest in full fault tolerance networking, machinery, and access, you’ll need to have some sort of realtime backup system in place should your primary environment go down. This poses the issue of how do you keep one system on standbye, but be sure that it’s ready to go at all times. The image below shows one possible setup.
The first thing to notice, is that there is a minimum of 3 systems here. The production environment can be one machine, on a separate network as the backups. The Backup application could be housed on one of the slave machines, but that means if you have an issue with the slave, your application is unusable, where if you had two application servers that are load balanced, and two separate slaves, if one of the slaves goes down, you can route the traffic to the other. How fault tolerant you get with the backup is up to you and how many resources you have.
This article is about testing your environment though, not setting it up. To test, follow the steps below.
- Set your backup application to use Slave 1 as the database connection.
- Allow login to the backup application. If you’re a web application, you’ll probably want to disable login to the backup application so that no user mistakenly logs into the backup instead of the production environment. If a user does so, not only will replication on slave 1 start throwing errors, the user will be upset that the work they did will not be in the production environment.
- Turn off replication on Slave 1 using the “stop slave” command. I would at this point use the “reset slave” command as well as you will need to resync the database anyways. Slave 1 will no longer receive updates from the master.
- Log into your backup application and do any testing. All updates will go to Slave 1. Slave 2 will continue to receive the real-time replication from the master.
- When you are done testing, disable your login to the backup system and follow the steps below for resyncing a double replicated database that is out of sync.
Resync the slaves
- Stop replication on both slaves. (log into mysql and run “stop slave”)
- On slave 2 – the good slave – run “show slave status\G” – Take note of the master log file and position.
- On slave 1 – the bad slave – run “reset slave”. This will make sure that replication does not automatically restart when you restart MySQL. If replication automatically restarts, it will be starting from the wrong position on the master binaries.
- Copy the good database from slave 2 to slave 1. If you have a large database, run mysqlhotcopy, or stop MySQL and copy the raw database files. If you have a small database, you could use a text file mysql dump.
- Restart MySQL on both slaves.
- On Slave 2 – the good slave – you should just have to run “start slave”. This will pick up the replication from the position where you stopped it at.
- On Slave 1 – you will need to run the commands below. If you run “SHOW SLAVE STATUS \G” afterwards, it should match Slave 2
If you do this correctly, both slaves should now be sync’d with the master server. They may have to catch up if it took a while to complete the above actions, but your master will never be bothered while this is going on. That’s the great thing about multiple slaves.