MySQL Database Replication: Fixing Problems!

G

gcawood

Guest
Setting up MySQL to replicate is a fairly easy task. However, invalid MySQL queries will cause the replication to break and you should know how to fix the replication if it fails. The most popular method of fixing queries is to simply skip them. This makes sense if maintaining the consistency of your two database is not critical to your operation, as you are telling the slave to simply skip over the problem query.

The best solution requires you to re-sync the slave from scratch, ensuring that your data is 100% consistent. However, this can be a tedious and unnecessary task if the query was inconsequential, or can be re-run on the database successfully.

Below is the process for skipping the failed queries, and noting which ones failed so that you can diagnose the impact on your data.


Step One: Confirm that problem is in the replication and not somewhere else.

To do this, first log into MySQL as root.
Code:
mysql -u root -p

Step Two: Output the status of your replication

To do this, run the following command at the MySQL prompt.
Code:
mysql> SHOW SLAVE STATUS \G

A happy MySQL database cluster will show something like...
Code:
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 111.222.333.444
                Master_User: replication
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.005773
        Read_Master_Log_Pos: 41065629
             Relay_Log_File: webdb1-relay-bin.010119
              Relay_Log_Pos: 3173532
      Relay_Master_Log_File: mysql-bin.005773
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 41065629
            Relay_Log_Space: 3173532
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0

An unhappy database will also show
Slave_SQL_Running: No
An unhappy database will show an error in
Last_Error:

The "Last_Error:" should show you the query that failed. This should give you a good in at what software caused the problem, and if you need to alter code elsewhere to avoid the problem in the future. Make a note of this query so that you can check to see if it ran successfully on the master, but failed on the slave, or if it simply failed on both.

Step Three: Stop The Slave
Code:
mysql> STOP Slave;

Step Four: Tell the replication to skip the invalid SQL query.
Code:
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;

Step Five: Start The Slave Up Again!
Code:
mysql> START Slave;


Step Six: Check to see if the problem is fixed!
Re-run the following command.
Code:
mysql> SHOW SLAVE STATUS \G

If everything is happy, check your servers log file /var/log/syslog, looking for a confirmation that replication has started again.
Code:
less /var/log/syslog

Else, you need to repeat this process! Normally it will take a couple skips to get the job done.
Good luck!
 

Members online


Top