MySQL Database Replication: Fixing Problems!

Discussion in 'Database' started by gcawood, Nov 1, 2011.

  1. gcawood

    gcawood Administrator Staff Member

    Messages:
    49
    Likes Received:
    19
    Trophy Points:
    0
    Setting up MySQL to replicate is a fairly easy task. However, invalid MySQL queries will cause the replication to break and you should know how to fix the replication if it fails. The most popular method of fixing queries is to simply skip them. This makes sense if maintaining the consistency of your two database is not critical to your operation, as you are telling the slave to simply skip over the problem query.

    The best solution requires you to re-sync the slave from scratch, ensuring that your data is 100% consistent. However, this can be a tedious and unnecessary task if the query was inconsequential, or can be re-run on the database successfully.

    Below is the process for skipping the failed queries, and noting which ones failed so that you can diagnose the impact on your data.


    Step One: Confirm that problem is in the replication and not somewhere else.

    To do this, first log into MySQL as root.
    Code:
     
    mysql -u root -p
    
    Step Two: Output the status of your replication


    To do this, run the following command at the MySQL prompt.
    Code:
    mysql> SHOW SLAVE STATUS \G
    
    A happy MySQL database cluster will show something like...
    Code:
    
    *************************** 1. row ***************************
                 Slave_IO_State: Waiting for master to send event
                    Master_Host: 111.222.333.444
                    Master_User: replication
                    Master_Port: 3306
                  Connect_Retry: 60
                Master_Log_File: mysql-bin.005773
            Read_Master_Log_Pos: 41065629
                 Relay_Log_File: webdb1-relay-bin.010119
                  Relay_Log_Pos: 3173532
          Relay_Master_Log_File: mysql-bin.005773
               Slave_IO_Running: Yes
              Slave_SQL_Running: Yes
                Replicate_Do_DB:
            Replicate_Ignore_DB:
             Replicate_Do_Table:
         Replicate_Ignore_Table:
        Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
                     Last_Errno: 0
                     Last_Error:
                   Skip_Counter: 0
            Exec_Master_Log_Pos: 41065629
                Relay_Log_Space: 3173532
                Until_Condition: None
                 Until_Log_File:
                  Until_Log_Pos: 0
             Master_SSL_Allowed: No
             Master_SSL_CA_File:
             Master_SSL_CA_Path:
                Master_SSL_Cert:
              Master_SSL_Cipher:
                 Master_SSL_Key:
          Seconds_Behind_Master: 0
    
    An unhappy database will also show
    An unhappy database will show an error in
    The "Last_Error:" should show you the query that failed. This should give you a good in at what software caused the problem, and if you need to alter code elsewhere to avoid the problem in the future. Make a note of this query so that you can check to see if it ran successfully on the master, but failed on the slave, or if it simply failed on both.

    Step Three: Stop The Slave
    Code:
    mysql> STOP Slave;
    
    Step Four: Tell the replication to skip the invalid SQL query.
    Code:
    mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
    
    Step Five: Start The Slave Up Again!
    Code:
    mysql> START Slave;
    

    Step Six: Check to see if the problem is fixed!
    Re-run the following command.
    Code:
    mysql> SHOW SLAVE STATUS \G
    
    If everything is happy, check your servers log file /var/log/syslog, looking for a confirmation that replication has started again.
    Code:
    less /var/log/syslog
    
    Else, you need to repeat this process! Normally it will take a couple skips to get the job done.
    Good luck!

Share This Page