Because of the combination of T183242: DB handles obtained with DB_REPLICA should not allow writes, T183245: Ensure replica DB in labs is read-only and some broken code in MW core that was merged into master recently, replication broke in beta cluster
wikiadmin@deployment-db04[deploymentwiki]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: deployment-db03.eqiad.wmflabs Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: deployment-db03-bin.000086 Read_Master_Log_Pos: 489306779 Relay_Log_File: deployment-db04-relay-bin.000283 Relay_Log_Pos: 312519 Relay_Master_Log_File: deployment-db03-bin.000086 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Error 'Duplicate entry '20972' for key 'PRIMARY'' on query. Default database: 'deploymentwiki'. Query: 'INSERT /* AbuseFilter::storeVarDump */ INTO `text` (old_id,old_text,old_flags) VALUES (NULL,'DB://cluster1/8224','nativeDataArray,gzip,external')' Skip_Counter: 0 Exec_Master_Log_Pos: 489296618 Relay_Log_Space: 323284 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Error 'Duplicate entry '20972' for key 'PRIMARY'' on query. Default database: 'deploymentwiki'. Query: 'INSERT /* AbuseFilter::storeVarDump */ INTO `text` (old_id,old_text,old_flags) VALUES (NULL,'DB://cluster1/8224','nativeDataArray,gzip,external')' Replicate_Ignore_Server_Ids: Master_Server_Id: 172234526 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 1 row in set (0.00 sec)
The text table has diverged:
wikiadmin@deployment-db03[deploymentwiki]> select * from text where old_id > 20965; +--------+--------------------+-------------------------------+ | old_id | old_text | old_flags | +--------+--------------------+-------------------------------+ | 20966 | DB://cluster1/8207 | nativeDataArray,gzip,external | | 20967 | DB://cluster1/8208 | nativeDataArray,gzip,external | | 20968 | DB://cluster1/8209 | nativeDataArray,gzip,external | | 20969 | DB://cluster1/8210 | nativeDataArray,gzip,external | | 20970 | DB://cluster1/8211 | utf-8,gzip,external | | 20971 | DB://cluster1/8218 | nativeDataArray,gzip,external | (DIFFERENT) | 20972 | DB://cluster1/8224 | nativeDataArray,gzip,external | (DIFFERENT) +--------+--------------------+-------------------------------+ 7 rows in set (0.00 sec) wikiadmin@deployment-db04[deploymentwiki]> select * from text where old_id > 20965; +--------+--------------------+-------------------------------+ | old_id | old_text | old_flags | +--------+--------------------+-------------------------------+ | 20966 | DB://cluster1/8207 | nativeDataArray,gzip,external | | 20967 | DB://cluster1/8208 | nativeDataArray,gzip,external | | 20968 | DB://cluster1/8209 | nativeDataArray,gzip,external | | 20969 | DB://cluster1/8210 | nativeDataArray,gzip,external | | 20970 | DB://cluster1/8211 | utf-8,gzip,external | | 20971 | DB://cluster1/8212 | utf-8,gzip,external | (DIFFERENT) | 20972 | DB://cluster1/8213 | utf-8,gzip,external | (DIFFERENT) | 20973 | DB://cluster1/8214 | utf-8,gzip,external | (DIFFERENT) | 20974 | DB://cluster1/8215 | utf-8,gzip,external | (DIFFERENT) | 20975 | DB://cluster1/8216 | utf-8,gzip,external | (DIFFERENT) | 20976 | DB://cluster1/8217 | utf-8,gzip,external | (DIFFERENT) | 20977 | DB://cluster1/8219 | utf-8,gzip,external | (DIFFERENT) | 20978 | DB://cluster1/8220 | utf-8,gzip,external | (DIFFERENT) | 20979 | DB://cluster1/8221 | utf-8,gzip,external | (DIFFERENT) | 20980 | DB://cluster1/8222 | utf-8,gzip,external | (DIFFERENT) | 20981 | DB://cluster1/8223 | utf-8,gzip,external | (DIFFERENT) | 20982 | DB://cluster1/8225 | utf-8,gzip,external | (DIFFERENT) +--------+--------------------+-------------------------------+ 17 rows in set (0.01 sec)
The revision table is in better shape. The master and replica agree on all rows except one row that is present on the master and missing from the replica. This row also has a rev_text_id (20972) that refers to a text row that the master and replica disagree on (and it looks like the replica's version is the right one), and that same ID is mentioned in the error that broke replication.
wikiadmin@deployment-db03[deploymentwiki]> select rev_id, rev_text_id, rev_timestamp from revision order by rev_id desc limit 15; +--------+-------------+----------------+ | rev_id | rev_text_id | rev_timestamp | +--------+-------------+----------------+ | 7944 | 20982 | 20171219010811 | (MISSING ON REPLICA) | 7943 | 20981 | 20171219002242 | | 7942 | 20980 | 20171218193859 | | 7941 | 20979 | 20171218193849 | | 7940 | 20978 | 20171218193735 | | 7939 | 20977 | 20171218193138 | | 7938 | 20976 | 20171218183128 | | 7937 | 20975 | 20171218182715 | | 7936 | 20974 | 20171218180348 | | 7935 | 20973 | 20171218174138 | | 7934 | 20972 | 20171218152013 | | 7917 | 20900 | 20171211164804 | | 7892 | 20783 | 20171202182607 | | 7891 | 20782 | 20171202154042 | | 7890 | 20336 | 20171202153840 | +--------+-------------+----------------+ 15 rows in set (0.00 sec) wikiadmin@deployment-db04[deploymentwiki]> select rev_id, rev_text_id, rev_timestamp from revision order by rev_id desc limit 15; +--------+-------------+----------------+ | rev_id | rev_text_id | rev_timestamp | +--------+-------------+----------------+ | 7943 | 20981 | 20171219002242 | | 7942 | 20980 | 20171218193859 | | 7941 | 20979 | 20171218193849 | | 7940 | 20978 | 20171218193735 | | 7939 | 20977 | 20171218193138 | | 7938 | 20976 | 20171218183128 | | 7937 | 20975 | 20171218182715 | | 7936 | 20974 | 20171218180348 | | 7935 | 20973 | 20171218174138 | | 7934 | 20972 | 20171218152013 | | 7917 | 20900 | 20171211164804 | | 7892 | 20783 | 20171202182607 | | 7891 | 20782 | 20171202154042 | | 7890 | 20336 | 20171202153840 | | 7848 | 20696 | 20171126213200 | +--------+-------------+----------------+ 15 rows in set (0.00 sec)