A few days ago I saw a duplicate entry replication on one of the sanitarium hosts (reminder: they use row base replication) for frwiki database (s6) which matched when the new writes to change_tag happened.
I thought it was just a one time thing and I rebuilt the table for that one.
However, today I have seen it happening again on bgwiki (s2) for both sanitarium hosts in eqiad and codfw for different entries even:
On db1125: Last_Error: Could not execute Update_rows_v1 event on table bgwiki.change_tag; Duplicate entry '6114880-visualeditor' for key 'change_tag_rev_tag', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log db1074-bin.003024, end_log_pos 818204967
And on db2095: Last_SQL_Error: Could not execute Update_rows_v1 event on table bgwiki.change_tag; Duplicate entry '3964232-CategoryMaster' for key 'change_tag_log_tag', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log db2063-bin.004170, end_log_pos 638399139
I am pretty sure we fixed all the inconsistencies on change_tag table (T161510 T160509) but I could be wrong, however, it is very suspicious that this shows up right after we are starting to write more to change_tag table everywhere (https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/446804/) as it has been like this for more than a year without any issues // cc @Ladsgroup
We should go ahead and do a general check on the change_tag table across all the wikis again.
As bgwiki.change_tag was pretty small I have rebuilt it on eqiad to make sure we don't get labs hosts affected for long. For codfw host we can, however, take our time and check what's wrong.
Update
Looks like there were some inconsistences left on bgwiki.change_tag and frwiki.change_tag that were not fixed when we did the massive clean up as for sometime we were using pt-table-checksum which uses a PK and change_tag didn't have a PK at the time. Until Jaime developed compare.py we excluded change_tag for some sections. We should check across sections for a quick round of compare.py for change_tag
Wikis with inconsistencies detected and pending to be fixed on change_tag table:
- s1 is fine - no differences detected
- bgwiki (s2) T154485
- cswiki (s2)
- enwiktionary (s2)
- eowiki (s2)
- fiwiki (s2)
- idwiki (s2)
- itwiki (s2)
- nlwiki (s2)
- nowiki (s2)
- plwiki (s2)
- ptwiki (s2)
- svwiki (s2)
- thwiki (s2)
- trwiki (s2)
- zhwiki (s2)
- ckbwiki (s3)
- dawiki (s3)
- elwiki (s3)
- eswikivoyage (s3)
- euwiki (s3)
- fawikivoyage (s3)
- glwiki (s3)
- itwikiversity (s3)
- jvwiki (s3)
- kshwiki (s3)
- mediawikiwiki (s3)
- mswiki (s3)
- orwiki (s3)
- outreachwiki (s3)
- ruwiktionary (s3)
- skwiki (s3)
- specieswiki (s3)
- uawikimedia (s3)
- urwiki (s3)
- zhwikisource (s3)
- commonswiki (s4)
- s5 is fine - no differences detected
- s7 is fine - no differences detected
- s8 is fine - no differences detected