Page MenuHomePhabricator

Fix Plagiabot DB corruption
Closed, InvalidPublic

Description

It seems there is a corrupted index on the diff_timestamp column of the copyright_diffs table, s51306__copyright_p database, on enwiki.labsdb.

The query we normally run:

MariaDB [s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY diff_timestamp DESC LIMIT 1;
ERROR 1030 (HY000): Got error 176 "Read page with wrong checksum" from storage engine Aria

If I do the same with the id then all is well:

MariaDB [s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY id DESC LIMIT 1;
...
1 row in set (0.72 sec)

INSERTs and UPDATEs are also apparently working, leading us to believe it's the INDEX on diff_timestamp that's causing the problem.

This issue effects the WikiEd Dashboard and the CopyPatrol tool on Tool Labs.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

(Possibly unbreak now?)

Apparently this only happens when you ORDER BY diff_timestamp. One quick and dirty workaround is to order by id, which should in theory should be ordered by date, too. INSERTs are still working it seems, the most recent was around 25 minutes ago, after we were first informed about the issue. I just tried doing an UPDATE and that also worked.

This of course is not a solution, still need to figure out what's going wrong, but this might at least get us up and running again in the interim.

I changed the query in CopyPatrol and it is at least functional now: https://tools.wmflabs.org/copypatrol/en

MusikAnimal lowered the priority of this task from High to Medium.Apr 13 2017, 7:02 PM
Ragesoss raised the priority of this task from Medium to High.Apr 13 2017, 7:40 PM
Ragesoss subscribed.

Not any more.

jcrespo subscribed.

Cannot reproduce:

root@labsdb1001[s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY diff_timestamp DESC LIMIT 1;
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| id    | project   | lang | diff      | diff_timestamp | page_title  | page_ns | ithenticate_id | report                                            
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| 42584 | wikipedia | en   | 775296835 | 20170413232233 | Jorge_Brugo |       0 |       28211468 | <div class="mw-ui-button">[//tools.wmflabs.org/era
* I  55% 174 words at [https://docs.com/the-next-miami/0dee3a92-4ac6-4fd1-b213-97c7a3f4381b/pi-attc-at-mdc https://docs.com/the-next-miami/0dee3a92-4
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
1 row in set (0.06 sec)

root@labsdb1001[s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY id DESC LIMIT 1;
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| id    | project   | lang | diff      | diff_timestamp | page_title  | page_ns | ithenticate_id | report                                            
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| 42584 | wikipedia | en   | 775296835 | 20170413232233 | Jorge_Brugo |       0 |       28211468 | <div class="mw-ui-button">[//tools.wmflabs.org/era
* I  55% 174 words at [https://docs.com/the-next-miami/0dee3a92-4ac6-4fd1-b213-97c7a3f4381b/pi-attc-at-mdc https://docs.com/the-next-miami/0dee3a92-4
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
1 row in set (0.00 sec)

root@labsdb1001[s51306__copyright_p]> REPAIR TABLE copyright_diffs;
+-------------------------------------+--------+----------+----------+
| Table                               | Op     | Msg_type | Msg_text |
+-------------------------------------+--------+----------+----------+
| s51306__copyright_p.copyright_diffs | repair | status   | OK       |
+-------------------------------------+--------+----------+----------+
1 row in set (1.15 sec)

We do not support Aria or MyISAM tables, in fact, we strongly recommend against using engines other than InnoDB- the only engine that guarantees corruption-free tables (and accomplishes it). Creating tables on those other engines is as safe as storing your data to /dev/null. However, we cannot disable them because many people still used them.

In addition to what Jaime said looks like the server has been crashing lately: https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=labsdb1001&from=1492008553177&to=1492181353177

And it looks memory related (possibly OOM+swapping): https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=labsdb1001&var-network=eth0&from=1492008634955&to=1492181434955

So probably the index related error was due to myisam being corrupted while starting up

I haven't looked further as I am checking emails/graphs from my phone and saw this ticket so quickly checked the graphs to see if there was something obvious.

I may have fixed it:

repair table copyright_diffs;
+-------------------------------------+--------+----------+------------------------------------------------------------------+
| Table                               | Op     | Msg_type | Msg_text                                                         |
+-------------------------------------+--------+----------+------------------------------------------------------------------+
| s51306__copyright_p.copyright_diffs | repair | info     | Wrong CRC on datapage at 5727                                    |
| s51306__copyright_p.copyright_diffs | repair | info     | Wrong CRC on datapage at 5728                                    |
| s51306__copyright_p.copyright_diffs | repair | info     | Wrong CRC on datapage at 5729                                    |
| s51306__copyright_p.copyright_diffs | repair | warning  | Duplicate key  1 for record at 1466114 against record at 1466113 |
| s51306__copyright_p.copyright_diffs | repair | warning  | 1 records have been removed                                      |
| s51306__copyright_p.copyright_diffs | repair | status   | OK                                                               |
+-------------------------------------+--------+----------+------------------------------------------------------------------+

Cannot reproduce:

root@labsdb1001[s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY diff_timestamp DESC LIMIT 1;
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| id    | project   | lang | diff      | diff_timestamp | page_title  | page_ns | ithenticate_id | report                                            
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| 42584 | wikipedia | en   | 775296835 | 20170413232233 | Jorge_Brugo |       0 |       28211468 | <div class="mw-ui-button">[//tools.wmflabs.org/era
* I  55% 174 words at [https://docs.com/the-next-miami/0dee3a92-4ac6-4fd1-b213-97c7a3f4381b/pi-attc-at-mdc https://docs.com/the-next-miami/0dee3a92-4
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
1 row in set (0.06 sec)

root@labsdb1001[s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY id DESC LIMIT 1;
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| id    | project   | lang | diff      | diff_timestamp | page_title  | page_ns | ithenticate_id | report                                            
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
| 42584 | wikipedia | en   | 775296835 | 20170413232233 | Jorge_Brugo |       0 |       28211468 | <div class="mw-ui-button">[//tools.wmflabs.org/era
* I  55% 174 words at [https://docs.com/the-next-miami/0dee3a92-4ac6-4fd1-b213-97c7a3f4381b/pi-attc-at-mdc https://docs.com/the-next-miami/0dee3a92-4
+-------+-----------+------+-----------+----------------+-------------+---------+----------------+---------------------------------------------------
1 row in set (0.00 sec)

root@labsdb1001[s51306__copyright_p]> REPAIR TABLE copyright_diffs;
+-------------------------------------+--------+----------+----------+
| Table                               | Op     | Msg_type | Msg_text |
+-------------------------------------+--------+----------+----------+
| s51306__copyright_p.copyright_diffs | repair | status   | OK       |
+-------------------------------------+--------+----------+----------+
1 row in set (1.15 sec)

We do not support Aria or MyISAM tables, in fact, we strongly recommend against using engines other than InnoDB- the only engine that guarantees corruption-free tables (and accomplishes it). Creating tables on those other engines is as safe as storing your data to /dev/null. However, we cannot disable them because many people still used them.

I don't think we care much on the DB engine for CopyPatrol / plagiabot. Would it be OK to change the engine to supported one? Should it be done using alter the table to ALTER TABLE s51306__copyright_p.copyright_diffs ENGINE=InnoDB; ?

Yes, you can use alter table s51306__copyright_p.copyright_diffs ENGINE=InnoDB; to migrate it to InnoDB if you like.

OK, I changed the engine InnoDB. Hopefully this should make it more stable to prevent further cases

OK, I changed the engine InnoDB. Hopefully this should make it more stable to prevent further cases

Just to be clear, probably this was a colateral damage from the server crashing a few times as stated here: T162932#3183044 and here T163001
This will not prevent the server from crashing or plagiabot from having an outage if the labs server is down. However it will prevent table corruption which can happen with Aria (MyISAM) but not with InnoDB.
To sum up, it was a good move to migrate the table from Aria to InnoDB to prevent further corruptions :-)
Thanks a lot!