Page MenuHomePhabricator

Recovery text table in a couple of wikis
Closed, ResolvedPublic

Description

As part of clean up of legacy encodings T128150: Stop needing to use wgLegacyEncoding in Wikimedia cluster production we made a mess in a couple of wikis T337700: Exception: "Malformed UTF-8 characters" in Parser\MagicWordArray (via LqtVIew) and I need to investigate more what happened and revert some back so I need the text table of dawiki, dawiktionary, svwiktionary and svwiki from two weeks ago. The most complicating factor is that the rows we will be recovering are not utf-8 but windowns-1252 encoding.

Can you do that please? @jcrespo 🥺

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2023-06-05T12:17:08Z] <jynus> creating a copy of db1157 binlogs on dbprov1004 T338128

Recovered all except svwiki to db1133 (test host), as I made a mistake and recovered last week backup for svwiki. Checking to overwrite those tables with one from 2 weeks ago.

dawiki, dawiktionary, svwiktionary are from 2023-05-16 approximately at 0h.

✔️ root@dbprov1004:/srv/backups/dumps/ongoing$ mini_loader.sh text_tables
Starting recovery at 2023-06-05 13:41:59+00:00
Creating database svwiktionary...
Database svwiktionary created successfully
Creating database dawiktionary...
Database dawiktionary created successfully
Creating database dawiki...
Database dawiki created successfully
Creating database svwiki...
Database svwiki created successfully
Creating table dawiki.text...
Table dawiki.text created successfully
Creating table dawiktionary.text...
Table dawiktionary.text created successfully
Creating table svwiktionary.text...
Table svwiktionary.text created successfully
Creating table svwiki.text...
Table svwiki.text created successfully
Importing data to table dawiktionary.text...
Table dawiktionary.text imported successfully
Importing data to table svwiktionary.text...
Table svwiktionary.text imported successfully
Importing data to table dawiki.text...
Table dawiki.text imported successfully
Importing data to table svwiki.text.00001...
Table svwiki.text.00001 imported successfully
Importing data to table svwiki.text.00000...
Table svwiki.text.00000 imported successfully
Finishing recovery at 2023-06-05 13:49:51+00:00
Remember to remove /root/.my.cnf to prevent accidental loads
✔️

I wanted to give you an update, as probably you may be able to start working with this already, while I finish reloading svwiki.

Sorry, I forgot to update that I had already reloaded svwiki on db1133 with the backup from the same date (2023-05-16):

$ mini_loader.sh pendint_tables/
Starting recovery at 2023-06-05 18:32:06+00:00
Creating database svwiki...
Database svwiki created successfully
Creating table svwiki.text...
Table svwiki.text created successfully
Importing data to table svwiki.text.00001...
Table svwiki.text.00001 imported successfully
Importing data to table svwiki.text.00000...
Table svwiki.text.00000 imported successfully
Finishing recovery at 2023-06-05 18:39:32+00:00
Remember to remove /root/.my.cnf to prevent accidental loads
✔️

Thanks. I'm putting back the rows now

Double-confirming cp1252 was correctly on those values:

mysql:root@localhost [svwiki]> select hex(old_text) FROM text where old_id = 199;
+--------------------------------------------+
| hex(old_text)                              |
+--------------------------------------------+
| 235245444952454354205B5B416C6EF66E5D5D0D0A |
+--------------------------------------------+
1 row in set (0.000 sec)

which is #REDIRECT [[Alnön]] in 1252.

The recovery is done. Thank you! Do you want to clean up the db1133?

The recovery is done. Thank you! Do you want to clean up the db1133?

Yes, but given it was already unclean, it can be done outside of this ticket (it was being used to test recoveries)- this can be resolved.

Should I keep the original files for longer? They will be purged in 3 months after being taken if no action taken.

Yeah, let's keep the file for a bit longer. Thanks.

let's keep the file for a bit longer

Prefect. Marking this done from my side, otherwise.

Ladsgroup assigned this task to jcrespo.

Let's call this done.