Page MenuHomePhabricator

Translate syntax version update and translation-aware transclusion lost
Closed, ResolvedPublicBUG REPORT

Description

I have no idea what’s going on, but either Translate’s source code is quite broken, or the database has been corrupted.

When I try to mark m:Template:Navigation header/COVID-19 for translation, I have the option to turn on translation-aware transclusion (it’s off) and to upgrade to the latest syntax version. However, the syntax version has already been upgraded in October (example edit by FuzzyBot adding the new markup), and the main COVID-19 page’s May 8, 08:35 version (the one before @1234qwer1234qwer4’s today revert) wasn’t full of raw <translate> tags when archive.org saved it a few hours after that May 8 edit, meaning the translation-aware transclusion was turned on at that time.

The translation-aware transclusion could in theory been turned off since archive.org saved the page, but the syntax version update cannot be undone on-wiki, only directly in the DB, so something’s broken for sure.

Outcome

Translatable page settings are no longer disappearing unexpectedly.

There was a bug, which got more severe recently, which caused a silent and permanent deletion of settings of translatable pages when associated translation page was deleted (see https://www.mediawiki.org/wiki/Help:Extension:Translate/Glossary). Such settings include priority languages, syntax version and transclusion support and other metadata.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

m:Wikimedia CEE Spring 2021 as well—the French translation was updated just a few days ago to include the new syntax version’s markup, yet I’m asked whether I want to upgrade it to the new syntax version. (This is not a template, thus I don’t have any evidence about whether translation-aware transclusion was enabled there, but the syntax version is issue is certain.)

Marostegui added a subscriber: Marostegui.

The database is only accessible via code, so there's not much we can do at the moment. Removing the tag and adding platform to see if they can narrow the issue.
I will stay subscribed to the task in case something is needed from the DBAs

Nikerabbit added subscribers: abi_, Nikerabbit.

Looking at Wikimedia CEE Spring 2021

Metadata table does not have anything for it:

wikiadmin@10.64.0.97(metawiki)> select * from translate_metadata where tmd_group like 'page-Wikimedia CE%' limit 10;
+------------------------------------------+--------------+-----------+
| tmd_group                                | tmd_key      | tmd_value |
+------------------------------------------+--------------+-----------+
| page-Wikimedia CEE Meeting 2014          | maxid        | 19        |
| page-Wikimedia CEE Meeting 2014          | transclusion | 0         |
| page-Wikimedia CEE Meeting 2018/Schedule | maxid        | 95        |
| page-Wikimedia CEE Spring 2016           | maxid        | 54        |
| page-Wikimedia CEE Spring 2016/Timeline  | maxid        | 68        |
| page-Wikimedia CEE Spring 2017           | maxid        | 13        |
| page-Wikimedia CEE Spring 2017/Timeline  | maxid        | 6         |
+------------------------------------------+--------------+-----------+

The page has been marked for translation for three times:

wikiadmin@10.64.0.97(metawiki)> select * from revtag where rt_page = '11244048';
+---------+----------+-------------+----------+
| rt_type | rt_page  | rt_revision | rt_value |
+---------+----------+-------------+----------+
| tp:mark | 11244048 |    21309462 | NULL     |
| tp:mark | 11244048 |    21330965 | NULL     |
| tp:mark | 11244048 |    21412775 | NULL     |
| tp:tag  | 11244048 |    21293400 | NULL     |
| tp:tag  | 11244048 |    21294059 | NULL     |
| tp:tag  | 11244048 |    21294061 | NULL     |
| tp:tag  | 11244048 |    21309461 | NULL     |
| tp:tag  | 11244048 |    21309462 | NULL     |
| tp:tag  | 11244048 |    21330421 | NULL     |
| tp:tag  | 11244048 |    21330965 | NULL     |
| tp:tag  | 11244048 |    21404545 | NULL     |
| tp:tag  | 11244048 |    21412774 | NULL     |
| tp:tag  | 11244048 |    21412775 | NULL     |
| tp:tag  | 11244048 |    21458427 | NULL     |
| tp:tag  | 11244048 |    21458434 | NULL     |
+---------+----------+-------------+----------+
15 rows in set (2.39 sec)

Page id is from https://meta.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&titles=Wikimedia%20CEE%20Spring%202021

Same is confirmed from the logging table https://meta.wikimedia.org/w/index.php?title=Special:Log&page=Wikimedia+CEE+Spring+2021

Unfortunately we do not store the "changed params" in anywhere else than the metadata table. I am unable to tell whether the metadata was saved-but-deleted or not saved at all.

I combed through Logstash at the time when the page was last marked for translation on May 1st/2nd and could not find anything exceptional there.

We definitely need better debug logging in our code. I will also stare the relevant code for a while to see if I spot anything.

Thanks for the analysis!

Unfortunately we do not store the "changed params" in anywhere else than the metadata table.

See also T279495, which, if it had been implemented, could have helped tracking this bug.

I am unable to tell whether the metadata was saved-but-deleted or not saved at all.

The main COVID-19 page was parsed as if translation-aware transclusion was turned on. Is it possible that it happened without this setting actually being in the DB? I don’t think so; they are separate pages and should communicate with each other only through parser cache invalidation.

Also, I want to add that most pages I looked at are fine. This is good news as not the whole wiki is broken, but probably bad news as it makes locating the bug even harder.

If you encounter any new pages with this issue, do report it here as that will help us to investigate.

Thanks to the help of the DBAs, I was able to figure out that the metadata for Wikimedia CEE Spring 2021 was removed around at 2021-05-08T07:49:11Z. This corresponds very accurately with 2021-05-08T10:54:40 Tulsi Bhagat talk contribs completed deletion of translation page Wikimedia CEE Spring 2021/tr (Not a translation) which can be seen in https://meta.wikimedia.org/wiki/Special:Log?type=pagetranslation&user=&page=&wpdate=2021-05-08&tagfilter=. There is only 5 minute difference between those (if we match including my timezone), which is a bit long but not unheard of.

Same is for Template:Policy: https://meta.wikimedia.org/wiki/Special:Log?type=pagetranslation&user=&page=&wpdate=&tagfilter=&subtype=delete

I'll make a fix right away.

Change 692312 had a related patch set uploaded (by Nikerabbit; author: Nikerabbit):

[mediawiki/extensions/Translate@master] Translation page deletion should not clear metadata

https://gerrit.wikimedia.org/r/692312

I made this issue more severe in commit rETRA7d00a65af1a6: Fix metadata handling for translatable page moves and deletions in late February by making it delete more metadata keys apart from priority language settings, but this issue seems to have been present since 2012.

Change 692312 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] Translation page deletion should not clear metadata

https://gerrit.wikimedia.org/r/692312

but this issue seems to have been present since 2012.

Wow, that’s pretty old, thanks for finding and fixing it!

Change 694306 had a related patch set uploaded (by Nikerabbit; author: Nikerabbit):

[mediawiki/extensions/Translate@mleb] Translation page deletion should not clear metadata

https://gerrit.wikimedia.org/r/694306

Change 694306 merged by Abijeet Patro:

[mediawiki/extensions/Translate@mleb] Translation page deletion should not clear metadata

https://gerrit.wikimedia.org/r/694306

Nikerabbit closed this task as Resolved.EditedTue, Jun 1, 12:21 PM
Nikerabbit claimed this task.

This was deployed and tested on translatewiki.net. Also deployed to WMF production.

Can we get a list of pages that were affected by this bug in some way?

Closest I have is to have a look at https://meta.wikimedia.org/wiki/Special:Log?type=pagetranslation&user=&page=&wpdate=&tagfilter=&subtype=delete (per wiki) and ignore the translatable page deletions, just look at the translation page deletions.