Page MenuHomePhabricator

Special:LanguageStats fails to update statistics in various cases
Closed, ResolvedPublic

Description

Special:LanguageStats turns unusable on Meta-wiki. It doesn't show the total amount of translated/untranslated messages, doesn't show some other messages' statistics and is generally slow and buggy. And extremely cluttered.

Generally the same applies to Special:AggregateGroups – this page works extremely slow. Yesterday I tried to organize some newly added translatable messages into new and existing message groups, but to add a single message name to an aggregate group I had to wait way too long.

Possible solutions:

  • Enable subgroups on Special:LanguageStats, which will reduce the clutter. E.g. there is an aggregate group called "Affiliates" and a separate group called "Wikimedia France" (there are also "Wikimedia Russia", "Wikimedia Chile", "Maithili Wikimedians User Group", etc.). "Wikimedia France" should be a subgroup of "Affiliates".
  • If this is possible, make all the aggregate groups collapsed both on Special:LanguageStats and on Special:AggregateGroups, and make their content load only after the "expand" link is pressed – that will make those pages load really fast, even for the users with a very slow connection.
  • If you have any other/better ideas, or if any of the above isn't possible to achieve, try to find some technical solution to address this issue.

Currently the more new translatable messages arrive to Special:LanguageStats on Meta-wiki, the more useless it turns.

Event Timeline

Sometimes I need to purge Special:LanguageStats to update the statistics that doesn't want to update on its own. I do it this way:

  • I add ?&action=purge to the page url and press enter
  • if the purge was successful (sometimes there's an error) I press any of the message groups to open them in the translation system
  • then I return to Special:LanguageStats by manually typing it in the search field and press Enter.
  • there I get a red message saying "Some of the statistics on this page are incomplete. Please reload to get more statistics." I reload until it disappears. But if my connection is a bit slow, I get all or nearly all the message groups (including the 100% translated ones) with three dots instead of message count
    special18.jpg (768×1 px, 190 KB)
Nemo_bis renamed this task from Make Special:LanguageStats great again! to Make Special:LanguageStats load again.Mar 15 2017, 10:38 PM
Nemo_bis triaged this task as Medium priority.

The group "Stewards elections 2016" doesn't show statistics for Italian either. And still no totals. Besides, that's not only my problem. Today I'd asked another translator to tell me how many untranslated messages there are, and she sent me a PDF with Special:LanguageStats and it contained no totals and no data on certain message groups.

Right now I can see the complete data (including totals) for Russian and Polish, but not for Ukrainian, Italian, French and German.

It is broken for me for years too. I confirm everything Piramidion says. @Nemo_bis can that testing thing make screenshots of whole page rather than just of the first screen? Some equivalent of screenshot --fullpage in Firefox

can that testing thing make screenshots of whole page rather than just of the first screen

I don't know, try asking on their forum.

http://i.imgur.com/0lzZSaz.png like here is full page screenshot of the page for me just moment ago.

I think I've ruined stats for Italian too
https://www.webpagetest.org/result/170317_7D_DTG/

?&action=purge does that (I changed my interface language to Italian first). Please fix that in the first place or make the stats update on its own, without the need to manually purge the page.

But don't pretermit the other issues I mentioned in the description – we'll have to do something about that sooner or later, and it's better to start sooner.

I guess the devs should do something about how the stats are collected. There's a noticeable lag in that. For example, if I open a separate but large aggregate message group, and press the tab "Message group statistics", I get the red message saying that the statistics is incomplete, and a lot of grey color with three dots instead of data. Then I reload and reload and reload – do that multiple times until the statistics turns complete (the page adds data for ~5 languages at a time).

And what's interesting – after I took a look at a separate aggregate group statistics that way – it suddenly started to display data on Special:LanguageStats (broken with ?&action=purge):

aa2.jpg (768×1 px, 198 KB)

aa3.jpg (768×1 px, 206 KB)

And what's more interesting – that reload thing doesn't work for CentralNotice Banners aggregate group – it just stopped when it came to retrieving stats for Ukrainian, which should show 100% translated:

aa4.jpg (768×1 px, 142 KB)

I noticed some of these problems much earlier on translatewiki where if you want to check the translation statistics for MediaWiki aggregate group (which contains over 30,000 messages), you usually need to reload it multiple times just to get the complete stats.

Perhaps it's better to create separate tickets for some of the issues mentioned here? And shouldn't the priority of this one be set to "high"? I need that stats working for translation purposes. I feel like I'm walking blindfolded in a large building of Meta-wiki, bumping some random pages, unable to prioritize the translations.

?&action=purge does that (I changed my interface language to Italian first).

Yes, one shouldn't purge the stats pages without a good reason. Once the statistics are purged, it takes a while to populate them. How is this unexpected?

One wouldn't need to use purge if the statistics were updating themselves with a reasonable delay. They do not. What is more a problem that re-population of the statistics for the most global case never happens completely. (I would rather ask how is any of this to be taken for expected.)

Yes, one shouldn't purge the stats pages without a good reason.

There is a good reason to purge: like Base said – the statistics doesn't update on its own. I like to do the translation work by chosing an aggregate group and translating everything it contains. After that I want to see it vanish from Special:LanguageStats. This is why I purge that page.

I didn't have any problems with that while working on Commons, where I translated around 7000 messages in two months (just for you to understand the amount of work I might do retaining this level of activity and having an up-to-date statistics). I had to reload that page several times but that's all. That reloading doesn't help on Meta-wiki.

Besides, there are messages that don't want to show stats no matter what. You'll see them if you expand the "2017 Wikimedia movement strategy process updates" for Ukrainian. I checked some other languages – they show the statistics for these messages as expected, but Ukrainian has some problems with that for some reason.

There is a good reason to purge: like Base said – the statistics doesn't update on its own.

Sure. But after purging you have to reload the page a few dozens times, that's the current system.

After that I want to see it vanish from Special:LanguageStats. This is why I purge that page.

If you want to see it vanish *immediately*, then what you need is most definitely not a way to update *all* statistics at once (which is necessarily slower than updating a single group).

I didn't have any problems with that while working on Commons

Commons has very little translatable material compared to Meta-Wiki, it's a different beast.

Ukrainian has some problems with that for some reason

Maybe for excess purging. :)

One wouldn't need to use purge if the statistics were updating themselves with a reasonable delay.

It would help to have something concrete for this claim. From what I have seen, in general case, this is not true. If there are exceptions, we need to identify them and figure out the reasons.

I'll repeat what @Nemo_bis said: purging will not make stats update faster, it only slows it down! Its sole purpose is to force recalculation of statistics in case they are incorrect. From this discussion it seems it is being overused (causing issues to the users and servers) and thus this feature could be removed altogether.

No, it's not about "updating faster", I'm purging the page to make the translated aggregate groups disappear. They just don't go away after I finish the translation, and if I don't purge, I have to wait for several days to see them vanish. Are you trying to tell me this is OK?

Sure. But after purging you have to reload the page a few dozens times, that's the current system.

RELOADING DOESN'T WORK
Yesterday I reloaded that page dozens of times but the number of aggregate groups showing stats didn't increase. Then I started to open those groups in separate tabs and pressing «Message group statistics» — every aggregate group opened that way started to show stats on Special:LanguageStats. I did that for a quite long time, and after the most of aggregate groups' stats had been updated that way, it suddenly triggered the stats population for the rest of the messages. But I repeat – it seems that simple reloading doesn't trigger stats update on Special:LanguageStats.

If you want to see it vanish *immediately*, then what you need is most definitely not a way to update *all* statistics at once (which is necessarily slower than updating a single group).

That doesn't work either, as in case with CentralNotice Banners aggregate group.

Before I started to translate "Affiliates" message group yesterday, it'd been showing 1055 untranslated messages. I've translated several more messages today but it still shows 1055 untranslated messages. No need to purge you say? Special:MessageGroupStats shows the same. How am I supposed to see the actual stats? I don't overuse that "purge" feature – I usually use it only after the whole large message group or several smaller are translated. Do you want to remove that feature without addressing the problems mentioned or what? Something like "out of sight, out of mind"?

I have to wait for several days to see them vanish.

This does sound like a speed problem to me. :)

My point is simple: you've written a lot about the need to have a powerful workaround for the failed updates (i.e. something to clear all statistics at once), while the actual solution is something that updates the statistics where needed (i.e. when one translates).

It seems clear to me that whatever is supposed to update the statistics after new translations are saved is failing to do so. We have several reports on the failure to update statistics: T118681#2495103, T105856, T145295, T102229#2190683, T53410#1068804. Probably we'd better spend time on this actual issue rather than quarreling about the workarounds for it.

Last year there were several changes to disempower the statistics updates, see https://gerrit.wikimedia.org/r/#/q/owner:aaron+project:mediawiki/extensions/Translate .

Nemo_bis renamed this task from Make Special:LanguageStats load again to Special:LanguageStats fails to update statistics in various cases.Mar 20 2017, 10:32 AM

Well, I'm not looking for a workaround, I want a complex solution. If you provide some ways to update the statistics for separate aggregate groups manually or fix the automatic update – that would be really nice, but the problem is more complex than this.

I think I remember the time when there were like 30000 messages to translate on Meta-wiki. Now there're 70 000 and this number is constantly growing. The pages like Special:LanguageStats and Special:AggregateGroups turn extremely cumbersome and slow. I don't think that simply enabling stats update for separate aggregate groups would solve all these issues. There're a lot of translation admins on Meta-wiki, but they didn't use Special:AggregateGroups to clean up Special:LanguageStats – and I perfectly understand their reasons.

On Meta-wiki people discuss the ways to attract more translators, to somehow simplify their work, to somehow encourage them, but here it looks like you're trying to kind of shut those translators' mouths.

Of course it might be the case that you're just unable to find a complex solution as it might need not a single dev or two but a team of devs working on solution together for a decent period of time. If that is the case, it might be worth trying to raise this issue on the next community wishlist survey. But I believe that the tasks you mentioned and the proposed subgroup system can be solved earlier, if only someone started to work on them.

Usually tasks are built around targets, rather than means. Discussing whether we want a "complex solution" or not doesn't sound especially fruitful.

The "low" and "normal" priority tasks you mentioned, created up to 1.5 year ago, don't seem to be fruitful either. I'm not a coder, I don't know how to formulate the current problem for someone to start working on the solution.

Change 343638 had a related patch set uploaded (by Nikerabbit):
[mediawiki/extensions/Translate] Fix message group stats caching issue

https://gerrit.wikimedia.org/r/343638

Is there a possibility to add a purge function for a separate aggregate group for a single language? I noticed that if I open such a group on Special:MessageGroupStats (for example, the mentioned "Affiliates" group which's been showing 1055 untranslated messages all the time) and purge that single page, the stats on Special:LanguageStats get updated too. But still if the group is a large one, it takes some time to populate the stats for all the languages.

Can you provide some means to purge the stats for a single language instead of all of them?

I could, but I want to use my limited time to make it unnecessary in the first place. My patch above is a step in that direction when reviewed.

Change 343638 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Handle message group stats caching for long IDs

https://gerrit.wikimedia.org/r/343638

Conclusions:

  • There should not be anymore rows that are stuck in gray
  • The speed of updates is slow (less than ten pages per second)
    • Given the special page has time limit of 2 seconds for processing, and meta has ~6000 groups, it will take a while
    • Api has 10 second limit (8 by default).
    • Lots of Duplicate get(): "metawiki:SpecialLanguageStats%3A%3AmakeGroupRow:----progress-page-###-fi-uk" fetched N times in the logs, I have a patch in progress that uses getMulti that will also avoid this issue.
    • Profiling shows some hot spots:
      • MediaWiki\Logger\Monolog\LegacyHandler::write 4,2s (perhaps due to above log messages?)
      • StringMatcher::match 4s due to inefficient implementation of MessageGroups::expandWildcards

Change 346244 had a related patch set uploaded (by Nikerabbit):
[mediawiki/extensions/Translate@master] Optimize expandWildcards

https://gerrit.wikimedia.org/r/346244

The Special:AggregateGroups slowness can be solved using pagination, as mentioned in T90511

Change 346244 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Optimize expandWildcards

https://gerrit.wikimedia.org/r/346244

MessageGroups::expandWildcards has been optimized (should do a new profiling run to check the current status). Duplicate get(): issue has not yet been fixed.

I am not aware of any bugs on the page anymore, just that it is not as fast as it could be.

@Krinkle and @Nikerabbit Unfortunatelly and sadly, I've got this error message after your patches:

[W0CHIQpAIC8AAKPYmWEAAADY] 2018-07-07 09:25:54: Fatal exception of type "Exception"

It's not related to this task:
Key contains invalid characters: mediawikiwiki:SpecialLanguageStats%3A%3AmakeGroupRow:19-0-0-0--page-Extension%3AAzhàr_Authentication-en-zh

Nikerabbit claimed this task.

The statistics code has been optimized quite a bit recently (not necessarily much faster, but more reliable and using job queue effectively). Since this is a "Bag of Issues" type of task, I'm going to close this as resolved. There is one good suggestion here for which I will open a separate task.