Page MenuHomePhabricator

Caching issue for title descriptions in German Wikipedia
Open, HighPublic

Description

There seems to be an update / cache issue in the apps for certain Wikidata descriptions. Check out the different title descriptions in this example:

https://de.wikipedia.org/wiki/Wikipedia:Hauptseite

WebiOSAndroid
de.m.wikipedia.org_wiki_Schwuchtel(Pixel 3).png (1×720 px, 180 KB)
IMG_0398C92CE289-1.jpeg (2×1 px, 694 KB)
Screenshot_20201130-155513.png (2×1 px, 234 KB)

“Schwuchtel” is a curse word in German.

This bug on iOS and Android is especially bad since they both output “Description for Lucas Feindt” (insult on a person) instead of “Curse word” in the title description.

This is an absolute high priority issue as it seems to be a thing in the apps only.

Another example, that @Johan mentioned today on Slack:

https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team#Vandalism_of_Angela_Merkel_persisted_in_the_Wikipedia_App

This seems to be corrected as per November 30, 2020. Still, it took quite some time to update per the linked discussion above.

Event Timeline

Quoting from the linked Wikidata discussion:

Some more background: the German description of Angela Merkel (Q567) was vandalized here and reverted here some 50 minutes later. The Wikipedia app for Android continued to show the vandalized description for some hours on the article page of Merkel (but not in the search results, as much as I am aware); this was confirmed by several users, many of them did not open the corresponding article on their smartphone earlier. In other words: apparently some server-side cache was not renewed when the vandalism was removed.
Today I looked into this a little more based on other German description vandalism cases, and I was able to reproduce the observed behavior in other items as well. My current hypothesis is that use of rollback when removing vandalism does not trigger the app to load the changed description, but use of undo does (and any modification in the web UI also makes the app to use the new description immediately). I thus suggest to review whether old content in caches is properly invalidated when rollback was used, as this is sort of the standard tool to combat vandalism and it would be a pretty blunt one if it does not reliably remove vandalism everywhere. —MisterSynergy (disk) 13:23, 29 November 2020 (UTC)

LGoto triaged this task as High priority.Nov 30 2020, 5:11 PM
LGoto moved this task from Needs Triage to Tracking on the Wikipedia-Android-App-Backlog board.

This is important, but seems to be a caching issue in mobile-html.

Pchelolo subscribed.

I know what's going on. For optimization purposes we limit the updates based on Wikidata edits based on the edit comment - the comments are automatic in Wikidata, so generally it's safe. However for reverts it seems like the user provided custom comment here, so the update didn't run. Given how critical is to process reverts with more guarantee, we need to modify change-prop config, so that wikidata_description_on_edit rule has one more match clause, matching on rev_is_revert = true.

Glad to hear we know how to fix it, thanks for the quick attention on this.

So we can appropriately set expectations if we get more questions about this, do you know when you'd expect the config update to be deployed?

So we can appropriately set expectations if we get more questions about this, do you know when you'd expect the config update to be deployed?

perhaps @Clarakosi as the Platform Team Workboards (Clinic Duty Team) leader can answer when this will be scheduled.

I know what's going on. For optimization purposes we limit the updates based on Wikidata edits based on the edit comment - the comments are automatic in Wikidata, so generally it's safe.

I hope you’re checking not only for wbsetdescription edit summaries but also wbeditentity. (And probably also wbmergeitems, since I assume that API can also affect the description.)

Clarakosi added a subscriber: holger.knust.

So we can appropriately set expectations if we get more questions about this, do you know when you'd expect the config update to be deployed?

perhaps @Clarakosi as the Platform Team Workboards (Clinic Duty Team) leader can answer when this will be scheduled.

@holger.knust is picking this up now

Change 644542 had a related patch set uploaded (by Holger Knust; owner: Holger Knust):
[operations/deployment-charts@master] Configuration chage to allow custom comment reverts on Wikidata

https://gerrit.wikimedia.org/r/644542

This seems similar to T220829 where the rollback in https://www.wikidata.org/w/index.php?title=Q105338&diff=900796369&oldid=890602022 on 2 April 2019 wasn't propagated to Wikipedias which used a wrong image from Wikidata. It was closed because it couldn't be reproduced.

And similar to T207651 which also was declined.

Change 644542 merged by jenkins-bot:
[operations/deployment-charts@master] Configuration chage to allow custom comment reverts on Wikidata

https://gerrit.wikimedia.org/r/644542

The change has been deployed so I'm marking this as resolved but please re-open if the issue continues

I know what's going on. For optimization purposes we limit the updates based on Wikidata edits based on the edit comment - the comments are automatic in Wikidata, so generally it's safe.

I hope you’re checking not only for wbsetdescription edit summaries but also wbeditentity. (And probably also wbmergeitems, since I assume that API can also affect the description.)

Also wbsetlabeldescriptionaliases, though that’s not relevant to the current Drachenlord case as far as I can tell. (Edit: In the Drachenlord case, the vandalism was addressed using a standard revert rollback, which should’ve been caught by the rev_is_revert match in the attached Gerrit change IIUC.)

Another android app user reporting the same problem of vandalism still being shown at the article description despite it being reverted days ago.

From a quick look at changeprop rules:

on_wikidata_description_change:
  topic: change-prop.wikidata.resource-change

We do purge mobile-html output on wikidata resource changes. Is there something more systematic we need to take a look?