Page MenuHomePhabricator

File captions sometimes do not display
Closed, ResolvedPublic

Description

On some files with German captions, the caption is displayed (as a collapsed language):

On other files with a German caption, it is not:

Event Timeline

Abit created this task.Jan 17 2019, 12:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 17 2019, 12:31 AM
Abit added a comment.Jan 17 2019, 12:34 AM

Strangely, I now do not see a German caption for the image in my original screenshot:


But I still do in another image:

Abit updated the task description. (Show Details)Jan 17 2019, 12:34 AM
Abit renamed this task from I see German captions on some but not all files to I see German captions on some but not all files with German captions.Jan 17 2019, 12:55 AM
Ramsey-WMF triaged this task as Normal priority.Jan 17 2019, 4:01 AM
Ramsey-WMF added subscribers: Cparle, Jdforrester-WMF.

Initial testing shows this one to be kind of odd. It seems sometimes this works and sometimes it doesn't, and it's not entirely clear why, even with files that were uploaded by the same user around the same time.

Perhaps the best way to test this is to look at all the files uploaded by GPSLeo on Jan 13 (https://commons.wikimedia.org/wiki/Special:Log?type=upload&user=GPSLeo&page=&wpdate=2019-01-13&tagfilter=&subtype=upload)

All of those files display the problem in question EXCEPT the Lippendorf Power Station XX series. For all of those, the German caption appears just fine. Only difference I can initially see between the sets is that the Lippendorf files are used by a page, and the others are not.

As far as I can tell, this problem does not occur with newly uploaded files.

Pinging @Cparle and @Jdforrester-WMF for thoughts?

Ramsey-WMF moved this task from Untriaged to Triaged on the Multimedia board.Jan 17 2019, 4:10 AM
Cparle added a subscriber: daniel.Jan 17 2019, 12:29 PM

There's something up with the parser cache, afaics - if you purge the cache on any page with the caption missing then it shows up. Was actually talking to Matthias about this earlier - it's the same for captions I uploaded last week, in some cases the captions I uploaded are missing, but they show up when you do ?action=purge

Here's an example (please don't purge the cache on this page, or nobody else will be able to see what's wrong) https://commons.wikimedia.org/wiki/File:Landscape_around_surface_mine_Schleenhain_10.jpg - if you click on 'history' you can see that a caption was uploaded, but it's not getting shown. The fact that it works for new captions means the parser cache is being invalidated when something is added first, which suggests that the cache is being overwritten somehow, by some process that is not MCR-aware

We had a similar problem before with RefreshLinksJob, which @daniel fixed with this patch https://gerrit.wikimedia.org/r/c/mediawiki/core/+/465157

Could there be some other job that's not MCR-aware?

GPSLeo added a subscriber: GPSLeo.Jan 18 2019, 9:11 PM

I think i it themes fixed now?
So only as additional information (I do not know if it is relevant): When I uploaded the files the field for the caption in UploadWizard was the same type as the description field and not the one-line field like it is now.

Cparle added a comment.Feb 7 2019, 1:46 PM

Ok I purged all the Landscape_around_surface_mine_Schleenhain pages *except* for _10, and they all display properly now.

Let's see if they stay in an ok state - if not then there's something weird going on

Cparle renamed this task from I see German captions on some but not all files with German captions to File captions sometimes do not display.Feb 7 2019, 1:51 PM
Cparle updated the task description. (Show Details)

Another thing - I wonder could @GPSLeo 's action on Jan 18 (see here https://commons.wikimedia.org/w/index.php?title=File:Lippendorf_power_station_04.jpg&action=history - according to @Abit 's comment above this image used to be ok) have triggered a non-MCR-aware job that corrupted the parser cache? @Jdforrester-WMF @MarkTraceur @matthiasmullie do you know of any jobs that might be triggered by such an action?

Cparle claimed this task.Feb 7 2019, 3:03 PM

Not off the top of my head. I'm minded to mark this as Resolved unless we're getting on-going issues.

I don't think this is a matter of parser cache getting corrupted.

Right now, I'm getting a JS notice that's essentially caused by node .wbmi-entityview-content not being found.
We recently changed that classname: that used to be .filepage-mediainfo-entityview.
.filepage-mediainfo-entityview is what is still being output, because that's still in cache.

RWD to a few weeks ago, this is the diff of what got deployed.
Part of those changes were the .caption -> .filepage-mediainfo- renames. So, exact same scenario.

I believe it's just a matter of our new JavaScript (expecting new classnames) being served immediately after deploy, while ignoring that we have caches that won't expire for another while (24 days, apparently)
We should 1/ try to minimize markup changes (in PHP output), and 2/ when we do, make our relevant JS backwards-compatible for a little while longer.
And ideally, to be able to close this ticket, 3/ purge all existing pages with captions, just to make sure they're good right now.

We changed the name in d1738c877111e273c2c6e33b4223c640fb66a82a which just missed wmf.13; wmf.14 was deployed to Commons on Wednesday, 23 January so the Varnish cache should theoretically by now be emptying and replaced with new renders which the DOM labelling it wbmi-entityview-content and not filepage-mediainfo-entityview, unless our transforms are entering the parser cache (which I thought we avoided with the hooks we were using).

Agreed about the DOM changes, I didn't think about the way the JS interacts with stale content.

matthiasmullie closed this task as Resolved.Feb 11 2019, 8:55 AM

All existing caches have been purged, which should solve the effects we were seeing here.
It appears there may not have been an actual issue at play, so there's nothing left to do (just need to be aware of different versions of markup in cache)

Ramsey-WMF reopened this task as Open.Feb 13 2019, 10:22 PM

Unfortunately, we're still getting sporadic instances of this bug on some pages. Either all the caches didn't actually get purged, or this is a more insidious bug.

All existing caches have been purged, which should solve the effects we were seeing here.
It appears there may not have been an actual issue at play, so there's nothing left to do (just need to be aware of different versions of markup in cache)

Cparle added a comment.EditedFeb 14 2019, 10:27 AM

@Ramsey-WMF got some examples?

edit: sorry, never mind, https://commons.wikimedia.org/wiki/File:Lippendorf_power_station_04.jpg is banjaxed again

(sighs)

I think we'll need to do another purge - markup has changed again because of depicts, and then when that stuff was deployed for the CC0 issue it'll have caused this all over again

(sighs)

we're gonna have to manage markup changes a bit better than this ...

@matthiasmullie could you kick off another purge?

matthiasmullie closed this task as Resolved.EditedFeb 14 2019, 5:41 PM

Ran purge again on same dataset - a few pages between when that was generated & deploy was executed might be missed: I suggest manually purging those, on the off-chance you stumble upon one (before it falls out of cache anyway)