Page MenuHomePhabricator

wbeditentity doesn't trigger page update on Wikimedia Commons
Closed, DuplicatePublic

Description

My bot is adding structured data on Commons to files. I notice an edit using wbeditentity doesn't trigger a page update. Because of this quite a few files are still in the wrong tracker category https://commons.wikimedia.org/wiki/Category:Pages_with_local_coordinates_and_missing_SDC_coordinates instead of in https://commons.wikimedia.org/wiki/Category:Pages_with_local_coordinates_and_matching_SDC_coordinates .

Would be better if wbeditentity on Commons triggers an update so that categorylinks etc. get updated.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 15 2020, 7:11 PM

Null edit triggers update, but looks like a purge also does that. At least that is a less intensive operation. As a workaround we can just mass purge suspected files.

Multichill, I think this might be duplicate of T237991. As someone who works with commons categories based on SDC and Wikidata I always seem to have bot running null-edit batch tasks. I use AWB for that. Those edits do not show up in the page or bot edits, but I do a lot of it.

Another category that needs constant refreshing is c:Category:Pages_with_script_errors. There are often files there but they all disappear after nul edit.

Multichill, I think this might be duplicate of T237991.

I think you're right, but that bug hasn't even been touched by the SDC team. @Ramsey-WMF @Keegan who is supposed to manage these boards? It's a bit pointless of us filing bugs for the SDC team if nobody is noticing it.

I've added tags for the main boards the SDC team monitors. We do monitor as many relevant boards as possible, but incoming SDC work across WMDE and WMF focuses mainly on these at the moment.

Our team is small and, as you're aware, we're working on other important tasks at the moment. But we will get to this one as soon as we can. Thank you for your patience.

Restricted Application added a project: Structured-Data-Backlog. · View Herald TranscriptMar 16 2020, 3:31 PM

I just run into this issue again. c:Category:Files with PermissionOTRS template but without P6305 SDC statement is added to files fith OTRS template but without P6305 statement and it has 47k files . I just added P6305 to 20k of them, and there is stil 47k files in the category. I would like for them to disappear from the category so I can use it to find more candidates.

I just run into this issue again. c:Category:Files with PermissionOTRS template but without P6305 SDC statement is added to files fith OTRS template but without P6305 statement and it had 47k files on 4/14 . I added P6305 to 20k of them, and afterwards there were still 47k files in the category. I would like for them to disappear from the category so I can use it to find the next 20k batch. Otherwise I might have to apply "touch" operation on the folder, which now have 70k files.

Change 602053 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[mediawiki/core@master] Remove unwanted parse step

https://gerrit.wikimedia.org/r/602053

Change 602053 merged by jenkins-bot:
[mediawiki/core@master] Remove unwanted parse step

https://gerrit.wikimedia.org/r/602053

This (finally - thanks @Tgr!) got merged.
@Jarekt or @Multichill, any chance either of you could verify that this got fixed? (happy to wait if it's too much hassle to try to reproduce!)

This (finally - thanks @Tgr!) got merged.
@Jarekt or @Multichill, any chance either of you could verify that this got fixed? (happy to wait if it's too much hassle to try to reproduce!)

The easy cases shouldn't be hard to verify. Is this change live on Commons now?

Yes, that has already been deployed to Commons, so I'm hoping that this issue is now gone :)

It's not just an Oct. 18th thing - the incategory search shows files before that date as well, at a pretty consistent daily rate.
I can't figure out why it suddenly stopped on Oct. 18, though - did the relevant bot stop running?
It looks like this is not a regression, as it appears to have already been happening weeks ago (before this patch was even deployed)

Curious observation: when I do a null edit on those files now, the category disappears from the page & the elastic index, and it's gone from the search results. I have no idea why the bot's null edit didn't seem to have had that same effect.
Regardless - a null edit shouldn't be required, and I bet it's a similar issue that caused categories to not be updated. Looking into it.

Tgr added a comment.Oct 22 2020, 5:51 PM

Presumably the bot doesn't do an actual null edit but calls the purge API? Those things are supposed to do the same but the actual code path is fairly different.

Presumably the bot doesn't do an actual null edit but calls the purge API? Those things are supposed to do the same but the actual code path is fairly different.

Nope, it's a real edit. It does even show up in the edit history every once in a while (some one byte glitch).

matthiasmullie added a comment.EditedOct 23 2020, 11:51 AM

I am able to reproduce in production, but things now work fine on a similar local setup.
However, if I call $mainContent->getParserOutput() in a PageContentSave (deprecated) hook, I manage to get a similar behavior.

ATM, I suspect that an extension will hook into the page save process and cause a premature parse, which ends up getting cached (causing similar behaviour that the earlier patch addressed)
Still trying to figure out exactly how & where that happens, and what would solve it...

(also, clueless about why the null edit has no effect, but a manual purge/null edit later on does)