Page MenuHomePhabricator

[investigate] purging strategy
Closed, ResolvedPublic

Description

When an editor makes an edit through the Wikidata Bridge we want to reflect their change in the article as soon as possible and as seamlessly as possible. How can we do this?

Issues to consider:

  • there is a time delay between the edit being made on the repository and that edit being dispatched to the client wiki
  • try to get a sustainable solution
  • how does this influence the possiblity of reloading vs in-place replacement (of the infobox)

Timebox this to max 4 hours.

Event Timeline

Note to researcher.
The normal way of this happening would be via the dispatch system.
I imagine whoever does this research will need to look into that a fair bit! :)

Charlie_WMDE renamed this task from research purging strategy to [investigate] purging strategy.Jul 24 2019, 3:12 PM

So if we purge the page via action=purge (probably with forcelinkupdate=1, though tbh I’m not quite sure what that controls) before reloading it, we shouldn’t have to worry about dispatch lag. That still leaves the issue of replication lag on the repo wiki, though: if the client wiki page is purged and re-rendered based on data from a replica database on the repo wiki that hasn’t seen the edit yet, there will still be stale data.

Usually, I believe MediaWiki guards against this with the ChronologyProtector, which saves the current replication position in the user’s session and waits until the replica selected for a request has at least caught up with that position. I don’t think that works very well cross-wiki; we could try to make it work, but it’s probably easier to duplicate its functionality client-side.

We can get the current maximum replication lag from the repo wiki’s API using [action=query&meta=siteinfo&siprop=dbrepllag](https://www.wikidata.org/w/api.php?action=query&meta=siteinfo&siprop=dbrepllag); if we repeat that request every 0.2 s or so until the max replication lag is less than the time since the edit was made (measuring that using the client-side clock should be good enough), and then we purge the page via the API and reload afterwards, then the user should be guaranteed to see fresh data. (There should probably be some limit here so that even if replication lag is sky-high, or the user’s clock jumps backwards or something, we don’t wait for more than, dunno, ten seconds.)

One issue with that might be that the page could later be re-rendered again through the regular dispatch system, which would waste some server resources. We should check if dispatch has any checks built-in for whether a purge is necessary or not, and possibly add them. On the other hand, perhaps it’s not really a big deal?

Also, it would probably be good to investigate the ChronologyProtector a bit more instead of just writing it off as I did above :) sadly, it’s not really documented on mw.org.

T228896: investigate page reload vs. in place replacement after edit was a competitor ticket for that topic. During story writing we decided to close this and see if this investigation surfaces a solution for the question of completely reloading vs in-place replacement (of the infobox), too.

Usually, I believe MediaWiki guards against this with the ChronologyProtector, which saves the current replication position in the user’s session and waits until the replica selected for a request has at least caught up with that position. I don’t think that works very well cross-wiki; we could try to make it work, but it’s probably easier to duplicate its functionality client-side.

As far as I know ChronologyProtector in RDBMS is setup to work with multiple dbs / connections so it might be fine to use this in some way.
It would be worth checking with @aaron though.

One issue with that might be that the page could later be re-rendered again through the regular dispatch system, which would waste some server resources. We should check if dispatch has any checks built-in for whether a purge is necessary or not, and possibly add them. On the other hand, perhaps it’s not really a big deal?

AFAIK there are currently no such checks.
The parser output will have the TS of the parser / render.
If the dispatching knows the time the change was made that it is dispatching for (not sure if it does) then a check could be performed.

Also, it would probably be good to investigate the ChronologyProtector a bit more instead of just writing it off as I did above :) sadly, it’s not really documented on mw.org.

Yup

The experience of termbox about parser cache and their ADR could be related? See https://gerrit.wikimedia.org/r/530092

Note that CdnCacheUpdate queues a purge to happen X seconds later to help deal with lag (mediawiki-config has $wgCdnReboundPurgeDelay at 11). If lag gets near that amount, then $wgCdnMaxageLagged will kick in.

Okay, I’ve looked at ChronologyProtector in some more detail; the result is Manual:ChronologyProtector. As far as I can tell, if users don’t roam between multiple IP addresses, chronology protection should work even across multiple wikis and domains. (It would be nice to verify this experimentally, but that would probably be even more work.)

Edit: I just tried it out, and the client ID part of the cpPosIndex cookie is indeed the same between www.wikidata.org and test.wikidata.org.

So the recommended workflow is:

  1. User clicks save button
  2. API call to save the change
  3. On Success, another API call to purge the page with action=purge (and forcelinkupdate=1 in case the user's edit added or removed links, for example to other items)
  4. Once that returns, we reload the page or otherwise refresh the content (T228896 exists to follow-up with this)

Is that correct?

(Side note: We might want to have another "loading" screen after clicking the save button, as those two requests could take some time?)

I think the forcelinkupdate option is just a very confusing name for a fuller purge (I’ll investigate if we need it), but apart from that, yes.

I think the forcelinkupdate option is just a very confusing name for a fuller purge (I’ll investigate if we need it), but apart from that, yes.

https://www.mediawiki.org/wiki/Manual:Purge#Null_edits seems to indicate that it updates the "what links here" tables and others. I thought this could be relevant when the users adds/deletes references, because parts of those references might be rendered as links to other pages due to existing sitelinks.

OTOH: That probably happens via the usual dispatch refresh anyway, and we don't care whether it happens some seconds sooner or later? Not sure 🤷

Well, I was more worried that a non-forcelinkupdate purge might not update the parser cache. But as far as I can tell:

  • A regular purge invalidates the parser cache. The page will be re-parsed the next time it is requested. (Code can still get the cached parser output via ParserCache::getDirty(). RefreshLinksJob uses this and is happy to use a dirty output as long as its timestamp is recent enough.) This is also what ?action=purge does.
  • A purge with forcelinkupdate immediately updates the parser cache (for the canonical parser options), and then performs “secondary data updates”, including but not limited to a LinksUpdate. Content and ContentHandler instances can register any other updates they want here, and the AbstractContent implementation further delegates this to the [SecondaryDataUpdates hook](https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:Hooks/SecondaryDataUpdates).
  • A purge with forcerecursivelinkupdate does the same thing, but sets the flag $recursive = true for the secondary data updates. For LinksUpdate, this means to enqueue jobs for other pages linking to the linked-to pages via templatelinks or imagelinks (those jobs are in turn also recursive). The flag is also passed into the now-deprecated Content::getSecondaryDataUpdates() (and the aforementioned hook), but not into its replacement ContentHandler::getSecondaryDataUpdates().

A regular purge might be enough for us, but that just means that we’ll wait for the parser on the page reload, whereas with forcelinkupdate the parse happens while we’re still showing the dialog – I think that would be better for the user. I’m also uneasy about RefreshLinksJob using ParserCache::getDirty().

Pablo-WMDE added a subscriber: Charlie_WMDE.

These are sound explanations - updated our sequence diagram [0] to contain the "purging strategy". Should be actionable in a story => T235208: Step 1: Show updated information after saving (impact: high).

We mentioned that T128486: [Story] Make Special:EntityData be up to date after an edit needs additional love (will not automagically be covered by the new story) to overcome the entity being cached in the client (e.g. when re-opening bridge).

Michael wrote

We might want to have another "loading" screen after clicking the save button, as those two requests could take some time

I bounced this question off of @Charlie_WMDE in T232468: Illustrate loading (which is related to a point) - I consider our job done in that regard.

[0]

Untitled(3).png (2×1 px, 130 KB)