Page MenuHomePhabricator

[Story] Purge Special:EntityData JSON after edit
Open, HighPublic

Description

When an entity is edited users expect the data they get via Special:EntityData to change as well. We need to purge the caches there after an edit.

Event Timeline

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptMar 1 2016, 6:05 PM
Legoktm added a subscriber: Legoktm.Mar 2 2016, 6:37 PM

Use the TitleSquidURLs hook?

Smalyshev added a comment.EditedMar 2 2016, 9:04 PM

Does TitleSquidURLs require full list?
Because if have something like: https://www.wikidata.org/wiki/Special:EntityData/Q3361378.ttl?flavor=dump then ttl part can be a bunch of formats, and so can be flavor.

There's EntityDataRequestHandler::purgeWebCache which is supposed to do the purging and it uses EntityDataUriManager::getCacheableUrls but I don't see whether it handles flavors.

I did some quick testing and looks like action=purge does not indeed purge URL like https://www.wikidata.org/wiki/Special:EntityData/Q4115189.ttl?flavor=dump. This looks like independent bug.

If we cache them we should purge them. But I'm worried about the performance implications of sending a purge for every possible combination of parameters.

I hear the new varnish version (was it 4?) allows you to put multiple "variants" of a url into a single "bucket". That would help.

See T128667 for flavor parameter task.

hoo added a subscriber: hoo.Mar 3 2016, 1:03 AM

Yeah...you'd have to purge each variant of ttl and flavor parameter individually, at least right now.

This comment was removed by Smalyshev.
Lydia_Pintscher triaged this task as High priority.Apr 3 2016, 12:27 PM
Lydia_Pintscher moved this task from incoming to ready to go on the Wikidata board.
daniel added a comment.Dec 5 2016, 5:33 PM

We could rely on xkey for purging, see the discussion of xkey at T114662: RFC: Per-language URLs for multilingual wiki pages

Adding to the general Wikidata Bridge board, since this means users may see stale data when starting an edit (even though the page content will typically have the fresh data). We just discovered this on Beta:

The Twitter hashtag value on Beta Wikidata was changed from WikidataCon to Wikidata; the infobox has the new value (was automatically updated through change dispatching), but the bridge dialog loaded the entity data via the special page and got a stale value.

(The termbox also loads Special:EntityData, but doesn’t have this problem because it always request the data for the mw.config.get( 'wgRevisionId' ) revision since T215786.)

Pablo-WMDE added a subscriber: Pablo-WMDE.EditedOct 16 2019, 2:56 PM

Crazy idea suggested by a pragmatic fellow programmer: why don't we simply use our API if we don't want stale information (at least as a workaround)?

That’s a possible workaround, of course, but it causes additional network traffic and server load.

It would be great to be able to use the cached special entity page when we want the bridge to work at scale to avoid increased network and server load.
But for now in an MVP I see no reason that can't use the uncached wbgetentities API?
Or alternatively call an API to initially lookup the latest revid, then call the possible cached special entity data page (but that's more work)