Page MenuHomePhabricator

Make querycache, querycachetwo and querycache_info tables visible on labs dbs
Open, MediumPublic

Description

It would be cool if those tables could be made available. They essentially contain cached versions of special pages, which mostly isn't useful to people, but on occasion can be.

Entries in querycache table with qc_type = Unwatchedpages, may be considered sensitive. Nothing else in these two tables is sensitive afaik.

These can already be queried through viewing Special pages and by fetching from API:querypage. The advantage of accessing it from labs would be lower latency, ability for batch queries, and ability to quickly query from multiple wikis.

Details

Reference
bz63782

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:13 AM
bzimport added a project: Toolforge.
bzimport set Reference to bz63782.

This will require an okay from legal (though I do not anticipate difficulties).

Only concern that I can see is how often these get stale? is there any risk that a page with sensitive content would be suppressed or reverted but still available through the cache?

(In reply to Luis Villa (personal-for work use lvilla@wikimedia.org) from comment #2)

Only concern that I can see is how often these get stale? is there any risk
that a page with sensitive content would be suppressed or reverted but still
available through the cache?

Yes. Most are updated once every 4 days, some take longer. But in that case i believe it would still be visible on main site (not 100% sure about that, but pretty sure)

You mean, we're still showing suppressed stuff on the main site in some cases? Or that more generally we may just have some stale caching around?

(In reply to Luis Villa (WMF Legal) from comment #4)

You mean, we're still showing suppressed stuff on the main site in some
cases? Or that more generally we may just have some stale caching around?

If someone deletes a page, (and then suppresses it or whatever) the page name may still appear on some list special pages for a bit of time - e.g. mostlinkedtopages and the like. However nobody can view the content of such a page, just that it was on the list

scfc triaged this task as Medium priority.Apr 7 2015, 4:57 AM
scfc updated the task description. (Show Details)
scfc set Security to None.
scfc moved this task from Ready to be worked on to Waiting for information on the Toolforge board.
scfc added a subscriber: WMF-Legal.
scfc added subscribers: Slaporte, coren.

@Slaporte: Feel free to assign to someone else from Legal.

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@scfc, I'm curious which user groups can see the data in querycache/querycachetwo through MediaWiki. Are any of these special pages restricted to admins? Can you post/link to a list of special pages that are in the table?

@Bawolff, as I would have to make educated guesses, could you please answer @Slaporte's questions if you can?

@scfc, I'm curious which user groups can see the data in querycache/querycachetwo through MediaWiki. Are any of these special pages restricted to admins? Can you post/link to a list of special pages that are in the table?

querycachetwo -> Special:ActiveUsers

querycache:

Extensions can add additional pages. On enwikipedia the list includes (Taken from API - https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bquerypage):

Ancientpages, BrokenRedirects, Deadendpages, DoubleRedirects, ListDuplicatedFiles, Listredirects, Lonelypages, Longpages, MediaStatistics, Mostcategories, Mostimages, Mostinterwikis, Mostlinkedcategories, Mostlinkedtemplates, Mostlinked, Mostrevisions, Fewestrevisions, Shortpages, Uncategorizedcategories, Uncategorizedpages, Uncategorizedimages, Uncategorizedtemplates, Unusedcategories, Unusedimages, Wantedcategories, Wantedfiles, Wantedpages, Wantedtemplates, Unwatchedpages, Unusedtemplates, Withoutinterwiki, DisambiguationPages, DisambiguationPageLinks, UnconnectedPages, Badges

In particular Wantedpages is restricted. The biggest risk is probably if somebody adds a restricted query page at a later date.

These are internal names. The actual special page name might be different.

If an extension adds stuff to $wgAPIUselessQueryPages, it will be excluded from this list, however things that use querycache table, is not supposed to be added to that variable. I don't think any extensions add stuff to that variable, so I don't think its relavent, but mentioning for completeness.

hmm. Special:MostGloballyLinkedFilesPage doesn't seem to be listed there on commons, but I think that's due to https://gerrit.wikimedia.org/r/242790

Krinkle renamed this task from querycache and querycachetwo tables aren't available on labs sql dbs to Make querycache, querycachetwo and querycache_info tables visible on labs dbs.Oct 28 2015, 7:26 AM
Krinkle updated the task description. (Show Details)
Framawiki edited projects, added Data-Services; removed Cloud-Services.
Framawiki edited projects, added Cloud-Services; removed Data-Services.
Framawiki edited projects, added Data-Services; removed Cloud-Services.

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)