Page MenuHomePhabricator

Make querycache, querycachetwo and querycache_info tables visible on labs dbs
Open, NormalPublic

Description

It would be cool if those tables could be made available. They essentially contain cached versions of special pages, which mostly isn't useful to people, but on occasion can be.

Entries in querycache table with qc_type = Unwatchedpages, may be considered sensitive. Nothing else in these two tables is sensitive afaik.

These can already be queried through viewing Special pages and by fetching from API:querypage. The advantage of accessing it from labs would be lower latency, ability for batch queries, and ability to quickly query from multiple wikis.

Details

Reference
bz63782

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:13 AM
bzimport added a project: Toolforge.
bzimport set Reference to bz63782.
Bawolff created this task.Apr 10 2014, 6:15 PM
coren added a comment.May 6 2014, 1:36 AM

This will require an okay from legal (though I do not anticipate difficulties).

Only concern that I can see is how often these get stale? is there any risk that a page with sensitive content would be suppressed or reverted but still available through the cache?

(In reply to Luis Villa (personal-for work use lvilla@wikimedia.org) from comment #2)

Only concern that I can see is how often these get stale? is there any risk
that a page with sensitive content would be suppressed or reverted but still
available through the cache?

Yes. Most are updated once every 4 days, some take longer. But in that case i believe it would still be visible on main site (not 100% sure about that, but pretty sure)

You mean, we're still showing suppressed stuff on the main site in some cases? Or that more generally we may just have some stale caching around?

(In reply to Luis Villa (WMF Legal) from comment #4)

You mean, we're still showing suppressed stuff on the main site in some
cases? Or that more generally we may just have some stale caching around?

If someone deletes a page, (and then suppresses it or whatever) the page name may still appear on some list special pages for a bit of time - e.g. mostlinkedtopages and the like. However nobody can view the content of such a page, just that it was on the list

coren moved this task from Triage to Backlog on the Toolforge board.Nov 25 2014, 4:14 PM
scfc triaged this task as Normal priority.Apr 7 2015, 4:57 AM
scfc updated the task description. (Show Details)
scfc set Security to None.
scfc moved this task from Backlog to Waiting for information on the Toolforge board.
scfc added a subscriber: WMF-Legal.
scfc reassigned this task from coren to Slaporte.Sep 30 2015, 10:07 AM
scfc added subscribers: Slaporte, coren.

@Slaporte: Feel free to assign to someone else from Legal.

Restricted Application added a project: Cloud-Services. · View Herald TranscriptSep 30 2015, 10:07 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@scfc, I'm curious which user groups can see the data in querycache/querycachetwo through MediaWiki. Are any of these special pages restricted to admins? Can you post/link to a list of special pages that are in the table?

scfc added a comment.Oct 1 2015, 2:26 AM

@Bawolff, as I would have to make educated guesses, could you please answer @Slaporte's questions if you can?

@scfc, I'm curious which user groups can see the data in querycache/querycachetwo through MediaWiki. Are any of these special pages restricted to admins? Can you post/link to a list of special pages that are in the table?

querycachetwo -> Special:ActiveUsers

querycache:

Extensions can add additional pages. On enwikipedia the list includes (Taken from API - https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bquerypage):

Ancientpages, BrokenRedirects, Deadendpages, DoubleRedirects, ListDuplicatedFiles, Listredirects, Lonelypages, Longpages, MediaStatistics, Mostcategories, Mostimages, Mostinterwikis, Mostlinkedcategories, Mostlinkedtemplates, Mostlinked, Mostrevisions, Fewestrevisions, Shortpages, Uncategorizedcategories, Uncategorizedpages, Uncategorizedimages, Uncategorizedtemplates, Unusedcategories, Unusedimages, Wantedcategories, Wantedfiles, Wantedpages, Wantedtemplates, Unwatchedpages, Unusedtemplates, Withoutinterwiki, DisambiguationPages, DisambiguationPageLinks, UnconnectedPages, Badges

In particular Wantedpages is restricted. The biggest risk is probably if somebody adds a restricted query page at a later date.

These are internal names. The actual special page name might be different.

If an extension adds stuff to $wgAPIUselessQueryPages, it will be excluded from this list, however things that use querycache table, is not supposed to be added to that variable. I don't think any extensions add stuff to that variable, so I don't think its relavent, but mentioning for completeness.

hmm. Special:MostGloballyLinkedFilesPage doesn't seem to be listed there on commons, but I think that's due to https://gerrit.wikimedia.org/r/242790

Krinkle renamed this task from querycache and querycachetwo tables aren't available on labs sql dbs to Make querycache, querycachetwo and querycache_info tables visible on labs dbs.Oct 28 2015, 7:26 AM
Krinkle updated the task description. (Show Details)
Danny_B removed a subscriber: WMF-Legal.
Restricted Application added a subscriber: JEumerus. · View Herald TranscriptAug 8 2016, 11:41 AM
ZhouZ moved this task from Backlog to Assigned on the WMF-Legal board.Sep 23 2016, 6:06 PM
Framawiki moved this task from Toolforge to Database on the Cloud-Services board.Nov 13 2018, 8:07 PM
Framawiki edited projects, added Data-Services; removed Cloud-Services.
Framawiki added a subscriber: Framawiki.

Ping @Slaporte :)

Framawiki edited projects, added Data-Services; removed Cloud-Services.
Framawiki edited projects, added Cloud-Services; removed Data-Services.
Framawiki edited projects, added Data-Services; removed Cloud-Services.