Page MenuHomePhabricator

Offer alternate views of the comment and actor tables which only check for supression in a single table in the Wiki Replicas
Open, MediumPublic

Description

Inspired by a question posed by @Pathoschild on the cloud mailing list.

The comments view is not aware of the type of entity that a given query is interested in. It is designed so that you could do a query like select * from comment and it would properly suppress any and all comments that should not be publicly viewable. To accomplish this, the view's WHERE clause includes 9 correlated subqueries against 8 other tables.

Most uses of the comment table would only be interested in the comments that go with rows from a single "entity" table (for example ipblocks). Having additional views that are restricted to a single entity type should provide some performance improvements to queries that know to use a specialized view.

The actor view has this identical issue as well.

Event Timeline

Change 513943 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] wiki replicas: Add specialized views of the "comment" table

https://gerrit.wikimedia.org/r/513943

If we can find a solution that folks like here, the actor table's view could use the same treatment.

This works, and I support this change, althought it may only fix queries in comment -> other table direction; due to temp* tables, other table -> comment direction may not be as efficient as the columns, for the most part do not exist yet (but will eventually).

I'm going to do some testing on this locally quick. It seems like a workable approach, though I also really like the third option mentioned in T215445.

If nothing else, this may shore some things up until a more holistic approach that is less surprising and weird for volunteers can be implemented.

So this seems to work really smoothly. This should get some of our users past the performance hit for now. The cost is some complexity that isn't nearly as bad as the _compat views. I think it is very clear what these do.

I could get a similar patch up for actor.

Change 513943 merged by Bstorm:
[operations/puppet@production] wiki replicas: Add specialized views of the "comment" table

https://gerrit.wikimedia.org/r/513943

Change 514548 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] wiki replicas: Add specialized views of the "actor" table

https://gerrit.wikimedia.org/r/514548

bd808 renamed this task from Offer alternate views of the comment table which only check for supression in a single table in the Wiki Replicas to Offer alternate views of the comment and actor tables which only check for supression in a single table in the Wiki Replicas.Jun 5 2019, 5:31 PM
bd808 updated the task description. (Show Details)

Change 514548 merged by Bstorm:
[operations/puppet@production] wiki replicas: Add specialized views of the "actor" table

https://gerrit.wikimedia.org/r/514548

Ok, now these just need to go out. I'll check in the morning if the maintenance can be paused and these deployed.

Mentioned in SAL (#wikimedia-operations) [2019-06-06T14:57:18Z] <bstorm_> T224850 update views on labsdb1012

Bstorm moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

Change 514744 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: depool labsdb1010 for view updates

https://gerrit.wikimedia.org/r/514744

Change 514744 merged by Bstorm:
[operations/puppet@production] wikireplicas: depool labsdb1010 for view updates

https://gerrit.wikimedia.org/r/514744

Mentioned in SAL (#wikimedia-operations) [2019-06-06T15:56:53Z] <bstorm_> T224850 depooled labsdb1010 for view updates

Mentioned in SAL (#wikimedia-operations) [2019-06-06T18:24:15Z] <bstorm_> T224850 repooled labsdb1010 after completing view run

Change 514882 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: depool labsdb1011 for view updates

https://gerrit.wikimedia.org/r/514882

Change 514882 merged by Bstorm:
[operations/puppet@production] wikireplicas: depool labsdb1011 for view updates

https://gerrit.wikimedia.org/r/514882

Change 514971 had a related patch set uploaded (by Jhedden; owner: Jhedden):
[operations/puppet@production] wikireplicas: depool labsdb1009 for view updates

https://gerrit.wikimedia.org/r/514971

Change 514971 merged by Jhedden:
[operations/puppet@production] wikireplicas: depool labsdb1009 for view updates

https://gerrit.wikimedia.org/r/514971

Mentioned in SAL (#wikimedia-operations) [2019-06-07T00:00:45Z] <bstorm_> T224850 repooled labsdb1009 after completing view updates

The views are up on all replicas. Now we just need to document it and perhaps push the message out.

The wiki replicas docs are kind of empty for user content, actually. Since the replicas are currently a bit out of phase with production, that's a place I should probably add something https://wikitech.wikimedia.org/wiki/Portal:Data_Services#Wiki_Replicas

bd808 lowered the priority of this task from High to Medium.Jun 11 2019, 11:44 PM

Lowering priority. Changes announced on cloud-announce mailing list. Still need to document in more places on wikitech.

@Bstorm : looks like https://gerrit.wikimedia.org/r/513943 did not include a definition of comment_archive, was that intentional? I would expect there to be a comment_archive table to allow for joining with the archive table when querying archived edit comments (similar to how there's an actor_archive table), but that table doesn't exist on the replicated databases on Toolforge.

@Nettrom archive isn't one of the subqueries in the comment view. Therefore, that is intentional. We only gain a benefit here if we can drop all subqueries but one, so the specialized view has to be one that links a table that is subqueried (such as filearchive or ipblocks) to generate the comment view. comment_archive would simply be the same comment view as it currently is.