Page MenuHomePhabricator

Add global_edit_count to wikireplicas
Open, Needs TriagePublicFeature

Description

The global_edit_count table was added to centralauth in 886ec32e351b4534485a9a74392287b87c85e849 (as part of T130439). This table would be useful for queries, instead of having to sum user.user_editcount across hundreds of wikis.

Event Timeline

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Change 990790 had a related patch set uploaded (by AntiCompositeNumber; author: AntiCompositeNumber):

[operations/puppet@production] Add global_edit_count to fullviews

https://gerrit.wikimedia.org/r/990790

taavi subscribed.

AIUI currently Data Engineering reviews the view changes to ensure the data is ok to publish and then WMCS (or Data Platform?) SREs deploy them. Let's use this as an example to figure out the review process and document it to https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Runbooks/Deploy_wiki_replicas_view_change#Review_process.

@lbowmaker, @WDoranWMF, @Ahoelzl - Would you be able to help us to define the procedure here please?
We have anew request for a change to the wikireplica views.

@taavi has tagged the ticket with Data-Platform (which I think is correct) and it's been in the backlog for a while.

Either Data-Platform-SRE or cloud-services-team could apply the patch, but I think that we need a review from Data-Engineering in order to be sure that we meet the privacy requirements.

The change seems simple enough, adding global_edit_count but I wouldn't want to say, personally, if this is permissible from a privacy standpoint.

I agree that this would be a good opportunity to get the review process defined a little better.

Suggest to filter out all rows for gec_user which gets itself filtered out of globaluser, visible should be only rows with gu_hidden_level=0 (join condition/not exists condition on gec_user=gu_id)