Page MenuHomePhabricator

Database replicas: replicate user.user_touched
Closed, ResolvedPublic

Description

In order for the new separated-by-age anonymized user_properties view to work, the user.user_touched column needs to be populated.

(It is forcibly set to NULL in the user views already, so populating the underlying column will not diclose)

Event Timeline

coren raised the priority of this task from to High.
coren updated the task description. (Show Details)
coren added projects: Patch-For-Review, Toolforge.
coren added subscribers: Springle, coren.

(This was assigned directly to Sean and likely fell between the cracks because of it)

Do you want me to fill in that column to sanitarium from production (and take out any nullifying process, if any there)?

@jcrespo: Yes, that's correct - the data in that column (and the views that use it) have been cleared by legal as part of T60196.

This requires some planning: drop and update the triggers, which has to be done while replication is stopped to avoid leaks. I've started to do so at: https://gerrit.wikimedia.org/r/253930

And reimport in a consistent way the columns- something that could be done with blocking dumps on a slave that can suffer some lag (or depooling one and syncing its binlog position)

The filters have been dropped (changed to allow that field). Now I have to backfill that column.

The importing is taking place now. It will take a while, as we have 5GB of user data per server.

enwiki has been backfilled, it took 5.81GB of transference and 1:28:42 (time).

Will backfill the rest of the wikis later.

All shards have been backfilled with this column.