Page MenuHomePhabricator

Update tools maintained by Community-Tech to use new actor storage
Closed, ResolvedPublic

Description

The actor table is live on Toolforge replicas, and it appears the foreign keys (e.g. rev_actor) are populated.

The effort involved here is to update our queries to join on actor when we need to filter by username, or use join decomposition, so that we are no longer referencing rev_user_text, rev_user, log_user_text, log_user, etc.

The full list of database changes are listed at T167246.

Affected tools, roughly ordered by difficulty, highest to lowest:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 26 2019, 8:34 PM

@MusikAnimal Do we know when they are going to deprecate the existing schema?

Do we know when they are going to deprecate the existing schema?

Not a clue, but it surely is months away.

MusikAnimal updated the task description. (Show Details)Mar 26 2019, 8:43 PM
MusikAnimal triaged this task as High priority.May 17 2019, 10:23 PM
MusikAnimal updated the task description. (Show Details)
MusikAnimal updated the task description. (Show Details)May 18 2019, 9:36 AM
Bstorm added a subscriber: Bstorm.May 20 2019, 1:50 PM

Working on the change to views in cloud/toolforge here: T223406

I've so far announced the shift on those views for Monday the 27th, but I have some concerns that it may need pushing back a bit further. The deprecation of the schema is blocked by changing the views first, so that ticket is the timetable for the replicas.

MaxSem updated the task description. (Show Details)May 20 2019, 7:59 PM
MaxSem added a subscriber: MaxSem.
MusikAnimal updated the task description. (Show Details)May 20 2019, 11:08 PM

Database reports PR here: https://github.com/wikimedia/database-reports/pull/26

Couldn't test it though because it kept timing out for me even before the change.

MaxSem updated the task description. (Show Details)May 21 2019, 10:24 PM

For the reference, I use ack 'rev_user|ar_user|ipb_by|img_user|oi_user|fa_user|rc_user|log_user' to search for parts that need updating.

MusikAnimal updated the task description. (Show Details)May 22 2019, 12:03 AM
MusikAnimal updated the task description. (Show Details)May 24 2019, 2:22 AM
MusikAnimal updated the task description. (Show Details)May 24 2019, 7:24 PM
MusikAnimal updated the task description. (Show Details)May 24 2019, 7:34 PM
MusikAnimal updated the task description. (Show Details)
MusikAnimal updated the task description. (Show Details)May 24 2019, 8:42 PM
MusikAnimal updated the task description. (Show Details)May 30 2019, 5:11 PM

I searched our Python database reports repo, and only found this line that's using one of the old columns https://github.com/wikimedia/database-reports/blob/53387e78456d025425f3fdf39090734e110622ee/reports.py#L121 and that report is now handled by the Ruby bot. So I think we're done?

I searched our Python database reports repo, and only found this line that's using one of the old columns https://github.com/wikimedia/database-reports/blob/53387e78456d025425f3fdf39090734e110622ee/reports.py#L121 and that report is now handled by the Ruby bot. So I think we're done?

Lies! The active editors with longest-establish accounts report is still referencing rc_user_text. @MaxSem did you still want to take that on? I'd be happy to if not. It's all a single query so no Python skills required.

MaxSem added a comment.Jun 5 2019, 2:19 AM

Ah, you mean that you noticed https://github.com/wikimedia/database-reports/pull/26 isn't merged? :P

The Database Requests PR has been finally merged and deployed.

MaxSem updated the task description. (Show Details)Jun 10 2019, 6:41 PM
dom_walden added a subscriber: dom_walden.

Had their own tickets. Did not go into QA.

Checked that a couple of revisions the tool highlighted as potential copyvio were attributed to the correct user.

I compared the number of edits and editors as reported by the tool with the history for a couple of pages. They matched.

For one user, checked history of a couple of pages to see that the user did actually create them.

https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Active_editors_with_the_longest-established_accounts has not been updated since 28th May. I thought it should have updated by now. I will keep an eye on it.

has not been updated since 28th May. I thought it should have updated by now. I will keep an eye on it.

That'd be T225774: [Timebox: 8 hours] Database reports tool experiencing SQL timeouts which appears to be caused by schema changes, not our code (I had it locally before coding).

Niharika closed this task as Resolved.Jun 19 2019, 7:42 PM
Niharika claimed this task.
MaxBioHazard reopened this task as Open.Jun 19 2019, 7:46 PM
MaxBioHazard added a subscriber: MaxBioHazard.

Global user contribs still broken because of this refactoring, see T224930

MusikAnimal closed this task as Resolved.Jun 19 2019, 8:19 PM

@MaxBioHazard That is not one of the tools we maintain. This task is about tools built and maintained by Community Tech

Aklapper renamed this task from Update tools to use new actor storage to Update tools maintained by Community-Tech to use new actor storage.Jun 19 2019, 9:03 PM
Aklapper added a project: Tools.