Page MenuHomePhabricator

Update tools maintained by Community-Tech to use new actor storage
Closed, ResolvedPublic

Description

The actor table is live on Toolforge replicas, and it appears the foreign keys (e.g. rev_actor) are populated.

The effort involved here is to update our queries to join on actor when we need to filter by username, or use join decomposition, so that we are no longer referencing rev_user_text, rev_user, log_user_text, log_user, etc.

The full list of database changes are listed at T167246.

Affected tools, roughly ordered by difficulty, highest to lowest:

Event Timeline

@MusikAnimal Do we know when they are going to deprecate the existing schema?

Do we know when they are going to deprecate the existing schema?

Not a clue, but it surely is months away.

MusikAnimal updated the task description. (Show Details)

Working on the change to views in cloud/toolforge here: T223406

I've so far announced the shift on those views for Monday the 27th, but I have some concerns that it may need pushing back a bit further. The deprecation of the schema is blocked by changing the views first, so that ticket is the timetable for the replicas.

MaxSem added a subscriber: MaxSem.

Database reports PR here: https://github.com/wikimedia/database-reports/pull/26

Couldn't test it though because it kept timing out for me even before the change.

For the reference, I use ack 'rev_user|ar_user|ipb_by|img_user|oi_user|fa_user|rc_user|log_user' to search for parts that need updating.

I searched our Python database reports repo, and only found this line that's using one of the old columns https://github.com/wikimedia/database-reports/blob/53387e78456d025425f3fdf39090734e110622ee/reports.py#L121 and that report is now handled by the Ruby bot. So I think we're done?

I searched our Python database reports repo, and only found this line that's using one of the old columns https://github.com/wikimedia/database-reports/blob/53387e78456d025425f3fdf39090734e110622ee/reports.py#L121 and that report is now handled by the Ruby bot. So I think we're done?

Lies! The active editors with longest-establish accounts report is still referencing rc_user_text. @MaxSem did you still want to take that on? I'd be happy to if not. It's all a single query so no Python skills required.

The Database Requests PR has been finally merged and deployed.

dom_walden added a subscriber: dom_walden.

Had their own tickets. Did not go into QA.

Checked that a couple of revisions the tool highlighted as potential copyvio were attributed to the correct user.

I compared the number of edits and editors as reported by the tool with the history for a couple of pages. They matched.

For one user, checked history of a couple of pages to see that the user did actually create them.

https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Active_editors_with_the_longest-established_accounts has not been updated since 28th May. I thought it should have updated by now. I will keep an eye on it.

has not been updated since 28th May. I thought it should have updated by now. I will keep an eye on it.

That'd be T225774: [Timebox: 8 hours] Database reports tool experiencing SQL timeouts which appears to be caused by schema changes, not our code (I had it locally before coding).

Niharika claimed this task.
MBH added a subscriber: MBH.

Global user contribs still broken because of this refactoring, see T224930

@MaxBioHazard That is not one of the tools we maintain. This task is about tools built and maintained by Community Tech

Aklapper renamed this task from Update tools to use new actor storage to Update tools maintained by Community-Tech to use new actor storage.Jun 19 2019, 9:03 PM
Aklapper added a project: Tools.