Page MenuHomePhabricator

Use img_actor when querying user id
Closed, ResolvedPublic

Description

No update of Tool-wikiloves − reported by two users independently.

This was traced back to an issue with the DB query itself: some records have no user name for uploader.

This is due to T167246: Refactor "user" & "user_text" fields into "actor" reference table − user_id has been deprecated, and actor_id should be used.

Event Timeline

JeanFred created this task.Jun 3 2019, 8:45 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 3 2019, 8:45 AM

Last successful update was 2019-05-29_15:00:34:

2019-05-29_15:00:34 Starting database update.
Fetching configuration...
Found 26 events in the configuration.
Updating only 1 event(s): earth2019.
Fetching data for earth2019...
Saved earth2019: 363sec, 24 countries, 54610 uploads
2019-05-29_15:06:40 Done with the update!

Since then:

2019-05-29_15:15:19 Starting database update.
Fetching configuration...
Found 26 events in the configuration.
Updating only 1 event(s): earth2019.
Fetching data for earth2019...
Traceback (most recent call last):
  File "database.py", line 229, in <module>
    db = update_event_data(event_name, event_configuration, db)
  File "database.py", line 192, in update_event_data
    event_data = getData(event_slug, event_configuration)
  File "database.py", line 109, in getData
    country_data = get_country_data(cat, start_time, end_time)
  File "database.py", line 121, in get_country_data
    dbData = get_data_for_category(category)
  File "database.py", line 181, in get_data_for_category
    for timestamp, usage, user, user_reg in query_data)
  File "database.py", line 181, in <genexpr>
    for timestamp, usage, user, user_reg in query_data)
AttributeError: 'NoneType' object has no attribute 'decode'
2019-05-29_15:17:33 Done with the update!

Interestingly, a python database.py africa2019runs just fine.
But not on earth2019

@JeanFred: Updates of what? This task completely lacks any context :) Feel free to add a project tag.

This query illustrates the issue: https://quarry.wmflabs.org/query/36585

The query returns file with no user name 🤔

JeanFred updated the task description. (Show Details)

@JeanFred: Updates of what? This task completely lacks any context :) Feel free to add a project tag.

Done, thanks for the flag :)

JeanFred renamed this task from No update since May 29 to wikiloves database query returns images with no user.Jun 3 2019, 9:25 AM
JeanFred claimed this task.
JeanFred triaged this task as High priority.
JeanFred updated the task description. (Show Details)
JeanFred renamed this task from wikiloves database query returns images with no user to Some images have no `img_user` in the `image` table in Commons Wiki replicas.Jun 3 2019, 11:53 AM
JeanFred lowered the priority of this task from High to Normal.
JeanFred added a project: DBA.

I played with and stripped down the query until isolating the issue, and the question boils down to:

Why would img_user (and img_user_text for that matter) not be populated in the image table?

(See query https://quarry.wmflabs.org/query/36585 )

(Tentatively tagging DBA here)

jcrespo edited projects, added Data-Services; removed DBA.Jun 3 2019, 12:02 PM
jcrespo added a subscriber: jcrespo.

@JeanFred Please read the several announcements on the cloud list of production database changes over the last 3 months. Specifically I don't believe such fields are canonical anymore, and actor id should be used instead. See T167246. DBAs are aware of this but they are not involved on the day to day changes, please ask on that ticker or on the cloud-l list.

BTW, I believe there are several *_compat tables that serve for transitioning with the same old fileds, but they couldn't be set transparently as they created performance issues (they should only be used as a stop gap).

@JeanFred Please read the several announcements on the cloud list of production database changes over the last 3 months. Specifically I don't believe such fields are canonical anymore, and actor id should be used instead. See T167246. DBAs are aware of this but they are not involved on the day to day changes, please ask on that ticker or on the cloud-l list.

Thanks @jcrespo − I did do a cursory search on img_user through my email but missed the relevant announcement, thanks for pointing it out (also, the manual still mentions these fields without deprecation warning.

Will dig into making the relevant changes − thanks again!

JeanFred renamed this task from Some images have no `img_user` in the `image` table in Commons Wiki replicas to Use img_actor when querying user id.Jun 3 2019, 1:25 PM
JeanFred updated the task description. (Show Details)
JeanFred removed a subscriber: jcrespo.

I think this is the most important announcement (although not the only one): https://lists.wikimedia.org/pipermail/cloud/2019-May/000653.html

Please note that https://www.mediawiki.org/wiki/Manual:Image_table#img_user is the Production/mediawiki head manual, not the cloud wikireplica one. The fields exist but were no longer updated, I think, recently. Please help us fixing the manual (you can edit it too! 0:-)) or by filing relevant bugs about documentation (which are bugs, too) below T167246. I will put a comment there.

Mentioned in SAL (#wikimedia-cloud) [2019-06-05T15:03:30Z] <wm-bot> <jeanfred> Deploy latest from Git master: 5317bb7, aabd6ec (T224862)

JeanFred closed this task as Resolved.Jun 5 2019, 3:29 PM
2019-06-05_15:15:08 Starting database update.
Fetching configuration...
Found 26 events in the configuration.
Updating only 1 event(s): earth2019.
Fetching data for earth2019...
Saved earth2019: 27sec, 36 countries, 67572 uploads
2019-06-05_15:15:38 Done with the update!

Before:

2019 	24 	54610 	2067 (3%) 	5236 	4468 (85%)

After:

2019 	36 	67572 	2538 (3%) 	6587 	5625 (85%)

I think this is the most important announcement (although not the only one): https://lists.wikimedia.org/pipermail/cloud/2019-May/000653.html

o_ô I never got that one − there must be smth wrong in my email forwarding. Thanks a lot for pointing it out − and for your help in general with this :)

Please help us fixing the manual (you can edit it too! 0:-)) or by filing relevant bugs about documentation (which are bugs, too) below T167246.

Fair points ;-)

For future reference T225007 was filled specifically to update the schema documentation. Thanks to you for the report and subsequent fix!