Page MenuHomePhabricator

Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26, 2018)
Closed, ResolvedPublic

Description

Cron <www-data@mwmaint1002> /usr/local/bin/foreachwikiindblist /srv/mediawiki/dblists/pageassessments.dblist extensions/PageAssessments/maintenance/purgeUnusedProjects.php > /dev/null
[Fri Oct 26 20:42:12 2018] [hphp] [116138:7fcb2bd703c0:0:000001] [] SlowTimer [10555ms] at runtime/ext_mysql: slow query: SELECT /* Wikimedia\Rdbms\Database::select www-data@mwmain... */ DISTINCT( pa_project_id )  FROM `page_assessments

Please let us know what you think. Feel free to remove the "WMF-NDA" tag if you thing all information on this task is harmless.

Event Timeline

jijiki triaged this task as Medium priority.Oct 29 2018, 3:18 PM
jijiki created this task.

(meta note)

Just a heads up, this task is currently public and not restricted to WMF-NDA

(top left)

Open, Normal Public

If I edit task and set the visibility to only WMF-NDA then it will be visible to WMF-NDA only. Projects are confusing and the fact they can be used as objects across multiple functions: CC, projects, ACL objects, etc.

I am going to set this task as visible to WMF-NDA only to demonstrate :)

(now...top left)

Open, Normal WMF-NDA

chasemp changed the visibility from "Public (No Login Required)" to "WMF-NDA (Project)".Oct 29 2018, 4:29 PM

I checked the query

SELECT /* Wikimedia\Rdbms\Database::select www-data@mwmain... */  DISTINCT( pa_project_id )  FROM `page_assessments

Based on the file /srv/mediawiki/dblists/pageassessments.dblist the enwiki, enwikivoyage, testwiki databases were used. Here are the results. I don't know which database would be queried this way but I was supposing vslow.

hostdbsectionresults
db1106enwikis13009 rows in set (3.30 sec)
db1113:3315enwikivoyages511 rows in set (0.07 sec)
db1123testwikis33 rows in set (0.04 sec)

enwiki wasn't fast, but 3 seconds is way better than 10

Banyek edited projects, added DBA; removed WMF-NDA.
Banyek changed the visibility from "WMF-NDA (Project)" to "Public (No Login Required)".

i'd like to add the owner of the script as a subscriber, but I don't know how to find who is it

i'd like to add the owner of the script as a subscriber, but I don't know how to find who is it

git blame can help

@kaldari if you need any help for further debugging this, you can ask me

@Banyek - Thanks for the ping. I don't think anything is unexpected here. This particular clean-up routine is expensive, which is why it was put in a cron job. Is there anything I can add to the script to indicate that? If we need to get it running faster than 3 seconds, let me know.

I think we should adjust the slow timer in a way of not to alert if the scripts runs for n seconds

This is interesting. T219935 seems to indicate that the query is now poorly formed instead of just expensive. I'm taking a look to see if I can see an obvious regression.

Aklapper renamed this task from Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26) to Issues with purgeUnusedProjects.php cron job on mwmaint1002 (Fri Oct 26, 2018).Sep 18 2020, 1:41 PM
Aklapper removed a project: User-Banyek.
Aklapper removed a subscriber: Banyek.

I am going to close this as fixed, please reopen if needed.