Page MenuHomePhabricator

[Investigation] WMDE cronjobs on stats1007
Closed, ResolvedPublic1 Estimated Story Points

Description

Context:
There are cronjobs on stats1007 written some time ago (likely by Adam S). We need to understand what they are and their context in order to decide how to move forward.

Notes:

A/C

  • a summary of the cronjobs and what they're doing if its clear

Event Timeline

From my side, everything that is in a cron there i believe should be in https://github.com/wikimedia/analytics-wmde-scripts
Specifically to see what is happening within the crons see https://github.com/wikimedia/analytics-wmde-scripts/tree/master/cron

The setup and crons are all defined in puppet https://github.com/wikimedia/operations-puppet/blob/da6c3eafd065f5268f71d8e683e58a216da9e936/modules/statistics/manifests/wmde/graphite.pp#L89-L149

Most individual scripts that run should be nammed in such a way that is easy to understand.
Many will also link to why they are needed
ie, https://github.com/wikimedia/analytics-wmde-scripts/blob/master/src/betafeatures/counts.php links to https://grafana.wikimedia.org/dashboard/db/betafeatures

Michael set the point value for this task to 1.Oct 19 2023, 1:56 PM
Michael added a subscriber: hoo.

Taking it over as agreed.

Change 969107 had a related patch set uploaded (by Michael Große; author: Michael Große):

[analytics/wmde/scripts@master] Fix link to Grafana dashboard

https://gerrit.wikimedia.org/r/969107

Change 969107 merged by jenkins-bot:

[analytics/wmde/scripts@master] Fix link to Grafana dashboard

https://gerrit.wikimedia.org/r/969107

Change 969147 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Michael Große):

[analytics/wmde/scripts@production] Fix link to Grafana dashboard

https://gerrit.wikimedia.org/r/969147

Change 969147 merged by jenkins-bot:

[analytics/wmde/scripts@production] Fix link to Grafana dashboard

https://gerrit.wikimedia.org/r/969147

Change 970415 had a related patch set uploaded (by Michael Große; author: Michael Große):

[analytics/wmde/scripts@master] Fix Grafana dashboard links to new format

https://gerrit.wikimedia.org/r/970415

Change 970416 had a related patch set uploaded (by Michael Große; author: Michael Große):

[analytics/wmde/scripts@master] Fix Grafana links to a different dashboard

https://gerrit.wikimedia.org/r/970416

Change 970417 had a related patch set uploaded (by Michael Große; author: Michael Große):

[analytics/wmde/scripts@master] Add missing links to Grafana dashboards using the data

https://gerrit.wikimedia.org/r/970417

Change 970723 had a related patch set uploaded (by Michael Große; author: Michael Große):

[analytics/wmde/scripts@master] Fixed/Added Grafana links for technical wishes scripts

https://gerrit.wikimedia.org/r/970723

Change 970415 merged by jenkins-bot:

[analytics/wmde/scripts@master] Fix Grafana dashboard links to new format

https://gerrit.wikimedia.org/r/970415

Change 970365 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Michael Große):

[analytics/wmde/scripts@production] Fix Grafana dashboard links to new format

https://gerrit.wikimedia.org/r/970365

Change 970365 merged by jenkins-bot:

[analytics/wmde/scripts@production] Fix Grafana dashboard links to new format

https://gerrit.wikimedia.org/r/970365

Change 970416 merged by jenkins-bot:

[analytics/wmde/scripts@master] Fix Grafana links to a different dashboard

https://gerrit.wikimedia.org/r/970416

Change 970746 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Michael Große):

[analytics/wmde/scripts@production] Fix Grafana links to a different dashboard

https://gerrit.wikimedia.org/r/970746

Change 970746 merged by jenkins-bot:

[analytics/wmde/scripts@production] Fix Grafana links to a different dashboard

https://gerrit.wikimedia.org/r/970746

Change 970417 merged by jenkins-bot:

[analytics/wmde/scripts@master] Add missing links to Grafana dashboards using the data

https://gerrit.wikimedia.org/r/970417

Change 970723 merged by jenkins-bot:

[analytics/wmde/scripts@master] Fix/Add Grafana links for technical wishes scripts

https://gerrit.wikimedia.org/r/970723

Change 970747 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Michael Große):

[analytics/wmde/scripts@production] Add missing links to Grafana dashboards using the data

https://gerrit.wikimedia.org/r/970747

Change 970747 merged by jenkins-bot:

[analytics/wmde/scripts@production] Add missing links to Grafana dashboards using the data

https://gerrit.wikimedia.org/r/970747

Change 970748 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Michael Große):

[analytics/wmde/scripts@production] Fix/Add Grafana links for technical wishes scripts

https://gerrit.wikimedia.org/r/970748

Change 970748 merged by jenkins-bot:

[analytics/wmde/scripts@production] Fix/Add Grafana links for technical wishes scripts

https://gerrit.wikimedia.org/r/970748

Peer review mainly for the "analytics scripts" table: https://docs.google.com/spreadsheets/d/1w2f_ndQa6Lo2BBfPJ88sJLSg2RJeTQKFNOPd0zjiB4I/edit#gid=1625715087

I think in particular, I did not find Grafana boards for all the scripts. Maybe someone remembers them better? But I don't think we should spend too much time on finding them, it could be that the data from some of those scripts is just not displayed anywhere (anymore).

Change 970769 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[analytics/wmde/scripts@master] Fix list of current beta features

https://gerrit.wikimedia.org/r/970769

Peer review mainly for the "analytics scripts" table: https://docs.google.com/spreadsheets/d/1w2f_ndQa6Lo2BBfPJ88sJLSg2RJeTQKFNOPd0zjiB4I/edit#gid=1625715087

I think in particular, I did not find Grafana boards for all the scripts. Maybe someone remembers them better? But I don't think we should spend too much time on finding them, it could be that the data from some of those scripts is just not displayed anywhere (anymore).

I left comments on the “???” cells – I was able to find boards for all but one of the scripts (dumpScanProcessing.php is as far as I can tell reading from a dead data source).

Peer review mainly for the "analytics scripts" table: https://docs.google.com/spreadsheets/d/1w2f_ndQa6Lo2BBfPJ88sJLSg2RJeTQKFNOPd0zjiB4I/edit#gid=1625715087

I think in particular, I did not find Grafana boards for all the scripts. Maybe someone remembers them better? But I don't think we should spend too much time on finding them, it could be that the data from some of those scripts is just not displayed anywhere (anymore).

I left comments on the “???” cells – I was able to find boards for all but one of the scripts (dumpScanProcessing.php is as far as I can tell reading from a dead data source).

Thanks!

I think based on the metrics you found that dumpScanProcessing.php provided maybe some of the data for https://grafana.wikimedia.org/d/000000182/wikidata-datamodel-references?orgId=1&refresh=30m&from=now-6y&to=now

I'm not fully sure what data there came from that script and what from the toolkit-analyzer, but from a cursory look, both seem to contribute.

I'm moving this to Product Verification for @Manuel and @AndrewTavis_WMDE to decide if this part about the scripts and toolkit analyzer answers the relevant questions or if more research is desired. A deeper look into that strange WDCM repository clone will happen in T350252

I think based on the metrics you found that dumpScanProcessing.php provided maybe some of the data for https://grafana.wikimedia.org/d/000000182/wikidata-datamodel-references?orgId=1&refresh=30m&from=now-6y&to=now

I'm not fully sure what data there came from that script and what from the toolkit-analyzer, but from a cursory look, both seem to contribute.

If I understand correctly, the toolkit analyzer writes these metrics.json files and then dumpScanProcessing.php basically just copies the contents over into Grafana. Not sure why the toolkit analyzer doesn’t just write to Grafana directly, but maybe it was easier to set up this way.

Change 970769 merged by jenkins-bot:

[analytics/wmde/scripts@master] Fix list of current beta features

https://gerrit.wikimedia.org/r/970769

Change 970753 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: WMDE-Fisch):

[analytics/wmde/scripts@production] Fix list of current beta features

https://gerrit.wikimedia.org/r/970753

Change 970753 merged by jenkins-bot:

[analytics/wmde/scripts@production] Fix list of current beta features

https://gerrit.wikimedia.org/r/970753

Change 970816 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[analytics/wmde/scripts@master] Add or update some more Grafana links

https://gerrit.wikimedia.org/r/970816

Change 970816 merged by jenkins-bot:

[analytics/wmde/scripts@master] Add or update some more Grafana links

https://gerrit.wikimedia.org/r/970816

Change 970763 had a related patch set uploaded (by Michael Große; author: Lucas Werkmeister (WMDE)):

[analytics/wmde/scripts@production] Add or update some more Grafana links

https://gerrit.wikimedia.org/r/970763

Change 970763 merged by jenkins-bot:

[analytics/wmde/scripts@production] Add or update some more Grafana links

https://gerrit.wikimedia.org/r/970763

Manuel claimed this task.

Hi Michael, thank you for the great documentation! Also, thanks to everyone supporting!

I'll resolve this task, and create a meeting for us to discuss what it means for the main task.

Change 973318 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[analytics/wmde/scripts@master] Remove deprecated tech wish scripts

https://gerrit.wikimedia.org/r/973318

Change 973318 merged by jenkins-bot:

[analytics/wmde/scripts@master] Remove deprecated tech wish scripts

https://gerrit.wikimedia.org/r/973318

Change 974243 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: WMDE-Fisch):

[analytics/wmde/scripts@production] Remove deprecated tech wish scripts

https://gerrit.wikimedia.org/r/974243

Change 974243 merged by jenkins-bot:

[analytics/wmde/scripts@production] Remove deprecated tech wish scripts

https://gerrit.wikimedia.org/r/974243