Recurring queries
Open, NormalPublic

Description

There are several queries where the user will be interested in current results, rather than published results.
Example: Latest newusers in the recent week.
http://quarry.wmflabs.org/query/3933

Currently it looks like only the owner can re run the query. It will be great, if queries can be run on access of the URL or by any user.

Arjunaraoc updated the task description. (Show Details)
Arjunaraoc raised the priority of this task from to Needs Triage.
Arjunaraoc added a project: Quarry.
Arjunaraoc added a subscriber: Arjunaraoc.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 9 2015, 1:56 PM
Capt_Swing renamed this task from Dynamic results from quarry to Recurring queries.Jul 17 2015, 7:06 PM
Capt_Swing triaged this task as Normal priority.
Capt_Swing set Security to None.
Capt_Swing added a subscriber: Capt_Swing.

Have recurring queries: to start, specify that a query can be run weekly or monthly, etc.

Capt_Swing moved this task from Backlog to Feature request on the Quarry board.Jul 17 2015, 7:07 PM

An extra table would be required - schedules. It'll have: (id, query, schedule), where schedule is one of 'daily, weekly, monthly'. And then we'll have celerybeat or something wake up every minute or so, check the schedule, find their latest runs, and if they needed to be run, schedule them. This hopefully distributes it enough to not cause any crashes. However, we can also keep an active count of the queue size and number of executing queries, and just not schedule anything if it is high enough.

An alternative is to use crontab like mechanics, but have those be randomly generated by the code when it is selected. This will distribute them randomly, and also give people an accurate estimate of when this gets updated.

The tsreports approach to this is regenerating on access, but showing the cached version while the regeneration is in process (plus an ETA for the new version).

Ricordisamoa added a subscriber: Ricordisamoa.
-jem- added a subscriber: -jem-.Oct 5 2015, 10:25 AM
Framawiki added a comment.EditedApr 11 2018, 5:19 PM

It can be great to have dedicated runners for scheduled queries, rather than interfere with end-user performance.

Framawiki added a comment.EditedOct 25 2018, 7:12 PM

Ca be incorporated into the data that will be shown by T206482: Show query code revisions and runs history

Wurgl added a subscriber: Wurgl.Oct 25 2018, 7:49 PM

I think there is no need to run such a query by cron or similar.

There will sure be forgotten queries which you execute over and over again, but no one is looking at the results. So you waste CPU-time.

With some cron-like mechanism just mark the query as "shall be reexecuted when accessed". And execute the query when someone opens the page and that flag is set, otherwise behave as you do now.