Page MenuHomePhabricator

Pages shouldn't time out due to expensive database queries
Closed, ResolvedPublic

Description

The Wikipedia Library program page (https://wikilink.wmflabs.org/programs/1) and a number of organisation pages currently don't load due to hitting the 30 second nginx timeout. The database queries for loading the data happen on page load and take an excessive amount of time to execute. The biggest offenders are the LinkEvents over time graph and the three breakout tables for top organisation/editor/project/page.

Instead of calculating the data for the tables and other figures on every page load, we should add new models to store this data, calculating it asynchronously. Page loads would then pull this data instead of waiting for lots of calculations.

Another option could have been to use caching, by crawling each page in the background so that cached versions were always available. This isn't feasible here because we have date range filtering - it would be impossible to load each combination of dates.

Acceptance criteria

  • All program and organisation pages load within 5 seconds.

Event Timeline

Samwalton9-WMF renamed this task from Wikipedia Library program page doesn't load due to expensive db queries to Some pages don't load due to expensive db queries.Apr 13 2020, 2:20 PM
Samwalton9-WMF updated the task description. (Show Details)
Samwalton9-WMF renamed this task from Some pages don't load due to expensive db queries to Pages shouldn't time out due to expensive db queries.Jul 16 2020, 10:16 AM
Samwalton9-WMF renamed this task from Pages shouldn't time out due to expensive db queries to Pages shouldn't time out due to expensive database queries.Aug 6 2020, 8:52 AM

We discussed this task yesterday and decided that the best approach is likely going to be doing the calculations for top tables and the graph asynchronously, storing that data and fetching it on page load.

If we do this we could also probably get rid of memcached, replacing it with file-based caching if we decide caching is still useful. This could be a separate task.

Additionally, we noted that Django 3.1 just introduced async views, though the feature isn't expected to be completed until 3.2.

The first step here will be evaluating Django async views in 3.1 (T259866). This would necessitate a Django upgrade before proceeding with this work, but may be worthwhile.

Something which occurred to me today - the tables and graphs change based on the user's filters. How would this play with doing our calculations asynchronously?

Taking this off the Kanban now that it's split into component tasks.

This has one remaining feature to reimplement, but it's low priority and we're not going to work on it soon.

All organisation pages load in a few seconds. The Wikipedia Library program page currently takes closer to 20 seconds, but we're considering that good enough for now with caching.