Page MenuHomePhabricator

Fix worker timeout issues on production
Open, HighPublic

Description

The tag #flickr2commons is accruing quite a large set of data (230,000 edits in 2.5 months). The query now takes a long time to evaluate, including today when a worker fully timed out, resulting in a 502 error.

This is a broad issue with many potential steps to take, including:

  • Evaluating which db calls are being made and how expensive they are
  • Taking steps to optimise the database
  • Considering whether to restrict #flickr2commons edits
  • Investigating worker timeouts (could we just increase the time before they time out for now?)

Event Timeline

Small update - #flickr2commons is now 600,000/900,000 hashtag records in the database.

@Samwalton9 Shall I go ahead and restrict the search for #flickr2commons for now?

No, I think we need a better solution than arbitrarily restricting any tags that get too large.

I'd at least like to first properly evaluate the current situation that leads to these timeouts and see what improvements can be made, such as to the database setup, to improve things.

Rather than increasing workers, can we paginate the results only for #flickr2commons hashtags by 10 returning the first 10 hashtags only. When the user goes on the second page, alternatively, hashtags ranked 11 to 20 by no of edits will be fetched raw for the first time.

I started working on implementing a proper search engine (https://github.com/Samwalton9/hashtags/pull/22), because I think the core issue here is that we're doing heavy searches directly via Django querysets. I made decent progress, with the primary search functionality up and running - all that's really left to do is reimpliment the top-level statistics that's shown for each search. I just haven't had time to do that yet. If you want to progress that branch please feel free to do so.

Oh actually I've just taken a look and I only committed some of my changes, let me just push my other WIP code.

PR updated with notes on current progress.

Hello @Samwalton9 and others and thanks for looking into this issue. Please let me know once the issues are resolved. Thank you.