Page MenuHomePhabricator

Do some load testing with grant metrics to get an idea of what limits we should add
Closed, ResolvedPublic3 Estimated Story Points

Description

Now that grant metrics is up and running, we should get some benchmarks to see what kind of limits we might have to add. Specifically for:

  • Update data (time taken to compute, how many wikis, event stats, length of event)
  • View all data (time taken to generate, how many wikis, number of records, length of event)

We should add this in the logs. Maybe have a separate file for logging just these.
Based on the results, we could consider capping the number of participants or the length of the event or the number of wikis etc. Maybe more than one.

Event Timeline

Niharika moved this task from New & TBD Tickets to Needs Discussion on the Community-Tech board.
Niharika set the point value for this task to 3.Feb 20 2018, 11:56 PM

For the test "Women in Red" event, I added some 500+ participants from actual Art+Feminism participants, along with prolific editors/stewards/people who edit on a lot of wikis, and set the event duration to just over a year, across 11 wikis. It took around 30 seconds to fetch the statistics. Browsing revisions sometimes took a while... around 30 seconds to load a page. Frankly this is not at all surprising given how much data we're working with, but we might look into potential improvements.

So, initial outlook is good :)

However I noticed we're not saying if a job is currently running. While you open the event in a different browser (see below), the "Update stats" button is active. We should check the job queue and put the page in the loading state if there's a job running. I've created T188368.

Related, I want to make sure it's known that you typically can't load the tool in another tab while queries are still running in one tab. If you want to do this, you have to use a new browser session (different browser, incognito). I believe it's Toolforge/VPS that's doing this, because the same thing happens with XTools. We consider it a feature, because it prevents people from going crazy and running a bunch of long-running queries at once. It's a little different for Grant Metrics because we have an actual job queue system, but overall I don't think forcing people to wait is a bad thing -- especially since there are real limitations on how many queries can be ran at any given time.

For the test "Women in Red" event, I added some 500+ participants from actual Art+Feminism participants, along with prolific editors/stewards/people who edit on a lot of wikis, and set the event duration to just over a year, across 11 wikis. It took around 30 seconds to fetch the statistics. Browsing revisions sometimes took a while... around 30 seconds to load a page. Frankly this is not at all surprising given how much data we're working with, but we might look into potential improvements.

Woah. 30 seconds is really good! How many revisions were there i.e. how many pages improved? We should get these stats written down.

Related, I want to make sure it's known that you typically can't load the tool in another tab while queries are still running in one tab. If you want to do this, you have to use a new browser session (different browser, incognito). I believe it's Toolforge/VPS that's doing this, because the same thing happens with XTools. We consider it a feature, because it prevents people from going crazy and running a bunch of long-running queries at once. It's a little different for Grant Metrics because we have an actual job queue system, but overall I don't think forcing people to wait is a bad thing -- especially since there are real limitations on how many queries can be ran at any given time.

That sounds fine to me.

So what kind of limits would you say we should add?

Well for starters, I've made it show more friendly errors if a query times out or we hit the maximum connection limit. The timeout is currently set to 900 seconds, which is ample.

For the revision browser (as I'm calling it, aka "Event data page"), we might consider showing fewer revisions per page, which means a smaller LIMIT on the query and hence should go faster. Taking it a step further, we could offer a "Results per page" option. If the query takes more than 30 seconds (long time), we could show a message "Request took N seconds to complete. Consider showing fewer results per page" (or something like that). The REQUEST_TIME_FLOAT server variables gives us the overall time it took to load the page, so we already have that at our disposal.

General related thought... I'm really loving this "revision browser", where I can enter in multiple users and a date frame and see all of their contributions together. I think this could be useful to a lot of people. Once we have it perfected I might create a dedicated tool :)

Well for starters, I've made it show more friendly errors if a query times out or we hit the maximum connection limit. The timeout is currently set to 900 seconds, which is ample.

For the revision browser (as I'm calling it, aka "Event data page"), we might consider showing fewer revisions per page, which means a smaller LIMIT on the query and hence should go faster. Taking it a step further, we could offer a "Results per page" option. If the query takes more than 30 seconds (long time), we could show a message "Request took N seconds to complete. Consider showing fewer results per page" (or something like that). The REQUEST_TIME_FLOAT server variables gives us the overall time it took to load the page, so we already have that at our disposal.

Since it's pretty fast for now, let's hold off on this until we actually run into this issue with users. I'm adding this as a nice-to-have thing for future.

General related thought... I'm really loving this "revision browser", where I can enter in multiple users and a date frame and see all of their contributions together. I think this could be useful to a lot of people. Once we have it perfected I might create a dedicated tool :)

I'm liking it too! But beware that it doesn't turn into this. ;)

I think we're done here, for now? The only thing that went a little too slow was the revision browser, but only when we have a ton of participants over a lot of wikis, etc.

Niharika claimed this task.
Niharika moved this task from Needs Review/Feedback to Q1 2018-19 on the Community-Tech-Sprint board.

Think so too. Let's not create any limits for the time being.