Page MenuHomePhabricator

Speed Change Impact Analysis
Closed, ResolvedPublic

Description

Analyze the impact of speed changes on mobile website (Baha is taking on this analysis). As learning evolves document details here. The material will eventually be carried over to the October 7, 2015 quarterly review for Reading.

Some lenses:

  • Pre and post regional impact, filtering out known complicated noise
  • Mobile web versus desktop web
  • Previous year similar timeframe versus current year similar timeframe
  • Analysis isolated for specific sets of pages (e.g., top 1,000 pages)
  • Distribution curves on relative interclick time
  • More refined geographic perspectives
  • Consecutive week delta analysis (e.g., how do deltas play out week-by-week prior to the change versus how do deltas play out week-by-week post the change - which is a problem distinct from comparing one week prior to the change against one week post the change).

Event Timeline

dr0ptp4kt assigned this task to bmansurov.
dr0ptp4kt raised the priority of this task from to Needs Triage.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt added a project: Reading-Admin.
dr0ptp4kt moved this task to Doing on the Reading-Admin board.

Probably not much. I'll close this one and carry the intro over there.

dr0ptp4kt set Security to None.

Actually, this card is a short term thing not for Q2. It's a one time analysis.

@dr0ptp4kt, am I correct to assume that the main patch for speed improvements [1] was in production starting 08/11? For weekly analysis, the dates before the change would be 08/04 - 08/10 and dates after the change would be 08/18 - 08/24 so that the days of the week also match. Is that reasonable?

[1] https://gerrit.wikimedia.org/r/#/c/227627/

Previous year similar timeframe versus current year similar timeframe

@Milimetric, will hadoop have complete pageview data for August of 2014? Or should I look at some other place for that? Thanks!

Also, anything I should be aware of before comparing data from August 2015 to data from August 2014? I suspect the definition of pageview may have changed during this time.

@bmansurov, if by complete you mean unsampled, then no, we delete all unsampled logs older than 60 days. For August 2015 we have sampled UDP data, sampled TCP data, and unsampled data, for August 2014 we only have sampled UDP data. You can apply the old pageview definition or the new pageview definition, whichever one you like, to the sampled data from August 2014. Likewise for the 2015 data. If you'd like to talk about how to do that, I can help.

@dr0ptp4kt, am I correct to assume that the main patch for speed improvements [1] was in production starting 08/11? For weekly analysis, the dates before the change would be 08/04 - 08/10 and dates after the change would be 08/18 - 08/24 so that the days of the week also match. Is that reasonable?

[1] https://gerrit.wikimedia.org/r/#/c/227627/

I think it was earlier than that .

https://twitter.com/mediawiki/status/630865572654264320

You probably want to avoid the day of the implementation or the day after. @ori, are there any other gotchas for the async rollout that could muddle analysis?

pre and post regional impact

Dates before change: from 2015-07-30 till 2015-08-05 (inclusive)
Dates after change: from 2015-08-13 till 2015-08-19 (inclusive)
Query:

SELECT continent, year, month, day, referer_class, sum(view_count) as view_count, access_method
FROM wmf.pageview_hourly
WHERE
    (
      (CONCAT(year, "-", LPAD(month, 2, "0"), "-", LPAD(day, 2, "0")) BETWEEN "2015-07-30" AND "2015-08-05")
     OR
      (CONCAT(year, "-", LPAD(month, 2, "0"), "-", LPAD(day, 2, "0")) BETWEEN "2015-08-13" AND "2015-08-19")
    )
    AND agent_type = "user"
    AND (access_method = "desktop" OR access_method = "mobile web")
GROUP BY continent, year, month, day, referer_class, access_method;

Results:
Details are in [1] (Let me know if anyone else needs access to the file).

High level snapshot:

  • Global desktop increase is 4.80% (cell L29).
  • Global mobile web increase is 7.16% (cell L38).

filtering out known complicated noise

will follow later

[1] https://docs.google.com/spreadsheets/d/1dVu72Q4zkRS4ES3tFrRFc1WqFwQOzI7S9qI4yPLqpzc/edit#gid=1320518345

Hi Baha,

the query looks solid, but two things to consider before one draws conclusions from this high level snapshot:

  1. There was an outage on August 4 (i.e. in the first week used for comparison above) which caused missing pageviews, see e.g. this chart from the weekly readership metrics report where that day is marked together with the day of the Javascript change; it's also marked as an annotation in this dashboard which links to the corresponding Phabricator task. IIRC Dan also sent a fuller explanation to the Analytics-l mailing list, can try to dig out the archive link in case you're not on it.
  2. More generally, I'm not sure how far we can get by just comparing two weeks and not accounting for e.g. seasonal changes. (E.g. the 4.8% desktop increase looks great, but there was also a 3.4% desktop increase in the subsequent week - Sept 20-26 compared to Sept 13-19 -, without any particular event that I'm aware of.) I know you may be seeking to look at other metrics that could be less affected by seasonal changes than pageviews. But in any case there are more systematic approaches for determining whether such a change (event) had an impact, see e.g. here.

Hi Tilman/@Tbayer, thanks for the feedback.

  1. I was able to find [1] which I hope is the email you're referring to. According to that email the data (that of interest to us) occurred on August 3. In order for the analysis to be correct, is the best way forward to exclude this date and August 17, which is also a Monday from analysis?
  2. I totally agree with you on this. I'm working on generating more conclusions using different dates and accounting for seasonal changes. I, however, noted that the 3.4% increase of two week comparison after the performance improvements doesn't match the number (0.84%) I got in my calculations. I'll post my analysis below and invite you to the spreadsheet I'm working on. I'd appreciate it if you could help me find errors in my calculations.

[1] https://lists.wikimedia.org/pipermail/analytics/2015-August/004218.html

Consecutive week delta analysis (e.g., how do deltas play out week-by-week prior to the change versus how do deltas play out week-by-week post the change - which is a problem distinct from comparing one week prior to the change against one week post the change).

Dates:
Two weeks before change: from 2015-07-23 till 2015-07-29 (inclusive)
The week before change: from 2015-07-30 till 2015-08-05 (inclusive)
The week after change: from 2015-08-13 till 2015-08-19 (inclusive)
Two weeks after change: from 2015-08-20 till 2015-08-26 (inclusive)

Query:

SELECT continent, year, month, day, referer_class, sum(view_count) as view_count, access_method
FROM wmf.pageview_hourly
WHERE
    (
      (CONCAT(year, "-", LPAD(month, 2, "0"), "-", LPAD(day, 2, "0")) BETWEEN "2015-07-23" AND "2015-08-05")
     OR
      (CONCAT(year, "-", LPAD(month, 2, "0"), "-", LPAD(day, 2, "0")) BETWEEN "2015-08-13" AND "2015-08-26")
    )
    AND agent_type = "user"
    AND (access_method = "desktop" OR access_method = "mobile web")
GROUP BY continent, year, month, day, referer_class, access_method;

Results:
Details are in [1] (Let me know if anyone else needs access to the file).

High-level snapshot:
The change in the number of page views ...

  • two weeks after the performance improvements compared to the week after the improvements is -0.25%.
  • the week before the performance improvements compared to two weeks before the improvements is 1.38%.

[1] https://docs.google.com/spreadsheets/d/1dVu72Q4zkRS4ES3tFrRFc1WqFwQOzI7S9qI4yPLqpzc/edit#gid=1000400365

Hi Baha,

re 1.: yes, I meant August 3 and that email. I guess leaving out both Mondays is a suitable workaround to eliminate that artefact.

re 2.: Not sure we are talking about the same comparison? As I mentioned, the 3.4% refer to Sept 20-26 compared to Sept 13-19, just as an illustration of the fact that we can find similar weekly gains elsewhere and need to be cautious about drawing conclusions from isolated data points. Sorry, I usually always try to include the query used, but was in a bit of a rush yesterday. Here is how I arrived at the 3.4%:

SELECT SUM(view_count) FROM wmf.pageview_hourly WHERE (CONCAT(year, "-", LPAD(month, 2, "0"), "-", LPAD(day, 2, "0")) BETWEEN "2015-08-13" AND "2015-08-19") AND agent_type = "user" AND access_method = "desktop";

2041977158

SELECT SUM(view_count) FROM wmf.pageview_hourly WHERE (CONCAT(year, "-", LPAD(month, 2, "0"), "-", LPAD(day, 2, "0")) BETWEEN "2015-08-20" AND "2015-08-26") AND agent_type = "user" AND access_method = "desktop";
...
2112336875

2112336875 / 2041977158 = 1.0344...

Thanks for inviting me to the spreadsheet, will try to take a look later.

Here [1] is a script that performs a t-test to test whether the mean page views over 41 days before and after the performance tweaks were statistically significantly different. Copy-pasted from the README in the above URL:

At 5% significance level, statistically significant:

INCREASE in the number of page views:
    Mobile Web: Africa, Antarctica, Europe, North America, South America
    Desktop: Europe, Oceania, South America
DECREASE in the number of page views:
    Mobile Web: Unkown
    Desktop: Unkown
NO DIFFERENCE in the number of page views:
    Mobile Web: Asia, Oceania
    Desktop: Africa, Antarctica, Asia, North America

@Tbayer, I'd appreciate a review. Thanks.

[1] https://github.com/kodchi/perf-impact

Nice work Baha! Let's run this by research as well, I'm sure there will be
feedback on seasonality, etc.

-Toby

Thanks, @Tnegrin. I'm thinking of doing a similar analysis to account for seasonality. The older data is in different format so I'll have to bring it to a good shape before doing so.

@Halfak, I'd appreciate it if you can review [1]. Thanks.

[1] https://github.com/kodchi/perf-impact