Page MenuHomePhabricator

Deploy lazy loaded images to a few more wikis
Closed, ResolvedPublic1 Estimated Story Points

Description

As a product owner, I want lazy loaded images running in production on a mobile web Wikipedia so that I can observe how the enhancement that saves bandwidth impacts usage in a coarse grained fashion.

  • Deploy lazy loaded images to fa.m.wikipedia.org, uk.m.wikipedia.org

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
dr0ptp4kt renamed this task from Deploy lazy loaded images to a specific mobile Wikipedia to Deploy lazy loaded images to bn.m.wikipedia.org.Apr 29 2016, 8:48 PM
dr0ptp4kt updated the task description. (Show Details)
This comment was removed by Jdlrobson.

Querying the most recent table

select count(*) from NavigationTiming_15485142 where wiki = 'bewiki' and timestamp > 20160403000000

yields 353 entries for the past month, so this is definitely a small sample.

@ori increasing the sampling rate for one wiki seems like a bad thing to do as it would have impacts on the global first paint / fully load times which are supposed to reflect all our traffic. How would you recommend going about this?

@Jdlrobson, good thinking to raise this!

@ori et al:

The task title currently suggests bnwiki, not bewiki, although both of those are on the lower end of the spectrum in terms of number of events. The top traffic sites like enwiki dwarf smaller ones like bnwiki, so my guess is amplification of the number of events would largely be washed out. But it's a good point that artificial amplification in a given wiki (likely) introduces skew synthetically on the macro number (i.e., by virtue of changing what gets counted), as opposed to naturally (i.e., as a result of an actual content loading change).

I think we should consider fawiki or ukwiki either in addition to bnwiki or as an alternative. These get relatively larger numbers of events, and although the natural change of different loading strategies is bound to impact the global trend (again, probably more nominally given the top sites account for most events), at least it's a natural consequence.

For a frame of reference, here's data looking at hits for one week. NULL means desktop, stable means mobile web stable, beta means mobile web beta.

select wiki, event_mobileMode, count(*) ct, avg(event_loadEventEnd) load_time from
NavigationTiming_15485142 where timestamp > '20160427' and timestamp < '20160504'
and wiki in ('enwiki', 'fawiki', 'bnwiki', 'ukwiki') group by wiki, event_mobileMode order by load_time asc;

wiki	event_mobileMode	ct	load_time
enwiki	NULL	725548	2940.3989
ukwiki	NULL	4336	3558.6836
fawiki	beta	7	4186.0000
enwiki	stable	564376	4834.4134
enwiki	beta	201	5382.7438
bnwiki	beta	4	6006.3333
fawiki	NULL	6824	6591.3143
bnwiki	NULL	165	7399.9538
fawiki	stable	9779	9505.1877
ukwiki	stable	2437	10225.1451
bnwiki	stable	295	11228.5442

Here's some additional data on bnwiki, fawiki, and ukwiki from Hive looking at one day of pageviews.

Destop v Mobile

select project, access_method, sum(view_count)
from projectview_hourly
where year = 2016 and month = 4 and day = 28
and project in ('bn.wikipedia', 'fa.wikipedia', 'uk.wikipedia')
and agent_type = 'user'
group by project, access_method;

project	access_method	_c2
bn.wikipedia	desktop	57539
bn.wikipedia	mobile app	2402
bn.wikipedia	mobile web	113066
fa.wikipedia	desktop	1049256
fa.wikipedia	mobile app	32395
fa.wikipedia	mobile web	1594857
uk.wikipedia	desktop	1046392
uk.wikipedia	mobile app	9386
uk.wikipedia	mobile web	662231




Top 3 country level access per wiki for mobile web traffic specifically. When there's a high concentration of users for a given wiki in a particular region, that helps with insights about relative pageview changes and relative speed changes.

select project, country_code, sum(view_count)
from projectview_hourly
where year = 2016 and month = 4 and day = 28
and project in ('bn.wikipedia', 'fa.wikipedia', 'uk.wikipedia')
and agent_type = 'user'
and access_method = 'mobile web'
group by project, country_code;

bn.wikipedia	BD	55823
bn.wikipedia	US	20892
bn.wikipedia	IE	16449


fa.wikipedia	IR	1418817
fa.wikipedia	US	56514
fa.wikipedia	DE	15765


uk.wikipedia	UA	620712
uk.wikipedia	NL	9387
uk.wikipedia	US	6930



Top 3 destination wikis based on top source country above, based on a quick scan. This aids in distinguishing if changes are material based on destination wiki. Granted, there are likely to be some socioeconomic conditions aligned with target langwiki for any given source country, but it's the relative change that's most interesting when comparing what happened in two destination wikis for the same source country of access.

select country_code, project, sum(view_count)
from projectview_hourly
where year = 2016 and month = 4 and day = 28
and country_code in ('BD', 'IR', 'UA')
and agent_type = 'user'
and access_method = 'mobile web'
group by country_code, project;


BD	en.wikipedia	411855
BD	bn.wikipedia	55823
BD	commons.wikimedia	8083

IR	fa.wikipedia	1418817
IR	en.wikipedia	268671
IR	fa.wikisource	6988


UA	ru.wikipedia	1079823
UA	uk.wikipedia	620712
UA	en.wikipedia	104695

Speaking to specific analysis for this work, we can of course always update our queries to include/exclude particular wikis from the resultant output to isolate effects.

For our future selves, here's a query by country for bnwiki, fawiki, and ukwiki with primary country sourcing vs US sourcing. Future queries would need to screen out impacts of non-caching via the available fields in NavigationSchema. The following are just averages; more sophisticated analysis would involve study of the median/quantiles/distribution curves (cf. T125414: Investigate how connection speed varies by country)

One week

select wiki, event_originCountry, count(*) ct, avg(event_loadEventEnd) load_time from
NavigationTiming_15485142 where timestamp > '20160427' and timestamp < '20160504'
and event_mobileMode = 'stable' and wiki in ('fawiki', 'bnwiki', 'ukwiki')
and event_originCountry in ('BD', 'US', 'IE', 'IR', 'DE', 'UA', 'NL') group by wiki, event_originCountry order by wiki, event_originCountry, load_time;


wiki	event_originCountry	ct	load_time
bnwiki	BD	167	10419.8519
bnwiki	US	21	3807.6000

fawiki	IR	8768	9680.8648
fawiki	US	317	6846.1467

ukwiki	UA	2252	10578.8736
ukwiki	US	31	10791.6667


Approximately April 12 - May 5 since latest functioning schema rev

select wiki, event_originCountry, count(*) ct, avg(event_loadEventEnd) load_time from
NavigationTiming_15485142 where timestamp < '20160505'
and event_mobileMode = 'stable' and wiki in ('fawiki', 'bnwiki', 'ukwiki')
and event_originCountry in ('BD', 'US', 'IE', 'IR', 'DE', 'UA', 'NL') group by wiki, event_originCountry order by wiki, event_originCountry, load_time

bnwiki	BD	427	10603.6950
bnwiki	US	58	5786.5556
fawiki	IR	24735	9487.8627
fawiki	US	935	7815.1058
ukwiki	UA	8163	10609.2396
ukwiki	US	84	10913.0615
dr0ptp4kt renamed this task from Deploy lazy loaded images to bn.m.wikipedia.org to Deploy lazy loaded images to specific <lang(s)>.m.wikipedia.org.May 5 2016, 1:49 PM
dr0ptp4kt renamed this task from Deploy lazy loaded images to specific <lang(s)>.m.wikipedia.org to Deploy lazy loaded images to fa.m.wikipedia.org and uk.m.wikipedia.org.May 9 2016, 3:04 PM
dr0ptp4kt triaged this task as Medium priority.May 9 2016, 5:02 PM
dr0ptp4kt renamed this task from Deploy lazy loaded images to fa.m.wikipedia.org and uk.m.wikipedia.org to Deploy lazy loaded images and references to fa.m.wikipedia.org and uk.m.wikipedia.org.May 16 2016, 3:49 PM
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt renamed this task from Deploy lazy loaded images and references to fa.m.wikipedia.org and uk.m.wikipedia.org to Deploy lazy loaded images to fa.m.wikipedia.org and uk.m.wikipedia.org.May 16 2016, 4:34 PM
dr0ptp4kt updated the task description. (Show Details)

I think it would be useful to deploy both lazy loaded references AND images to https://tl.wikipedia.org/
This wiki is about the same size in readership as Bengali wikipedia, in a 2G prevalent area and will give us interesting data about how combining the data changes things. @dr0ptp4kt
do you agree?

I think the benefit of rolling out to fa and uk wikis is the larger audience, thus more confidence that our changes work and larger data sets for navigation timing metrics but I don't envision much benefit aside from those points.

@dr0ptp4kt @Jhernandez it is my suggestion we update this task to launch lazy loading images on all medium wikis with definitions coming from:

1Large size wikis (>= 9 digit figures):
2{code: "en", size: 7398000000}
3{code: "es", size: 1233000000}
4{code: "ja", size: 1073000000}
5{code: "de", size: 1003000000}
6{code: "ru", size: 947000000}
7{code: "fr", size: 752000000}
8{code: "it", size: 518000000}
9{code: "pt", size: 376000000}
10{code: "zh", size: 334000000}
11{code: "pl", size: 255000000}
12{code: "ar", size: 157000000}
13{code: "tr", size: 155000000}
14{code: "nl", size: 142000000}
15{code: "id", size: 130000000}
16
17MEDIUM SIZED WIKIS (8 digit figures)
18{code: "sv", size: 90900000}
19{code: "ko", size: 84300000}
20{code: "fa", size: 82800000} *
21{code: "cs", size: 69100000}
22{code: "fi", size: 63000000}
23{code: "vi", size: 51700000}
24{code: "uk", size: 49900000} *
25{code: "hu", size: 44000000}
26{code: "he", size: 42900000}
27{code: "th", size: 37300000}
28{code: "no", size: 32900000}
29{code: "da", size: 28400000}
30{code: "ro", size: 28000000}
31{code: "el", size: 22600000}
32{code: "bg", size: 20900000}
33{code: "sr", size: 19100000}
34{code: "hr", size: 17500000}
35{code: "kk", size: 14200000}
36{code: "ca", size: 14000000}
37{code: "sk", size: 12900000}
38{code: "simple", size: 12800000}
39
40SMALL WIKIS (<7 digit figures)
41{code: "hi", size: 9700000}
42{code: "lt", size: 9500000}
43{code: "ms", size: 8100000}
44{code: "az", size: 6800000}
45{code: "et", size: 6800000}
46{code: "sh", size: 6700000}
47{code: "sl", size: 6600000}
48{code: "bn", size: 5300000} *
49{code: "ka", size: 5000000}
50{code: "hy", size: 4099999.9999999995}
51{code: "lv", size: 4000000}
52{code: "sq", size: 3900000}
53{code: "bs", size: 3500000}
54{code: "mk", size: 3200000}
55{code: "arz", size: 2800000}
56{code: "ta", size: 2800000}
57{code: "ml", size: 2700000}
58{code: "eu", size: 2500000}
59{code: "tl", size: 2500000}
60{code: "ur", size: 2400000}
61{code: "mr", size: 2300000}
62{code: "zh-yue", size: 2300000}
63{code: "be", size: 1800000}
64{code: "af", size: 1700000}
65{code: "gl", size: 1700000}
66{code: "eo", size: 1500000}
67{code: "nn", size: 1400000}
68{code: "kn", size: 1200000}
69{code: "is", size: 1100000}
70{code: "gu", size: 1100000}
71{code: "uz", size: 1000000}
72{code: "te", size: 1000000}
73{code: "mn", size: 992000}
74{code: "la", size: 959000}
75{code: "sw", size: 890000}
76{code: "wuu", size: 871000}
77{code: "pa", size: 853000}
78{code: "ce", size: 803000}
79{code: "csb", size: 782000}
80{code: "ky", size: 771000}
81{code: "tt", size: 752000}
82{code: "ba", size: 668000}
83{code: "my", size: 635000}
84{code: "sah", size: 613000}
85{code: "cv", size: 609000}
86{code: "su", size: 594000}
87{code: "an", size: 572000}
88{code: "lb", size: 570000}
89{code: "cy", size: 553000}
90{code: "jv", size: 552000}
91{code: "als", size: 550000}
92{code: "sco", size: 503000}
93{code: "br", size: 502000}
94{code: "ckb", size: 501000}
95{code: "ig", size: 490000}
96{code: "oc", size: 480000}
97{code: "war", size: 465000}
98{code: "yi", size: 464000}
99{code: "udm", size: 453000}
100{code: "si", size: 446000}
101{code: "ne", size: 442000}
102{code: "zh-min-nan", size: 430000}
103{code: "ast", size: 417000}
104{code: "am", size: 412000}
105{code: "bar", size: 408000}
106{code: "ga", size: 407000}
107{code: "ceb", size: 406000}
108{code: "ps", size: 396000}
109{code: "so", size: 383000}
110{code: "mhr", size: 374000}
111{code: "tg", size: 371000}
112{code: "km", size: 362000}
113{code: "or", size: 349000}
114{code: "yo", size: 349000}
115{code: "lez", size: 326000}
116{code: "fy", size: 318000}
117{code: "rue", size: 316000}
118{code: "ku", size: 310000}
119{code: "vec", size: 295000}
120{code: "av", size: 289000}
121{code: "io", size: 287000}
122{code: "pnb", size: 286000}
123{code: "scn", size: 274000}
124{code: "as", size: 270000}
125{code: "ia", size: 269000}
126{code: "nds", size: 268000}
127{code: "qu", size: 263000}
128{code: "new", size: 243000}
129{code: "ang", size: 237000}
130{code: "krc", size: 235000}
131{code: "lmo", size: 233000}
132{code: "hif", size: 228000}
133{code: "ilo", size: 223000}
134{code: "os", size: 220000}
135{code: "fo", size: 212000}
136{code: "ht", size: 206000}
137{code: "bo", size: 205000}
138{code: "sa", size: 202000}
139{code: "li", size: 199000}
140{code: "gd", size: 194000}
141{code: "bh", size: 190000}
142{code: "zh-classical", size: 189000}
143{code: "nah", size: 185000}
144{code: "mg", size: 184000}
145{code: "diq", size: 177000}
146{code: "vo", size: 177000}
147{code: "dsb", size: 174000}
148{code: "pms", size: 174000}
149{code: "hsb", size: 170000}
150{code: "lo", size: 170000}
151{code: "bat-smg", size: 169000}
152{code: "bxr", size: 169000}
153{code: "myv", size: 166000}
154{code: "fiu-vro", size: 166000}
155{code: "tk", size: 165000}
156{code: "gn", size: 159000}
157{code: "map-bms", size: 150000}
158{code: "nap", size: 150000}
159{code: "nds-nl", size: 147000}
160{code: "gv", size: 147000}
161{code: "crh", size: 143000}
162{code: "wa", size: 142000}
163{code: "vls", size: 141000}
164{code: "hak", size: 139000}
165{code: "gan", size: 138000}
166{code: "eml", size: 138000}
167{code: "ace", size: 135000}
168{code: "mzn", size: 134000}
169{code: "frp", size: 134000}
170{code: "bcl", size: 133000}
171{code: "tyv", size: 133000}
172{code: "frr", size: 131000}
173{code: "ksh", size: 125000}
174{code: "pam", size: 125000}
175{code: "fur", size: 125000}
176{code: "kv", size: 124000}
177{code: "bpy", size: 124000}
178{code: "ug", size: 124000}
179{code: "stq", size: 122000}
180{code: "sd", size: 122000}
181{code: "mt", size: 122000}
182{code: "min", size: 119000}
183{code: "nrm", size: 116000}
184{code: "lad", size: 111000}
185{code: "lij", size: 109000}
186{code: "cdo", size: 109000}
187{code: "gom", size: 108000}
188{code: "co", size: 106000}
189{code: "dv", size: 106000}
190{code: "bug", size: 104000}
191{code: "kw", size: 104000}
192{code: "szl", size: 103000}
193{code: "jbo", size: 101000}
194{code: "cbk-zam", size: 101000}
195{code: "ln", size: 98000}
196{code: "vep", size: 97000}
197{code: "mai", size: 96000}
198{code: "ab", size: 95000}
199{code: "se", size: 94000}
200{code: "sc", size: 94000}
201{code: "pcd", size: 92000}
202{code: "ext", size: 91000}
203{code: "st", size: 91000}
204{code: "sn", size: 90000}
205{code: "ay", size: 90000}
206{code: "kab", size: 89000}
207{code: "rw", size: 88000}
208{code: "arc", size: 87000}
209{code: "bjn", size: 86000}
210{code: "xal", size: 85000}
211{code: "kaa", size: 84000}
212{code: "zu", size: 82000}
213{code: "mi", size: 82000}
214{code: "lbe", size: 81000}
215{code: "ie", size: 81000}
216{code: "ha", size: 80000}
217{code: "pdc", size: 80000}
218{code: "mwl", size: 80000}
219{code: "om", size: 78000}
220{code: "kbd", size: 76000}
221{code: "pap", size: 74000}
222{code: "mrj", size: 73000}
223{code: "nov", size: 73000}
224{code: "nv", size: 73000}
225{code: "nso", size: 72000}
226{code: "zea", size: 72000}
227{code: "koi", size: 72000}
228{code: "cu", size: 67000}
229{code: "roa-tara", size: 67000}
230{code: "kl", size: 62000}
231{code: "pi", size: 62000}
232{code: "rm", size: 61000}
233{code: "iu", size: 61000}
234{code: "pih", size: 60000}
235{code: "pag", size: 60000}
236{code: "bi", size: 60000}
237{code: "rmy", size: 59000}
238{code: "na", size: 57000}
239{code: "chr", size: 57000}
240{code: "wo", size: 56000}
241{code: "tet", size: 56000}
242{code: "mdf", size: 54000}
243{code: "sm", size: 53000}
244{code: "tpi", size: 52000}
245{code: "haw", size: 51000}
246{code: "ny", size: 50000}
247{code: "roa-rup", size: 49000}
248{code: "fj", size: 49000}
249{code: "ki", size: 48000}
250{code: "za", size: 48000}
251{code: "pnt", size: 48000}
252{code: "tn", size: 47000}
253{code: "kg", size: 47000}
254{code: "xh", size: 47000}
255{code: "glk", size: 47000}
256{code: "to", size: 46000}
257{code: "chy", size: 46000}
258{code: "ff", size: 46000}
259{code: "sg", size: 45000}
260{code: "ik", size: 44000}
261{code: "ts", size: 44000}
262{code: "got", size: 43000}
263{code: "bm", size: 43000}
264{code: "ss", size: 43000}
265{code: "tw", size: 42000}
266{code: "ti", size: 42000}
267{code: "ak", size: 41000}
268{code: "ch", size: 41000}
269{code: "tum", size: 41000}
270{code: "ks", size: 39000}
271{code: "srn", size: 39000}
272{code: "ltg", size: 38000}
273{code: "lg", size: 38000}
274{code: "mo", size: 38000}
275{code: "rn", size: 37000}
276{code: "ee", size: 37000}
277{code: "dz", size: 36000}
278{code: "ve", size: 36000}
279{code: "ty", size: 35000}
280{code: "cr", size: 28000}
281{code: "aa", size: 19000}
282{code: "ng", size: 15000}
283{code: "kr", size: 14000}
284{code: "cho", size: 13000}
285{code: "mh", size: 11000}
286{code: "hz", size: 11000}
287{code: "mus", size: 9600}
288{code: "ho", size: 8600}
289{code: "kj", size: 7700}
290{code: "ii", size: 6800}

I would also like to see us roll out both lazy loading images and references for all small wikis (with exception of Bengali (bnwiki) so as to avoid interfering with the fact we've already enabled lazy loaded images there.

I'm keen for us to get feedback on scale and to get a clearer idea what impact both together have.

To avoid a crazy config we'd enable on 'wikipedia' and then list all the larger wikis as a blacklist.

Any objections?

Jdlrobson renamed this task from Deploy lazy loaded images to fa.m.wikipedia.org and uk.m.wikipedia.org to Deploy lazy loaded images to medium sized wikis (including fa.m.wikipedia.org and uk.m.wikipedia.org).May 31 2016, 11:38 PM

We talked about this and agreed on the current description. Ship it! :)

dr0ptp4kt renamed this task from Deploy lazy loaded images to medium sized wikis (including fa.m.wikipedia.org and uk.m.wikipedia.org) to Deploy lazy loading to a few more wikis.Jun 5 2016, 7:57 PM
dr0ptp4kt updated the task description. (Show Details)

Another query for our future selves. Although we'll want to analyze image bytes per in-scope pageview (my initial look a month or so ago suggested this was possible in Hive), for the purpose of analyzing loading time, the following tries to confine the analysis so that we can look at the quantiles for several measured fields on like data sets. In this case, this was looking at bnwiki, where lazy loading images on mobile web has been enabled for a while.

The short of it is that, after attempting to factor out lag (roughLoadTimeInitialLagExcluded), the new image loading strategy appeared to be slightly faster at the median when looking at two distinct two week periods where there were clearly no overlapping implementations in the events and the time ranges were roughly for about the same part of the month calendar.

Pre: 20160419-20160502
Post: 20160517-20160530

This seemed to be the case on HTTP/2 and non-HTTP/2, which had different characteristics in the speed distribution curve (more analysis needed).

It should be noted the data on this smaller wiki was fairly sparse - there were only several hundred rows that qualified for analysis. The medium sized wikis will have a greater volume of pageviews and should aid in data analysis. It should be noted that holidays may have some macro pull on the figures, but we'll see.

select
left(timestamp,8) as day,
webHost,
event_originCountry,
event_lazyLoadImages,
event_firstImage,
event_isHttp2,
event_mediaWikiVersion,
event_requestStart,
event_responseStart,
event_responseEnd,
event_firstPaint,
event_domInteractive,
event_domComplete,
event_loadEventStart,
event_loadEventEnd,
event_responseEnd-event_responseStart as roughNetworkTimeInitialLagExcluded,
event_loadEventEnd-event_responseStart as roughLoadTimeInitialLagExcluded
from NavigationTiming_15485142
where
timestamp > '20160413'
and timestamp < '20160602'
and event_action = 'view'
and event_isAnon = true
and event_mobileMode = 'stable'
and event_namespaceId = 0
and event_redirectCount is null
and event_responseStart is not null
and event_loadEventEnd is not null
and wiki = 'bnwiki';
dr0ptp4kt renamed this task from Deploy lazy loading to a few more wikis to Deploy lazy loading to a couple more wikis.Jun 6 2016, 4:23 PM
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt set the point value for this task to 1.Jun 6 2016, 4:26 PM
dr0ptp4kt renamed this task from Deploy lazy loading to a couple more wikis to Deploy lazy loaded images to a couple more wikis.Jun 6 2016, 9:01 PM

@dr0ptp4kt you have made some changes but not documented the why which makes it confusing for me to follow the narrative here. Why are we no longer shipping to tl.m.wikipedia.org ?

(Also why the hold off to sprint Q?)

Change 294247 had a related patch set uploaded (by Jdlrobson):
Enable lazy loaded images on Ukranian and Farsi Wikipedias

https://gerrit.wikimedia.org/r/294247

Thanks for asking, @Jdlrobson. Sprint 74 ran out of points. But this task is at the top of the queue (#2, although I'll drag it to #1). tl.m.wikipedia.org depends on the reference patch, so I decoupled it from this task, which doesn't have the dependency.

dr0ptp4kt renamed this task from Deploy lazy loaded images to a couple more wikis to Deploy lazy loaded images to a few more wikis.Jun 14 2016, 4:37 PM
dr0ptp4kt updated the task description. (Show Details)

Note the idea was to turn on lazy loaded references and images together so I have made an update to this card (email to be sent out).

Change 294247 merged by jenkins-bot:
Enable lazy loaded images on Ukranian and Farsi Wikipedias

https://gerrit.wikimedia.org/r/294247

Verified deployment, found T138473 in the process...

Closing this one off.