Today I got everything I need to run Android test on Bitbar so we can get a feeling for how much work it is to setup and what kind of metrics we get. I'm gonna use this task to document the setup.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Peter | T274227 Try out running tests on Bitbar | |||
Resolved | Peter | T277523 Investigate difference in TTFB on Android Chrome | |||
Resolved | Peter | T278412 Test running WebPageReplay |
Event Timeline
I've been testing the simple cloud version. The idea is that you upload a bash script that do your testing. I've tested with a super simple version:
#!/bin/bash adb --version node --version npm install browsertime -g echo "Start tests ..." browsertime --android -n 1 https://en.m.wikipedia.org/wiki/Barack_Obama
It works fine but the Chrome versions do not match (This version of ChromeDriver only supports Chrome version 88 Current browser version is 83.0.4103.106).
I think for us to be able to really evaluate we need a dedicated instance + network throttling, I'll ask about that.
One more thing to test is if we can reach our Graphite/S3 direct from the machine running the tests, that will make it much easier for setup. Else we need to use the API, download the result zip file and push the data.
I've got help to run on latest Chrome and a Moto G5 device with throttled 3g. We can also send metrics to Graphite/S3 so I will try that tonight/tomorrow.
For the first run, the wifi doesn't look so good (I think also others use it). Checkout the change in TTFB:
[2021-02-11 09:19:49] INFO: Run tests on Moto G (5) [ZY322RX6D6] using Android version 7.0 [2021-02-11 09:19:49] INFO: Running tests using Chrome - 11 iteration(s) [2021-02-11 09:19:57] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 1 [2021-02-11 09:20:11] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 1.06s DOMContentLoaded: 4.02s firstPaint: 4.19s FCP: 4.19s LCP: 4.19s Load: 5.76s TBT: 710ms CLS:0.0681 [2021-02-11 09:20:18] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 2 [2021-02-11 09:20:33] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 594ms DOMContentLoaded: 4.32s firstPaint: 4.46s FCP: 4.46s LCP: 4.46s Load: 6.64s TBT: 669ms CLS:0.0681 [2021-02-11 09:20:41] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 3 [2021-02-11 09:20:57] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 1.04s DOMContentLoaded: 5.30s firstPaint: 5.17s FCP: 5.17s LCP: 5.17s Load: 8.77s TBT: 669ms CLS:0.0681 [2021-02-11 09:21:05] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 4 [2021-02-11 09:21:18] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 562ms DOMContentLoaded: 4.10s firstPaint: 3.79s FCP: 3.79s LCP: 3.79s Load: 5.74s TBT: 701ms CLS:0.0681 [2021-02-11 09:21:26] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 5 [2021-02-11 09:21:39] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 554ms DOMContentLoaded: 3.96s firstPaint: 3.83s FCP: 3.83s LCP: 3.83s Load: 5.75s TBT: 704ms CLS:0.0681 [2021-02-11 09:21:47] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 6 [2021-02-11 09:22:00] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 569ms DOMContentLoaded: 3.98s firstPaint: 3.86s FCP: 3.86s LCP: 3.86s Load: 5.73s TBT: 695ms CLS:0.0681 [2021-02-11 09:22:08] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 7 [2021-02-11 09:22:21] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 532ms DOMContentLoaded: 4.07s firstPaint: 3.95s FCP: 3.95s LCP: 3.95s Load: 5.57s TBT: 716ms CLS:0.0681 [2021-02-11 09:22:29] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 8 [2021-02-11 09:22:42] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 558ms DOMContentLoaded: 3.90s firstPaint: 3.75s FCP: 3.75s LCP: 3.75s Load: 5.56s TBT: 709ms CLS:0.0681 [2021-02-11 09:22:49] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 9 [2021-02-11 09:23:06] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 741ms DOMContentLoaded: 5.37s firstPaint: 5.20s FCP: 5.20s LCP: 5.20s Load: 8.10s TBT: 681ms CLS:0.0681 [2021-02-11 09:23:13] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 10 [2021-02-11 09:23:27] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 745ms DOMContentLoaded: 3.93s firstPaint: 3.79s FCP: 3.79s LCP: 3.79s Load: 5.71s TBT: 707ms CLS:0.0681 [2021-02-11 09:23:34] INFO: Testing url https://en.m.wikipedia.org/wiki/Barack_Obama iteration 11 [2021-02-11 09:23:48] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama TTFB: 565ms DOMContentLoaded: 4.12s firstPaint: 3.89s FCP: 3.89s LCP: 3.89s Load: 5.76s TBT: 690ms CLS:0.0681 [2021-02-11 09:23:48] INFO: https://en.m.wikipedia.org/wiki/Barack_Obama 28 requests, TTFB: 684ms (±56.46ms), firstPaint: 4.17s (±156.38ms), FCP: 4.17s (±156.38ms), DOMContentLoaded: 4.28s (±153.39ms), LCP: 4.17s (±156.38ms), CLS: 0.0681 (±0.00), TBT: 696ms (±4.72ms), Load: 6.28s (±320.17ms) (11 runs) [2021-02-11 09:23:48] INFO: Wrote data to browsertime-results/en.m.wikipedia.org-wiki-Barack_Obama/2021-02-11T091948+0000 Done ...
There's also other alternatives to setup a throttled connection, I will talk with Bitbar about it.
That test was using the 3g wifi, the 4g wifi looks better. But there's no RTT set on those, let me check if they can add that.
The only thing now missing is to trigger runs using their API. I'll dig into it.
I've been running tests over the weekend on the 3g and 4g wifi:
Here's the TTFB on the 3g.
And median First Visual Change during the same period:
And 4g TTFB:
And 4g First Visual Change:
Today I switched to use gnirehtet instead of the wifi, I've setup a 3g and a 4g run. When I first tried 4g I don't get the same stability as on my own computer. Lets see, I will keep the test run for a couple of days to see what get.
I've been running gnirehtet for a while now with the following result:
First Visual Change 3g
First Visual Change 4g
TTFB 4g
TTFB 3g
It looks like we got much higher variance between runs on Bitbar that running the same setup at hope with my Mac. I've setup a test to run the same setup every 10 minutes at home to see what kind numbers we get. If we look at the spread between 11 runs it looks like this for 4g:
Min/Median/Average/Max
That seems to too high and something isn't working as it should. I'm gonna collect more data.
Okay, I've been running the same setup at home for almost 6 hours, running tests every 12 minutes and it looks like this (using 4g setup using my Mac):
Min/Median/Average/Max
The TTFB median looks much better and you can also see the variance between runs. It looks like something isn't right at Bitbar.
I added test so we compare Bitbar vs Kobiton: I test the the static Banksy page, 11 runs each and looking and min/medin/max and stdev:
https://grafana.wikimedia.org/d/Tbeh-peWk/test-kobiton?orgId=1&from=now-7d&to=now
I've also enabled test using gnirehtet against the static Banksy test to compare with testing with wifi.