Page MenuHomePhabricator

Measure CI overhead using xvfb/ffmpeg vs --headlesss
Open, Needs TriagePublic5 Estimated Story PointsSpike

Description

In WebdriverIO documentation headless is recommended (meaning using the browsers own headless switch). The drawback is that you can't get a video of the run. We uses XVFB as screen and use ffmpeg to record a video.

Since some time ago, Chromium --headless uses the same code path for both headless and non headless (before it used its own binary for headless). That is good since it should then run through the same code paths. You can also configure Chrome's tracelog to continues take screenshots by adding the trace category disabled-by-default-devtools.screenshot. The user can then drag and drop the trace log into Chrome and see the screenshots.

Should we follow best practices and also use --headless? Let try to measure what kind of overhead ffmpeg/xvfb adds to the test runs so we can take a decision.

Today we have these graphs for CI.

Acceptance Criteria:

  • Measure and document the run time (will --headless be faster?).
  • For both tests run with at least 1 and 10 simultaneously running tests.
  • Create follow up task where the decision can be documented and we can discuss if we should use headless or not

Event Timeline

Change #1207155 had a related patch set uploaded (by Phedenskog; author: Phedenskog):

[mediawiki/core@master] selenium: Test removing ffmpeg overhead

https://gerrit.wikimedia.org/r/1207155

Change #1207196 had a related patch set uploaded (by Phedenskog; author: Phedenskog):

[mediawiki/core@master] selenium: Test many ffmpegs at the same time

https://gerrit.wikimedia.org/r/1207196

Change #1207449 had a related patch set uploaded (by Phedenskog; author: Phedenskog):

[mediawiki/core@master] selenium: Test to see the xvfb/ffmpeg overhead

https://gerrit.wikimedia.org/r/1207449

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptThu, Nov 20, 2:52 PM

T410607 needs to be merged for this to make it easier to do this task.

I'm gonna do like this:
I'll prepare one of the bare metal servers we have for synthetic performance tests. Install mediawiki and quickstart and perf. Then I'll just make sure everything works on that machine and then when we merged T410607 we can stop the performance tests for a short while, run some tests and collect data using perf and see if I can evaluate the numbers we get.,

Dumping the log of the errors I get running the tests on the test machine:

[0-0] RUNNING in chrome - file:///tests/selenium/docs/Create_a_simple_test/specs/specialpages.js
[0-0] PASSED in chrome - file:///tests/selenium/docs/Create_a_simple_test/specs/specialpages.js
[0-1] RUNNING in chrome - file:///tests/selenium/docs/Page_object_pattern/specs/login.js
[0-1] Error in "User.should be able to log in without page object"
Error: expect(received).toBe(expected) // Object.is equality

Expected: "User-0.17611062292731283-Iñtërnâtiônàlizætiøn"
Received: null
    at Context.<anonymous> (file:///test/mediawiki/tests/selenium/docs/Page_object_pattern/specs/login.js:36:28)
[0-1] RETRYING in chrome - file:///tests/selenium/docs/Page_object_pattern/specs/login.js
[0-1] RUNNING in chrome - file:///tests/selenium/docs/Page_object_pattern/specs/login.js
[0-1] Error in "User.should be able to log in without page object"
Error: expect(received).toBe(expected) // Object.is equality

Expected: "User-0.1472929106489793-Iñtërnâtiônàlizætiøn"
Received: null
    at Context.<anonymous> (file:///test/mediawiki/tests/selenium/docs/Page_object_pattern/specs/login.js:36:28)
[0-1] FAILED in chrome - file:///tests/selenium/docs/Page_object_pattern/specs/login.js (1 retries)
[0-2] RUNNING in chrome - file:///tests/selenium/docs/Stack/specs/expect.js
[0-2] PASSED in chrome - file:///tests/selenium/docs/Stack/specs/expect.js
[0-3] RUNNING in chrome - file:///tests/selenium/docs/Stack/specs/mocha.js
[0-3] Edit link visible
[0-3] PASSED in chrome - file:///tests/selenium/docs/Stack/specs/mocha.js
[0-4] RUNNING in chrome - file:///tests/selenium/docs/Stack/specs/pageobject.js
[0-4] PASSED in chrome - file:///tests/selenium/docs/Stack/specs/pageobject.js
[0-5] RUNNING in chrome - file:///tests/selenium/docs/Use_MediaWiki_API/specs/api.js
[0-5] PASSED in chrome - file:///tests/selenium/docs/Use_MediaWiki_API/specs/api.js
[0-6] RUNNING in chrome - file:///tests/selenium/specs/page.js
[0-6] Error in "Page.should be previewable @daily"
Error: Cannot submit login form
    at async LoginPage.login (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:36:3)
    at async LoginPage.loginAdmin (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:50:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:33:3)
[0-6] Error in "Page.should be re-creatable"
Error: Can't call setValue on element with selector "#wpTextbox1" because element wasn't found
    at async EditPage.edit (file:///test/mediawiki/tests/selenium/pageobjects/edit.page.js:57:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:62:3)
[0-6] Error in "Page.should be deletable"
Error: Can't call setValue on element with selector "#wpReason input" because element wasn't found
    at async DeletePage.delete (file:///test/mediawiki/tests/selenium/pageobjects/delete.page.js:26:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:98:3)
[0-6] Error in "Page.should be restorable"
Error: Cannot submit login form
    at async LoginPage.login (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:36:3)
    at async LoginPage.loginAdmin (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:50:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:110:3)
[0-6] Error in "Page.should be protectable"
Error: Can't call setValue on element with selector "#mwProtect-reason input" because element wasn't found
    at async ProtectPage.protect (file:///test/mediawiki/tests/selenium/pageobjects/protect.page.js:30:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:126:3)
[0-6] RETRYING in chrome - file:///tests/selenium/specs/page.js
[0-6] RUNNING in chrome - file:///tests/selenium/specs/page.js
[0-6] Error in "Page.should be previewable @daily"
Error: Cannot submit login form
    at async LoginPage.login (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:36:3)
    at async LoginPage.loginAdmin (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:50:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:33:3)
[0-6] Error in "Page.should be creatable"
Error: Cannot submit login form
    at async LoginPage.login (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:36:3)
    at async LoginPage.loginAdmin (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:50:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:45:3)
[0-6] Error in "Page.should be re-creatable"
Error: Can't call setValue on element with selector "#wpTextbox1" because element wasn't found
    at async EditPage.edit (file:///test/mediawiki/tests/selenium/pageobjects/edit.page.js:57:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:62:3)
[0-6] Error in "Page.should be restorable"
Error: Cannot submit login form
    at async LoginPage.login (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:36:3)
    at async LoginPage.loginAdmin (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:50:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:110:3)
[0-6] 2025-12-03T06:37:48.466Z ERROR webdriver: WebDriverError: element not interactable
[0-6]   (Session info: chrome=143.0.7499.40) when running "element/f.90C4A4E3465E4FF6BA5A0DDDDB81B42D.d.7F2E55B43F49EE943AACA939707640AA.e.66/click" with method "POST"
[0-6] Error in "Page.should be protectable"
Error: Cannot submit login form
    at async LoginPage.login (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:36:3)
    at async LoginPage.loginAdmin (file:///test/mediawiki/tests/selenium/wdio-mediawiki/LoginPage.js:50:3)
    at async Context.<anonymous> (file:///test/mediawiki/tests/selenium/specs/page.js:124:3)

Ok down to two type of errors now:

[0-1] Error in "User.should be able to log in without page object"
Error: expect(received).toBe(expected) // Object.is equality

Expected: "User-0.24708116128821733-Iñtërnâtiônàlizætiøn"
Received: "User-0.5685653779285673-Iñtërnâtiônàlizætiøn"

And

[0-6] Error in "Page.should be previewable @daily"
Error: Can't call setValue on element with selector "#wpTextbox1" because element wasn't found

Ok, the problem was that I used 127.0.0.1 and not localhost - I wonder though why our tests care? Let's check later and move on with the measuring.

There's one error left:

[chrome 143.0.7499.40 linux #0-8]
[chrome 143.0.7499.40 linux #0-8] 1) User temporary user should be able to create account
[chrome 143.0.7499.40 linux #0-8] expect(received).toBe(expected) // Object.is equality

Expected: "User-0.7420192976655529-Iñtërnâtiônàlizætiøn"
Received: "~2025-17"

And sometimes:

[0-6] 2025-12-03T13:35:03.495Z ERROR webdriver: WebDriverError: element click intercepted: Element <span class="oo-ui-labelElement-label" id="ooui-1" role="textbox" aria-readonly="true">...</span> is not clickable at point (361, 9). Other element would receive the click: <div class="vector-sticky-header-start">...</div>
[0-6]   (Session info: chrome=143.0.7499.40) when running "element/f.F0F70A2D71995E454DCDED6B8F168142.d.3990B08834F33D0865C8EBB73EBD6ED5.e.1152/click" with method "POST"

Change #1207155 abandoned by Phedenskog:

[mediawiki/core@master] selenium: Test removing ffmpeg overhead

Reason:

This was only for testing

https://gerrit.wikimedia.org/r/1207155

Change #1207196 abandoned by Phedenskog:

[mediawiki/core@master] selenium: Test many ffmpegs at the same time

Reason:

This was only for testing

https://gerrit.wikimedia.org/r/1207196

Change #1207449 abandoned by Phedenskog:

[mediawiki/core@master] selenium: Test to see the xvfb/ffmpeg overhead

Reason:

This was only for testing

https://gerrit.wikimedia.org/r/1207449

Skipping the following tests make the rest run without any errors:

it.skip( 'should be able to log in without page object',
it.skip( 'temporary user should not see signup form fields relevant to named users'
it.skip( 'temporary user should be able to create account', 
it.skip( 'should be protectable'

I've been trying to test if we can reduce the speed of webdriverIO tests on CI by running tests truly headless and skipping the video. Let's look at the result. But first lets talk about how the test was conducted.

How the tests was done

I've been using one of the bare metal server we use for synthetic performance testing. I disabled the tests on the machine so nothing else than the OS was running. The server has 8 cores, running at max speed 4ghz using the performance governor. On the server I installed perf and use it to calculate CPU time spent and wall clock time (=how much CPU is used and how fast do the test run). I installed mediawiki-quickstart to have a local version of Mediawiki to test against. And then mediawiki, to run our core tests directly on the machine. There was a couple of tests that was flakey so I disabled those tests. It's important that we run the exact same tests and iterations when we compare.

I started with ten iteration but ended up running 50 iterations and looked like this: perf stat -r 50 -- npm run selenium-test
And then I modified the configuration between the tests.

With the tests I measured using XVFB /FFmpeg vs headless and different maxInstances settings (how many tests runs at the same time).

XVFB/FFMpeg vs --headless 1 maxInstance

Today in core we run with maxInstance test to 1 since with the upgrade to wdio 9 we had problem with multiple instances. However we want to run more at the same time.

XVFB/FFmpeg 1 instance

Performance counter stats for 'npm run selenium-test' (50 runs):

         59,398.42 msec task-clock:u              #    1.231 CPUs utilized            ( +-  0.10% )
                 0      context-switches:u        #    0.000 /sec
                 0      cpu-migrations:u          #    0.000 /sec
         1,855,105      page-faults:u             #   31.135 K/sec                    ( +-  0.31% )
   160,981,271,442      cycles:u                  #    2.702 GHz                      ( +-  0.07% )
   176,175,911,664      instructions:u            #    1.09  insn per cycle           ( +-  0.08% )
    32,762,818,402      branches:u                #  549.870 M/sec                    ( +-  0.07% )
       994,288,240      branch-misses:u           #    3.03% of all branches          ( +-  0.07% )

            48.267 +- 0.158 seconds time elapsed  ( +-  0.33% )

--headless 1 instance

Performance counter stats for 'npm run selenium-test' (50 runs):

      53,028.66 msec task-clock:u              #    1.172 CPUs utilized            ( +-  0.06% )
              0      context-switches:u        #    0.000 /sec
              0      cpu-migrations:u          #    0.000 /sec
      1,589,355      page-faults:u             #   29.858 K/sec                    ( +-  0.05% )
145,707,526,357      cycles:u                  #    2.737 GHz                      ( +-  0.07% )
154,862,977,084      instructions:u            #    1.06  insn per cycle           ( +-  0.06% )
 30,445,235,098      branches:u                #  571.942 M/sec                    ( +-  0.06% )
    942,474,642      branch-misses:u           #    3.08% of all branches          ( +-  0.06% )

         45.233 +- 0.213 seconds time elapsed  ( +-  0.47% )

Summary

Running one instance at a time the difference is:

  • Headless is ~6% faster (48.3 → 45.2 s)
  • Headless uses ~11% less CPU time

XVFB/FFMpeg vs --headless 4 maxInstance

Before we upgraded to wdio 9 we used 4 maxInstances. Lets see what that the difference would be

XVFB/FFmpeg 4 instances

Performance counter stats for 'npm run selenium-test' (50 runs):

         70,805.09 msec task-clock:u              #    2.836 CPUs utilized            ( +-  0.07% )
                 0      context-switches:u        #    0.000 /sec
                 0      cpu-migrations:u          #    0.000 /sec
         1,863,666      page-faults:u             #   26.431 K/sec                    ( +-  0.10% )
   199,697,640,484      cycles:u                  #    2.832 GHz                      ( +-  0.07% )
   180,696,157,493      instructions:u            #    0.91  insn per cycle           ( +-  0.07% )
    33,047,194,830      branches:u                #  468.693 M/sec                    ( +-  0.06% )
     1,100,075,988      branch-misses:u           #    3.34% of all branches          ( +-  0.08% )

            24.965 +- 0.486 seconds time elapsed  ( +-  1.95% )

--headless 4 instances

Performance counter stats for 'npm run selenium-test' (50 runs):

         62,360.63 msec task-clock:u              #    2.733 CPUs utilized            ( +-  0.07% )
                 0      context-switches:u        #    0.000 /sec
                 0      cpu-migrations:u          #    0.000 /sec
         1,582,559      page-faults:u             #   25.239 K/sec                    ( +-  0.06% )
   177,877,377,628      cycles:u                  #    2.837 GHz                      ( +-  0.08% )
   154,531,564,275      instructions:u            #    0.86  insn per cycle           ( +-  0.06% )
    30,269,405,014      branches:u                #  482.740 M/sec                    ( +-  0.06% )
     1,035,899,837      branch-misses:u           #    3.41% of all branches          ( +-  0.08% )

           22.8169 +- 0.0886 seconds time elapsed  ( +-  0.39% )

Summary

  • Headless is ~8.6% faster
  • Headless Uses ~12% less CPU time

That was the raw data. I'm gonna update the epic with all the numbers we have (I also measured other changes we done).

So how we run tests today with ffmpeg and Xvfb with maxInstance set to 1 and change that to use --headless and maxInstance 4, the core tests will be ~2.1× faster and the CPU cost for that will be 5% more CPU time.

I added a summary section to T408361 where we can see difference of different changes.