Page MenuHomePhabricator

ECONNREFUSED error when running Selenium tests on M1 Mac
Open, Needs TriagePublicBUG REPORT

Description

What happens?:

I used what I believe to be the latest fresh:

# fresh: 22.05.1
# image: docker-registry.wikimedia.org/releng/node14-test-browser:0.0.2-s4
# software: Debian GNU/Linux 11 (bullseye)
#           Node.js v14.17.5 (npm 7.21.0)
#           Chromium 97.0.4692.99
#           Mozilla Firefox 91.5.0esr
#           JSDuck 5.3.4 (Ruby 2.7.4) ruby 2.7.4p191
# mount: /mediawiki      ➟ /Users/montehurd/mediawiki-test/mediawiki      (read-write)
#        /mediawiki/.git ➟ /Users/montehurd/mediawiki-test/mediawiki/.git (read-only)

I followed mediawiki setup instructions here:

https://gerrit.wikimedia.org/g/mediawiki/core/+/HEAD/DEVELOPERS.md

Running npm run selenium-test, I see the following:

nobody@docker-desktop:/mediawiki$ npm run selenium-test

> selenium-test
> wdio ./tests/selenium/wdio.conf.js


Execution of 5 workers started at 2022-05-20T22:15:19.540Z

[0-2] RUNNING in chrome - /tests/selenium/specs/user.js
[0-1] RUNNING in chrome - /tests/selenium/specs/recentchanges.js
[0-0] RUNNING in chrome - /tests/selenium/specs/page.js
[0-3] RUNNING in chrome - /tests/selenium/specs/watchlist.js
[0-0] 2022-05-20T22:15:59.605Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:59207
[0-0]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-3] 2022-05-20T22:15:59.625Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:56115
[0-3]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-2] 2022-05-20T22:15:59.624Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:56655
[0-2]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-1] 2022-05-20T22:15:59.639Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:60141
[0-1]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-0] RETRYING in chrome - /tests/selenium/specs/page.js
[0-1] RETRYING in chrome - /tests/selenium/specs/recentchanges.js
[0-2] RETRYING in chrome - /tests/selenium/specs/user.js
[0-3] RETRYING in chrome - /tests/selenium/specs/watchlist.js
[0-0] RUNNING in chrome - /tests/selenium/specs/page.js
[0-2] RUNNING in chrome - /tests/selenium/specs/user.js
[0-3] RUNNING in chrome - /tests/selenium/specs/watchlist.js
[0-1] RUNNING in chrome - /tests/selenium/specs/recentchanges.js
[0-0] 2022-05-20T22:16:38.766Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:58619
[0-0]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-2] 2022-05-20T22:16:38.914Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:56295
[0-2]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-3] 2022-05-20T22:16:38.918Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:55895
[0-3]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-0] FAILED in chrome - /tests/selenium/specs/page.js (1 retries)
[0-1] 2022-05-20T22:16:39.065Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:58169
[0-1]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-2] FAILED in chrome - /tests/selenium/specs/user.js (1 retries)
[0-3] FAILED in chrome - /tests/selenium/specs/watchlist.js (1 retries)
[0-1] FAILED in chrome - /tests/selenium/specs/recentchanges.js (1 retries)
[0-4] RUNNING in chrome - /tests/selenium/wdio-mediawiki/specs/BlankPage.js
[0-4] 2022-05-20T22:17:10.637Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:58063
[0-4]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-4] RETRYING in chrome - /tests/selenium/wdio-mediawiki/specs/BlankPage.js
[0-4] RUNNING in chrome - /tests/selenium/wdio-mediawiki/specs/BlankPage.js
[0-4] 2022-05-20T22:17:42.220Z ERROR @wdio/runner: Error: connect ECONNREFUSED 127.0.0.1:57439
[0-4]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
[0-4] FAILED in chrome - /tests/selenium/wdio-mediawiki/specs/BlankPage.js (1 retries)

Spec Files:	 0 passed, 5 retries, 5 failed, 5 total (100% completed) in 00:02:22

What should have happened instead?:

@zeljkofilipin Did the same steps on his Intel Mac and everything works ok.

I suspect there's some chromium configuration which isn't playing nice with arm64 hosts. If I understand correctly, since it doesn't look like we're building a fresh image for arm64, the amd64 image is used. My inclination was to see about building a fresh image for arm64, but I haven't tracked down specifics for how to go about this with fresh.

I know headless chrome can play nice on M1 from using browserless-chrome, which has an arm64 build:

https://hub.docker.com/r/browserless/chrome/tags

I'm using it elsewhere (though I do have an unrelated bug using it). Here's some info on what they use spinning up headless chrome:

https://docs.browserless.io/blog/2018/06/04/puppeteer-best-practices.html#7-use-docker-to-contain-it-all

Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc.:

  • MacOS Monterey 12.3.1
  • Apple M1
  • Docker Desktop 4.8.2, Engine 20.10.14, Compose 2.5.1

Any help is appreciated. I may be missing something glaringly obvious...

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mhurd updated the task description. (Show Details)

@Mhurd Are you using the propietary "Docker Desktop for Mac" app to host your Linux VM in which the containers run, or are you using a different docker-machine server? If the former, can you confirm that you've installed anew (not carried over from OS upgrade) the Apple Silicon version from https://docs.docker.com/desktop/mac/install/, including the rosetta command which installs the transparent layer for running Intel applications?

From their manual (link): Another thing that might work is to run export DOCKER_DEFAULT_PLATFORM=linux/amd64 on your outer shell before the fresh-node command. If that works, we can add that to Fresh. For reasons I don't understand, upstream Docker is refusing to let macOS engage Rosetta unless you specifically set this.

Hi, I've been using M1 arm64 for a while and the workaround --platform linux/amd64 almost never works (at least for me) I think we need to have arm64 builds. I've rebuilt my projects that used Chrome and Firefox to build both amd/arm containers.

In a nut shell, Docker Desktop for Mac uses Qemu to run the Linux VM, which either inherrently or due to the specific way that Docker uses it, makes inotify not work when emulated. Chromium requires this during startup and thus refuses to start. This isn't configurable as far as I know, and affects all uses of Chromium inside Docker on Apple M1, so long as the container is run under emulation. There are dozens of upstream bug reports and other projects affected by this as well. In addition to that, Chromium also makes use of zygote which causes Qemu to crash.

See also:

In general, Chromium works on Apple M1 when run directly by macOS (transparently via Rosetta), e.g. when using Google Chrome as desktop mac app. It fails only when run with the indirection of Qemu/Linux/Docker.

I know headless chrome can play nice on M1 from using browserless-chrome, which has an arm64 build:

Chromium could be compiled directly for ARM-type chips (such as Apple M1) and run fine within Qemu/Docker then. However, as far as I know there are not yet any such official distributions by Google. It is my understanding that the way browserless/chrome works on M1, is by actually using the Microsoft Edge distribution of Chromium (from playwright instead of puppeteer). Unlike Google, Microsoft does support ARM-type processors. Noting that Apple M1 is not the first chipset to use ARM-type processors, there are plenty of PC laptops with ARM-type chips, instead of AMD-type chips like Intel.

My inclination was to see about building a fresh image for arm64 […]

Fresh is mostly just an idea that, concretely, is merely a 10-line shell alias to run docker run --rm --interactive --tty --entrypoint /bin/bash docker-registry.wikimedia.org/releng/node12-test-browser. Its main purpose is to resemble and reflect CI and does so by literally using the same container, and CI in turn intends to generally share the base images and packages with production.

To create a native arm64 Linux base image, we'd need a lot more than Chromium.

See also:

@Krinkle Thanks for the great feedback!

Are you using the propietary "Docker Desktop for Mac"

Yes

can you confirm that you've installed anew (not carried over from OS upgrade) the Apple Silicon version from https://docs.docker.com/desktop/mac/install/, including the rosetta command

Yes, but just to be extra sure I re-installed Docker Desktop and re-ran the Rosetta command

To create a native arm64 Linux base image, we'd need a lot more than Chromium.

I see, and that makes sense.


It may be helpful to explain what I'm trying to do...

I have a repo here with a makefile which lets you spin up mediawiki from scratch with basically a single command. Keep in mind I'm still relatively novice with Docker, but so far this has been a really fun learning experience.

My next goal was to add make commands for running tests, which was easy for parser and unit tests ( make runparsertests, make runphpunittests... forgive the all lower case, will tweak this in the future... ) They worked as expected.

So next I wanted to tackle selenium tests. Because I couldn't seem to get the tests to work with fresh, I tried using a browserless chrome container as seen is my WIP selenium branch, but even though the tests begin, and can actually be seen running in a chrome window served up by the browserless chrome container, the tests get stuck for some reason after a little bit.

My suspicion is the problem is related to how I'm configuring browserless, and it's maybe spawning too many sessions, so when a test causes a new page to be loaded, browserless isn't just re-using the session in the same way it would if you were running the tests outside of a dockerized environment.

While debugging this we thought it would be instructive to circle back and see if running the tests hangs in the *same way* on my machine when using fresh, or if they actually work but I had just been using fresh incorrectly to run the selenium tests, but they seem to be hanging more immediately when run via fresh...

That was a lot of text :)

I'm still digesting your second comment...

One thing I did notice in fresh which I wasn't sure how to interpret was...

nobody@docker-desktop:/mediawiki$ chromium --version
Error: Can't open display: 
nobody@docker-desktop:/mediawiki$ /usr/lib/chromium/chromium --version
Chromium 97.0.4692.99 
nobody@docker-desktop:/mediawiki$

Edit: perhaps it's related to this this:

And to avoid running into zombie processes (which commonly happen with Chrome), you'll want to use something like dumb-init to properly start-up:

ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64 /usr/local/bin/dumb-init
RUN chmod +x /usr/local/bin/dumb-init

I'm pretty out of my depth here haha.

Krinkle edited projects, added Performance-Team (Radar); removed Performance-Team.
Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.