Page MenuHomePhabricator

Proton fails with Chromium 72.0.3626.96
Closed, ResolvedPublic

Description

Tracking upstream issue: https://github.com/GoogleChrome/puppeteer/issues/4040

Background

I tested the latest Chromium security updates on deployment-chromium01.deployment-prep.eqiad.wmflabs and it fails with 72.0.3626.96-1~deb9u1 (released as https://lists.debian.org/debian-security-announce/2019/msg00036.html).

I've copied the previous version to my home dir (/home/jmm), reverting to that version fixes PDF generation. In /srv/log/proton/main.log I can see:

{"name":"proton","hostname":"deployment-chromium01","pid":14,"level":50,"msg":"Unexpected error: Error: Failed to launch chrome!\n[0219/100722.128782:ERROR:address_tracker_linux.cc(158)] Could not create NETLINK socket: Operation not supported (95)\nFailed to create secure directory (/nonexistent/.config/pulse): No such file or directory\nReceived signal 11 SEGV_MAPERR 000000000080\n#0 0x5631c6f02711 <unknown>\n#1 0x5631c6f02b7b <unknown>\n#2 0x5631c6f031de <unknown>\n#3 0x7f412c3350c0 <unknown>\n#4 0x5631c52ea314 <unknown>\n#5 0x5631c52f51b7 <unknown>\n#6 0x5631ca9dd566 <unknown>\n#7 0x5631ca9dd61a [0219/100722.141180:ERROR:udev_linux.cc(21)] Failed to initialize udev, possibly due to an invalid system configuration. Various device-related browser features may be broken.\n<unknown>\n#8 0x5631c5286fd3 <unknown>\n#9 0x5631c5700c32 <unknown>\n#10 0x5631c5289089 <unknown>\n#11 0x5631c528a039 <unknown>\n#12 0x5631cb4af775 <unknown>\n#13 0x5631c69ad257 <unknown>\n#14 0x5631c69ad501 <unknown>\n#15 0x5631c69ad8b0 <unknown>\n#16 0x5631c69b8b7a <unknown>\n#17 0x5631c69ab6c5 <unknown>\n#18 0x5631ca9e32e3 <unknown>\n#19 0x5631ca9e34d4 <unknown>\n#20 0x5631c69b5d49 <unknown>\n#21 0x5631c4653d6d ChromeMain\n#22 0x7f411e75f2b1 __libc_start_main\n#23 0x5631c4653b8a _start\n  r8: 0000000000000001  r9: 0000000000000030 r10: 00007f40ebfff9d0 r11: 0000000000000202\n r12: 00007ffff50ac1f0 r13: 00005631ce187d30 r14: 00007ffff50ac240 r15: 00005631ce1862c0\n  di: 00007ffff50ac1f0  si: 00005631cba9a770  bp: 00007ffff50ac290  bx: 00005631ce186490\n  dx: 00005631c52ea314  ax: 00007ffff50ac1f0  cx: 0000000000000319  sp: 00007ffff50ac190\n  ip: 00005631c52ea314 efl: 0000000000010202 cgf: 002b000000000033 erf: 0000000000000004\n trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000080\n[end of stack trace]\nCalling _exit(1). Core file will not be generated.\n\n\nTROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md\n","trace":"Error: Failed to launch chrome!\n[0219/100722.128782:ERROR:address_tracker_linux.cc(158)] Could not create NETLINK socket: Operation not supported (95)\nFailed to create secure directory (/nonexistent/.config/pulse): No such file or directory\nReceived signal 11 SEGV_MAPERR 000000000080\n#0 0x5631c6f02711 <unknown>\n#1 0x5631c6f02b7b <unknown>\n#2 0x5631c6f031de <unknown>\n#3 0x7f412c3350c0 <unknown>\n#4 0x5631c52ea314 <unknown>\n#5 0x5631c52f51b7 <unknown>\n#6 0x5631ca9dd566 <unknown>\n#7 0x5631ca9dd61a [0219/100722.141180:ERROR:udev_linux.cc(21)] Failed to initialize udev, possibly due to an invalid system configuration. Various device-related browser features may be broken.\n<unknown>\n#8 0x5631c5286fd3 <unknown>\n#9 0x5631c5700c32 <unknown>\n#10 0x5631c5289089 <unknown>\n#11 0x5631c528a039 <unknown>\n#12 0x5631cb4af775 <unknown>\n#13 0x5631c69ad257 <unknown>\n#14 0x5631c69ad501 <unknown>\n#15 0x5631c69ad8b0 <unknown>\n#16 0x5631c69b8b7a <unknown>\n#17 0x5631c69ab6c5 <unknown>\n#18 0x5631ca9e32e3 <unknown>\n#19 0x5631ca9e34d4 <unknown>\n#20 0x5631c69b5d49 <unknown>\n#21 0x5631c4653d6d ChromeMain\n#22 0x7f411e75f2b1 __libc_start_main\n#23 0x5631c4653b8a _start\n  r8: 0000000000000001  r9: 0000000000000030 r10: 00007f40ebfff9d0 r11: 0000000000000202\n r12: 00007ffff50ac1f0 r13: 00005631ce187d30 r14: 00007ffff50ac240 r15: 00005631ce1862c0\n  di: 00007ffff50ac1f0  si: 00005631cba9a770  bp: 00007ffff50ac290  bx: 00005631ce186490\n  dx: 00005631c52ea314  ax: 00007ffff50ac1f0  cx: 0000000000000319  sp: 00007ffff50ac190\n  ip: 00005631c52ea314 efl: 0000000000010202 cgf: 002b000000000033 erf: 0000000000000004\n trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000080\n[end of stack trace]\nCalling _exit(1). Core file will not be generated.\n\n\nTROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md\n\n    at onClose (/srv/deployment/proton/deploy-cache/revs/ff7c8a2f8a09ae0893e27f2b25df65edff7a9b32/node_modules/puppeteer-core/node6/lib/Launcher.js:417:14)\n    at Interface.helper.addEventListener (/srv/deployment/proton/deploy-cache/revs/ff7c8a2f8a09ae0893e27f2b25df65edff7a9b32/node_modules/puppeteer-core/node6/lib/Launcher.js:406:50)\n    at emitNone (events.js:91:20)\n    at Interface.emit (events.js:185:7)\n    at Interface.close (readline.js:320:8)\n    at Socket.onend (readline.js:109:10)\n    at emitNone (events.js:91:20)\n    at Socket.emit (events.js:185:7)\n    at endReadableNT (_stream_readable.js:974:12)\n    at _combinedTickCallback (internal/process/next_tick.js:80:11)\n    at process._tickCallback (internal/process/next_tick.js:104:9)","levelPath":"error/request","time":"2019-02-19T10:07:22.169Z","v":0}

Reproduce the error

  1. clone https://github.com/mateusbs17/mediawiki-services-chromium-render/blob/puppeteer_stretch
  2. Run:
cd path/to/proton
docker build ./ --file testing.Dockerfile -t proton
docker run -it -p 3030:3030 proton
  1. If you can see the service startup log, access http://localhost:3030/en.wikipedia.org/v1/pdf/Barack_Obama/letter

Open Questions

We are currently using puppeteer v1.9.0 and the closest version to match the current debian chromium version is v1.11.0, which is paired with chromium 72.0.3618.0.

  • Upgrading puppeteer is enough or the difference between minor versions is a problem?

Event Timeline

To exclude firejail as a source of error, I disabled puppet on deployment-chromium01, remove Firejail from the service unit and restarted proton.service, same effect, Proton still fails:

{"name":"proton","hostname":"deployment-chromium01","pid":3873,"level":50,"msg":"Unexpected error: Error: Failed to launch chrome!\nFailed to create secure directory (/nonexistent/.config/pulse): No such file or directory\nReceived signal 11 SEGV_MAPERR 000000000080\n#0 0x560dc46cf711 <unknown>\n#1 0x560dc46cfb7b <unknown>\n#2 0x560dc46d01de <unknown>\n#3 0x7f18d1c7a0c0 <unknown>\n#4 0x560dc2ab7314 <unknown>\n#5 0x560dc2ac21b7 <unknown>\n#6 0x560dc81aa566 <unknown>\n#7 0x560dc81aa61a <unknown>\n#8 0x560dc2a53fd3 <unknown>\n#9 0x560dc2ecdc32 <unknown>\n#10 0x560dc2a56089 <unknown>\n#11 0x560dc2a57039 <unknown>\n#12 0x560dc8c7c775 <unknown>\n#13 0x560dc417a257 <unknown>\n#14 0x560dc417a501 <unknown>\n#15 0x560dc417a8b0 <unknown>\n#16 0x560dc4185b7a <unknown>\n#17 0x560dc41786c5 <unknown>\n#18 0x560dc81b02e3 <unknown>\n#19 0x560dc81b04d4 <unknown>\n#20 0x560dc4182d49 <unknown>\n#21 0x560dc1e20d6d ChromeMain\n#22 0x7f18c40a42b1 __libc_start_main\n#23 0x560dc1e20b8a _start\n  r8: 0000000000000001  r9: 0000000000000030 r10: 00007f18a67fc9d0 r11: 0000000000000202\n r12: 00007ffdeb95c970 r13: 0000560dcc8d80e0 r14: 00007ffdeb95c9c0 r15: 0000560dcc8d6670\n  di: 00007ffdeb95c970  si: 0000560dc9267770  bp: 00007ffdeb95ca10  bx: 0000560dcc8d6840\n  dx: 0000560dc2ab7314  ax: 00007ffdeb95c970  cx: 0000000000000319  sp: 00007ffdeb95c910\n  ip: 0000560dc2ab7314 efl: 0000000000010206 cgf: 002b000000000033 erf: 0000000000000004\n trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000080\n[end of stack trace]\nCalling _exit(1). Core file will not be generated.\n\n\nTROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md\n","trace":"Error: Failed to launch chrome!\nFailed to create secure directory (/nonexistent/.config/pulse): No such file or directory\nReceived signal 11 SEGV_MAPERR 000000000080\n#0 0x560dc46cf711 <unknown>\n#1 0x560dc46cfb7b <unknown>\n#2 0x560dc46d01de <unknown>\n#3 0x7f18d1c7a0c0 <unknown>\n#4 0x560dc2ab7314 <unknown>\n#5 0x560dc2ac21b7 <unknown>\n#6 0x560dc81aa566 <unknown>\n#7 0x560dc81aa61a <unknown>\n#8 0x560dc2a53fd3 <unknown>\n#9 0x560dc2ecdc32 <unknown>\n#10 0x560dc2a56089 <unknown>\n#11 0x560dc2a57039 <unknown>\n#12
0x560dc8c7c775 <unknown>\n#13 0x560dc417a257 <unknown>\n#14 0x560dc417a501 <unknown>\n#15 0x560dc417a8b0 <unknown>\n#16 0x560dc4185b7a <unknown>\n#17 0x560dc41786c5 <unknown>\n#18 0x560dc81b02e3 <unknown>\n#19 0x560dc81b04d4 <unknown>\n#20 0x560dc4182d49 <unknown>\n#21 0x560dc1e20d6d ChromeMain\n#22 0x7f18c40a42b1 __libc_start_main\n#23 0x560dc1e20b8a _start\n  r8: 0000000000000001  r9: 0000000000000030 r10: 00007f18a67fc9d0 r11: 0000000000000202\n r12: 00007ffdeb95c970 r13: 0000560dcc8d80e0 r14: 00007ffdeb95c9c0 r15: 0000560dcc8d6670\n  di: 00007ffdeb95c970  si: 0000560dc9267770  bp: 00007ffdeb95ca10  bx: 0000560dcc8d6840\n  dx: 0000560dc2ab7314  ax: 00007ffdeb95c970  cx: 0000000000000319  sp: 00007ffdeb95c910\n  ip: 0000560dc2ab7314 efl: 0000000000010206 cgf: 002b000000000033 erf: 0000000000000004\n trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000080\n[end of stack trace]\nCalling _exit(1). Core file will not be generated.\n\n\nTROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md\n\n    at onClose (/srv/deployment/proton/deploy-cache/revs/ff7c8a2f8a09ae0893e27f2b25df65edff7a9b32/node_modules/puppeteer-core/node6/lib/Launcher.js:417:14)\n    at Interface.helper.addEventListener (/srv/deployment/proton/deploy-cache/revs/ff7c8a2f8a09ae0893e27f2b25df65edff7a9b32/node_modules/puppeteer-core/node6/lib/Launcher.js:406:50)\n    at emitNone (events.js:91:20)\n    at Interface.emit (events.js:185:7)\n    at Interface.close (readline.js:320:8)\n    at Socket.onend (readline.js:109:10)\n    at emitNone (events.js:91:20)\n    at Socket.emit (events.js:185:7)\n    at endReadableNT (_stream_readable.js:974:12)\n    at _combinedTickCallback (internal/process/next_tick.js:80:11)\n    at process._tickCallback (internal/process/next_tick.js:104:9)","levelPath":"error/request","time":"2019-02-19T11:04:54.707Z","v":0}
MSantos renamed this task from Proton fails with Chromium 72 to Proton fails with Chromium 72.0.3626.96.Feb 19 2019, 7:04 PM
MSantos updated the task description. (Show Details)

It looks like Chromium is trying to write some PulseAudio config. @MSantos I'd say to first try to upgrade to the latest puppeteer and deploy that in deployment-prep to see if it helps. If not, we'll have to investigate more.

We still haven't created the herald rule to tag all proton tasks with our backlog, so for now we need the important tasks to be tagged manually with #reading-infrastructure-team-backlog. I'm moving this one to the kanban board directly for someone to pick up.

I tried to run proton (which I could before this update) and couldn't make it work locally with the new Debian version because it doesn't match any puppeteer version. I set up some environments to test puppeteer against the suggested chromium versions and the debian one:

To replicate my tests just clone the repo and run:

cd path/to/proton
docker build ./ --file testing.Dockerfile -t proton
docker run -it -p 3030:3030 proton

If you can see the service startup log, access http://localhost:3030/en.wikipedia.org/v1/pdf/Barack_Obama/letter

Where did you get the 72.0.3618.0 chromium build from? There's no Debian release for stretch or did you test this with version from Debian unstable?

Great, this means that we can rule out some Debian-specific build change. I think the next step is to report this to Puppeteer upstream?

[...] I think the next step is to report this to Puppeteer upstream?

Done.

I think we should also discuss the viability to backport specific versions in case Puppeteer doesn't go along with Debian updates. That's far from ideal and it's a nightmare with mapnik in the maps infrastructure but I don't know yet how it could be better.

I think we should also discuss the viability to backport specific versions in case Puppeteer doesn't go along with Debian updates. That's far from ideal and it's a nightmare with mapnik in the maps infrastructure but I don't know yet how it could be better.

What kind of backports are you referring to? Puppeteer or Chromium? We can't stick with an old version of Chromium, Google's handling of security bugs and the sheer rate of security fixes makes backporting security fixes impossible. That's why all distros only ship the latest version (similar to Chrome itself).

@MoritzMuehlenhoff I understand, the thing is that Puppeteer is unpredictable with different chromium versions, the official documentation explains that.

We have documented the process for that but it seems that we need to watch chromium versioning closely and maybe pin specific versions.

Puppeteer acts as an indivisible entity with Chromium. Each version of Puppeteer bundles a specific version of Chromium – the only version it is guaranteed to work with. This is not an artificial constraint: A lot of work on Puppeteer is actually taking place in the Chromium repository.

Perhaps we should consider packaging the fixed Chromium version that comes with Puppeteer and have it our APT repo? Would that be an option?

There was some discussion on this in T213366: [2 hrs] Decide on handling system updates for Proton.

Yeah we could pin a specific Chromium version. That would mean lagging behind in security updates; not sure how bad that is (the process is firejailed and most security updates are probably about things like holes in same-domain checks, not OS-level access).

There was some discussion on this in T213366: [2 hrs] Decide on handling system updates for Proton.

Yeah we could pin a specific Chromium version. That would mean lagging behind in security updates; not sure how bad that is (the process is firejailed and most security updates are probably about things like holes in same-domain checks, not OS-level access).

Firejail is a bandaid to limit the scope of exploitation, but ultimately we still want to prevent it in the first place. There's still a fair number of security issues in Chrome/Chromium allowing code execution by means of memory corruption, use-after-frees or similar, so while a transition period to a new Puppeteer release is fine, we can't remain on an older Chromium release for extended periods of time (specific definitions of what constitutes a "transition period" or an "extended period of time" are currently not precisely defined :-)

There was a regression in the upstream release which also broke headless mode, as reported by Antoine Musso in https://phabricator.wikimedia.org/T216702

This fix got backported in Debian and was released as https://lists.debian.org/debian-security-announce/2019/msg00039.html. I've retested Proton in deployment-prep whether it could have been the reason for the Proton error, but it still fails as before.

We should probably ask upstream if they have a policy about this kind of situation. Does Puppeteer strive to always have a release compatible with the latest Chromium, so we only have to wait for it / report when it's not compatible? Or they don't particularly care, in which case we might want to think about what we need to run old Chromium versions securely.

It seems this was caused by an upstream regression in Chromium: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=922794 I tested the new 72.0.3626.122 release in deployment-prep and it appears to be fixed. I'll slowly roll this out to prod.

The Chromium update has been rolled out, closing the task. I've also notified https://github.com/GoogleChrome/puppeteer/issues/4040 that this seems caused by a Chromium regression.