OCG uses node.js, and the version we have installed on trusty includes their own set of root CAs, and doesn't include GlobalSign's R3 from 2009....
Resolved for now. To recap:
Initial symptom was lots of errors the ocg logs after we deployed the new intermediate, and the errors contained: [Error: CERT_UNTRUSTED]
Alex figured out that the old version node.js we use on trusty (which we think is just for OCG at this point), has a compiled-in set of Root CAs which lacks the GlobalSign R3 root from 2009 that our new intermediate chains to.
I downloaded the ubuntu upstream sources (our existing package was vanilla trusty), created a local build tree on copper with a local git repo (in ~bblack/njs/ if you want to poke), and added a new debian patch on top to add the R3 root, and rebuild it tacking +wmf1 onto the version and uploaded to carbon trusty-wikimedia repo.
After that, basically the fixup on ocg100 was:
apt-get update apt-get install nodejs=0.10.25~dfsg2-2ubuntu1+wmf1 puppet agent -t # so it fixes the /usr/bin/nodejs-ocg hardlink service ocg restart
Seems to be working ok based on log outputs not spamming CERT_UNTRUSTED anymore.
FYI it's worth noticing that the upgrade of NodeJS for this service looks a bit broken by design to me, given that apt-get will overwrite nodejs but not the hardlink for nodejs-ocg that will still be the old one.
The hardlink will be fixed at the next Puppet run but it will not restart the service so if a security upgrade is done just with apt-get upgrade/install + service restart it will actually not upgrade the running NodeJS.
From what I can see the nodejs-ocg is used to apply specific apparmor rules.
Beside ocg we have a other precise/trusty systems not using nodejs 4:
- sca1* still has it installed, but the only remaining service (zotero) isn't node-based and the hosts will be repurposed anyway (as zotero is moved to an elephants' graveyard)
- helium and gallium use wmf-specific precise builds on node 0.8.2, but these appear to be totally unused (and gallium is going away soon). Will check wrt the deinstallation on helium
- francium has it installed, but unused, will check wrt deinstallation
- stat1002/1003 has nodejs installed, but there's no long-running services, it's just available for researches, will update it on those hosts to the new version
So true, but the entire service is practically unmaintained. Wheels are moving towards replacing it however with Electron and Services owning it, so I hope we will get rid of ocg soon and not have the same problem in the future.