Page MenuHomePhabricator

Class not found transient errors after Parsoid/PHP scap3 deploys
Closed, ResolvedPublic

Description

See commit message of https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/deploy/+/549245. https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/deploy/+/547311 was the earlier effort thinking this would solve it.

But, we need a mechanism to do an atomic depool + symlink + restart php-fml mechanism before repooling the server.

Right now, while we don't yet have live traffic, this isn't an issue, but once live traffic starts flowing to the cluster, this will cause user failures.

Details

Related Gerrit Patches:
mediawiki/services/parsoid/deploy : masterDepool the node before the promote stage
mediawiki/services/parsoid/deploy : masterEnsure FPM is depooled throughout the deployment process on a node

Event Timeline

ssastry triaged this task as High priority.Wed, Nov 20, 3:11 PM
ssastry created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptWed, Nov 20, 3:11 PM
ssastry moved this task from Backlog to Deployment on the Parsoid-PHP board.Wed, Nov 20, 3:12 PM
Joe added a subscriber: Joe.Wed, Nov 20, 5:29 PM

The solution would be, IMHO, to deploy parsoid-php as part of the main mediawiki code deployment (so, via scap and not scap3).

We have a well established, working way of deploying code with scap2. We should stop trying to add patches to a scap3 deployment process to make it work in an environment tuned to scap2.

The solution would be, IMHO, to deploy parsoid-php as part of the main mediawiki code deployment (so, via scap and not scap3).

This will be the case once we integrate Parsoid/PHP into mw-core. Until then, I think the best course of action would be to (de)pool fpm at the same time as Parsoid/JS. That way we know the server will not be used during the whole deployment process for any given server.

Change 552120 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/parsoid/deploy@master] Ensure FPM is depooled throughout the deployment process on a node

https://gerrit.wikimedia.org/r/552120

Change 552120 merged by jenkins-bot:
[mediawiki/services/parsoid/deploy@master] Ensure FPM is depooled throughout the deployment process on a node

https://gerrit.wikimedia.org/r/552120

Change 552130 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/parsoid/deploy@master] Depool the node before the promote stage

https://gerrit.wikimedia.org/r/552130

Change 552130 merged by jenkins-bot:
[mediawiki/services/parsoid/deploy@master] Depool the node before the promote stage

https://gerrit.wikimedia.org/r/552130

Mentioned in SAL (#wikimedia-operations) [2019-11-20T20:53:50Z] <ssastry@deploy1001> Started deploy [parsoid/deploy@7665624]: Dummy Parsoid deploy to test T238748 fix

Mentioned in SAL (#wikimedia-operations) [2019-11-20T21:01:17Z] <ssastry@deploy1001> Finished deploy [parsoid/deploy@7665624]: Dummy Parsoid deploy to test T238748 fix (duration: 07m 20s)

ssastry closed this task as Resolved.Wed, Nov 20, 9:03 PM