Page MenuHomePhabricator

Make the parsoid cluster support parsoid/PHP
Closed, ResolvedPublic

Description

The parsoid cluster (the wtp*hosts) also need to be able to run parsoid-php.

Start with puppet work to make it easy to switch an existing wtp server to become also a Mediawiki appserver with PHP 7.2 and without HHVM, just like the existing parsoid-testing server scandium.eqiad.wmnet has been converted.

After installing one server ask the Parsoid team to do some benchmarking.

Later convert the entire wtp cluster.

Use newly ordered hardware in addition to wtp servers.

  • add wtp servers to conftool data - https://gerrit.wikimedia.org/r/c/operations/puppet/+/541377
  • set servers where we still didn't install MediaWiki to pooled=inactive so they are not visible anywhere (default)
  • set wtp1025 and wtp2001 where we did install MediaWiki to pooled=yes
  • add parsoid/parsoid-php to the mediawiki-installation group so that scap syncs to the servers where we installed mediawiki
  • convert all other wtp servers into MediaWiki appserver style servers via the "use_php" Hiera switch and pool them in confctl

-> continue with LVS config in T233722

all related Gerrit changes

Icinga monitoring - wtp servers

Details

Related Gerrit Patches:
operations/puppet : productionssl: update certificates for parsoid/parsoid-php
operations/puppet : productiondiscovery.yaml: add parsoid-php microservice
operations/puppet : productionLVS: add config for parsoid-php service
operations/puppet : productionparsoid: turn all wtp servers into Parsoid/PHP-MW-appservers
operations/puppet : productionlogstash: add wtp1025/wtp2001 to filter-mediawiki with parsoid-php channel
operations/puppet : productionscap/dsh: add parsoid-php servers to mediawiki-installation group
operations/puppet : productionconftool: add parsoid-php service to wtp servers
operations/dns : masteradd service records for new parsoid-php service
mediawiki/services/parsoid/deploy : masterDeployment: Create the vendor symlink on all deploys
operations/mediawiki-config : masterAdd wtp1025/wtp2001 to the list of servers using Parsoid/PHP
operations/puppet : productionconftool: turn wtp1025 and wtp2001 into test servers
operations/puppet : productionparsoid: turn wtp1025 into eqiad parsoid/php appserver
labs/private : masteradd fake mcrouter certs for ALL parsoid wtp hosts
operations/puppet : productionparsoid: introduce parameter to use parsoid/PHP
operations/puppet : productionssl: add new parsoid.svc(eqiad|codfw) certs
labs/private : masteradd fake keys for new parsoid certs
operations/puppet : productionwtp2001: explicitly set has_lvs to true
operations/puppet : productionadd certificate for parsoid.discovery/parsoid.svc
labs/private : masteradd fake key for parsoid.svc, delete fake key for wtp2001
labs/private : masteradd fake SSL key for wtp2001.codfw.wmnet

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 23 2019, 7:19 PM
Dzahn added a subscriber: Joe.Sep 23 2019, 8:02 PM
Joe renamed this task from convert parsoid cluster from parsoid/JS to parsoid/PHP to Make the parsoid cluster to support parsoid/PHP.Sep 24 2019, 2:14 PM
Joe updated the task description. (Show Details)
herron triaged this task as Medium priority.Sep 25 2019, 3:44 PM

Mentioned in SAL (#wikimedia-operations) [2019-09-26T18:04:33Z] <mutante> running mcrouter_generate_certs to add a cert for wtp2001.codfw.wmnet for T233654

Change 539181 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] parsoid: introduce parameter to use parsoid/PHP

https://gerrit.wikimedia.org/r/539181

Dzahn renamed this task from Make the parsoid cluster to support parsoid/PHP to Make the parsoid cluster support parsoid/PHP.Sep 26 2019, 7:05 PM
ssastry updated the task description. (Show Details)Sep 26 2019, 8:07 PM

Change 539410 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] add fake SSL key for wtp2001.codfw.wmnet

https://gerrit.wikimedia.org/r/539410

Change 539410 merged by Dzahn:
[labs/private@master] add fake SSL key for wtp2001.codfw.wmnet

https://gerrit.wikimedia.org/r/539410

Dzahn added a subscriber: jijiki.EditedSep 26 2019, 10:49 PM

@Joe @jijiki This change compiles now.

https://gerrit.wikimedia.org/r/c/operations/puppet/+/539181

On wtp2001 it adds all these resources:

https://puppet-compiler.wmflabs.org/compiler1002/18613/wtp2001.codfw.wmnet/

and it also removes some resources because "has_lvs" is switched from true to false.

If we set "profile::parsoid::use_php" to true we require all these:

require ::profile::mediawiki::scap_proxy
require ::profile::mediawiki::common
require ::profile::mediawiki::nutcracker
require ::profile::mediawiki::mcrouter_wancache
require ::profile::prometheus::mcrouter_exporter
require ::profile::mediawiki::php
require ::profile::mediawiki::php::monitoring
require ::profile::mediawiki::webserver

I am not including the role::mediawiki::common but instead all the profiles used by it because including a role inside a profile is bad per puppet style check (remember that came up with scandium as well and i have that pending change about it https://gerrit.wikimedia.org/r/c/operations/puppet/+/526290). But there it was just role in another role and here it would be role in another profile.

I have added:

  • fake mcrouter certs for the compiler
  • real mcrouter certs in the private repo
  • fake ssl key for webserver "has_tls" which is enabled (https://gerrit.wikimedia.org/r/c/labs/private/+/539410) but no real key to match that. what about that part? There are no other keys named after individual hosts modules/secret/secrets/ssl/ in there
  • Do we need to enable the "services_proxy" ? That is commented out right now.

On another unchanged wtp server this does nothing except adding the parameter with false. https://puppet-compiler.wmflabs.org/compiler1002/18613/wtp1025.eqiad.wmnet/

Change 540251 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] add fake key for parsoid.svc, delete fake key for wtp2001

https://gerrit.wikimedia.org/r/540251

Change 540251 merged by Dzahn:
[labs/private@master] add fake key for parsoid.svc, delete fake key for wtp2001

https://gerrit.wikimedia.org/r/540251

Change 540252 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] add certificate for parsoid.discovery/parsoid.svc

https://gerrit.wikimedia.org/r/540252

Mentioned in SAL (#wikimedia-operations) [2019-10-02T19:58:02Z] <mutante> puppetmaster1001 - sudo puppet cert clean parsoid.discovery.wmnet (only created yesterday but does not have all the SANs it needs, updating with more SANs) (T233654)

Change 540252 merged by Dzahn:
[operations/puppet@production] add certificate for parsoid.discovery/parsoid.svc

https://gerrit.wikimedia.org/r/540252

Change 539181 merged by Dzahn:
[operations/puppet@production] parsoid: introduce parameter to use parsoid/PHP

https://gerrit.wikimedia.org/r/539181

Change 540615 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] wtp2001: explicitly set has_lvs to true

https://gerrit.wikimedia.org/r/540615

Change 540615 merged by Dzahn:
[operations/puppet@production] wtp2001: explicitly set has_lvs to true

https://gerrit.wikimedia.org/r/540615

Mentioned in SAL (#wikimedia-operations) [2019-10-03T19:19:02Z] <mutante> puppetmaster1001 - revoke cert for parsoid.discovery.wmnet - creating new ones for each DC and a unified one with both (T233654)

Change 540672 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] ssl: add new parsoid.svc(eqiad|codfw) certs

https://gerrit.wikimedia.org/r/540672

Change 540675 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] add fake keys for new parsoid certs

https://gerrit.wikimedia.org/r/540675

Change 540675 merged by Dzahn:
[labs/private@master] add fake keys for new parsoid certs

https://gerrit.wikimedia.org/r/540675

Change 540672 merged by Dzahn:
[operations/puppet@production] ssl: add new parsoid.svc(eqiad|codfw) certs

https://gerrit.wikimedia.org/r/540672

Change 540680 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] parsoid: turn wtp1025 into the first eqiad parsoid appserver

https://gerrit.wikimedia.org/r/540680

Dzahn added a comment.Oct 3 2019, 8:40 PM

certficate issues fixed.

wtp2001 after the mediawiki roles have been applied now:

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=wtp2001&scroll=286

all appserver checks green after 2 or 3 puppet runs.

except "not in dsh group". need to add in confctl.

next we can do this with wtp1025 for eqiad. we said one in each DC.

Change 540684 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] conftool: turn wtp2001 into a test server

https://gerrit.wikimedia.org/r/540684

Change 540947 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] add fake mcrouter certs for ALL parsoid wtp hosts

https://gerrit.wikimedia.org/r/540947

Change 540947 merged by Dzahn:
[labs/private@master] add fake mcrouter certs for ALL parsoid wtp hosts

https://gerrit.wikimedia.org/r/540947

Change 540680 merged by Dzahn:
[operations/puppet@production] parsoid: turn wtp1025 into eqiad parsoid/php appserver

https://gerrit.wikimedia.org/r/540680

Dzahn added a comment.Oct 4 2019, 8:21 PM

created mcrouter certs for ALL wtp eqiad and codfw hosts

Dzahn added a subscriber: ssastry.Oct 4 2019, 10:43 PM

@ssastry @Joe wtp1025.eqiad.wmnet and wtp2001.codfw.wmnet are now the 2 hosts selected as the test/benchmarking hosts for parsoid/PHP. They are simply the first in each DC when sorting numerically.

After the merges above they both now have the mediawiki appserver PHP classes.

That includes:

  • mediawiki::common
  • mediawiki::php (and monitoring for it)
  • mediawiki::webserver
  • scap_proxy
  • nutcracker
  • mcrouter
  • services_proxy

in addition to the existing parsoid class.

Icinga shows:

PHP7 rendering - check
PHP opcache health - check

mcrouter certs / process - check

(I have also created the mcrouter certs for ALL wtp hosts already.)

nutcracker process / socket - check

php7.2-fpm service - check

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?host=wtp1025

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?host=wtp2001

Currently the only red check is the "not in mediawiki dsh group" which means "not in conftool". Maybe they should be added as "test_servers" like here? -> https://gerrit.wikimedia.org/r/c/operations/puppet/+/540684

Change 541377 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] parsoid/conftool: add wtp servers as apache appservers

https://gerrit.wikimedia.org/r/541377

Change 540684 abandoned by Dzahn:
conftool: turn wtp1025 and wtp2001 into test servers

Reason:
in favor of https://gerrit.wikimedia.org/r/c/operations/puppet/ /541377

https://gerrit.wikimedia.org/r/540684

Change 541611 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/mediawiki-config@master] add wtp1025/wtp2001 to list of servers using Parsoid/PHP

https://gerrit.wikimedia.org/r/541611

Change 541645 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] logstash: add wtp1025/wtp2001 to filter-mediawiki with parsoid-php channel

https://gerrit.wikimedia.org/r/541645

Change 541611 merged by jenkins-bot:
[operations/mediawiki-config@master] Add wtp1025/wtp2001 to the list of servers using Parsoid/PHP

https://gerrit.wikimedia.org/r/541611

Mentioned in SAL (#wikimedia-operations) [2019-10-10T08:12:15Z] <mobrovac@deploy1001> Synchronized wmf-config/CommonSettings.php: Add wtp1025/wtp2001 to the list of servers using Parsoid/PHP - T233654 (duration: 01m 01s)

Change 542566 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] add service records for new parsoid-php service

https://gerrit.wikimedia.org/r/542566

Change 542572 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] discovery.yaml: add parsoid-php microservice

https://gerrit.wikimedia.org/r/542572

Change 542671 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/parsoid/deploy@master] Deployment: Create the vendor symlink on all deploys

https://gerrit.wikimedia.org/r/542671

We have to resolve the same problem here to the one we encountered in Beta. Namely, both php-fpm and parsoid services use port 8000 to listen to incoming requests. This creates a rather problematic situation, since all the appservers have it configured on that port, and Parsoid/JS has an LVS set up on that port that is used by RESTBase and possibly others.

Since changing the port of php-fpm (and hence the config of all appservers) is very risky, the only viable option I see here is to work around it in the following way. We temporarily set up Parsoid to listen to both 8000 and another port (much like we've done for RESTBase). We then set up a second LVS (and discovery) DNS - parsoid-js.svc.{site}.wmnet - and make it use the alternative port~{1} and make RESTBase (and any other entity that uses it, like MW for private wikis) use the alternative LVS. Once we complete the transition, we can abandon the old LVS and reconfigure Parsoid to use only the alternative port.

@Joe @akosiaris @Dzahn your thoughts are greatly appreciated here. This should be resolved before we move forward given the port clash.

[1] Perhaps we could make the currently-set-up LVS listening on 8000 to direct requests to the alternative port, but I don't know if that's possible in our set-up and whether it would actually require more work than just setting up a new, temporary LVS.

Dzahn added a comment.Oct 12 2019, 2:18 PM

@mobrovac Yes, i agree. Making 2 new LVS and DNS services, one parsoid-php and one parsoid-js and then switching first from old parsoid to parsoid-js seems like the best plan to solve the conflict. My latest patch is the attempt to add that config for a new parsoid-php service so i could more or less copy that to make parsoid-js first. ACK.

Change 542671 merged by jenkins-bot:
[mediawiki/services/parsoid/deploy@master] Deployment: Create the vendor symlink on all deploys

https://gerrit.wikimedia.org/r/542671

Joe added a comment.EditedOct 14 2019, 1:00 PM

We have to resolve the same problem here to the one we encountered in Beta. Namely, both php-fpm and parsoid services use port 8000 to listen to incoming requests. This creates a rather problematic situation, since all the appservers have it configured on that port, and Parsoid/JS has an LVS set up on that port that is used by RESTBase and possibly others.

No it's not. php-fpm listens on a unix socket in production.

Aaah indeed, you are right @Joe. Not sure where I was looking :/ Must have been logged into a Beta instance. Ok, we don't need to do this whole port-switching dance then, yay.

Change 542566 abandoned by Dzahn:
add service records for new parsoid-php service

Reason:
we are using the same IP as parsoid but on a different port

https://gerrit.wikimedia.org/r/542566

Change 543243 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] LVS: add config for parsoid-php service

https://gerrit.wikimedia.org/r/543243

Dzahn updated the task description. (Show Details)Oct 15 2019, 11:35 PM
Dzahn updated the task description. (Show Details)Oct 16 2019, 2:48 PM

Change 541377 merged by Dzahn:
[operations/puppet@production] conftool: add parsoid-php service to wtp servers

https://gerrit.wikimedia.org/r/541377

Dzahn updated the task description. (Show Details)Oct 16 2019, 3:02 PM

Mentioned in SAL (#wikimedia-operations) [2019-10-16T15:04:07Z] <mutante> wtp parsoid servers added to conftool - wtp1025 and wtp2001 pooled in new service parsoid-php (T233654)

Mentioned in SAL (#wikimedia-operations) [2019-10-16T15:04:51Z] <mutante> wtp1025 wtp2001 - scap pull (T233654)

Change 543479 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] scap/dsh: add parsoid-php servers to mediawiki-installation group

https://gerrit.wikimedia.org/r/543479

Change 543479 merged by Dzahn:
[operations/puppet@production] scap/dsh: add parsoid-php servers to mediawiki-installation group

https://gerrit.wikimedia.org/r/543479

Change 541645 merged by Giuseppe Lavagetto:
[operations/puppet@production] logstash: add wtp1025/wtp2001 to filter-mediawiki with parsoid-php channel

https://gerrit.wikimedia.org/r/541645

Dzahn updated the task description. (Show Details)Oct 17 2019, 2:32 PM
Dzahn updated the task description. (Show Details)Oct 18 2019, 5:33 PM

Change 544232 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] parsoid: turn all wtp servers into Parsoid/PHP-MW-appservers

https://gerrit.wikimedia.org/r/544232

Change 544232 merged by Dzahn:
[operations/puppet@production] parsoid: turn all wtp servers into Parsoid/PHP-MW-appservers

https://gerrit.wikimedia.org/r/544232

Mentioned in SAL (#wikimedia-operations) [2019-10-18T18:27:34Z] <mutante> temp. disabled puppet on all wtp* servers, adding mediawiki appserver roles on them incrementally by re-enabling puppet, starting with wtp1026, scheduled icinga downtime for wtp* all services (T233654)

With this change the parameters profile::parsoid::use_php and has_lvs have become default now for all parsoid servers. wtp1025 /wtp2001 are not the special cases anymore.

I temp. disabled puppet on all of wtp* and then gradually re-enabled it which added the mediawiki / php puppet roles and everything that comes with it.

That added all the appserver monitoring checks on Icinga wtp overview. Once they turned fully green i pooled servers into the parsoid-php service one after another.

sudo -i confctl --tags dc=codfw,cluster=parsoid,service=parsoid-php --action get all | jq and sudo -i confctl --tags dc=eqiad,cluster=parsoid,service=parsoid-php --action get all | jq . show everything is pooled from wtp1025 through wtp1048
and wtp2001 through wtp2020

Dzahn raised the priority of this task from Medium to High.Oct 18 2019, 11:33 PM
Dzahn updated the task description. (Show Details)

@Joe If from here on the remaining steps are T233722 then this ticket should be resolved i think.

Change 543243 merged by Giuseppe Lavagetto:
[operations/puppet@production] LVS: add config for parsoid-php service

https://gerrit.wikimedia.org/r/543243

Change 542572 merged by Giuseppe Lavagetto:
[operations/puppet@production] discovery.yaml: add parsoid-php microservice

https://gerrit.wikimedia.org/r/542572

Mentioned in SAL (#wikimedia-operations) [2019-10-25T01:13:19Z] <mutante> puppetmaster1001 - revoking parsoid.svc.eqiad / parsoid.svc.codfw / parsoid.discovery.wmnet certificates and creating new ones including parsoid-php.discovery.wmnet (T233654)

Change 545989 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] ssl: update certificates for parsoid/parsoid-php

https://gerrit.wikimedia.org/r/545989

Change 545989 merged by Dzahn:
[operations/puppet@production] ssl: update certificates for parsoid/parsoid-php

https://gerrit.wikimedia.org/r/545989

Ottomata assigned this task to Dzahn.Oct 29 2019, 3:41 PM
Ottomata added a subscriber: Ottomata.

@Dzahn, assigning to you, feel free to undo or reassign if this is not correct.

Dzahn closed this task as Resolved.Oct 29 2019, 5:45 PM