Create the base container images for running MediaWiki in a production environment
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Joe
	Oct 13 2020, 7:40 AM

Description

We will need at the very least:

apache httpd image. Will need to have the ability to add a complex configuration to it. This image can be based on debian buster. Unclear how we will want to manage logging.
php-fpm image. This should be relatively straightforward once we've figured out how to tackle T263545 correctly
mcrouter image.
nutcracker image - Hopefully the need for this will go away soon.
php/apache exporters images - this will need to be expanded and evaluated, but we surely need:
one php-fpm exporter. We have multiple right now and we should unify them somehow.
one apache status exporter
mtail to read the apache logs
mcrouter exporter image

Details

Subject	Repo	Branch	Lines +/-
Add LAMP prometheus exporters.	operations/docker-images/production-images	master	+68 -0
Add a mcrouter image	operations/docker-images/production-images	master	+84 -0
Add a php-fpm image for php 7.2	operations/docker-images/production-images	master	+488 -0
Add base php cli image	operations/docker-images/production-images	master	+68 -0
Add apache httpd base image	operations/docker-images/production-images	master	+57 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	akosiaris	T198901 Migrate production services to kubernetes using the pipeline
Resolved	Clement_Goubert	T238770 Deploy MediaWiki to Wikimedia production in containers
Resolved	Jdforrester-WMF	T238771 Get production MW-land images built and published
Duplicate	None	T238773 Create initial production MW-land images with blubber
Open	None	T238747 Migrate www.wikipedia.org (and other www portals) to be its own service
Resolved	akosiaris	T238774 Provide the official production base images for Wikimedia use
Resolved	Joe	T265324 Create the base container images for running MediaWiki in a production environment
Resolved	Clement_Goubert	T265876 Logging options for apache httpd in k8s
Resolved	kamila	T276095 Keep calculating latencies for MediaWiki requests in the WikiKube environment
Resolved	kamila	T367076 benthos mw-accesslog-metrics kafka lag and interpolation errors
Resolved	Clement_Goubert	T324439 New mediawiki.httpd.accesslog topic on kafka-logging + logstash and dashboard

Event Timeline

Joe created this task.Oct 13 2020, 7:40 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 13 2020, 7:40 AM

JMeybohm subscribed.Oct 13 2020, 7:43 AM

MoritzMuehlenhoff subscribed.Oct 13 2020, 7:50 AM

JMeybohm triaged this task as Medium priority.Oct 13 2020, 9:53 AM

Joe claimed this task.Oct 19 2020, 8:51 AM

Joe moved this task from Backlog to In Progress on the MW-on-K8s board.

Joe moved this task from Incoming 🐫 to Doing 😎 on the serviceops board.Oct 19 2020, 8:53 AM

Change 634924 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/docker-images/production-images@master] Add apache httpd base image

https://gerrit.wikimedia.org/r/634924

gerritbot added a project: Patch-For-Review.Oct 19 2020, 9:14 AM

Dzahn subscribed.Oct 19 2020, 2:24 PM

Regarding the apache httpd container, I am approaching layering as follows:

one base image, which uses the apache2-bin debian package and just modifies the vanilla configuration to listen on port 8080 (so that the container can run as user www-data).
one image configured to manage a php-fpm application. This image will be used as a base for both MediaWiki and the shellout service. It will include all modules and base configurations we need, and have a single virtualhost sending all "*.php" files to the fastcgi daemon
the Wikimedia MediaWiki apache configuration will be injected on top of this base image using helm templating. To this end, we will first refactor our puppet code so that we can digest a yaml data structure to generate all the virtualhosts, and re-use the same data structure to configure things in helm, generating it from puppet. This way, we can keep managing changes during the transition mostly sane.

Does anyone see better approaches?

• ema subscribed.Oct 21 2020, 8:28 AM

If I got this right you are purposing to put apache and php-fpm in the same container, correct (talking about vhosts in fpm context)?
I can think of reasons why that might make sense to do (sharing "static" assets for example, ease of MVP), but maybe you could outline them here?

In T265324#6567095, @Joe wrote:

one base image, which uses the apache2-bin debian package and just modifies the vanilla configuration to listen on port 8080 (so that the container can run as user www-data).

In case it makes things easier/cleaner, instead of modifying the configuration you could set the capability CAP_NET_BIND_SERVICE.

In T265324#6567154, @JMeybohm wrote:

If I got this right you are purposing to put apache and php-fpm in the same container, correct (talking about vhosts in fpm context)?
I can think of reasons why that might make sense to do (sharing "static" assets for example, ease of MVP), but maybe you could outline them here?

No, the idea is to have separate containers, one for php-fpm, and one for apache httpd, just running in the same pod in production.

I was purely talking about the httpd configuration here. The idea is that e.g. the shellout service will be able to use this image as a base with little to no modifications.

The idea is that in production the two containers will have a shared volume that will be used to host the unix socket.

In T265324#6567213, @Joe wrote:

Oh, well. Sorry then. I guess I just misread the "one image configured to manage a php-fpm application." part as already containing fpm.

In T265324#6567163, @ema wrote:

In T265324#6567095, @Joe wrote:

one base image, which uses the apache2-bin debian package and just modifies the vanilla configuration to listen on port 8080 (so that the container can run as user www-data).

In case it makes things easier/cleaner, instead of modifying the configuration you could set the capability CAP_NET_BIND_SERVICE.

You can't do that in your dockerfile, which means that people need to grant them when running the container, one way or another. It's an useless burden on the user IMHO.

In T265324#6567095, @Joe wrote:

Regarding the apache httpd container, I am approaching layering as follows:

one base image, which uses the apache2-bin debian package and just modifies the vanilla configuration to listen on port 8080 (so that the container can run as user www-data).

one image configured to manage a php-fpm application. This image will be used as a base for both MediaWiki and the shellout service. It will include all modules and base configurations we need, and have a single virtualhost sending all "*.php" files to the fastcgi daemon

How are we planning to solve some minor differences our clusters have in our php configurations? The ones I know for sure are max_execution_time (jobrunners) and apc.ttl = 10 (parsoid). Would use ENV work? Moreover, I think we would like to have the ability to easily tweak those settings

the Wikimedia MediaWiki apache configuration will be injected on top of this base image using helm templating. To this end, we will first refactor our puppet code so that we can digest a yaml data structure to generate all the virtualhosts, and re-use the same data structure to configure things in helm, generating it from puppet. This way, we can keep managing changes during the transition mostly sane.

To my understanding, in order to make apache changes, that would require a puppet commit, a helm commit, a puppet run and a helm apply, is that right?

In T265324#6567694, @jijiki wrote:

In T265324#6567095, @Joe wrote:

Regarding the apache httpd container, I am approaching layering as follows:

one base image, which uses the apache2-bin debian package and just modifies the vanilla configuration to listen on port 8080 (so that the container can run as user www-data).

one image configured to manage a php-fpm application. This image will be used as a base for both MediaWiki and the shellout service. It will include all modules and base configurations we need, and have a single virtualhost sending all "*.php" files to the fastcgi daemon

How are we planning to solve some minor differences our clusters have in our php configurations? The ones I know for sure are max_execution_time (jobrunners) and apc.ttl = 10 (parsoid). Would use ENV work? Moreover, I think we would like to have the ability to easily tweak those settings

That's not going to be in the httpd configuration, and I guess we will just provide the php keys via helm to each deployment. but it's not relevant to the apache httpd layering.

To my understanding, in order to make apache changes, that would require a puppet commit, a helm commit, a puppet run and a helm apply, is that right?

If the change is just to rewrite rules or other general configurations (99.9% of our changes), this will mean we'll need to:

In the transition phase: make a puppet commit, a puppet run and a helm deployment
Afterwards: a helm commit and a helm deployment.

For more structural changes, that don't happen since ~ 1 year btw, we will need both a puppet and a helm commit for the time of the transition.

In T265324#6567723, @Joe wrote:

In T265324#6567694, @jijiki wrote:

How are we planning to solve some minor differences our clusters have in our php configurations? The ones I know for sure are max_execution_time (jobrunners) and apc.ttl = 10 (parsoid). Would use ENV work? Moreover, I think we would like to have the ability to easily tweak those settings

That's not going to be in the httpd configuration, and I guess we will just provide the php keys via helm to each deployment. but it's not relevant to the apache httpd layering.

I was referring to the php-fpm image in general :)

To my understanding, in order to make apache changes, that would require a puppet commit, a helm commit, a puppet run and a helm apply, is that right?

If the change is just to rewrite rules or other general configurations (99.9% of our changes), this will mean we'll need to:

In the transition phase: make a puppet commit, a puppet run and a helm deployment

Afterwards: a helm commit and a helm deployment.

I understand that it is not going to be easy and I do not have a better answer now, but I believe going through this 5-6 step process, will be nerve breaking, lengthy, and hard to roll back.

Right now it is: a cumin run, a puppet commit, a puppet run, and a couple of cumin commands, and it can take up to 15-30'. We also have a safety net where, if our change is about to break production, most of the times we avoid that, since we simply do not re-enable puppet, but rarther revert and merge. We will need this safety net too.

For more structural changes, that don't happen since ~ 1 year btw, we will need both a puppet and a helm commit for the time of the transition.

Overall, I think we may need to take one step back and consider if an apache container in the pod, is something that might cause more problems than it will solve.

In T265324#6567761, @jijiki wrote:

Overall, I think we may need to take one step back and consider if an apache container in the pod, is something that might cause more problems than it will solve.

I'm not sure I understand what alternative approach are you proposing.

Also I want to clarify: we can reduce the pain as much as possible, but for the duration of the transition phase, it will be somewhat more work than we're used to for this kind of changes. There is no way around that that I can think of.

Hopefully we'll be able to reduce the amount of time needed as much as possible.

I don't think "you also need to do a deploy to k8s" is a bad compromise, but at this point I'd like to hear from others.

In T265324#6567779, @Joe wrote:

Also I want to clarify: we can reduce the pain as much as possible, but for the duration of the transition phase, it will be somewhat more work than we're used to for this kind of changes. There is no way around that that I can think of.

Hopefully we'll be able to reduce the amount of time needed as much as possible.

I don't think "you also need to do a deploy to k8s" is a bad compromise, but at this point I'd like to hear from others.

I had not completely understood the process, but I do now. The amount of time to deploy a change could be concerning, but it does not appear we have other options.

Jdforrester-WMF added a parent task: T238774: Provide the official production base images for Wikimedia use.Oct 21 2020, 4:10 PM

Jdforrester-WMF subscribed.

Change 638095 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/docker-images/production-images@master] Add base php cli image

https://gerrit.wikimedia.org/r/638095

Change 640386 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/docker-images/production-images@master] Add a php-fpm image for php 7.2

https://gerrit.wikimedia.org/r/640386

Change 634924 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] Add apache httpd base image

https://gerrit.wikimedia.org/r/634924

Change 638095 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] Add base php cli image

https://gerrit.wikimedia.org/r/638095

Change 640386 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] Add a php-fpm image for php 7.2

https://gerrit.wikimedia.org/r/640386

Change 641331 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/docker-images/production-images@master] Add a mcrouter image

https://gerrit.wikimedia.org/r/641331

Joe updated the task description. (Show Details)Nov 17 2020, 7:29 AM

Change 643019 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/docker-images/production-images@master] Add LAMP prometheus exporters.

https://gerrit.wikimedia.org/r/643019

Change 641331 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] Add a mcrouter image

https://gerrit.wikimedia.org/r/641331

Change 643019 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] Add LAMP prometheus exporters.

https://gerrit.wikimedia.org/r/643019

Mentioned in SAL (#wikimedia-operations) [2020-11-24T08:49:26Z] <_joe_> uploading the base production docker images for MediaWiki, T265324

Maintenance_bot removed a project: Patch-For-Review.Nov 24 2020, 9:10 AM

Joe mentioned this in T268612: Docker image on the build host seem to ignore apt priority for wikimedia packages.Nov 24 2020, 10:03 AM

Joe closed this task as Resolved.Nov 25 2020, 6:44 AM

Joe updated the task description. (Show Details)

Clement_Goubert closed subtask T265876: Logging options for apache httpd in k8s as Resolved.Mar 6 2023, 5:38 PM

Create the base container images for running MediaWiki in a production environmentClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Create the base container images for running MediaWiki in a production environment
Closed, ResolvedPublic
Actions

Related Objects
Search...