Page MenuHomePhabricator

Wikimedia Technical Conference 2019 Session: Local development environment - MediaWiki core
Closed, ResolvedPublic

Assigned To
Authored By
debt
Oct 4 2019, 3:16 PM
Referenced Files
None
Tokens
"Like" token, awarded by Nikerabbit."Love" token, awarded by Tgr."Love" token, awarded by jeena."Meh!" token, awarded by zeljkofilipin."Meh!" token, awarded by bd808."Pterodactyl" token, awarded by Jdforrester-WMF."Love" token, awarded by brennen."Mountain of Wealth" token, awarded by Addshore."Love" token, awarded by MusikAnimal."Love" token, awarded by Jpita."Love" token, awarded by hashar."Love" token, awarded by kostajh.

Description

Session

  • Track: Local Development and onboarding
  • Topic: Local development environment - MediaWiki core

Description

Various local development environments for MediaWiki core have different benefits and drawbacks - what are all of the necessary features (eg: stability, easy setup, production 'parity') that would allow us to standardize on one? How do we get to that place from here?

Questions to answer and discuss

Question: Should there be a single official MediaWiki-core local development environment?
Significance: Deciding how to address the problem of lack of standards

If YES,

Question: To whom should the environment be tailored?
Significance: We can't solve all problems in one environment. We should choose some core things to focus on, and identifying who we are serving will facilitate that.

Question: Which/whose problems should it attempt to solve?
Significance: Decide which issues are most meaningful to the target audience identified above.

If NO,

Question: What should be done instead?
Significance: Find out what the opposition to an official, supported MW-core development environment is, and how we can improve the development experience in other ways.

Regardless,

Question: Which one, if any, of local k8s, mediawiki-docker-dev, and vagrant should we maintain?
Significance: We need to decide whether we should make a commitment to maintaining any of the existing environments in order to move forward with efforts to improve developer productivity.

Related Issues

  • T235372 - Wikimedia Technical Conference 2019 Session: Local development environment - complex multi-service Mediawiki development

Pre-reading for all Participants


Session Leader(s)

Session Scribes

  • Nick

Session Facilitator

  • Aubrey

Notes

Slides:

Post-event summary:

  • Most people think there should be a single, official MW core dev environment.
  • It was agreed that the MediaWiki core local environment should be tailored to basic patch handling or beginner MediaWiki developers.
  • People really want to talk about complex development.
  • No consensus on what to maintain.

Post-event action items:

  • The definition and code for the basic MediaWiki environment should live in the core repo.
  • Evaluate the existing development environments to come up with a list of their pros and cons
  • Make a decision of whether or not to support SRE use cases (ex: testing different webserver variants)
Detailed notes from Etherpad
Going through slides:
Worked on trying to make "Local-charts" more production-like. Did a proof-of-concept using Minikube.
Once things are merged in master and images is built (using pipeline) and pushed to the image repo.
Problems!
Plagued by complexity, because we tried to make things very configurable.
We think it's not feasible to solve all our use cases with a *single* developer environment, or even to solve these problems in the deveveloper-facing environment at all. Some of these use cases would be better solved elsewhere.

Main topic:
QUESTION: Should there be a *single official* mediawiki core + local developer environment?
Split in two groups! Yes or No. [General feeling in room: 75% yes, 20% No, 5% Undecided/Other]
"Official" being: Release Engineering - standard dev env for mediawiki that we will promote.
Is single official constraint coming from RelEng - and is it because they don't want to maintain multiple dev envs?
We'd prefer to focus our efforts to make one a great success.
It's about whether we should maintain one, and which.
By local dev environemnt, what is that? is it like vagrant, with all the things? or core + extensions and simple deps? Or is that eg the whole wmf cluster with all micro services and data backends -- 
QUESTION: How do we define the set of extensions/skins in the bundle?
So if I am a new developer on MediaWiki, where do I start? What do i do? MediaWiki core, some basic extensions? If I want to work on abusefilter, all I need is mediawiki core and basic extension support.
QUESTION: Which *one*, if any, of local k8s, mediawiki-docker-dev, and vagrant shold we maintain?

"Yes group" discussion:
Florian: What do i do now? I have this task here, it says just fix this translation or something, how do I test that? It just needs mediawiki, php, webserver and thats about it.
Z, I talked to a outreachy student, and they said that the hardest thing about the whole project was installing mediawiki to get going
Z, Who has tried local charts (some people put hands up), its just 5 ot 6 commands to get everything setup
F: how long does it take? Vagrant takes a long time
J: probably less than vagrant, still need to download a lot of code/images
Z: probably depends on your OS, dependson on if a VM is needed etc on how much is needed.
Ricardo: what do we do right now for onboarding develoeprs in different teams
all: not much
Andre: we do have a wiki page for that
BD: it did say use mediawiki vagrant, then community decided that vagrant wasnt working for them and added different things
J: they found that vagrant wasn't good for their use cases
F: which ones?
Kosta: it didn't work for me, something didn't work with NFS, for who it should be tailored: new developers, should be fast and basic. MW-V's strength was the complexity it could handle
Florian: Thats probably an advanced use case
A: If we target just new developers, will we end up using it?
James F, no that is tommorrow ssession (complex environments)
A: if we don't use it we won't be dogfooding
Moritz: That was the question with vagrant, but noone used it? Bugs were not fixed?
James: Noone used it and tried to make it mimic production
BD: search did, really hard to make it mimic production entirely.
Gergo: .... the flexibility of vagrant could be captured in docker if you had a system of creating envs based on a config like vagrant
Kosta: About dogfooding, I would use the simple environemnt, sometimes i use homebrew system sometimes, sometimes I use the docker-compose thing, generally the simple environemtns work for 99% of situations.Same system locally and for CI
Forian: Have we answered "To whom hould it be taylored". Beginners. Core+extensions plus a bit
James F: you cant test central auth unless oyu ave a multi wiki setup, you cant test X unless you have Y
Gergo: new developer use case, simple. Other is new sysadmin setting up a production wiki for a simple use. For these both local and production are close to the same.
Florian: Does iut need to? in production there are far more requirements than for locally.
Gergo: config layers for different services. Do we want to support both use cases with the same system?
Z: I wanted to answer the question of if i would use it, and I have used them all. I would use it (simple one) defintly. At least for my usecase it would work.
Lars: there's about 15 different use cases in this group. Trying to solve all at once is difficult. Concentrate on a solveable problem.
Z: So you vote for lets do the simple thing?
Lars: What is the simpliest thing we can do and how can we make it more simple?
F + JamesF +Z: the simplest would be 1 wiki with some extensions if you want to, maybe a skin
Daniel: why can't I just go to WMCS, click an instance and it built ready for me
Lars: From my understanding of this, is that it is not local, it's local to "you"
Adam: with mediawiki docker-dev there's no reason it can't run on EC2 or whatever
J: vagrant works on toollabs as well
Lars: we could let everyon hack in production??
Daniel: installing mediawiki and a few extension doesn't seem very complex in puppet, package+role+an instance. What's the advantage of running it on a local machine?
J: using puppet is like running a 100m race in a tank
A: who should it be tailored to? include people who don't have the internet reliably
K: local dev config in the mwcore repo. Start with baseline: 1) it should run MW. 2) extension 3) parsoid. Weakness of existing solutions is the separateness of mwcore repo. 
Z: there is no reason that they have to be

"Yes group" summary:
    - Target beginners who are working on core and a few, simple extensions plus a bit?
    - Environment config should live in the core MW repo so that it gets updated in lock-step with MediaWiki itself.
    - Shouldn't be exclusively focussed on running in the cloud, also include low Internet use cases.

"No group" discussion:
- Single solution quickly becomes unmanagebly complex. Depends what you're working on, extensions, core, gadgets, etc. 
- INFO: Singular solution will inevitably become too complex.
- Should there be a single env? no. But should there be a single tool? yes. With different configurations and variables.
Perhaps implementation detail, e.g. vagrant has it, but problematic because all contained, still have to install 
e.g. one env for standard git core. one with no runtime sharing, -- difficulties with 
If we only have a single configuration, you're only going to find bugs that exist in that env.
INFO: Should be a recommended single solution, but alternatives should be available.
QUESTION: k8s future. Current situtaion seems to be that multiple solutions are supported. 
T: Different installer runtimes vs dev envs. Apache and Vagrant.
S: if we make a decision about the software
T: If wemake a breaking change, and doesn't affect you if not using k8s
S: If there is a single dev env, there isa seperate thing.
P: Some people simply refuse to use certain envs..
P: If we make one dev env, we'll make it inhouse, that might be very diffciutl to change in the future. Do we want to tie ourselves to k8s/docker.
INFO: Singular tools are often built inhouse and often black boxes.
QUESTION: use-case that keeps coming up - for a dev, installing it yourself is the preferred outcome. With the docker it's a bit lower level, adding config myself and thus i learn how to debug it which is good, however, then it potentially requires QA and PM to also use same/similar config.
S: beta-cluster is where QA happens
A: releng and some staff devs need the same env as beta staging test production, same database/schema/etc. 
INFO: 2 different solutions needed. 1 for staff devs, 1 for newcomer/volunteer. How do we get there, minikube currently broken
INFO: Gitlab has a good automated workflow, automaticaly create temporary clones of beta cluster, -- compromise/con: can't run offline
E: We should narrow down 1 by 1 from the existing multitude, rather than deciding on a single one now/today. Incremental approach.

"No group" summary:
- one custom solution, usually custom built on top of existing technologies
- most non-dev use-cases (PM/QA) point to a single tool for all. vs dev who will often have multiple tools interconnected
- Need a more singular solution for non-dev use-case, and configurable sol
- devs working on different environemnts, helps surface issues
- Singular tools are often built in house, and are often black boxes

Final question, which one (if any) of X Y Z should we maintain?
BD: vagrant is better documented, and more widely installed. But it does need care and feeding. Problem: primary maintainer (me) hasn't been a highly active developer of mediawiki itself for a few years.
BB: should we just invest more here? and stop looking for alternatives?
BD: This is the case that I have been making for the last years, yes.
GG: Parity. "Is it close enough to production". Can vagrant be?
JF: Should we split mw vagrant into a simple thing and a complex thing. Lean and simple, vs production like thing
BD: In theroy the roles system does that, but in practise maybe it doesnt
GT: the model of docker is make a system, take snapshot, [???]. Model of vagrant is [???] - technically inferior to docker. However I like the role system and want that in docker.
Z: my question was, is vagrant itself supported upstream? if vagrant might just go away, should we invest effort there?
GT: Is upstream a problem? In breakages the problem is rarely vagrant itself.
SR: They have does things like break on point releases numerous times
Z: They did change the config on a point release and break things before
BD: Do you expect minikube to keep working and not break down the line?
BB: We've had frstrations trying to work with minikube
TT: I'm in the middle. Support how vagrant works, ease of use, model, good for QA. Other tools that emerge have been workaround for issues with vagrant. Are those issues fixable, or inherent?  For example: setup time (took me 45-60 mins to get up and running on a fast machine). If something goes wrong, and needs updates, my transition to the new state might not have been tested [?].  But with docker, I can teardown and rebuild in a few minutes. 
R: In theory you could do that with vagrant images, and package the stuff in it, rather than building it when people download and install etc
KH: insights being made into the pros/cons of these 3, and we should write down a specific list, for the 2 audiences (daily professional versus occasional contributor?). Start with a minimal core.
Moritz: Should discuss exactly what we want to achieve. Docker for me was easier to setup on my own and change on my own.
PM: I have 3 environments i use: 1 docker, 1 vagrant-lite, and 1 vagrant-max (restbase parser drains battery fast). 
R: Thats not vagrant's fault though
BD: But it is a usecase thing. Thats probably the most important thing for us to be concetrating on.
F: I think we all agreed that, We should put whatever we are using in the repo of mw core. Easier to maintain, more visible,
AM: Is there any altrnative to minikube?
AS: minikube doesn't work for me, but plain k8s does
AM: do we have contacts with upstream? Is that google? We need to be able to spinup a dev env that is close to production cluster, can you help us do that?
BB: I dont have contact with upstream:
BD: the WMF is a member of the CNCF which is the owner of k8s.
AS: Why are we doing k8s? theoretically because we're going to tht in production, but that's a longterm project, so currently it's added complexity
GG: Action items, usecase creation and analysis evaluation would make sense. K8s? One thing we did learn and want to avoid, is just doing it in production. This is why beta sucks, is because it never became a real part of the pipeline to getting code to production. So we want to have the whole pipeline to be unified, not skipping them on purpose.
Moritz: Hire a software maintainer that supports it for 4-5 years?
AM: There is one thing i love in vagrant, that you can enable roles. If you want to fix search, just enable the search role. As a dev, i just want to work on a single extension, and not worry about all the peripheral services etc.
PM: People mention parity between prod and QA, but i dont think we need that. Xdebug, really slows down a lot, but QA and product manager dont need that, so already there is no parity.
Akoris: What happens when you have to debug stuff?
PM: Usually QA person can verify that on a different type of environment
TG: need to make sure you don't break things related to cluster. Setup a repo, and ability of vagrant to pick the things you want to run, and not the others. I'd like a system that is known to work for testing in production.
Akoriss: the roles thing is our creation, not built into vagrant. Lets have an action item maintaining that roles thing no matter what our environment is
JF: Action item: for people: this test stack is meant to test things we're deploying to production or 3rd parties. Note: PHP versions. It should meet the needs of DBAs
GG: Not only use the local dev environemnt for new php app level code, but also use it for system level changes as well
JF: or... dont care, and dont make so much effort.
Tyler: make the deciion of whether or not to support SRE use-cases as well as developer use-cases. (ACTION)
....
TT: I think the CI usecase is close enough to the .... One of the things that we do in CI is test against multiple PHP versions. Implys that webserver and PHP version is plugable at some level. One thing i like, is if someone report s bug that only affects PHP v 7.4, i can do that in docker.  It's still plain mediawiki, it's just different flavours of plain. E.g. testing nginx, which is something we currently support, and need to be able to test. 
(It's TIME!)

Action items:
- Whatever it is, put it as close to mediawiki core as possible (in the same git repo).
- Dogfooding is important
- related to promotion, and whatever is well-documented will be likely to be used.
- Usecase creation and analysis evaluation.
    - Avoid only in production. [Unclear what this means]
- Keep the idea of Roles, no matter what our dev environment ends up being.
- Make the decision of whether or not to support SRE use-cases as well as developer use-cases.

Originally from https://etherpad.wikimedia.org/p/WMTC19-T234632

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Actually I really like this idea for a session, it's kind of hard for a newcomer to actually being able to "see the wiki running", even if they did not even start with all the possible services we provide on top of MediaWiki core. I would, however, propose to keep the focus of this to a specific area in the beginning, probably MediaWiki core, as this is something most people will most likely run into, otherwise it could be an endless discussion?

One question or suggestion (maybe?) as well: Is this only focussed on looking at the various ways to setup a local environment we already have? Or does it also asks for new ways of running MediaWiki (and/or services) locally, which we did not consider/use at the moment? I would kind of prefer the first one, otherwise we _could_ end up in an discussion about stuff we actually do not have and probably only adding _yet another way_ of developing locally :]

Vagrant was pushed as a standard when it had resources behind it. It has since been, to my knowledge, completely abandoned in terms of resourcing. I'm one of the few that still contributes to it, but nobody officially maintains it, that I know of.

Whichever newer and shinier option might be proclaimed a new standard will meet the same fate if it doesn't have long-term support. This is the main issue to fix, which technology to pick is the easy part.

I'll add that based on my experience there are at least 2 very different needs for local development:

  • rapid MediaWiki development where all you need is MediaWiki + a very small amount of extensions
  • complex multi-service MediaWiki development where you need several optional services mimicking the more complex setup found in Wikimedia production

It can be challenging to find a solution that excels at both.

An additional consideration is whether it's possible to use the same tools used for local development in CI and for QA in beta labs. AFAICT local-charts has the promise to do this.

Whichever newer and shinier option might be proclaimed a new standard will meet the same fate if it doesn't have long-term support. This is the main issue to fix, which technology to pick is the easy part.

Yeah exactly. That said, the technology does matter in that if we're using the same sets of tools for local dev, QA in beta labs, continuous integration and production, then the local dev environment will inherently get more support and resources.

This session Our local development environment sounds as the most on topic one. It would surely need to be an extended session, or a couple sessions on two days, it could even be a very focused theme on its own.

As more and more services and technologies enter the orbit of MediaWiki (at least MediaWiki as deployed by Wikimedia), I'm wondering if it is still feasible to try to create and maintain an environment on developers' computers for

  • complex multi-service MediaWiki development where you need several optional services mimicking the more complex setup found in Wikimedia production

Already, if I'd like to get MediaWiki up and running locally, I'll need MySQL/Postgres and a PHP runtime (plus preferably a web server like Apache). Now, if I wanted to start working on the kask session storage service in concert with MediaWiki, I'll also need a Cassandra install and a golang development environment. If I'd like to work with RESTBase, I'll similarly need Cassandra plus a node.js install with NPM. If I'd like to work with graphoid, I'll need various library dependencies[1] together with a node.js install with NPM.

The situation reminded me of an article[2] I had read about testing microservices in a local context, specifically the extract:

… asking to boot a cloud on a dev machine is equivalent to becoming multi-substrate, supporting more than one cloud provider, but one of them is the worst you’ve ever seen (a single laptop)

Like the Vagrant setup mentioned in the referred article, MediaWiki-Vagrant has also been plagued by stability issues and lack of active maintainers.

At the same time, I haven't yet been able to think of a satisfactory solution to this dilemma that could work both for WMF and third-parties as well as volunteers.


[1] https://github.com/wikimedia/mediawiki-services-graphoid#quick-start
[2] https://medium.com/@copyconstruct/testing-microservices-the-sane-way-9bb31d158c16

One thing I think we should improve is the local and CI environments, making it unified in order to have the same experience both developing locally and running tests on the CI.
That's one thing I'm struggling with currently, I have 3 environments locally (vagrant, docker and LAMP) all of them with different configurations and extensions.
When I try to write UI tests I always need to change my code in order to run against a specific environment and then I'm not sure if it's going to be running the same way in the CI.
From my experience, we should stick to one tech (docker seems to be the one used in the CI env) and promote it's use for both developers and testers.

Hi all, thanks so much for the feedback! Based on the comments, it sounds like it might be a good idea to:

  • Focus this particular session on the MediaWiki core (@Florian , yes, I think it should focus on looking at the existing methods and not new ones, since there doesn't seem to be a lack of the former :)), giving strong consideration to whether our solution is usable for CI as well.
  • Add a couple more sessions for the other suggestions. My current thoughts are:
    • Challenges surrounding adoption and long-term maintenance of the "standard" decided upon, and how to overcome them
    • Complex multi-service MediaWiki development - feasibility and options

Please let me know what you think before this Friday, and I'll make the changes then.

yes, I think it should focus on looking at the existing methods and not new ones, since there doesn't seem to be a lack of the former :)

Considering this comment as well:

Whichever newer and shinier option might be proclaimed a new standard will meet the same fate if it doesn't have long-term support. This is the main issue to fix, which technology to pick is the easy part.

I would probably be not that strict anymore about adding new(er) technologies apart from what we've already, my fear only would be, that a lot of people throw "their" favourite "way of running locally" into the discussion (which is not bad at it's own), but it's most likely pretty difficult to moderate something like that into a valuable and effective direction, so that we actually have something at the end, where everyone can agree on that this fulfills our requirements (-> requirements needs to be decided on as well, in order to actually understand what we're looking for, see more in the next section :P).

Complex multi-service MediaWiki development - feasibility and options

I kind of know what you're talking here, but together with the things pointed out by others, I would argue that it's actually kind of difficult to have a "production-like" environment ready on your local computer. I would also trying to argue if that is really what one would like to have. Actually, that is my impression, most people (and I intentionally do not say all of them) will not work with _all_ services running in production, so they, in most cases, will not rely on the full set of services, infrastructure, applications and stuff running in production, but only on a subset.

And also here: Most developers will probably focus on one or maybe two different areas in the Wikimedia technical technology-landscape (maybe tools, maybe MediaWiki core/extensions and maybe a separate service). So, looking for a solution which fulfills the fullset of production services, will most likely end up in being complex in itself as well (as it tries to emitate a complex system). Maybe it's a better way to have a different set of tooling and/or local environment for different technology areas (maybe or maybe not relying on the same local infrastructure basis, like docker or whatever) and make it easier for developers to actually switch between them.

However, this is actually going to deep into the "possible solutions area", which I should try to keep for the discussion in Atlanta :P My vote here would be to focus on the "80% solution" to try to fit most WMF/WMDE staff, volunteer developers and third-parties, on which we then can try to build up on in order to fulfill more use cases as well (without making the bare minimum to be a complex and difficult to understand tooling solution).

P.S.: I would also say, that a person that actually already needs to understand the production-like environment (which also may change unvisibly to the most of us) complexity, is also able to adapt this complexity to find a solution to reflect it locally for themself as well, or at least would be ok with having a complete other solution to replicate the systems locally, as if you "only" need to be able to run MediaWiki and some nodejs services?!

Here is some existing work we (Releng) have done on the local development environment recently:

There was a survey on developer satisfaction, and I also did some interviews in order to better understand what challenges we needed to address for the local development environment. You can see the results and interpretation of requirements on the Developer Satisfaction Page.

In order to address some of the issues that came up and to try and align the environment with the future plans for CI and production we worked on a proof of concept using minikube called local-charts. I think this work will be useful as we continue with our k8s based CI implementation and for anyone who wants to run services in k8s or share a particular configuration of services, but we are still discussing whether this implementation is ideal for doing development on their own machine. Some of the reasons are that

  1. It introduces a lot of tools that may be unfamiliar to people
  2. It doesn't reduce resource usage (unless run somewhere else)
  3. The abstraction of the configuration for the services can be confusing

I think @TK-999 has a good point about trying to maintain an environment on developers' computers.

Lately we've been discussing the idea of keeping things simpler by providing vm images or snapshots for different development purposes with the projects/tools they might need to develop on pre-configured and run-able, and letting them customize from there. That would allow people to get up and running quickly and be flexible without adding unnecessary complexity. Of course, we still need to solve the issue of being able to run tests with the expectation that they match what happens in CI.

Jeena's comment above covers it (edit: "it" being recent RelEng work) pretty well. We have a local k8s proof-of-concept with Minikube that I think could be used for basic MediaWiki development right now, but the work has led to questions about complexity both for users of the development environment and for maintainers of the environment itself.

We've also been talking about whether we should provide the option to use cloud resources for hosting dev environments (i.e. a developer's DigitalOcean or AWS account, potentially Foundation-provided resources, etc.). I think there's a lot of potential in that both for hosting setups that are too heavy for laptops and for sharing development resources, but it's also something of a can of worms.

Hi all, thanks so much for the feedback! Based on the comments, it sounds like it might be a good idea to:

  • Focus this particular session on the MediaWiki core (@Florian , yes, I think it should focus on looking at the existing methods and not new ones, since there doesn't seem to be a lack of the former :)), giving strong consideration to whether our solution is usable for CI as well.
  • Add a couple more sessions for the other suggestions. My current thoughts are:
    • Challenges surrounding adoption and long-term maintenance of the "standard" decided upon, and how to overcome them
    • Complex multi-service MediaWiki development - feasibility and options

Please let me know what you think before this Friday, and I'll make the changes then.

@brennen @jeena would @josephine_l's suggestion work for you?

For newcomers getting started in a situation that is easy to setup and has proper defaults (config and extension) is most important. To me that means as much composer driven installation and config as possible (because its the php ecosystem default). Where composer is not possible (due to services in different ecosystem etc), composer should inform users what to do next.

For advanced users, in my personal experience, the biggest issue is the gap between a local system and WMF systems. I always struggle with the balance between the complexity of my dev environment (aka breaks often and quickly) and getting the stuff that I want to work with installed. Vagrant just introduced a 3rd environment in my experience.

I think the suggestion by @josephine_l is a good one

@greg @josephine_l @jeena: I think multiple sessions is probably a fine idea, and those seem like reasonable starting points for discussion.

josephine_l renamed this task from Wikimedia Technical Conference 2019 Session: Our local development environment to Wikimedia Technical Conference 2019 Session: Local development environment - MediaWiki core.Oct 13 2019, 10:41 AM
josephine_l updated the task description. (Show Details)

Thanks for the feedback! As proposed above, I have edited the title of this session to focus on MediaWiki core, and added two sessions: T235371 and T235372 for discussing adoption/long-term maintenance and complex multi-service development.

Is anyone interested in leading or co-leading any of these sessions? :)

I'd argue there are three use cases here:

  • Development environments specific to some project. (If you want to develop MediaWiki core, that's usually just MediaWiki core. If you want to work on Echo, that's MediaWiki + Echo + CentralAuth + multiple wikis. If you want to work on VisualEditor, that's MediaWiki + VisualEditor + Parsoid + maybe Restbase. Etc.) MediaWiki core is not really special among those, other than being simpler.
  • An environment that reflects Wikimedia production to the extent feasible, for integration testing, reproduction of complex bugs etc.
  • The ability to set up a complex system specific to some situation (e.g. when having to test the effect of a change on some production setup that's not Wikimedia production) and/or add/remove single extensions with little resource usage overhead (no separate VM for every single extension - this is important for people code reviewing random volunteer patches). Vagrant currently supports this and AIUI for local-charts this will be impossible (in exchange it will be a lot more robust).

So I'm not sure core vs. multi-service is the best way to split up the topic.

… asking to boot a cloud on a dev machine is equivalent to becoming multi-substrate, supporting more than one cloud provider, but one of them is the worst you’ve ever seen (a single laptop)

I would argue that' a good thing if you do actually want to be multi-substrate (which we do: we want to support Wikimedia production, Wikimedia Cloud Services, we want it to be easy for third parties to set up MediaWiki with a single click on AWS etc.): it means that you don't have to maintain a separate multi-substrate framework and local development setup, the local setup is just one of the substrates.

So I think there is a lot of potential synergy in having a development setup that is also a low-key production setup (an alternative or replacement of the current shared hosting setup which prevents the use of most mainstream MediaWiki features). It's unfortunate that that topic is not currently covered by any of the sessions (maybe T234644: Wikimedia Technical Conference 2019 Session: Release "strategies" for MediaWiki and other elements of Wikimedia platform, for safe and efficient deployment and hosting touches on that topic a bit).

I'd also add that Vagrant was a local development environment for both MediaWiki (and related repos) and Wikimedia operations Puppet. It's not clear what's supposed to replace it for the latter (admittedly niche) use case.

Vagrant was pushed as a standard when it had resources behind it. It has since been, to my knowledge, completely abandoned in terms of resourcing. I'm one of the few that still contributes to it, but nobody officially maintains it, that I know of.

RobLa gave me pretty wide discretion to work on MediaWiki-Vagrant, but it was never a named goal for a quarter or managed as a "real" product. Something good could still be salvaged from mw-v in my opinion, but it would take investment of at least 1 FTE and a product vision.

We're still looking for a session owner for this session, as well as T235371 and T235372, in order for the proposed sessions to go ahead. @kostajh @Florian @hashar @jeena @Tgr @bd808 would you be interested in leading any of these sessions?

So I'm not sure core vs. multi-service is the best way to split up the topic.

Thanks for the feedback Tgr. I wish I could follow up on this, but I think we can't make major changes to session titles after last Sunday, sorry. :( The timeline does feel rather tight to me, too. Perhaps we could address these issues as subtopics in T235372?

@kostajh @Florian @hashar @jeena @Tgr @bd808 would you be interested in leading any of these sessions?

The only loval environment I'm truly familiar with is Vagrant, and it has been made clear that is not going to be the future, so I wouldn't be a good leader for this.

@josephine_l I'll volunteer to lead or co-lead this session and T235372.

@jeena Awesome, thank you! :) I will add you as a session leader for both, please feel free to fill out the rest of the template as you see fit.

I'm not attending TechConf this year, but the lack of a common, well-supported local development environment has been a huge pain point during my time here.

I tried to use MW-Vagrant when I first joined. The "roles" system is a nice feature to have, but in practice getting a lot of complex dependencies to work together (Wikibase, Elasticsearch, etc) was not possible with my limited understanding when I was starting out. I quickly gave up on Vagrant and made a few abortive attempts to run everything natively before I settled on using the unofficial mediawiki-docker-dev environment.

This is a great project, but I think that officially adopting it (and improving documentation, troubleshooting, etc) would make it better. Docker is familiar to most developers these days, and it's pretty easy to adapt to various use-cases. Adding an ElasticSearch service to the existing mw-docker-dev framework was very simple, for example. Adding more documentation, perhaps a narrative tutorial for beginners, and a series of "recipes" for how to support the requirements of various extensions would be very helpful in my opinion. The performance on Mac machines is pretty terrible, but I'm willing to live with it to have a stable environment that is easy to spin up or trash as needed.

I'm curious about the Kubernetes-based work that Release Engineering is doing, but I'd be concerned that Kubernetes might be overkill for local development.

One thing that I have noticed while working on a project recently is that it's all setup to be deployed via helm to kubernetes, but I still do not use kubernetes in any variety for development, I just use docker-compose.

I have been away for a while and am very keen to try out the new kubernetes / charts stuff that has been worked on and I'm sure I'll have some more thoughts after that.

I think the mediawiki-docker-dev environment has potential and there are already many tasks filed and discussions around what could be done to improve it greatly.

As for performance on Mac, I believe this is something that can be worked on and from what I have seen it generally need some fine tuning of which files around mounted, and using what settings etc, all because everything is being run in a VM (unless I'm mistaken).

Thanks @brennen ! I have added you as co-lead.

debt triaged this task as Medium priority.Oct 22 2019, 6:54 PM

(Programming note)

This session was accepted and will be scheduled.

Notes to the session leader

  • Please continue to scope this session and post the session's goals and main questions into the task description.
    • If your topic is too big for one session, work with your Program Committee contact to break it down even further.
    • Session descriptions need to be completely finalized by November 1, 2019.
  • Please build your session collaboratively!
    • You should consider breakout groups with report-backs, using posters / post-its to visualize thoughts and themes, or any other collaborative meeting method you like.
    • If you need to have any large group discussions they must be planned out, specific, and focused.
    • A brief summary of your session format will need to go in the associated Phabricator task.
    • Some ideas from the old WMF Team Practices Group.
  • If you have any pre-session suggested reading or any specific ideas that you would like your attendees to think about in advance of your session, please state that explicitly in your session’s task.
    • Please put this at the top of your Phabricator task under the label “Pre-reading for all Participants.”

Notes to those interested in attending this session

(or those wanting to engage before the event because they are not attending)

  • If the session leader is asking for feedback, please engage!
  • Please do any pre-session reading that the leader would like you to do.

During planning discussions, it came up that we'd like to have this session happen prior to T235372, on the theory that there's a natural progression from one to the other.

@Petar.petkovic maybe your article with the installation instructions will be useful here?

During planning discussions, it came up that we'd like to have this session happen prior to T235372, on the theory that there's a natural progression from one to the other.

I agree that this would be best, could this be arranged @greg ?

@Petar.petkovic maybe your article with the installation instructions will be useful here?

I don't think this ticket is about creating tutorials. We already have that for Mediawiki core, which seems to be focus of this ticket.

During planning discussions, it came up that we'd like to have this session happen prior to T235372, on the theory that there's a natural progression from one to the other.

Currently they are, yes, even separate days: https://www.mediawiki.org/wiki/Wikimedia_Technical_Conference/2019/Program

Krinkle updated the task description. (Show Details)

I see "The definition and code for the basic MediaWiki environment should live in the core repo." is listed as an action item, but I see little in the notes to justify that.

This would apparently pull in extensions, services, and probably a whole LAMP stack inside a VM too. Why should the configuration for it be in mediawiki/core rather than having its own repo that pulls in mediawiki/core in the same way it does everything else? That would make a lot more sense to me, plus it would avoid adding more clutter to mediawiki/core.

so that it gets updated in lock-step with MediaWiki itself

What is going to be in this that it's going to need many updates in lockstep with MediaWiki itself?

We have a similar situation with the mediawiki/vendor repo, but that is handled without dumping it all into mediawiki/core.

Easier to maintain, more visible,

I guess the "maintain" part is the same as the first quote?

As for "more visible", I'd guess that clear documentation on mediawiki.org would probably do far more for visibility than having more cryptic configuration files scattered in mediawiki/core's root directory along side composer.json, .editorconfig, .eslintrc.json, .fresnel.yml, Gruntfile.js, jsduck.json, .mailmap, package.json, .phpcs.xml, .stylelintrc.json, .travis.yml, and a bunch of others.


I don't have much to add to the rest of the discussion, as I'm unlikely to use it for the most part. My (complicated) local installation already does what I want for the most part. I only occasionally use mediawiki-vagrant for things like Wikibase, VE, and Flow where I've never had enough need to figure out their complex setup requirements.

I see "The definition and code for the basic MediaWiki environment should live in the core repo." is listed as an action item, but I see little in the notes to justify that.

Sorry, the notes record the outline of the discussion, but don't explain much of the nuance, you're right.

This would apparently pull in extensions, services, and probably a whole LAMP stack inside a VM too. Why should the configuration for it be in mediawiki/core rather than having its own repo that pulls in mediawiki/core in the same way it does everything else? That would make a lot more sense to me, plus it would avoid adding more clutter to mediawiki/core.

I'm sure others can justify this better than I can, but the main consensus view was around making there be "just one thing" to download, rather than the split we currently have with MWVagrant, with the various docker-based attempts, and others.

so that it gets updated in lock-step with MediaWiki itself

What is going to be in this that it's going to need many updates in lockstep with MediaWiki itself?

Every change to config risks (and occassionally does) breaking MW-Vagrant. A cause of much pain and misery for "new" devs, in the experience of several participants (including me).

We have a similar situation with the mediawiki/vendor repo, but that is handled without dumping it all into mediawiki/core.

That's a quite different use case, at least as far as we saw it at TechConf. This is talking about development images for MW itself and maybe a handful of extensions, not for Wikimedia production hosting/development.

The future of prod(-like) images, to be discussed later this week, is that vendor is very much going to be bundled into the image (and that hacky repo itself likely deleted).

I'm sure others can justify this better than I can, but the main consensus view was around making there be "just one thing" to download,

Wouldn't the "just one thing" be the repo that has the docker-or-whatever configuration? Then when you run docker-or-whatever, that pulls in MediaWiki along with the extensions, services, LAMP stack, and so on.

That's how mediawiki-vagrant works, for example: You do one git clone to check out the mediawiki/vagrant repo, then run ./setup.sh (to configure vagrant) and vagrant up. You never have to manually download mediawiki/core.

rather than the split we currently have with MWVagrant, with the various docker-based attempts, and others.

The "problem" there is that there are several competing solutions. You don't have to download all of them, just pick one. Or install MW manually without downloading any of them.

We could very easily choose just one container-based dev environment solution to support without dumping that solution into mediawiki/core. And we could still have the "several competing solutions" problem even if some or all of them did live in mediawiki/core.

Every change to config risks (and occassionally does) breaking MW-Vagrant. A cause of much pain and misery for "new" devs, in the experience of several participants (including me).

Every change to config risks breaking my manual local setup, Beta Cluster, and Wikimedia Production too. Very often it doesn't, though, because we're typically careful to choose good defaults when changing the config.

I don't use MediaWiki-Vagrant often enough to speak to how often that breaks, although I wouldn't be surprised if it would break about as often even if it were part of mediawiki/core.

That's a quite different use case, at least as far as we saw it at TechConf. This is talking about development images for MW itself and maybe a handful of extensions, not for Wikimedia production hosting/development.

Not so different, I think. The problem is "I need to use compatible versions of MediaWiki and X together". What makes X="docker-or-whatever config" that much different from X="composer libraries", or X="operations/mediawiki-config", or X="any random extension or service"?

The future of prod(-like) images, to be discussed later this week, is that vendor is very much going to be bundled into the image (and that hacky repo itself likely deleted).

But I'd guess we still won't be merging everything into the mediawiki/core repo. I'd guess the "prod(-like) images" will be built by something checking out operations/mediawiki-config, the right versions of core and extensions, and whatever else, running something like composer update --no-dev, and bundling the result. I'd further guess that configuration needed for the "something" to work will be in operations/mediawiki-config, not mediawiki/core.

Wouldn't the "just one thing" be the repo that has the docker-or-whatever configuration? Then when you run docker-or-whatever, that pulls in MediaWiki along with the extensions, services, LAMP stack, and so on.

One advantage of having the docker-compose configuration live in the repo root is that it's a lot more intuitive to use tab completion for executing commands (e.g. docker-compose exec {service} ... will allow for tab completion of the mediawiki core repo, and then the command is executed at the proper directory level in the container. Yes, there are various workarounds for when you need to execute at a different directory level but it's not great.

A potential compromise might be to pull in the docker-compose.yml file and any supporting files via a composer library, so we'd instruct developers to add mediawiki/dev-environment to their composer.local.json, or just include that line directly in the dev requirements of MediaWiki's composer.json. The downside to that is 1) we require users to have PHP installed, maybe not the worst problem but it would be nice to not make that a requirement and 2) discoverability is again an issue, unless it's part of core's dev requirements.

Thanks for making this a good session at TechConf this year. Follow-up actions are recorded in a central planning spreadsheet (owned by me) and I'll begin farming them out to responsible parties in January 2020.