Page MenuHomePhabricator

How should Wikimedia software support non-Wikimedia deployments of its software?
Closed, ResolvedPublic

Assigned To
Authored By
Gilles
Sep 21 2015, 7:52 AM
Referenced Files
None
Tokens
"Like" token, awarded by Rdicerb."Dislike" token, awarded by Ipstenu."Heartbreak" token, awarded by Ciencia_Al_Poder."Heartbreak" token, awarded by Addshore."Dislike" token, awarded by Vituzzu."Heartbreak" token, awarded by Kghbln."Heartbreak" token, awarded by Krenair."Heartbreak" token, awarded by Bawolff."Evil Spooky Haunted Tree" token, awarded by hashar."Like" token, awarded by GWicke.

Description

Session overview

Currently scheduled for Monday, 3:40pm PST in Robertson 1. Check the official schedule to confirm.

Wikimedia software like MediaWiki is widely used for non-Wikimedia content (e.g. intranet wikis, hobbyist wikis) often using infrastructure quite different from the Wikimedia production environment (e.g. different database software, shared hosting). To what extent should Wikimedia software development serve the needs of running Wikimedia software in non-Wikimedia environments?

Details

Definition of the problem

"Non-technical installs" includes shared hosting and one-click installs. Support for these ways of installing mediawiki has been on a volunteer basis and not a clear WMF objective. We need clarity on this topic, as it is often brought up - possibly incorrectly - as a potential blocker for mediawiki architectural changes.

Users currently using shared hosting are looking for a simple, cheap solution to easily host a wiki. Mediawiki in its current form seems to be the favored wiki to install for that use case.

VPS and containers hosting can be cheap, but they require a technical literacy that this audience doesn't have. They're not a viable alternative on their own. They would need an ecosystem of simple installation and maintenance to go with them, which doesn't seem to exist yet.

Expected outcome at the summit

Form consensus on the following questions:

  • What subset of mediawiki functionality should get "Grade A" support for installs on lightweight shared hosting environments? Which extensions should that include?

The same way that we've defined grade support for browsers, we could do the same for limited hosting environments. Right now the picture is very unclear and mediawiki developers aren't told explicitely to what extent those use cases are expected to be supported.

  • Should the WMF, as part of its mission to promote such content, offer one-click self-serve hosting for wikis focused on open-licensed educational content?

These users are currently hosting their projects themselves. Such a platform might serve as a better incubator than the one we currently have, which seems to be heavily focused on new languages for wikipedia. It would also make us build the technology that commercial hosting environments could reuse to provide one-click wiki installs to their customers.

  • Beyond open-licensed educational content, what could we do to reconnect with the large audience of non-technical mediawiki installs on commercial hosting?

We find them hard to reach, but maybe they feel the same way. The main value for the mediawiki project to have these thousands of installs in the wild is to create a feedback loop for the project. Not only automated (which versions are running, which extensions, maybe some error reporting, etc.), but also maintaining the human connection of these users reporting bugs, engaging with the mediawiki community, etc. The fact that we find that audience very hard to reach at the moment is probably a sign that we can improve the situation. We're not getting the value we should for the mediawiki project with the current disconnect.

  • What technical projects could be done to propose a viable alternative to shared hosting on richer platforms like container hosting?

Simplicity of installation, maintenance and upgrade would be key. It should be as simple and cheap for a non-technical person to use the alternative. This could very well happen on the same commercial hosting services they are using now, but there is clearly a long way to go before it's made easy to the hosts to offer such a service. Access to newer functionalities like VE should be a big selling point for all the stakeholders.

Current status of the discussion

We've gathered various stakeholders from the current shared hosting landscape to participate in this discussion. We could probably reach more, but this is a good start. Shared hosting constraints, and why people use share hosting seems to be a clearer subject now. The next step will likely be to hold office hours to explore individual questions listed above.

Links to background information

Related Objects

Mentioned In
T154535: 2017 dev summit unconference proposal: VisualEditor included: Can we have simpler installs, but all the features?
T114542: Next Generation Content Loading and Routing, in Practice
T119403: Meeting with MW Stakeholders and WMF
E121: WikiDev '16 Agenda Bashing Session (2015-12-16, 22:00 UTC on #wikimedia-office)
T116024: WikiDev16 program
T119593: Define the list of "must have" sessions for WikiDev '16
T119032: WikiDev 16 working area: Software engineering
T119029: WikiDev 16 working area: Content access and APIs
T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services
T114803: Service-Oriented Architecture, quo vadis?
Mentioned Here
T119032: WikiDev 16 working area: Software engineering
T118932: RfC: Raise MediaWiki's PHP version requirement to 5.5 and update coding standards
T12668: Detect and notify user of extremely low (probably accidental) POST limits
T37001: features not reported due to unnescessary violation of the open_basedir restriction
T42966: 0 for the SQLite database name (as in 0.sqlite) breaks installation without warnings
T43896: help installer deal with memory_limit being too low to load installer i18n
T50149: Persistent "Unable to write to directory" Installer error with SQLite backend when SELinux is enabled
T51472: Installation fails if passthru is blocked
T51531: Query execution timeout in restricted environment may go undetected
T100768: Provide English-only MediaWiki installer
T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services
T92826: Ready-to-use Docker package for MediaWiki
T101046: [EPIC] Use MCS as parser for main content in mobile web
T65699: Parsoid and Tidy differ in how they deal with misnested tags
T55784: [EPIC] Use Parsoid HTML for all page views
T104789: Install composer on tools-login
T91104: PHP thumbnailer as a service
T56425: Provide an opt-in ability to register the user's MediaWiki installation
T87774: Evaluate and decide on a MediaWiki distribution strategy targeted at VMs
T92971: Examine ways to make using MediaWiki-Vagrant secure (or at least not wildly insecure) on a host exposed to the Internet
T96903: Identify and prioritize architectural challenges

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

In addition, if a wiki installs Parsoid, and you are okay with serving read views from Parsoid HTML (which is one of the goals for Wikimedia wikis as well), there is no use for Tidy or any of its replacements either.

This is the first time I've heard this goal -- serving read views from Parsoid -- stated. Is this an explicit goal of WMF engineering? Could you point to where that goal is stated?

If that is a WMF Engineering goal, then it would clarify a lot of what those of us outside have been hearing.

To be clearer, I should have said it is the parsing team's goals ( T55784: [EPIC] Use Parsoid HTML for all page views ), seems to be something that reading-web is interested in (T101046: [EPIC] Use MCS as parser for main content in mobile web). Also see Parsing team's 2015/16 Q2 goals.

In addition, if a wiki installs Parsoid, and you are okay with serving read views from Parsoid HTML (which is one of the goals for Wikimedia wikis as well), there is no use for Tidy or any of its replacements either.

Does Parsoid do template expansion now without calling out to MW? My understanding was that (maybe only in the past) it relied on the PHP for template expansion. When people set up a wiki using MediaWiki and not DokuWiki or some other -- more simple -- tool, they typically want it to act like Wikipedia. Right now, that means using a number of PHP extensions during the initial set up.

As time goes on (and looking at mature 3rd party MW installations shows this), they'll install other extensions like Semantic MediaWiki or write their own -- usually in PHP.

The ecosystem right now means that most people who want to use MediaWiki will need to use PHP.

So, in short, you are not looking at mixing Java, node.js and PHP in the same mediawiki installation.

You're right, if someone is looking for basic wiki functionality, they can get away with node.js. But why would someone do that when all the extensions they want are written in PHP?

I don't know of anyone except those people working on Parsoid who would consider setting up a Parsoid-only wiki. Maybe you'll eventually have a compelling use case, but, right now, people who are interested in MediaWiki are more interested in eschewing node.js than PHP.

Congratulations! This is one of the 52 proposals that made it through the first deadline of the Wikimedia-Developer-Summit-2016 selection process. Please pay attention to the next one: > By 6 Nov 2015, all Summit proposals must have active discussions and a Summit plan documented in the description. Proposals not reaching this critical mass can continue at their own path out of the Summit.

In addition, if a wiki installs Parsoid, and you are okay with serving read views from Parsoid HTML (which is one of the goals for Wikimedia wikis as well), there is no use for Tidy or any of its replacements either.

Does Parsoid do template expansion now without calling out to MW? My understanding was that (maybe only in the past) it relied on the PHP for template expansion.

Yes, Parsoid queries the Mediawiki API to talk to the preprocessor and to handle extensions installed on the wiki.

When people set up a wiki using MediaWiki and not DokuWiki or some other -- more simple -- tool, they typically want it to act like Wikipedia. Right now, that means using a number of PHP extensions during the initial set up.

As time goes on (and looking at mature 3rd party MW installations shows this), they'll install other extensions like Semantic MediaWiki or write their own -- usually in PHP.

The ecosystem right now means that most people who want to use MediaWiki will need to use PHP.

So, in short, you are not looking at mixing Java, node.js and PHP in the same mediawiki installation.

You're right, if someone is looking for basic wiki functionality, they can get away with node.js. But why would someone do that when all the extensions they want are written in PHP?

I don't know of anyone except those people working on Parsoid who would consider setting up a Parsoid-only wiki. Maybe you'll eventually have a compelling use case, but, right now, people who are interested in MediaWiki are more interested in eschewing node.js than PHP.

I made no claim about dropping PHP for node.js There is clearly some misunderstanding. So, let me make another attempt:

As I tried to explain in T65699#1695613 (and referred you to it), a Java service is not a requirement. You can get away without it by leaving behind old Tidy, or replacing with tidy-html5 (if that serves the purpose), or a PHP HTML5 library (when -- I suppose it is a matter of when, not if -- one such becomes available in the future). Clearly, continue to use old Tidy is not the best option, but that option exists for wikis that want it.

But, if you install VE, you are going to install Parsoid. In that case, if you serve read views from Parsoid, you don't need the Java HTML5 parsing service or tidy-html5 or old tidy either.

So, all I was trying to argue was: you are not compelled to add Java to the mix. In VE-installation scenarios, node.js + PHP should suffice.

ssastry writes:

I made no claim about dropping PHP for node.js There is clearly some misunderstanding.
...
So, all I was trying to argue was: you are not compelled to add Java
to the mix. In VE-installation scenarios, node.js + PHP should
suffice.

My apologies for misunderstanding. I thought you were saying that you
could have a node.js-only wiki.

I'm aware that efforts are being made to provide a pure-php alternative
to HTML5 parsing. My confusion was caused by mis-interpreting this statement:

In addition, if a wiki installs Parsoid, and you are okay with serving
read views from Parsoid HTML (which is one of the goals for Wikimedia
wikis as well), there is no use for Tidy or any of its replacements
either.

Since I was talking about mixing platforms, I thought you were saying
"Parsoid is all you need! PHP isn't needed, so you can just run
pure-node.js".

Again, I apologize for the misunderstanding.

I was reading an article about Wikipedia this morning and came across this quote.

"As Wikipedia grew in stature, its community began to look beyond the limits of mere words and links to images and infoboxes. Culture keeps creeping in. People keep hacking. And they keep pulling in ideas from elsewhere. “We need chessboards,” someone thought, and then after some experimentation, there were chessboards, only now it wasn’t knowledge anyone could have: You needed to know how to code using the right template."

-Paul Ford for The New Republic.

One phrase struck a chord with me. "only now it wasn’t knowledge anyone could have".

If MediaWiki continues to be incomplete out of the box (It's hard to consider that a non-VE wiki would be viable now and in the future) and requires more and more services to be an actually usable product, then we are making something where it isn't "knowledge anyone could have". It's something where only a few can understand. Maintain. Heck, just install! As a product of the Wikimedia movement, MediaWiki should continue to embody this tenant. Removing shared hosting support without a simple, supportable solution of third-party users is a bad idea.

A VM might be just as cost effective - maybe even less expensive - but adds much more complexity - knowledge of command-line, server configuration, security, etc. Are we willing to drop folks who could handle a managed shared host into that abyss? :)

OVH has their entry-level VPS at $3.49/month: https://www.ovh.com/us/vps/ I've seen even cheaper, but from small operations that didn't have good reliability.

I ran my IRC bouncer on OVH for 9 months and recently switched off to another provider because their VM uptime and network connectivity were honestly not worth $3.49/mo.

Another drawback with most very very cheap VPS providers is that they tend to run under shared kernel virtualization services (e.g. KVM) which prevent running additional "containers" (e.g. LXC, Docker) inside the rented VM. Many of the packaging scenarios that I have heard discussed center around providing some sort of federated container system to sidestep the cross-distro compatibility issues inherent in the deb/rpm/whatever native packaging debate.

I don't disagree that there are nice VPS solutions available for the cost of shared hosting 5-10 years ago, but the VPS solutions that are on par with the base cost of a modern shared hosting service typically are not suited to running anything more complex than a single application. These types of providers are also much less likely to provide shared services such as database servers and caching systems as a part of their offered solutions.

I'm not personally a huge fan of shared hosting for non-trivial websites and am not taking a strong position for the deliberate retention of shared hosting support in the MediaWiki product family, but I do find the "VPS hosting is really cheap" argument to be a weak one.

Hi there, I was led to this thread from a friend. I work at Bluehost. I did a quick amount of digging to find we have thousands of installs on our shared platform and we are installing 100s more per month (our most popular wiki install). This is just people using the one-click installer BH offers, others installing manually are not included in the installs per month number. This leads me to believe that users do want to use this project on a shared platform. I think @Ckoerner nailed it when he said that VMs come with an added layer of complexity. Many users do not want to deal with that and if they do find a tutorial or something online to help them get it going, they will likely not keep things updated and secure. IMO this would introduce people believing the the project itself may not be secure when in fact it is their environment (because they don't know how to manage it).

I guess my intent here is just to say that shared users are still using this project and that if a VM becomes a requirement that there should at least be an establish upgrade path to help those currently in shared environments or an added incentive to continue supporting shared hosting environments.

Hi there, I was led to this thread from a friend. I work at Bluehost. I did a quick amount of digging to find we have thousands of installs on our shared platform and we are installing 100s more per month (our most popular wiki install).

Thank you Mike, your input is *so* valuable.

Vito

Hi there, I was led to this thread from a friend. I work at Bluehost. I did a quick amount of digging to find we have thousands of installs on our shared platform and we are installing 100s more per month (our most popular wiki install).

Thank you Mike, your input is *so* valuable.

+1. It would be great to hear from folks at GoDaddy, 1&1, and other major shared hosting providers who have 1-click MediaWiki installs on this topic (and in general) as well.

@MarkAHershberger would it be reasonable for the MediaWiki-Stakeholders-Group to make a dedicated effort to find contacts at such companies? I have a vague feeling that Wordpress does outreach and evangelism with major hosting providers but I've never seen anyone in the MediaWiki community talk much about that.

@bd808 I'm going to out myself here.

I am the one who asked Mike (@MikeHansenMe) and Mika (@Ipstenu) from Bluehost and Dreamhost respectively to add their comments to this task. I have met both of them through my work with WordPress and thought their input would be valuable in this conversation.

Not to step on @MarkAHershberger's toes, but my interest in this topic is motivated by my participation in the MediaWiki-Stakeholders-Group as well. Small world :)

I like your idea. You also hit on an interesting topic, MediaWiki evangelism. That is something that other open-source projects have dedicated resources focused on. Maybe something that would benefit MediaWiki?

+1. It would be great to hear from folks at GoDaddy, 1&1, and other major shared hosting providers who have 1-click MediaWiki installs on this topic (and in general) as well.

We would be happy to invite these stakeholders to the Summit (all the better if they are based in SF Bay Area or can cover their travel).

The deployment choice isn't limited to shared hosting or a VPS, there's also T87774: Evaluate and decide on a MediaWiki distribution strategy targeted at VMs. As I understand it, with containerization (a blocker of T87774) you would set up container(s) for MediaWiki-on-PHP, a SQL database, and (currently optional) a nodejs server. If T92826: Ready-to-use Docker package for MediaWiki is the approach we choose, Docker container hosting is cheap; Google container engine will run five containers for free. People are already offering various containers for MediaWiki and Parsoid.

I have no idea how admins would update containers with security fixes, or really any of this :-)

Over at DreamHost, we also have tens of thousands of MW users and it's our most popular Wiki software. I wouldn't be shocked to find out most people installed it on Shared hosting either and upgraded as needed.

I work at Bluehost. I did a quick amount of digging to find we have thousands of installs on our shared platform and we are installing 100s more per month (our most popular wiki install).

Over at DreamHost, we also have tens of thousands of MW users and it's our most popular Wiki software.

Thanks a lot for your contribution to this discussion, it's the most important one I've seen in many years of MediaWiki bugs and mailing lists.

Could you provide us with a list of api.php URLs for said wikis (at least the public ones)? Or if not, comment on https://www.mediawiki.org/wiki/Requests_for_comment/Opt-in_site_registration_during_installation ? If we have URLs, we will consider statistics about the configurations they use etc. Currently we have a very hard time keeping in touch with your users, we typically just meet few of them briefly in random places (like StackExchange) when they have an issue.

@Gilles, can this task be assigned to you? Wikimedia-Developer-Summit-2016 proposals need an owner.

Could you provide us with a list of api.php URLs for said wikis (at least the public ones)?

I'm pretty sure security would read me the riot act for that one :) Yes, it's public, but the customers didn't agree to or ask for that. I'll check with them, though, and ask.

FWIW, I'm pro Opt In registration and/or tracking. WordPress does it via their API in order to perform plugin updates, which is a feature I've always wished MediaWiki had. Upgrading extensions has always been annoying unless you use git, and even though there's no way to 'know' if you need an update. I'll leave a comment.

Sorry for being unresponsive @Qgil, I was at a conference all week. My plan is to try to reframe this task based on all the feedback we've received, and to make it conform to the expected format for the summit. The semi-provocative initial title brought a lot of attention and that's awesome, but now that we have more information I hope to synthesize these topics into something clearer and more actionable.

Gilles renamed this task from The end of shared hosting support? to How should the WMF support non-technical mediawiki installs?.Oct 15 2015, 12:51 PM
Gilles updated the task description. (Show Details)

I encourage everyone to read the new description of this task, which is a drastic refocus based on the discussion so far. Criticism on the new content is welcome :)

'Non-technical' seems a bit disingenuous, as there is quite a lot in between a simple expand and go installation and a full server setup, and then there are, beyond that, fully scaling server setups with services and security and distributedness and all the things most people never even think of. Even when you have the required access things like parsoid and elasticsearch can be incredibly difficult to set up, let alone set up correctly (hurr durr let's leave external ports open by default; you can pay money if you actually want security), and in a lot of cases this just plain isn't worth it.

We need some balance of 'worth it' for all the stakeholders.

'Non-technical' seems a bit disingenuous, as there is quite a lot in between a simple expand and go installation and a full server setup, and then there are, beyond that, fully scaling server setups with services and security and distributedness and all the things most people never even think of.

+1, what about "How should the WMF support shared hosting mediawiki installs?", words are important

Support for these ways of installing mediawiki has been on a volunteer basis and not a clear WMF objective

I think this choice of words is unclear. Its not the act of installing we are talking about, but the feature set
Supported on a platform. When I first read this i thought you were referring to the web installer vs debian packages, etc

I, too, find the new focus unclear. The "questions" are rather answers in themselves and partly exclude each other.
They respectively mean/imply: 1) find a developer solution: [explicitly] drop shared hosting support on some subset of MediaWiki, 2) find an organisational solution: replace [current form of] shared hosts with a competing service, 3) business as usual + more marketing, 4) mix of the preceding.

'Non-technical' seems a bit disingenuous, as there is quite a lot in between a simple expand and go installation and a full server setup, and then there are, beyond that, fully scaling server setups with services and security and distributedness and all the things most people never even think of.

+1, what about "How should the WMF support shared hosting mediawiki installs?", words are important

I think that the focus on shared hosting in the long term is a red herring. We need to address shared hosting issues right now, but that doesn't mean that shared hosting is the only answer to these users' needs. In the current landscape of available hosting solutions, they've picked shared hosting for reasons we've established earlier in this thread. But should a hosting solution better suited to their needs appear, they are likely to migrate from shared hosting and never look back. Let's not forget that people are trying to run customized wikis, the fact that they currently need to learn how to run shared hosting isn't the goal itself, it's a consequence of what they're trying to do.

I think that it's important to start the thinking from the user's needs. Shared hosting isn't a user need, it's a side-effect of economic conditions that make it a tool that many people choose in order to achieve their goals.

  1. find a developer solution: [explicitly] drop shared hosting support on some subset of MediaWiki

Assume good faith :) I see this goal rather as a way to whitelist things that should be properly supported and currently aren't. The status quo is that it's at the developers' discretion to care about shared hosting at all. It leads to very inconsistent practices and poor support. Of course some extensions are likely to end up in the "won't be supported" list, but they're probably not working in shared hosting environments right now. I hope to have a conversation with a wide enough audience to reach a decent consensus on this matter. The current situation consists of people writing ideas about how we could make extension X work in shared hosting and then... nobody works on implementing those ideas. Setting clear expectations is always helpful, and it's never going to prevent anyone from volunteering to make an extension work even if it's not on the official support list at a given time. It might also make some teams realize the demand there is and rethink their position on supporting it.

In T113210#1732685, @Nemo_bis wrote:3) business as usual + more marketing

I personally don't think that the solution lies in marketing. I have specific ideas to bounce off other people and I think that this is a topic that hasn't been explored much. Being able to bring the disconnected shared hosting community back into the fold to things like phabricator isn't business as usual, it's making sure that they get a better representation in the mediawiki community, which is currently close to zero, with most people talking about that issue on their behalf. It's a tricky problem and I think that having a collective discussion about it could bring specific pain points to light and result in actionable tasks to improve the situation.

I've scheduled an IRC office hour on Nov 19th to explore the third bullet point in the task description, "Reconnecting with the shared hosting community".

I hope that this meeting will help us find ideas in areas as wide-ranging as outreach, documentation or technical solutions to improve the current situation of complete disconnect with that user base.

What subset of mediawiki functionality should get "Grade A" support for installs on lightweight shared hosting environments? Which extensions should that include?

I'd like to hear more on how this might work. How would the community make these decisions?

We find them hard to reach, but maybe they feel the same way. The main value for the mediawiki project to have these thousands of installs in the wild is to create a feedback loop for the project. Not only automated (which versions are running, which extensions, maybe some error reporting, etc.), but also maintaining the human connection of these users reporting bugs, engaging with the mediawiki community, etc. The fact that we find that audience very hard to reach at the moment is probably a sign that we can improve the situation. We're not getting the value we should for the mediawiki project with the current disconnect.

Hello. We're here (and this week, here). :)

What resources are available to further enable this? A community liaison role dedicated to this MediaWiki community?

The current situation consists of people writing ideas about how we could make extension X work in shared hosting and then... nobody works on implementing those ideas. Setting clear expectations is always helpful, and it's never going to prevent anyone from volunteering to make an extension work even if it's not on the official support list at a given time. It might also make some teams realize the demand there is and rethink their position on supporting it.

Would the work before us be clearer with some sort of roadmap of where we (WMF, Wikipedians, Mediawiki users, Extension developers) want to go with MediaWiki? "We want to get to X in the next 6 months and y in the next 3 years. Here's how we're going to do it. Along the way, we have to make deciiosns on A, B, and C."

We don't have that now and making a decision like the one we're talking around here is, obviously, going to have impact in the future. How do we know what those impacts will be?

I'm afraid my thoughts are diverging from the initial conversation.

It may be worth merging T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services into this task, or at least considering them together. That gives another viable option for simple single-process installs, even if the core architecture unbundles into a collection of separate services. And it provides shared hosting support via Heroku and friends as well.

It may be worth merging T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services into this task, or at least considering them together.

T87774: Evaluate and decide on a MediaWiki distribution strategy targeted at VMs is already linked by the task description; the task you added is a sensible blocker for it.

I think that behind the issue of installation, we're forgetting the issue of maintenance. While it's possible to resolve the former with packages/containers/specialized installers, the latter is much harder.

It's way harder to investigate issues in an environment where in addition to PHP files in webserver root there are services written in different languages. Keeping software written in multiple languages running inherently requires more knowledge. And Docker can't help you with that. Considering all of this, I'd say that making services mandatory would be the end of MediaWiki as a popular wiki software that people other that the org developing it are using.

I'd say this task is blocked on T118932: RfC: Raise MediaWiki's PHP version requirement to 5.5 and update coding standards, specifically on having some basic information on MediaWiki requirements as suggested in T118932#1834602

I'd say this task is blocked on T118932: RfC: Raise MediaWiki's PHP version requirement to 5.5 and update coding standards, specifically on having some basic information on MediaWiki requirements as suggested in T118932#1834602

@Nemo_bis: that strongly suggests that T118932#1834602 belongs in a task of its own, doesn't it?

strongly suggests that T118932#1834602 belongs in a task of its own, doesn't it?

Unlikely. It's impossible to document a rationale without knowing one. (Better discuss on that task whether to split it, though.)

There's a conversation on this topic that is starting up happening at T119032#1864586 now. People who believe we need this session should weigh in on T119032: WikiDev 16 working area: Software engineering.

@ori mentioned in an IRC meeting a few weeks back if we could look at WikiApiary to see if we could determine if folks have left MediaWiki for other tools. @Krabina posted this on the Semediawiki-user mailing list (via @Nemo_bis on the wikiapiary-l list - it's mailing lists all the way down!). I though it might be worth sharing here.

http://trends.builtwith.com/cms/MediaWiki/Market-Share

Under "MediaWiki are Losing Customers to"

I'm scheduling two IRC office hours back to back (shouldn't be hard to use the time, we ran over last time) a week from now to continue exploring this topic. Starting with the topic that I had to hold people back on last time and then "eating our vegetables" for the second part: https://meta.wikimedia.org/wiki/IRC_office_hours#Upcoming_office_hours

Monday 2015-12-21 20:00 UTC until 21:00 UTC #wikimedia-office on Freenode

IRC office hour: Shared hosting technical alternatives

During the last office hour on the topic of non-technical mediawiki installs, people seemed very eager to discuss new technical solutions that could offer a viable alternative to shared hosting.

Could new technologies like containers allow for performance/cost ratios comparable to shared hosting? If not, how big would be the penalty? How much maintenance would we have to do to keep deployment on such platforms up to date?

Shared hosting has always suffered from the fact that it's not used at the WMF and therefore only maintained on a volunteer basis. How would things be different with new tech?

Monday 2015-12-21 21:00 UTC until 22:00 UTC #wikimedia-office on Freenode

IRC office hour: Shared hosting support definition

Shared hosting usage is already a reality and we should do a better job accounting for it. Currently mediawiki contributors have no visibility in what should be supported and to what degree. Our browser support is graded and very clear, meanwhile our server-side support is not: https://www.mediawiki.org/wiki/Compatibility

Should we model our server-side compatibility guidelines on the graded system we have for browsers? If so, what would that look like? How could we break down "shared hosting support" into more discreet server-side capabilities?

The meetings on #wikimedia-office from a few hours ago:

Shared hosting technical alternatives

Meeting started by @Gilles at 20:00:26 UTC.

Meeting summary

Meeting ended at 21:11:40 UTC.

Action items (none)

People present (lines said)

Shared hosting support definition

Meeting started on #wikimedia-office by @Gilles at 21:11:48 UTC.

Meeting summary

Meeting ended at 22:09:09 UTC.

Action items: (none)

People present (lines said)

I read the logs of those two meetings and summary seems to be something like:

  • Containers are not a solution for shared hosting (different audience)
  • We do know whether anyone is maintaining packages for debian/npm/composer as work duty
  • Installation needs to be made easier, maybe also automatic or one-click upgrades.
RobLa-WMF renamed this task from How should the WMF support non-technical mediawiki installs? to How should Wikimedia software support non-Wikimedia deployments of its software?.Dec 24 2015, 10:57 PM
RobLa-WMF updated the task description. (Show Details)

I read the logs of those two meetings and summary seems to be something like:

  • Containers are not a solution for shared hosting (different audience)
  • We do know whether anyone is maintaining packages for debian/npm/composer as work duty
  • Installation needs to be made easier, maybe also automatic or one-click upgrades.

Thanks for boiling this down @Nikerabbit! One simple outcome I would suggest is a follow up on @GWicke's suggestion. He proposed that instead of trying to support every possible option, we narrow our focus to a small subset (e.g. three options). What those three options should be was not entirely clear from the conversation.

It might be a fruitful conversation to just build a list of possible audiences, and agree that we've got a reasonably complete list. After the summit, the royal "we" could then iterate on the list, progressing through these interim milestones:

  1. A reasonably complete list of all reasonable audiences to serve (not exhaustive, but not missing anything obvious and important)
  2. A prioritized list of the most important audiences to serve
  3. A decision about how many audiences we plan to serve (i.e. where to draw the "cut line" on the list)

If we could get step 1 of 3 above completed in this session, that would be big progress.

If we could get step 1 of 3 above completed in this session, that would be big progress.

I will definitely be there and try to have useful stats ready from the survey, WikiApiary, etc.

I've prepared some slides for that session, trying to keep it as short as possible: https://docs.google.com/presentation/d/1_UaZXN3udsRq0tvpm3G4AHE_e0WisBWCoQsK3q4uahU/edit?usp=sharing

My goal for the slides was to summarize the enlightening birds-eye view "truths" that have emerged from the discussions here and on IRC, to give everyone some clarity on the matter. As well as describe the two big issues where we need to make progress in terms of decisions and commitments.

There's definitely a lot less depth in those slides than some of the more specific points we've explored. But I felt that including those sub-topics in the presentation would eat away from the discussion time for the session as well as increase the likelihood to get rabbit-holed into a specific issue that might not matter that much in the bigger picture.

The current scheduling seems to indicate that I'm going to be off in R2-land while this topic is discussed in R1. So I'm going to write a bit about T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services here and hope that someone will represent for me in the R1 discussion.

Thanks, @Gilles, for adding slides to set the agenda. I hacked together mediawiki-express (in part) to elucidate the role of apache in our current requirements stack. mediawiki-express replaces apache with node as the web server component. This is convenient mainly because it allows the "apache configuration" part of installing mediawiki to be done in a consistent manner with the rest of the packaging (and expressible in a "real programming language", for what it's worth). This also allows multiple independent "servers" to be bundled together for shared hosts, with all of the inter-service routing performed automatically. This enables a new class of "shared hosting" providers -- the group of node/npm-based providers such as Heroku -- while enabling further decoupling of our software into independent services.

But it also begins to address another issue, which is touched upon by @Gilles' last slide. We are factoring more functionality out of core. I see this as largely positive, although our mechanism has been mostly implicit: by restricting the dependencies of core, we ensure that large features (MathJax, Visual Editor, Content Translation, etc) are implemented as extensions. Again, I have no problem with moving away from a single monolithic piece of software.

But we will increasingly have to deal with complicated interdependencies between extensions. Arguably, we already have this, and composer is part of the solution in PHP land. It currently doesn't scale well when multiple languages or web services are involved (ie, MathJax, VisualEditor, Content Translation).

I started mediawiki-express to explore whether we might have an orthogonal solution in JavaScript-land. I would like to construct, as a proof-of-concept, an npm module for (say) Extension:WikiHiero which would bundle (a) the PHP extension for rendering hieroglyphs, (b) the required client-side JavaScript to edit wikihiero markup with Visual Editor, and (c) the required server-side JavaScript to teach Parsoid how to parse WikiHiero markup.

So I see mediawiki-express as a useful experiment in packaging multiple services into a single "server" for small deployments, as well as an experiment in packaging multilingual codebases (currently PHP, client-side JavaScript, server-side JavaScript) together. mediawiki-express isn't the only possible solution: I think these are useful evaluation axes to consider for all the other packaging/deployment/support options we will discuss in this session, which can solve some of the same problems in different ways.

Summary notes:

  • Gilles gave the intro presentation
  • Main question: How should Wikimedia software support non-WIkimedia deployments?
  • Questions explored
    • Which hosting platforms are/should be supported?
    • Do we need to stick with a LAMP stack? Could we decide that some version in the not distant future will be the last "pure PHP" implementation? Can we drop support for shared hosting?
    • How do we make sure non-Wikimedia users benefit? How can we make sure that non-Wikimedia users stay up to date easily?
    • Should Wikimedia fork MediaWiki?
    • If we move to requiring SOA, should we invest in supporting non-Wikimedia installs in migrating? Is it worth maintaining PHP versions of non-PHP software which Wikimedia deploys?
    • Does MediaWiki need a governance structure outside of Wikimedia? How can people be part of this?
    • Does Wikimedia need to support non-Wikipedia use cases better?
    • Does there need to be a "MediaWiki Foundation"?
    • What problem are we trying to solve by dropping pure PHP support? What is wrong with the status quo?
    • What constitutes the "full MediaWiki experience"? How do we make the experience of installing MediaWiki result in something like editing Wikipedia?
    • How should non-Wikimedia use cases be funded?
    • Is VisualEditor/Parsoid part of the core MediaWiki experience? If so, should the MW core requirements be changed to reflect that? Or could Parsoid be ported to PHP?
    • Can VM-based hosts be the "low budget" solution that replaces shared hosting? Consider that "low budget" might include aspects other than money, e.g. time/effort and ability to maintain the rest of the OS on the VM.
    • Can we remove the delta between the setup for new Wikimedia developers and the setup for low-budget hosting?
    • Should we put greater support behind Gabriel's https://github.com/wikimedia/mediawiki-containers project?
    • Should we put more effort behind other non-Wikimedia open content efforts, and improve outreach to other open content efforts?
    • Should Wikimedia operate a wikifarm to improve Wikimedia's support for wikifarms?
    • What is the urgency to move the line?
    • What is our plan for a plan?
    • Who gets to make the decision?
    • Are use cases off of the Wikimedia Foundation production cluster important?
    • How can innovative developments in this be planned for also? ? What does that mean? New related innovative question developments will emerge. How to plan for these?
    • How can we make MediaWiki core as clean and modular as possible? (i.e. make it possible to replace parts easily)
    • How can we make extension compatibility work between versions?

Full notes:
https://etherpad.wikimedia.org/p/NonWikimediaDeployments

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!