Page MenuHomePhabricator

Wikimedia Developer Summit 2018 Topic: Evolving the MediaWiki Architecture
Closed, ResolvedPublic

Description

This session is about evolving the system that allows us store and deliver our content. It's about the architecture of not just MediaWiki core, but also extensions, APIs, underlying services (like databases and job queues) as well as public services, such as RESTbase and the Wikidata Query Service.

This is one of the 8 Wikimedia Developer Summit 2018 topics.

Position papers: https://wikifarm.wmflabs.org/devsummit/index.php/Session:2
Note: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018/Evolving_the_MediaWiki_Architecture (copied from https://etherpad.wikimedia.org/p/devsummit18-evolvingmediawikiarchitecture )

Keep in mind:

  • Architecture impacts 3rd parties, supporting 3rd parties impacts architecture
  • Top-down (vision) vs. bottom-up (issues)
  • Scale and sustain vs. evolve and explore
  • Session outcomes should inform annual plan as well as movement strategy
  • How do we develop a tech strategy of there is no product strategy yet?

Desired Outcomes:

  • Urgent Focus for Scaling and Sustaining
    • Identify risks to and needs for sustaining and scaling our current capabilities (aka services)
  • Strategic Direction for Improvement
    • Key Questions blocking Development Decisions (decision tree)
    • Strategic Direction and Technological Visions (with justification and dependency on key questions)
  • Strategic Planning Process
    • Define further process and responsibility for decision making for each focus area / key question
    • Define convergence points that ensure product and tech strategy match

Session Structure

  • Define session scope, clarify desired outcomes, present agenda
  • Discuss Focus Areas
    • Discuss and Adjust. ''Note that we are not trying to come to a final agreement, we are just prioritizing and assigning responsibilities!''
    • For each proposition https://etherpad.wikimedia.org/p/devsummit18-evolvingmediawikiarchitecture
      • Decides whether there is (mostly) agreement or disagreement an the proposition(s).
      • Decide whether there is more need for discussion on the topic, and how urgent or important that is.
      • Identify any open questions that need answering from others, and from who (product, ops, etc)
      • Decides who will drive the further discussion/decision process. A four months deadline is suggested.
  • Discuss additional strategy questions https://etherpad.wikimedia.org/p/devsummit18-evolvingmediawikiarchitecture. For each question:
    • Decide whether it is considered important, blocking tech strategy.
    • Discuss who should answer it.
    • Decide who will follow up on it.
  • Wrap up

Resources:


Topic Leaders (@daniel) , please

  • Add more details to this task description,
  • Coordinate any pre-event discussions (here, IRC, email, hangout, etc),
  • Outline the plan for discussing this topic at the Developer Summit.
  • Optionally, include what it will not try to solve.
  • Update this task with summaries of any pre-event discussions.
  • Include ways for people not attending to be involved in discussions before the summit and afterwards.

Post-event Summary:

  • ...

Action items:

  • ...

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I won't be able to attend this session because it is running parallel to the language session, so I'll share my thoughts here:

  1. We should attract more developers. From architecture point of view, I think reducing tech debt and increasing test coverage is a good way to do it. To do this well requires answering to many of the other questions you listed: monolithic or services (or both?), what technologies to use (and how to deploy them), should we develop OOUI or use something that other people use too, etc. Of course we also need to support those new developers, but I think that topic is better discussed in T183318: Wikimedia Developer Summit 2018 Topic: Growing the MediaWiki Technical Community.
  2. One of MediaWiki's strength is its support for languages and multilingual content. I don't think we highlight this enough. Having said that, the language support is not perfect and that should be taken into account in the architecture and strategy. To make it more concrete, here a few issues to identify what kind of solutions are needed if we want to better reach every person in their preferred language:
    1. Everyone finds <translate> tags annoying – it looks like MCR or parsoid or WikiText 2.0 will provide a better way to do it, but more work is needed, work that goes across teams. Some content is not made translatable because of this. T131516
    2. We have the methods to deliver the interface in users language, but they are not suitable for the logged-out users of Wikimedia wikis (where the reach is the biggest!). More work here is needed, again work that goes across teams. Perhaps "Content assembly at edge" is one possible solution to it? Are there other solutions? T58464
    3. We started using LocalisationUpdate (LU) long before industry created terms like continuous translation. However LU no longer runs every day on Wikimedia wikis, and it is so complicated to set-up that I am not aware of more than a handful of third party wikis using it. This is problematic because we also stopped pushing translation updates to maintenance releases for stable branches. What kind of architecture is needed to provide fast, secure and simple translation updates both to Wikimedia wikis and 3rd party wikis? It's a big demotivation for translators and poor user experience for users if we have translations that exist but are not reaching the users. See also https://www.mediawiki.org/wiki/Internationalisation_wishlist_2017#Integrate_continuous_translation_updates_to_MediaWiki_core
daniel updated the task description. (Show Details)

Improve Gadget Infrastructure: Gadgets could be

  • a great way to onboard newcomers to the dev community
  • to keep specialized features and settings out of our core code.

However, the current infrastructure…

  • Allows too much (Monkeypatching, so Mediawiki and gadgets break each other)
  • Makes enabling/disabling too hard.

Looking at Mozilla here is interesting, since they also have had an you-can-do-it-all "Extension" infrastructure which broke the software and was problematic for security. Their newer "WebExtensions" alleviate these problems. Mozilla also uses this strategically to try new features or fade others out. (@Qgil, @Bmueller)

Some opt my own thoughts and some cribbing from the ATWG sessions:

  • Strategies for making our content machine readable (structured/semantic markup) - and consistent markup for all projects
  • 3rd party API consumers are not well represented. Most 3rd party support talk is centered around 3rd party MW installations - but a large number of 3rd parties just want to access our data using APIs. But our (lack of coherent) documentation makes this difficult (I have personally had to handhold partners through understanding our APIs). This is true of both MW and RB APIs, and while RESTful Routing helps as does Swagger - we do not have a portal like developer.wikiomedia.org that explains how to interact with out services.
  • Easy installation for not only 3rd parties, but developers is not a good experience. We need to make it easy to work on our stack.
  • Do we want to prioritize the sharing of components in a way that doesn't require MW? (IOW should we try to create projects like VE that don't require MW to use?)
  • Related to the change prop discussion and storage discussion: Do we want to prioritize features built on inter-project communication and presentation of data (for example, adding support for reviewing Wikidata descriptions in Wikipedia has been problematic)
  • I'll second the DB discussion… that seems to be a recurring problem that has had little movement and I am not sure everyone understands all the constraints and issues.
  • Skin system - some have suggested that the system is too powerful and we should restrict styling to just CSS to make this easier
  1. Everyone finds <translate> tags annoying – it looks like MCR or parsoid or WikiText 2.0 will provide a better way to do it, but more work is needed, work that goes across teams. Some content is not made translatable because of this. T131516

https://www.mediawiki.org/wiki/User:Nikerabbit/Translate_v2 is an early draft proposal to address this.

As usual, I couldn't make the IRC meeting.

22:10:14 <coreyfloyd> On architecture as it relates to 3rd party: I think that those 2 are very intertwined. The implication is that 3rd parties can only be supported by shipping a installable PHP app in a LAMP stack. This limits our architecture decisions

Think of it like progressive enhancement in the frontend JS world.

  • I think it's a strength that people can run low-traffic wikis with much of the same functionality as Wikipedia on a LAMP stack. And that developers can get started developing on a LAMP stack. And really the "L" and "A" aren't very required, and the "M" could theoretically be something else too although it's not very well supported.
  • Something like Scribunto with LuaStandalone is the next step: augment the LAMP stack by having P shell out to a standalone binary. This cuts off some users and adds some complexity, but it's usually not too bad for someone to figure out.
  • The step beyond that is to install other well-supported software beyond the LAMP stack. Memcached, redis, ElasticSearch, these all have well-established products with well-established communities. Again, some users can't do this and it adds even more complexity, but the installation and running of the software is usually managed and well-documented by people who aren't us and all we have to handle is integration.
  • Then we get to the bespoke services, that usually require installing an extra runtime (see previous paragraph) and then figuring out how to make sure it's started reliably and configured correctly before you can get to the point of integration with MediaWiki. And usually you have to do that with limited or outdated documentation and only a handful of developers you might ask for support.
  • And worse, these bespoke services too often seem to require other bespoke services, compounding the problem.

Requiring the nest of bespoke services is like abandoning the idea of progressive enhancement in favor of your site only functioning with the latest version of Chrome.

22:19:57 <_joe_> So I think some big questions about our architecture are: do we want a REST API? Should it be part of MediaWiki itself or stay external? How do we move away from the cache-but-also-storage-but-also-router-and-almost-ESB that restbase currently is

Before we can meaningfully answer "do we want a REST API?", I think we need to define the term.

  • If we mean a mostly stateless API (i.e. except for authentication), the action API is pretty much already there.
  • If we mean URLs that each reference a single resource or a predefined collection of resources and that can be heavily cached in something like varnish, sure, that would be useful for some use cases.
  • If we mean the previous bullet plus using URL path components as positional parameters rather than using the query string, meh.
    • A query string based implementation can be cacheable if query parameters are required to be well-ordered via errors or redirects, or if the cache layer can be told that parameter order doesn't matter.
    • I suppose some of the advantage of positional parameters is that it makes it really hard to sanely have more than a few of them, forcing you to keep the options limited.
  • If we mean full-on HATEOAS, I don't think restbase is even there and I don't see the point of going that far.

Whatever, if it's going to be something that is supposed to be useful for third parties, I'd say it should be in MediaWiki core. That's not to say that the implementation in core can't route to standalone bespoke services, or that a huge site like WMF can't do some of that routing at a varnish or apache level rather than in PHP as an optimization, but the basic functionality should be in MediaWiki.

22:30:57 <TimStarling> if it is revolutionary in nature, e.g. rewrite everything in typescript, then it helps to re-evaluate after you do prototypes
22:33:38 <cscott> TimStarling: the danger being, of course, that we product-ize the prototypes w/o deprecating the things they were supposed to replace

Indeed. And, for that matter, that we product-ize the prototypes rather than re-evaluating foundational issues that make sense in a prototype but not so much in a final product.

22:49:30 <tgr> cscott: for example, one possible tech vision is a distributed-ownership microservice architecture where people can take over the fringe things that they love, and the "official" Wikimedia site interfaces them and recovers gracefully if they do a poor job and the service is not keeping up

The "and recovers gracefully if they do a poor job and the service is not keeping up" is the extremely hard problem there, particularly when you realize that even if the site as a whole recovers gracefully it will probably be lacking the fringe functionality meaning that functionality is still flaky or broken.

22:49:48 <cscott> brion: google docs doesn't work well on phones. i think folks still use big machines & big screens when they want to actually tackle a big piece of editing.

That has always been my feeling too, I find that typing anything significant on my phone is a major pain. So when people keep saying that the majority of contributions in the future are going to be done on phones, it never really resonates with me.

22:49:48 <cscott> brion: google docs doesn't work well on phones. i think folks still use big machines & big screens when they want to actually tackle a big piece of editing.

That has always been my feeling too, I find that typing anything significant on my phone is a major pain. So when people keep saying that the majority of contributions in the future are going to be done on phones, it never really resonates with me.

I've been a long time disbeliever of "mobile is the future" hype and honestly I think I was right in 2010 to think it was quite a way out. I do know human beings today however who are very active consumers of internet content who have what I would consider to be large'ish phones as their only device. I don't know if they engage in what any Wikimedian would consider long form editing, but they do interact with Q&A sites, social sites, and consume a lot of content.

Folks found a way to turn wikitext into a Turing complete language which should give all of us pause when we assume that just because we can't personally see doing something that it won't be done by motivated individuals. If you examine the sorts of things that are being done with Echo, Alexa, and Siri today and reflect on the short rise and fall of products like Dragon Naturally Speaking in the late 1990's it's not too hard to imagine a phone based editing experience that is mostly dictation and some manual correction. "open curly brace. open curly brace. C. N. close curly brace. close curly brace."

I've been a long time disbeliever of "mobile is the future" hype and honestly I think I was right in 2010 to think it was quite a way out. I do know human beings today however who are very active consumers of internet content who have what I would consider to be large'ish phones as their only device. I don't know if they engage in what any Wikimedian would consider long form editing, but they do interact with Q&A sites, social sites, and consume a lot of content.

Given that 10% of people in the United States only access the internet on a smartphone, and that number exceeded the number of people who access the web only on a non-mobile device 3 years ago.

I think it's safe to say the mobile is a priority for the mission since there is a large number of users who only have a mobile device. It's not that they can use a non-mobile, it's that they don't even have a non-mobile device to use.

And this doesn't include people who are in less developed countries than the United States, which I imagine will have an even greater percentage of smartphone-only users.

"Do we want a REST API?" is an interesting question. Who's "we"?

In my mind this is at least as much a product question as a technology question, and so we ought to be including third-party API consumers in the "we," if not giving their interests top priority. Do third parties want to obtain Wikimedia content via a REST API, for some definition of "REST API"? Have we heard much from them on the subject one way or the other?

Anecdotally, when I was working primarily on the Wikipedia Android app, we (weakly) preferred interacting with MCS via the current RESTBase-backed REST API, partly because it was just a little less to grok than the Action API, which freed up headspace for other things. We also liked that it had slightly lower response latency on average, probably mostly due to RESTBase's content pre-caching.

It was also attractive in part because it was pretty much custom-built for our use cases. That said, the goal for MCS, as I understand it, has always been in principle for it to move toward providing a select, well-cached set of content generally useful to mobile clients in a client-friendly way, rather than to exist permanently as an essentially bespoke service to support the Wikipedia mobile apps and the Wikimedia mobile websites. On the other hand, in practice the product needs of WMF clients seem to take precedence. Maybe the AQS API is a better current example of where we've been aiming.

I don't think client devs have strong opinions one way or another about the backend behind the current or future REST API, and I tend to think that a REST API native to MediaWiki (along the lines of the third bullet point in @Anomie's taxonomy above) would be a useful thing to have, but again I think if our aspiration is to provide open knowledge as a service, we ultimately ought to prioritize what third-party users need and want when we decide what to build, and find out what they need and want if we don't know.

Condensed log of Wednesday's discussion on IRC:

1Condensed log of Wednesday's discussion on IRC:
2
322:10:14 <coreyfloyd> On architecture as it relates to 3rd party: I think that those 2 are very intertwined. The implication is that 3rd parties can only be supported by shipping a installable PHP app in a LAMP stack. This limits our architecture decisions
4
522:10:56 <tgr> DanielK_WMDE: the proposed session structure seems very bottom-up (identify current / short-term problems, try to compose them into a larger strategy)
622:11:34 <tgr> I wonder if having top-down discussions can fit into the structure, or best not to expect those to happen
7
822:11:42 <mdholloway> +1 to coreyfloyd; the future of a lot of the node.js work that we're doing over in Reading Infrastructure now seems very uncertain at the moment
922:12:11 <_joe_> DanielK_WMDE: it's not clear to me how we link the discussion about the architecture relates to the strategy process - that would seem to make sense as a top-down approach
1022:12:53 <_joe_> start from the strategy direction, which is not very deeply defined though, can be challenging
11
1222:13:00 <Niharika> brion: DanielK_WMDE: Okay. So is there a possibility that the session outcomes might affect the annual plan?
1322:13:11 <tgr> DanielK_WMDE: I mean driven by long-term strategy, not specific features
14
1522:13:34 <mdholloway> and that has to do with the idea that we may not want to impose installing node.js on third parties, among other things
1622:13:39 <TimStarling> marvin is not even aiming to support wiktionary let alone 3rd parties
17
1822:13:59 <DanielK_WMDE> coreyfloyd, mdholloway: yes, i agree that "how long do we need to support LAMP" is an essential question. and one that cannot be asnered by engineers. it's a product decision.
19
2022:14:08 <tgr> e.g. dbarratt's proposal of switching to a model where most code is maintained by third parties is top down
21
2222:14:54 <tgr> I wrote a proposal https://www.mediawiki.org/wiki/User:Tgr_(WMF)/Turn_MediaWiki_core_into_an_embeddable_library which is also top-down and wondering if the devsummit is the right place to try to discuss it
2322:18:32 <DanielK_WMDE> tgr: the dev summit would be a good place to pitch this as a possible strategy, and see if other find it worth exploring. there will unfortunately not be enough time to explore the idea then and there.
24
2522:15:55 <_joe_> so with no product strategy, nt even a high level one, how do we link to it at the tech level?
2622:16:53 <coreyfloyd> _joe_ good question - I have given this feedback to Toby/Victoria to let them know it is a blocker for long term technical planning
2722:17:58 <_joe_> to be more specific - I think some of the strategy directions can affect our architecture - if we want to store efficiently things other than text, we are in for some necessary evolutions on multiple levels
2822:21:51 <DanielK_WMDE> _joe_: re efficiently storing things other than text: one outcome could be to say just that: "*if* the product strategy is to go for multimedia aggressively, we'll need to look into X, Y, and Z".
29
3022:16:04 * brion would eventually love to raise things like "storage integrity" and "security in depth". we still have a "monolithic kernel" design in terms of database access, and it worries me. but is this too big-scope?
3122:22:46 <brion> i think we should seriously consider long-term things like "should password hashes be in a big mysql database" and "should someone who gets a privilege escalation on an app server have full access to private data?"
32
3322:16:48 <TimStarling> "timeless vs marvin" is the main question of frontend architecture for me at this
3422:18:18 <TimStarling> marvin is a frontend written in typescript, a RESTBase consumer
3522:18:33 <TimStarling> imagined as convergence between mobile apps and web
3622:20:07 <TimStarling> https://www.mediawiki.org/wiki/Marvin
37
3822:17:03 <DanielK_WMDE> _joe_: I think the strategy process should be informed from the technical side. In particular, needs for scaling and sustaining should be communicated, as well as constraints on feature development
39
4022:19:57 <_joe_> So I think some big questions about our architecture are: do we want a REST API? Should it be part of MediaWiki itself or stay external? How do we move away from the cache-but-also-storage-but-also-router-and-almost-ESB that restbase currently is
41
4222:20:47 <tgr> DanielK_WMDE: to put it more generally, I think we should have a tech vision driving longterm planning, not just a product vision (not that we have a product vision right now...) and it would be cool if the devsummit (the MW architecture session more specifically) could be used to come up with it
4322:21:57 <tgr> this year's dev summit is intended to inform strategic planning and so far I don't see that happening, it's more like the usual "let's make a tech roadmap for the next 3 years"
44
4522:20:54 <apergos> I hope we talk about the future of microservices; what types of services are a good fit? what layers should they talk to? how do we enforce that?
46
4722:22:28 <coreyfloyd> _joe_: FWIW on the router question, one of the things that came out of last weeks ATWG meeting is that we need routing in MediaWiki, but also need to route non-MW traffic (like analytics)
48
4922:23:54 <tgr> _joe_: strategic planning is, IMO, deciding where we want to arrive first and on the route to get there only after that
5022:24:37 <_joe_> tgr: where depends a lot on what we want to do; I can name you 3 or 4 issues I see as critical for being able to grow in some directions
5122:24:45 <tgr> _joe_: if we could come to consensus on those issues, that would already be more productive than all past dev summits, sure
5222:24:52 <_joe_> like - how to be able to scale inter-wiki relationships
5322:27:22 <DanielK_WMDE> _joe_: i had a long conversation about inter-wiki change propagation with Marko the other day.
5422:28:06 <_joe_> DanielK_WMDE: I think that's one of the challenges we have to tackle in order to sustain our *current* feature set for the next few years
55
5622:26:29 <tgr> DanielK_WMDE: what I am trying to say is that the current structure proposed in the task seems to be focused on issues but not ideas (I might be misunderstanding it of course)
5722:29:05 <DanielK_WMDE> tgr: i suppose you are right - and you are welcome to change that. My mind is pretty split between "issues we need to address to scale and sustain" and "what'S our technical vision" - but the later depends a lot of where we want to go with products. And that's still quite unclear.
5822:29:16 <coreyfloyd> _joe_: DanielK_WMDE I also think we need to split our concerns into short term (scaling current ops) and long term (strategy)
5922:29:28 <_joe_> coreyfloyd: I think 5-year plans tend to fall on their back pretty fast, but sure, we could identify some directions that could influence our decisions on such a timeline
6022:30:14 <DanielK_WMDE> _joe_: we are not trying to make a 5 year plan. the idea is to identify the questions we need to think about to get a good idea of what we need to do to get to the place we want to be in in 5 years..
6122:30:33 <cscott> fwiw, i think our original "1 year plan" in parsing has ended up taking 5 yrs or so
6222:30:38 <brion> strategy -> keep at 5 years. actual work plans will change every year. :D
6322:32:03 <cscott> making big changes on the wikis can be hard, there's a lot of momentum going in one direction
64
65
6622:25:19 <DanielK_WMDE> tgr: that (a tech vision) is indeed the idea. the summit is intended to be the start of a process that will deliver that. right now, we have a ton of ideqas and issues. we need to structure them somehow so we can find a vision, and a strategy.
67
6822:25:52 <_joe_> for the future, I'd see things like "how can we ingest more than 50 multimedia files at once"? for instance. Or what brion said, a security approach for the 2020s
69
7022:26:40 <coreyfloyd> I wonder because how far ahead should we plan? What is long term
7122:27:16 <_joe_> coreyfloyd: long-term is anything that will develop over more than a year, or will not be needed before a year, I'd say
72
7322:28:19 <coreyfloyd> _joe_ I feel like especially from the product side of the house has a 1 year horizon on features planning… and technology changes fast. So if we could create a tech strategy that lasted 5 years would that be a win?
7422:28:39 <apergos> 5 years is a long dang time, 15 years is unimaginable
75
7622:29:46 <apergos> we could talk about what would be needed to support a massive increase in editors (example: 10-fold growth in wikidata)
77
7822:07:38 <Dysklyver> Over the last few years object storage systems such as Amazons S3 service have become very popular, also many people like static storage systems like git (github), currently there appears to be no way to export mediwiki databases to these services, despite the fact that doing so could allow easy backup and/or static html sites to use data stored in this way.
7922:30:54 <Dysklyver> Well I think there should be a move away from tons of serverside processes and complex storage, towards cheaper static storage and client side processes, perhaps using the tech used by other major websites as an example, but certainly less reliance on so much hard-to-setup, costly to run server architecture. However my view is based mainly on the view of MW as a general software for website, less what wikipedia itself is needing,
8022:30:54 <Dysklyver> although I would imagine that cutting costs by cutting complexity would be a favorable outcome, especially when scaling.
8122:33:48 <Dysklyver> google has done a significant amount of work towards that aim, I believe it is feasible to cover all the main customization of MW within a lightweight framework and what is effectively a web-app shell
8222:32:12 <DanielK_WMDE> Dysklyver: the question is how you can suppoort the gazillion complex use cases that Wikimedia needs with such an architecture.
8322:32:35 <coreyfloyd> DanielK_WMDE: I think that begs the question “do we want to reduce our use cases?"
8422:33:38 <coreyfloyd> I think that is something we can push back on. We can tell product that we need to scale back the features of the software in order to move forward
85
8622:30:57 <TimStarling> if it is revolutionary in nature, e.g. rewrite everything in typescript, then it helps to re-evaluate after you do prototypes
8722:31:16 <TimStarling> if the idea is to keep doing what we're doing, then we already have the prototype
8822:32:41 <cscott> TimStarling: i could get behind a 1-year plan of 5 or so prototypes
8922:32:55 <cscott> each focusd on a different "key need" (or approach)
9022:33:38 <cscott> TimStarling: the danger being, of course, that we product-ize the prototypes w/o deprecating the things they were supposed to replace
9122:33:42 <brion> but that does give the opportunity to think about commonalities in those cases (for instance, do things need additional types of data storage -> leading to MCR etc)
9222:33:52 <TimStarling> we've done prototypes of node.js backend services and I'm not very impressed by them
9322:34:15 <TimStarling> so if the plan is "let's do more things like RESTBase" then I would have concerns about that
94
9522:34:39 <brion> i'd love for mediawiki-as-a-big-php-app to reduce itself into a core-with-an-api and do the ui separately, but doing it well is a tricky thing
9622:34:55 <brion> and i love the idea of storage becoming pluggable, but that's ........... hard
9722:35:25 <coreyfloyd> brion: I think thats the feeling of many in audiences - to have a core and an API, then let clients build UIs
98
9922:34:43 <DanielK_WMDE> tgr: do you think it is useful/sensible to try and develop a "top dow" tech vision for the platform before we know what products we want to build?
10022:34:55 <DanielK_WMDE> tgr: if yes, on what parameters would that be based?
10122:34:58 <tgr> DanielK_WMDE: so here's an example (cscott's, I think): the whole Wikimedia universe should be a single system internally. That's well beyond 5 years, doesn't depend that heavily on product plans, it can be broken down into meaningful short-term steps which won't be a complete waste of time even if the goal is abandoned after 2 years
10222:35:20 <cscott> it's cross-cutting because it affects how we do cross-wiki and cross-language collaboration
10322:37:05 <DanielK_WMDE> cscott: ah, ok, this is about dissolving the boundaries between sites/projects/communities, not so much about system architecture.
10422:37:54 <DanielK_WMDE> cscott: but that only makes sense if we anticipate that on the product level, we want/need that integration.
10522:38:19 <cscott> DanielK_WMDE: maybe. or maybe it makes sense in terms of simplifying our db config & etc purely from the tech/ops side as well.
10622:39:01 <cscott> coreyfloyd, tgr https://en.wikipedia.org/wiki/User:Cscott/2030_Vision
107
10822:36:28 <apergos> we could talk about the eternal parser issue (mw is the one true parser, should we put more resources again into changing that, or give up for good?)...
10922:37:13 <cscott> mw isn't really the one true parser, the one true parser is a platonic ideal no one has yet implemented. mw has known bugs, which are different from parsoid's known bugs, and the One True Parser is in between them.
11022:39:02 <apergos> cscott: mw has some edge cases that are exploited; that's why it gets to keep its status as authoritative, as far as I understand it
111
11222:38:01 <Dysklyver> A simple way of looking at it is that all MW has to do is let people read and write files, and browse what looks like a website, everything else is UI customization and access control.
11322:38:49 <DanielK_WMDE> Dysklyver: you are missing THE key feature that makesy wikipedia work: crowd-sourced curation and quality assurance
11422:41:56 <Dysklyver> from a technical view its just read/write on files, and logging, viewing of such. it's grown very complex, thats true, but fundamentally there are not that many needed functions to the software, it would be an idea to create some kind of roadmap as to what function are really needed before deciding exactly how to make the framework more lightweight.
11522:42:41 <DanielK_WMDE> Dysklyver: yes, that'S a pretty good point: we not only need a product strategy for new stuff.
11622:42:48 <DanielK_WMDE> we need a product strategy for getting rid of old stuff
117
118
11922:38:22 <tgr> DanielK_WMDE: I think the summit is not a good place for coming up with one true vision (not inclusive + not enough time) but maybe as a place to figure out which vision proposals are interesting, and start some structured conversation about them?
120
12122:40:39 <tgr> DanielK_WMDE: I agree it's not going to be completely independent from product strategy. I worry that strategies put together by product people will have blind spots (just like those put together by tech people, of course) and having both kinds could be complimentary
12222:40:51 <DanielK_WMDE> tgr: yes, and asking pointed questions to product people. "so, *do* we need to support LAMP?", "do we need to see edits on wikidata on the local watchlist?" "Is supporting multi-lingual content for anonymous users important?"
12322:41:48 <DanielK_WMDE> tgr: yes, absolutely. we need iterations between product and tech. the more the better. and the summit should at least ask for, if not establish, a process for this. should surface needs for sustaining, and questions for exploration/development. shoudl offer possibilities and opportunitites
12422:42:24 <coreyfloyd> tgr: yeah… they need help to understand the gaps in product planning. They don’t know about the constraints of the stack and what affects are
12522:43:09 <coreyfloyd> DanielK_WMDE: tgr yes, the only way forward will be iteration with audiences
12622:45:29 <DanielK_WMDE> cscott: i think it is also useful to be able to say "hey, this is really expensive, maybe doing 90% of that for 10% the cost would be enough"? I think we should ask that question. Not make the decision.
12722:45:12 <tgr> DanielK_WMDE: having the "cost" side for some kind of cost/benefit evaluation would be very useful, although for the most important issues it's not achievable in a week IMO
12822:46:07 <DanielK_WMDE> tgr: yea, all we can do is to ask the right questions, and try to establish procecsses.
12922:48:09 <DanielK_WMDE> coreyfloyd: sure, that's how it should be: requirements/wishes from the one side, cost/constraints from the other. iterate, agree, rinse, repeat.
13022:48:39 <DanielK_WMDE> coreyfloyd: i think we don't do that enough on the large scale. I hope we can start doing that for the strategy process.
13122:49:10 <coreyfloyd> DanielK_WMDE: yeah I think this is a large gap we are identifying… engineers who operate close to product seem to have this feedback loop, but it is non-existent at a high level
132
13322:48:39 <apergos> suppose in 10 years most of our editors come from phones/tablets, or most of our edits (number, not necessarily amount of content), what would that mean? is that likely? do we want to support that? what would we have to deprioritize?
13422:49:07 <brion> i think it's 100% for sure that most of our edits will come from phones in 10 years
13522:49:48 <cscott> brion: google docs doesn't work well on phones. i think folks still use big machines & big screens when they want to actually tackle a big piece of editing.
13622:50:20 <DanielK_WMDE> brion: ...but not by typing.
13722:52:13 <cscott> it's just as likely we'll have this mobile vision where your phone is actually your primary processor, and it takes over whatever TV screen and/or keyboard device you happen to be near
13822:52:25 <cscott> so there's not a sharp division between 'phone editors' and 'desktop editors'
13922:52:27 <_joe_> I think we should not try to forecast tech trends ourselves, that's pretty much not what tech should think about. Flexibility, on the other hand, is something we should be concerned about
140
14122:49:30 <tgr> cscott: for example, one possible tech vision is a distributed-ownership microservice architecture where people can take over the fringe things that they love, and the "official" Wikimedia site interfaces them and recovers gracefully if they do a poor job and the service is not keeping up
14222:50:11 <tgr> toolforge is sort if edging towards that directions, but it could be taken much farther
14322:50:36 <cscott> tgr: in my ideal world we'd have gradual on-ramps and off-ramps toward and away from "WMF maintanence of <feature X>".
14422:50:50 <cscott> tgr: with intermediate steps between "we take care of everything" and "we turn the service off"
14522:54:49 <tgr> cscott: "I can write an experimental feature and release it to beta testers without risking site security and reliability" is a nice long-term tech vision too
146
14722:54:10 <coreyfloyd> DanielK_WMDE _joe_ So covering mobile plus flexibility: is it good to focus on APIs? Maybe focus on abstracting the front of MW from the core?
14822:54:45 <cscott> making core smaller is the way i think about it
14922:54:46 <coreyfloyd> _joe_: also that forces us to figure out “what is core to MW"
15022:55:07 <bd808> coreyfloyd: bring this team back to life and staff it -- https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_API_Team
15122:55:24 <bd808> "Accelerate user interface innovation/evolution and increase efficiency of editor automation by making all business logic for MediaWiki available via remotely callable APIs."
15222:55:49 <coreyfloyd> DanielK_WMDE: yeah, and if we go in with a goal of just separating the front and back end with APIs and supporting the same functionality that would be a huge step forward… but yes the layers need designed
15322:59:37 <cscott> tgr: anyone who wants to make a *fully-functional* new UX for mediawiki needs to grapple with Special:
15422:59:53 <brion> yeah special: is the special hell of mediawiki :D
15523:00:01 <brion> basically everything in there needs to be abstracted sanely
15623:00:19 <DanielK_WMDE> yea, refactoring special pages to share logic with APIs isn't *that* hard. it's quite doable, but it needs time.
157

I'll try recapture a point that I tried to make in the IRC discussion: there are two ways of strategizing, bottom-up and top-down. Bottom-up is to look at what problems and immediate needs we have today, what works well and what doesn't, and see if a larger strategy emerges from that. (E.g. have a discussion about microservices, another about handling frontend libraries, a third about improving performance for logged-in users, and maybe those all point into the direction of Javascript being a more useful technology for us, and congeal into a technological vision of gradually deprecating PHP components and replacing them with Javascript components.) Top-down is to start from a longer-term product vision (e.g. we want multimedia and interactive content to be first-class citizens) or technological vision (e.g. we want a codebase that's significantly cheaper to maintain) and try to break it down into small steps.

Those approaches have different strengths and weaknesses (bottom-up is probably more focused and time-effective as we mostly talk about things we already have a decent shared understanding of, it is less dependent on the - so far unknown - product vision, it risks repeating stalled discussions for which getting people in the same room is probably not much added value as it already failed to unstall them in the past; top-down allows more innovation but also risks wasting time on discussing things no one understands in depth). The current structure (as proposed in the task description) suggests having a bottom-up discussion of current problems; is that where we want to be? The wiki page on purpose of the summit seems to expect more of an exploration of the vision ("How does the session’s topic relate to the broad strategic goals, namely knowledge equity and knowledge as a service?"); OTOH arguably the conditions for doing that effectively haven't been provided.

The Problem Grouping document is a bit too daunting to be a required read IMO. Maybe the blocking questions / technology goals could be copied out of it?

I'll try recapture a point that I tried to make in the IRC discussion: there are two ways of strategizing, bottom-up and top-down. Bottom-up is to look at what problems and immediate needs we have today, what works well and what doesn't, and see if a larger strategy emerges from that. (E.g. have a discussion about microservices, another about handling frontend libraries, a third about improving performance for logged-in users, and maybe those all point into the direction of Javascript being a more useful technology for us, and congeal into a technological vision of gradually deprecating PHP components and replacing them with Javascript components.) Top-down is to start from a longer-term product vision (e.g. we want multimedia and interactive content to be first-class citizens) or technological vision (e.g. we want a codebase that's significantly cheaper to maintain) and try to break it down into small steps.

Those approaches have different strengths and weaknesses (bottom-up is probably more focused and time-effective as we mostly talk about things we already have a decent shared understanding of, it is less dependent on the - so far unknown - product vision, it risks repeating stalled discussions for which getting people in the same room is probably not much added value as it already failed to unstall them in the past; top-down allows more innovation but also risks wasting time on discussing things no one understands in depth). The current structure (as proposed in the task description) suggests having a bottom-up discussion of current problems; is that where we want to be? The wiki page on purpose of the summit seems to expect more of an exploration of the vision ("How does the session’s topic relate to the broad strategic goals, namely knowledge equity and knowledge as a service?"); OTOH arguably the conditions for doing that effectively haven't been provided.

well said!

I'll add an issue that I haven't seen mentioned elsewhere. Per the recent blog post, Android is experimenting with offline content sharing (ie. copying downloaded Wikipedia articles from one phone to another); presumably the feature will reach other platforms as well eventually. That suggests we should be considering some kind of verification system (e.g. signed downloads) to prevent fake news operations seeding some offline community with made-up Wikipedia articles subvert Wikipedia's reputation for their own aims.

The Problem Grouping document is a bit too daunting to be a required read IMO. Maybe the blocking questions / technology goals could be copied out of it?

I have changed the description to mention the two relevant tabs.

@Tgr regarding the top-down approach: I think when proposing a vision, it's important to make implicit the motivation behind it. Also, we (as engineers) should be careful not to venture too far into "product territory". To me, "make multimedia and interactive content first class citizens" is not a technology vision. It's a question we should ask to the people defining the product strategy. And we could already inform the product strategy process with the costs, risks, difficulties and opportunities we see if we decide to go that way.

My point is: thinking about something like "multimedia and interactive content as first class citizens" is a good topic for the summit, but the framing should be providing input for the product strategy. Not making product strategy, nor assuming product strategy.

On the other hand "fully separate the presentation layer from application logic" as a tech vision has merit in and of itself, especially for maintainability and flexibility of the code base. it also offers opportunities that align well with the movement goals. All this should be made explicit, and be used as input for the product strategy.

I think it's important to answer some bigger picture questions before we can answer smaller implementation questions.

The foundation created and maintains what is effectively a custom CMS. We provide the source code to everyone and accept contributions, but the overwhelming majority of development (as well as product direction) is done by the foundation.

I completely understand why we did this in the first place, it was out of necessity at the time. Nothing even remotely close existed.

However, today, I do not believe this is true. I think we should really stop and ask ourselves:
Would the mission be further along in ~15 years if we continue what we are doing or if we abandon MediaWiki completely and adopt someone else's software platform?

I think we should always be asking ourselves this question. I worked for a large corporation that abandoned their custom CMS (that they spent millions of dollars developing) for an open source solution because, in the long term, doing so would allow them to implement new features faster and return a greater profit.

I think a lot of us (myself included) get stuck in the trap of thinking about how much time/effort/money we have spent on a piece of software and the thought of abandoning that software seems crazy, but I think it's really important to consider whether we are falling into the sunk-cost facility and ask ourselves if the solution that we have really is the best for the mission or if there is something else that might be better (and the cost of changing is worth it).

If MediaWiki is the best thing to propel the mission forward, than fantastic! But if it's not, then we should be willing to give it up.

I think a lot of us (myself included) get stuck in the trap of thinking about how much time/effort/money we have spent on a piece of software and the thought of abandoning that software seems crazy,

I actually think we often get stuck in the opposite fallacy - seeing that MediaWiki is invented here, and therefore thinking we should not use it to avoid not invented here syndrome. MediaWiki seems to be much more popular outside the foundation then it is inside it.

I worked for a large corporation that abandoned their custom CMS (that they spent millions of dollars developing) for an open source solution because, in the long term, doing so would allow them to implement new features faster and return a greater profit.

On the other hand, MediaWiki already is an open source solution and as a nonprofit the Wikimedia Foundation isn't trying to return a greater profit.

I also suspect that migrating whatever content was in your corporation's custom CMS was by orders of magnitude a smaller job than it would be to migrate all our wikis to some other software, and wasn't complicated by the fact that thousands of people are making hundreds of edits per minute, 24 hours a day, every day of the year.

A serious plan for full-on replacing MediaWiki (or rewriting in $language_of_the_day, another proposal that was popular some time ago) would probably start with "stop all development for three years and focus on the migration" (and then it would actually take five). I doubt there is any appetite for that.

Also, Drupal (or Symfony etc) being on par with MediaWiki in terms of scaling is very much [citation needed]. nih.org is said to be the top ranking Drupal site, and a slightly outdated comScore report claims they 10M uniques a month. That's about 1% of what Wikimedia sites get, and then I'm sure the ratio of writes is a tiny fraction of even that (outside of social media, very few sites are seriously write-heavy).

On the other hand, MediaWiki already is an open source solution and as a nonprofit the Wikimedia Foundation isn't trying to return a greater profit.

It's an analogy. We are not slaves to profit, but we are slaves to the mission.

I actually think we often get stuck in the opposite fallacy - seeing that MediaWiki is invented here, and therefore thinking we should not use it to avoid not invented here syndrome. MediaWiki seems to be much more popular outside the foundation then it is inside it.

Ha. That's probably true. And there might be other opportunities to use other libraries rather than ditching MediaWiki completely. For instance, if we need Dependency Injection, perhaps we should consider using Symfony's Dependency Injection component.

Also, Drupal (or Symfony etc) being on par with MediaWiki in terms of scaling is very much [citation needed].

I'm asking you to trust my 9 years of Drupal experience. But you're right, Wikipedia would be the most trafficked Drupal site... but honestly, it would be the most trafficked site of any platform/library. I think the next highest might be nbcolympics.com (while the olympics are going on). I'm not sure what kind of citation you'd be looking for... a benchmark? I suppose for a benchmark we'd need effectively an identical page/site?

A serious plan for full-on replacing MediaWiki (or rewriting in $language_of_the_day, another proposal that was popular some time ago) would probably start with "stop all development for three years and focus on the migration" (and then it would actually take five). I doubt there is any appetite for that.

And that doesn't seem very agile to me... I think it would happen concurrently. As if we were maintaining multiple major versions of a piece of software.

Also, Drupal (or Symfony etc) being on par with MediaWiki in terms of scaling is very much [citation needed].

Regardless, even if Drupal isn't as scalable as MediaWiki (which I doubt), since their mission is to "build the best open source content management framework", then making Drupal more scalable than MediaWiki would become a priority for them in order to satisfy their mission.

I'm asking you to trust my 9 years of Drupal experience.

With all due respect, that's pretty unconvincing. If the best thing that could be said about Drupal's scalability is that someone involved in the project feels that its probably "scalable", that's just not compelling. As far as citations go, I can't speak for tgr, but I would expect technical descriptions of design decisions related to scalability so it can be analysed how is should scale "in theory", as well as descriptions of other high traffic sites - what their workload is and how much computer resources it takes to serve it.

But that's all beside the point, as the technical scalability point is only one of the reasons why this is totally infessible. The bigger hurdle would be political - convincing wikipedians to change, and the sheer scale of moving all the infrastructure over to a new system. I don't think there's any world where that is remotely practical.

In sum - I suspect other systems such as drupal wouldn't really meet our needs or our users needs [citation needed on the second part. I must admit I am not super familar with Drupal], and the transition cost would massively outweigh any benefits.

But you're right, Wikipedia would be the most trafficked Drupal site... but honestly, it would be the most trafficked site of any platform/library. I think the next highest might be nbcolympics.com (while the olympics are going on). I'm not sure what kind of citation you'd be looking for... a benchmark? I suppose for a benchmark we'd need effectively an identical page/site?

I'm doubtful nbcolympics is really comparable. It would have a very small number of people changing/creating content. Wikipedia's scaling is largely about having a large number of users changing the content at once, while having an even larger group giving it bursts of traffic. Having high number of users is meaningless without the high amount of content changes.

@Bawolff and you might very well be 100% correct. And that's totally fine. I'm not saying we should absolutely do this, I'm only saying that it's worth considering. I have no idea if Drupal (or something like it) would scale to what we need, but honestly, I don't think anyone else knows either (I'm confident it would, in my experience, but as you said, that's not a great argument). It would require some testing/research to find out.

@Bawolff and you might very well be 100% correct. And that's totally fine. I'm not saying we should absolutely do this, I'm only saying that it's worth considering. I have no idea if Drupal (or something like it) would scale to what we need, but honestly, I don't think anyone else knows either (I'm confident it would, in my experience, but as you said, that's not a great argument). It would require some testing/research to find out.

Sure. I'm just saying until someone invests the time to do that research (Which would be a rather large undertaking), or at least has a start on that research, there's not much to talk about beyond saying I have extreme doubts.

Sure. I'm just saying until someone invests the time to do that research (Which would be a rather large undertaking), or at least has a start on that research, there's not much to talk about beyond saying I have extreme doubts.

I will match your extreme doubts with my extreme confidence. :)

I think we do need a top down vision driven by WMF's top level strategy, against which we can ask specific questions and make specific decisions. A few notes:

  • Being a good FLOSS citizen means both sufficiently funding our own development, and not making it unnecessarily hard on other contributor-users. To me that implies we do have an interest in having a good system of layers that can expand from core to use additional services where needed. It also opens the question of to what degree we want to help drive external development work, and helping to either fund it or find orgs who will.
  • restbase, multimedia, etc etc are mostly details and we need to find the upper level issue. With restbase, it's a tool that other tools rely on, which to me feels like a core service, but it could stay layered too. We do though need to make sure it's known how to build on those layers.
  • I think building a good api layer that UI can implement on top of is a really good idea. The current form-submit CRUD behavior is awkward to work with, and special pages are impossible to generalize well to mobile etc. This is all stuff that we can progressively enhance on the web, there's no need to leave old browsers completely behind. But we need to put in the work and most importantly we need to make the decision to do it.
  • I think we need a good installer for dev and tiny installs. Vagrant partially fills this role but it's not easy to deploy and has some maintenance difficulties... and few/no resources assigned to keeping it going, it's mostly a labor of love.
  • database access is lik a monolithic kernel, which means any security hole can reach a lot of internal data. I think long term we should radically change how we store private data like user password hashes, suppressed pages, up addresses. This would be far reaching potentially, by could be done in baby steps starting with separating password hashes to a service, etc.

In short: I support a cleaner lighter core that supports additional layers needed for custom functionality, and think we should build in that direction.

Sure. I'm just saying until someone invests the time to do that research (Which would be a rather large undertaking), or at least has a start on that research, there's not much to talk about beyond saying I have extreme doubts.

I will match your extreme doubts with my extreme confidence. :)

Even if some other system would be better than MediaWiki at running Wikipedia, and it would be feasible to provide feature parity, the migration cost seems absolutely forbidding. Migrating the infrastructure and training developers would already be a big problem, but migrating content and the user base would not happen. We would lose the most valuable asset we have: power users of MediaWiki.

Even if some other system would be better than MediaWiki at running Wikipedia, and it would be feasible to provide feature parity, the migration cost seems absolutely forbidding. Migrating the infrastructure and training developers would already be a big problem, but migrating content and the user base would not happen. We would lose the most valuable asset we have: power users of MediaWiki.

I agree. Let's look at PHP 5 -> HHVM. That was the same language, and the same code base, and it took over a year. And now we need to switch back and it will likely take about as much time.

We spent 8 years trying to get rid of global javascript... 7 building a visual editor...

[edit] Note, and that was mostly for reasons of technical and community stability, actually doing it was often the easier part.

I'm asking you to trust my 9 years of Drupal experience. But you're right, Wikipedia would be the most trafficked Drupal site... but honestly, it would be the most trafficked site of any platform/library. I think the next highest might be nbcolympics.com (while the olympics are going on). I'm not sure what kind of citation you'd be looking for... a benchmark? I suppose for a benchmark we'd need effectively an identical page/site?

Reads/writes per second (in the framework, not a CDN or other cache) would be a good start. Sure, it's never going to be an apples-to-apples comparison but that would at least suggest that Drupal can scale to top-ten-website levels despite not really being built for that. It has been some years since I last did Drupal development but back then they didn't really have any internals to handle the practices that are typical for sites under heavy load (like multi-datacenter logic, or DB replication). Also the whole theming system seemed very inefficient although maybe that part caches well. Anyway if you want to make a serious proposal, it would probably be better to discuss that in its own task.

daniel updated the task description. (Show Details)

No matter if using Drupal, or any other system, instead of some parts of the existing MediaWiki is a sane move to make with the regard to scalability etc (thanks @Tgr for suggestions where to start to measure this, BTW), I think @dbarratt has touched some important points. At least reading what he said got me thinking of some things I mention below, thanks @dbarratt!
I am not sure how those fit the general strategy discussions, but as we're not clear whether we go top-down or bottom-up, I am going to ramble a bit:

  • While I understand concerns about drastic changes, it would be good to have a framework allowing to actually evaluate the "drasticity" and estimate the cost of possible options, whatever they are going to be. In other words, I am pretty sure Daniel is absolutely right when saying above "the migration cost seems absolutely forbidding", but then, do we think we (here "we" being the people who are supposed to know and decide) know what would be the migration cost that would be more appealing? And things like that.
  • It seems to me that we should not think of possible changes as "we would rewrite parts A, B, C and MediaWiki but better", but rather think what do the users want to do, and what tools do they need to get it done. I strongly believe there are ways to improve/fix/etc some things without needing to re-write half of MediaWiki and falling into stop-other-development-for-three-but-in-practive-five-years refactoring, etc. And I also believe that the tool that might allow people do what they want is not necessarily always MediaWiki.

I'd like to see a focus on at least the following.

-Docker + Kubernetes with Parsoid included, with the legacy parser set to be deprecated within X years. We *will* support VE in the Wikimedia cluster (and if product management agrees to it I think it's sensible to bundle it with MediaWiki, too; but that's a secondary matter in a sense) so let's set a clear direction for the ecosystem, and let's do it in a way that acknowledges the need for developer productivity, the state of the industry, and resource elasticity. As to installation / builds for shared hosting providers (who have sufficient sophistication) and gold build virtual machines, I'm pretty sure that's compatible with this approach.

-Commitment to OpenAPI (Swagger) coverage of the Action API, with strict versioning. As a developer I also want "more RESTful" slash separated request URLs and responses, but if that has to be done in RESTBase alone for Wikimedia sites, that's fine, too.

-Addressing the need for high speed client (end user) access to full Parsoid / Page Content Service (PCS), PCS/Parsoid-derived, and general microservice endpoints. The number of objects in the edge cache will need to grow to address multi-device access (one HTML page for all modalities will not work), so we need to formalize the cache/purge strategy and probably consider more edge cache resources.

-Building out event orchestration. Both notifications to clients and increasing fusion of the project content and collaboration demand it.

I'd like to see a focus on at least the following.

-Docker + Kubernetes with Parsoid included, with the legacy parser set to be deprecated within X years. We *will* support VE in the Wikimedia cluster (and if product management agrees to it I think it's sensible to bundle it with MediaWiki, too; but that's a secondary matter in a sense) so let's set a clear direction for the ecosystem, and let's do it in a way that acknowledges the need for developer productivity, the state of the industry, and resource elasticity. As to installation / builds for shared hosting providers (who have sufficient sophistication) and gold build virtual machines, I'm pretty sure that's compatible with this approach.

-Commitment to OpenAPI (Swagger) coverage of the Action API, with strict versioning. As a developer I also want "more RESTful" slash separated request URLs and responses, but if that has to be done in RESTBase alone for Wikimedia sites, that's fine, too.

-Addressing the need for high speed client (end user) access to full Parsoid / Page Content Service (PCS), PCS/Parsoid-derived, and general microservice endpoints. The number of objects in the edge cache will need to grow to address multi-device access (one HTML page for all modalities will not work), so we need to formalize the cache/purge strategy and probably consider more edge cache resources.

-Building out event orchestration. Both notifications to clients and increasing fusion of the project content and collaboration demand it.

Why?

(Im going to slightly pick on you but its not really just you on this thread) - it seems like we are focusing on specific things but not having any reason why we want these things or what the benefit is.

I dont think this makes sense. We should start from a place of goals and the decide how we change tech to get there

-Docker + Kubernetes with Parsoid included, with the legacy parser set to be deprecated within X years.

If by "parsoid" you mean the existing nodejs service, I'd be very sad to see that since it would be completely giving up on the "progressive enhancement" ideals I laid out in T183313#3910304. I also wonder whether the Parsoid nodejs service is anywhere near being able to replace the PHP parser's generation of links table metadata and other metadata, or to support extensions such as Scribunto.

If by "parsoid" you mean something that generates a parsoid-style DOM but is written in PHP (or otherwise is usable in a basic third-party install without having to run bespoke services and has somehow solved the extension and metadata problems), sure. Although at that point you don't need Docker and Kubernetes, unless parsoid the nodejs service still exists as an enhancement and you want to make that easier for someone to use.

-Commitment to OpenAPI (Swagger) coverage of the Action API, with strict versioning.

See T136839#2441715 for an analysis of Swagger with respect to the Action API. The TL;DR is that Swagger cannot sanely express the Action API, since it's designed for pathinfo-as-positional-parameters style APIs that accept only JSON blobs when POSTing is necessary.

I'm quite new to the MW community, but one thing that has seemed to be the case is that there is a mis-match between technical architecture (pretty monolithic, very tightly interdependent) and WMF/community organizational structure (very distributed, fairly siloed).

I point this out only to suggest that decisions about architectural direction should hopefully keep in mind the structure of the organization that's going to be responsible for implementing that direction, and align to it.

(I imagine someone else has made this point already, but I couldn't find it stated anywhere -- sorry if I'm repeating.)

Krinkle updated the task description. (Show Details)

(Im going to slightly pick on you but its not really just you on this thread) - it seems like we are focusing on specific things but not having any reason why we want these things or what the benefit is.

I dont think this makes sense. We should start from a place of goals and the decide how we change tech to get there

Let me try to rephrase. I should note some of this might be germane to other sessions.

  1. The environment for development and deployment of code should be more predictable.
  2. The platform should be able to handle resource consumption surges.
  3. The platform's APIs should make it easy to build for multiple devices / modalities.
  4. The platform should quickly return API responses for a broad array of access patterns.
  5. The platform should steer toward industry practice.

Is that the sort of level of abstraction you were looking for in goals?

If by "parsoid" you mean the existing nodejs service,

Maybe. I think it would be good to give people something against which they can start developing today that will more closely model the future parser that entails the Parsoid spec and design.

I'd be very sad to see that since it would be completely giving up on the "progressive enhancement" ideals I laid out in T183313#3910304.

I'm working from the assumption of dramatically simplified installation. This said, yes, it definitely would introduce an additional hard dependency as is today.

I also wonder whether the Parsoid nodejs service is anywhere near being able to replace the PHP parser's generation of links table metadata and other metadata, or to support extensions such as Scribunto.

Yeah, I'm not sure about that part. @ssastry do you have a read on this?

With regard to moving in the direction of the future state, though, I don't know if in the short run the current Node.js Parsoid would have to replace the PHP parser's generation (apparently some parser hooks would need treatment, though).

If by "parsoid" you mean something that generates a parsoid-style DOM but is written in PHP (or otherwise is usable in a basic third-party install without having to run bespoke services and has somehow solved the extension and metadata problems), sure. Although at that point you don't need Docker and Kubernetes, unless parsoid the nodejs service still exists as an enhancement and you want to make that easier for someone to use.

Right, the extra basic installation not requiring Node.js (although by definition requiring other software to be functional) would beg for less of a multi-container sort of setup. That said, I don't think Docker precludes the extra basic installation (going in the other direction, the simpler LAMP version of course is fine in Docker as a convenience). I'm interested here in supporting this approach while preparing for elasticity in our environment (Docker/Kubernetes). Now, a dramatically simplified installation involving the Node.js Parsoid that doesn't use Docker still helps solve for simplicity for developers, but probably doesn't help for elasticity as much more than the current state. I keep wondering about whether it's possible to really start supporting Node.js Parsoid sooner rather than later so that people have something to target for the eventual future, irrespective of whether that eventual future is a PHP-bindable parser. Today @Joe indicated a preference for the parser being hardwired into request processing lifecycle and not a separate component, but I'm still wondering, practically speaking, what do we need to do to ready people for the future with Parsoid?

-Commitment to OpenAPI (Swagger) coverage of the Action API, with strict versioning.

See T136839#2441715 for an analysis of Swagger with respect to the Action API. The TL;DR is that Swagger cannot sanely express the Action API, since it's designed for pathinfo-as-positional-parameters style APIs that accept only JSON blobs when POSTing is necessary.

Is that still the case in OpenAPI 3? If not, I wonder if submitting a proposal / pull request to extend it could help pave the way.

Is that still the case in OpenAPI 3? If not, I wonder if submitting a proposal / pull request to extend it could help pave the way.

Can't say, since I've never seen nor heard of "OpenAPI 3" before.

I'm reminded of https://xkcd.com/927/, and also https://www.commitstrip.com/en/2018/01/08/new-year-new-frameworks/?.

*[ Subbu responding after the fact: This was before my time, but here is what I understand. It was NOT possible to write this in PHP in 2012 when this started. There was no HTML5 parser, performance was a concern because of all the additional work that needed to happen. There was PEG.js available. So, the separation wasn't necessarily on a whim but because writing it in PHP wasn't feasible at that time. I don't know if I am speculating here, but there was some passing idea / thought of possibly running Parsoid in the browser and node.js was what enabled it. So, if we had to do it all in PHP, Parsoid, and VE launch might have taken much longer. The original Parsoid design had (and still has remnants) an integrated preprocessor and wasn't meant to be a separate call into M/W api. The call to M/W api happened in the rush to have things ready for the Dec 2012 VE launch and since that worked, it stuck and we didn't pursue the preprocessor integration beyond that. Also see https://www.mediawiki.org/wiki/Parsing#E:_Evaluate_feasibility_of_porting_Parsoid_to_PHP Note that, RemexHTML, the HTML5 parser in PHP is based off the node.js HTML5 parser, domino, which is something we switched to around 2014 from another HTML5 parser. ]

I recall that when Parsoid was started, the nodejs service was supposed to be a prototype before rewriting it in C++, perhaps with the ability to compile it as a PHP extension (cf. php-luasandbox). The nodejs service wasn't originally supposed to be the final product of the project. Of course, the original idea of rewriting in C++ was "deprioritized" in 2013.<ref>https://www.mediawiki.org/wiki/Parsoid/status</ref>

I think I did hear talk at some point that nodejs was good because it might allow running the same JS in the browser too, although I don't recall whether that was specifically about Parsoid. And I've yet to see any example of that in any of our nodejs-using projects.

*David Barratt: Since technology is a byproduct, not our mission, it's a byproduct, not a product, according to our mission.

The mission says we want to empower people to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally. Sure, we could do that simply by providing a platform (Wikipedia, etc) for people to collect and develop content. But isn't it even more useful to also give them the software itself so they can set up their own platforms? And it also helps protect that collected content from some disaster taking out the WMF, since we publicly provide not just dumps of the data but also the software and configuration to make direct use of those dumps.

'''Proposition:''' All presentation layer code should be written in a way that allows it to run on the client or the server (''Isomorphic JavaScript''). [basically: Frontend in JS + Server side rendering in Node.js]

My take is that if we're going to require server-side JS, it should be as easy to set up as it is to install Apache, PHP, and MariaDB. In other words, people should be able to buy generic "LAMPJ" hosting or pick a generic "LAMPJ" profile in a distro's installer, put MediaWiki in the right place(s), and things should work about as well as they do now.

  • ...or we make a hybrid approach possible: require either node.js on the server (for supporting low-tech clients), ''or'' require “modern” JS on the client, but no node.js on the server.

I don't much like that idea, any more than I like the current situation where MediaWiki core doesn't support mobile web and you have to install a hacky extension to get it.

I'm reminded of https://xkcd.com/927/, and also https://www.commitstrip.com/en/2018/01/08/new-year-new-frameworks/?.

LOL. Of course the REST & JSON stuff was always going to trend toward qualities of WS-* standards - people need parts of those standards.

But seriously, what's your read of https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md#parameterObject (including parameter style and the extension mechanism) and https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md#operationObject ? Is the spec still too constrained and would it need to be enhanced to properly support "normal" things developers do with the Action API?

Separately, all around it sounds like everyone wants simpler installation if there's a dependency on Node.js Parsoid.

I also wonder whether the Parsoid nodejs service is anywhere near being able to replace the PHP parser's generation of links table metadata and other metadata, or to support extensions such as Scribunto.

Yeah, I'm not sure about that part. @ssastry do you have a read on this?

We haven't worked on these aspects, but, part of the reason is because it depends on some other decisions on whether we will be porting the PHP preprocessor to node.js OR port the Parsoid code base into PHP OR have the current setup be as is (which would require some action API updates). But, https://www.mediawiki.org/wiki/Parsoid/Known_differences_with_PHP_parser_output should be generalized to add additional rows for some of these additional functionality that needs to be supported. I am aware of these two use cases, and the need to generate TOC and ensure skinning continues to work, but I am not yet sure if there are other things that we should account for. I'll update that page and would appreciate additional input there -- best to directly edit the page.

The mission says we want to empower people to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally. Sure, we could do that simply by providing a platform (Wikipedia, etc) for people to collect and develop content. But isn't it even more useful to also give them the software itself so they can set up their own platforms? And it also helps protect that collected content from some disaster taking out the WMF, since we publicly provide not just dumps of the data but also the software and configuration to make direct use of those dumps.

I'm not saying that we should not release our software under an open source license and make it publicly available. I think all companies (and individuals) should do this.

My statement (and position statement) was based on what I was taught in Wikimedia Foundations cultural orientation: 'Technology is not a part of our mission, technology is a byproduct of our mission' and an example given was that: 'If we discovered that it was more effective to use pigeons to disseminate educational content, we would start using pigeons tomorrow.' While this is obviously hyperbole, it illustrates the point that we should always self-evaluate our own solutions and be willing to change our solutions (even if that means completely abandoning our current solutions).

My question is simply this:
If there was a solution that was more effective at disseminating education content for the next ~15 years than MediaWiki, shouldn't we consider that solution?

One of the major arguments against this is: We are already developing MediaWiki, so we should continue developing MediaWiki. But as I've mentioned before (T183313#3914993), this is the sunk-cost fallacy and is not a valid argument.

The other argument is that the cost of migrating to another system (whatever that might be) is so prohibitive that it is not even worth considering. However, given a long enough timeline, the benefits of migrating may outweigh the cost. I am making an argument against short-term gains in favor potential long-term gains.

Again, I have no idea if it would actually be more effective to move to something other than MediaWiki. I am however, really discouraged at the overwhelming majority view which is completely dismissive. I think it takes a great deal of intellectual dishonesty to completely dismiss an idea based on the premise and that makes me sad. I can be completely wrong, and that's fine, but I think it's worth considering. I don't want to sacrifice the mission on the alter of MediaWiki.

*[ Subbu responding after the fact: This was before my time, but here is what I understand. It was NOT possible to write this in PHP in 2012 when this started. There was no HTML5 parser, performance was a concern because of all the additional work that needed to happen. There was PEG.js available. So, the separation wasn't necessarily on a whim but because writing it in PHP wasn't feasible at that time. I don't know if I am speculating here, but there was some passing idea / thought of possibly running Parsoid in the browser and node.js was what enabled it. So, if we had to do it all in PHP, Parsoid, and VE launch might have taken much longer. The original Parsoid design had (and still has remnants) an integrated preprocessor and wasn't meant to be a separate call into M/W api. The call to M/W api happened in the rush to have things ready for the Dec 2012 VE launch and since that worked, it stuck and we didn't pursue the preprocessor integration beyond that. Also see https://www.mediawiki.org/wiki/Parsing#E:_Evaluate_feasibility_of_porting_Parsoid_to_PHP Note that, RemexHTML, the HTML5 parser in PHP is based off the node.js HTML5 parser, domino, which is something we switched to around 2014 from another HTML5 parser. ]

I recall that when Parsoid was started, the nodejs service was supposed to be a prototype before rewriting it in C++, perhaps with the ability to compile it as a PHP extension (cf. php-luasandbox). The nodejs service wasn't originally supposed to be the final product of the project. Of course, the original idea of rewriting in C++ was "deprioritized" in 2013.<ref>https://www.mediawiki.org/wiki/Parsoid/status</ref>

For a whole bunch reasons, the C++ port idea was just deprioritized and then dropped after Gabriel and Adam (Wight) spent about a month on it because Dec 2012 and July 2013 VE launches had to happen ... and then they happened and it was all hands on deck to make sure VE was fully supported. Perf. wasn't really an issue at that point, and given how many ppl were working on Parsoid, a port to C++ would have seriously compromised our ability to continue to support VE. So, it never bubbled back up as something we would undertake. I was not directly involved in that decision making but that is my understanding of some of those decisions that happened in the 2012-2013 timeframe.

I think I did hear talk at some point that nodejs was good because it might allow running the same JS in the browser too, although I don't recall whether that was specifically about Parsoid. And I've yet to see any example of that in any of our nodejs-using projects.

Indeed. That is why I qualified my comment with whether I am speculating or if that was something that I had actually heard as a reason at one point.

My statement (and position statement) was based on what I was taught in Wikimedia Foundations cultural orientation: 'Technology is not a part of our mission, technology is a byproduct of our mission'

Perhaps we as a whole should have a discussion about that, since it seems at odds with what some of us believe. But not here in this Phab task.

If there was a solution that was more effective at disseminating education content for the next ~15 years than MediaWiki, shouldn't we consider that solution?

That question is not at all the statement I responded to. Let's not move the goalposts.

Again, I have no idea if it would actually be more effective to move to something other than MediaWiki. I am however, really discouraged at the overwhelming majority view which is completely dismissive. I think it takes a great deal of intellectual dishonesty to completely dismiss an idea based on the premise and that makes me sad. I can be completely wrong, and that's fine, but I think it's worth considering. I don't want to sacrifice the mission on the alter of MediaWiki.

You should refrain from accusing your coworkers of "intellectual dishonesty". It's a very poor way to try to support your arguments.

Again, I have no idea if it would actually be more effective to move to something other than MediaWiki. I am however, really discouraged at the overwhelming majority view which is completely dismissive. I think it takes a great deal of intellectual dishonesty to completely dismiss an idea based on the premise and that makes me sad. I can be completely wrong, and that's fine, but I think it's worth considering. I don't want to sacrifice the mission on the alter of MediaWiki.

I suspect part of the reason you are encountering the reaction is that you are seriously underestimating the complexity involved. A radical proposal like this has to factor in the challenges in undertaking such a project to be taken seriously.

To give you an example, even a "simple" project like replacing Tidy (T89331) with a HTML5 parser has ballooned from a ~1 year project to a 3 year project. The task of writing a replacement wikitext parser that supports other functionality has taken 6 years and going. So, a proposal to completely replace MediaWiki (not just a html5 parser or the wikitext parser) with all the functionality it supports and is being asked to support needs to engage with the challenges that it entails.

Again, I have no idea if it would actually be more effective to move to something other than MediaWiki. I am however, really discouraged at the overwhelming majority view which is completely dismissive. I think it takes a great deal of intellectual dishonesty to completely dismiss an idea based on the premise and that makes me sad. I can be completely wrong, and that's fine, but I think it's worth considering. I don't want to sacrifice the mission on the alter of MediaWiki.

Can we avoid this kind of comments about other's intellectual honesty, please?

I would like all of us to assume that whatever we write in said in good faith, and with the best intentions.

This comment was removed by dbarratt.

Also, Drupal (or Symfony etc) being on par with MediaWiki in terms of scaling is very much [citation needed]. nih.org is said to be the top ranking Drupal site, and a slightly outdated comScore report claims they 10M uniques a month. That's about 1% of what Wikimedia sites get, and then I'm sure the ratio of writes is a tiny fraction of even that (outside of social media, very few sites are seriously write-heavy).

My personal experiences with scaling Drupal 7 to maybe 1/1000th of the traffic of WikiMedia has been below abysmal. There are several reasons for that that might have been fixed in the meantime, like the way databases connections are managed, the way caching is done, the way data storage works, and the little-to-very-broken support for CDN caching at the time.

Again, things might have changed radically in the meantime, but my professional experience with it have been not exciting as far as scalability is concerned.

In general, I'd be very surprised if any general-purpose CMS had the scalability feature MediaWiki has, because, well, none has been shaped by running a top-10 website. Not surprisingly, WordPress with some peculiar extensions is the one coming closer.

But to be honest, I think this discussion is hijacking what this ticket was about and is mostly tangential and off-topic.

Here are some distilled notes from the session.

TL;DR; the notes from the 3 primary topics (API, MW Refactor, and Single Parser) are being used by the ATWG to create proposals for the Annual Plan.

Enable multi-client support by exposing all functiality through APIs (was: API First)

Proposal:

  • Design and create a fully featured REST API in MediaWiki Core
    • with consistent responses
    • and easy to implement caching and purging semantics for developers.
  • Create an Internal Service Interface for PHP to be used by the PHP UI.
  • Enforce the PHP UI to use this interface (incrementally).
  • Create an REST HTTP API on top of the Internal Service Interface other clients (MF and 3rd party)
  • Use best practices and code reviews to enforce developers to expose APIs and implement caching semantics for all new functionality.

Questions:

  1. Benefits of this strategy are not clear to everyone. Can we better articulate the benefits for others?
  2. Is the engineering culture willing to enforce an API first mentality on Code reviews?
  3. Do we want the ability for anyone to create an editor, publish and use Wiki content? Do we want this flexibility for ourselves? If yes, we want to embrace this strategy.
  4. How do we handle APIs for data that does not come from MediaWIki? (Analytics/Scores)
  5. Scalable purging and caching probably requires an event propagation system. How do we implement this in a way to preserve the LAMP stack for 3rd parties (extract into a service?)

Follow Up:
This proposal its being integrated into the Program for Platform Evolution in the Annual Plan. Notes and Questions here will be used for that proposal and will be discussed in upcoming ATWG meetings. The ATWG has since discussed this topic and has wide consensus on the principle, and has proposed that we prototype an API and then use that prototype to plan the work.

Improve modularization, re-use, maintanability, and testability of MediaWIki Core (Was: Refactor MediaWiki Core)

(Note: This work is a dependency for the API proposal)
Proposal:

  • Use the service container
  • No global state
  • Favor DI
  • Separate UI and business logic
  • Encapsulate functionality into libraries
  • Improve security through narrow interfaces
  • Better allow selective SOA (to simultaneously support WMF and 3rd party needs like event propagation)
  • Refactor (not rewrite) has consensus - must be done incrementally
  • Use best practices and code reviews to enforce developers to limit access to global state

Questions:

  1. What concrete steps can we take now?
  2. The outcomes mostly have consensus, but some implementation details do not (like DI), how do we come to am agreement here?
  3. Some say we need global state, others say we do not. How do we resolve this?
  4. Are there any extension features that should be refactored into core to simplify this?
  5. Enforcement for new code seems to be required so that we do not incur new technical debt, even though the exact implementation is TBD. But how do we enforce this? (Automation techniques, code review standards, etc…)
  6. There are concerns about the size of this task.

Follow Up:
This proposal its being integrated into the Program for Platform Evolution in the Annual Plan. Notes and Questions here will be used for that proposal and will be discussed in upcoming ATWG meetings. The ATWG has since discussed this topic and so far there are still open questions around DI and global state. There is some email follow up currently in process - the hope is to present some paths forward to a wider group soon for open discussion.

Unified Parser

Proposal:

  • A single parser (broad consensus)
  • Use Parsoid (some consensus)
  • Port to PHP or fix NodeJS implementation (leaning towards PHP, but need to perform tests to understand the implications of such a move)

Questions:

  1. What are the performance implications of porting Parsoid to PHP? We need a prototype. (Moving async to sync code) https://www.mediawiki.org/wiki/Parsing#E:_Evaluate_feasibility_of_porting_Parsoid_to_PHP
  2. How much work will it be to complete the port?
  3. If unable to port to PHP due to performance issues, what needs to be done clean up the interface and remove extra round trips to MediaWiki?

Follow Up:
This proposal its being integrated into the Program for Platform Evolution in the Annual Plan. Notes and Questions here will be used for that proposal and will be discussed in upcoming ATWG meetings. Additionally, the Parsing team is planning a prototype to answer question 1.

@daniel Thank you for organizing your session at the 2018 Dev Summit. Please make sure to document your outcomes, next steps and link to any tasks that were created as outcomes of this session. Otherwise please resolve this and corresponding Phabricator tasks when there is nothing left to do in those tasks."

Rfarrand claimed this task.