Page MenuHomePhabricator

WikiDev 16 working area: Software engineering
Closed, ResolvedPublic

Description

This is a potential area for work at Wikimedia-Developer-Summit-2016. "Software engineering" is about building and delivering high quality code. Central problem: "how do we simultaneously optimize the following conditions? 1) make software development more logical and obvious for all Wikimedia contributors, 2) make Wikimedia software more useful and reliable for the Wikimedia sites"

Below are some session proposals for this area.

Please use comments to propose sessions or discuss the ranking and categorization.

= if there is time
= would be helpful
= must have

Architecture

T96903: Identify and prioritize architectural challenges (as a brain storming session)
T384: RfC: Dependency Injection for MediaWiki core
T114542: Next Generation Content Loading and Routing, in Practice (overlap with T114803? T99088? T111588?)
T114803: Service-Oriented Architecture, quo vadis? (overlap with T113210? T114542? T99088? T111588?)
T99088: [RFC] Evolving our content platform: Content adaptability, structured data and caching (overlap with T114803? T114542? T111588?)

Framework & Technical Debt

T107595: [RFC] Multi-Content Revisions (important for content platform)
T114394: RFC: PageLookup service and PageRecord object
T113034: RFC: Overhaul Interwiki map, unify with Sites and WikiMap
T114071: Let's discuss the skin creation process
T113002: Let's discuss LanguageConverter
T114065: The future of MobileFrontend
T114474: RFC: More flexible and modernized ChangesList formatting for Recent Changes

Deployment and Distribution

T114045: Scap3: updates, upgrades, and challenges
T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services
T105638: RFC: Streamlining Composer usage
T113210: How should Wikimedia software support non-Wikimedia deployments of its software? (overlap with T114803?)

Procedures

T114419: Event on "Make code review not suck"
T114384: Standardise procedures for deprecating public-facing code
T114320: Code-review migration to Differential status/discussion

Other working areas (and the meta conversations about the idea of working areas) can/should be found here: T119018: Working groups/areas for macro-organization of RfCs for the summit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Note that T114320: Code-review migration to Differential status/discussion is also proposed for Collaboration (T119030). Collaboration is focusing around processes, so maybe it is better to place this session there.

I wonder whether T114384: Standardise procedures for deprecating public-facing code should also go under Collaboration.

... and I'm still confused about T96903: Identify and prioritize architectural challenges, but if it is about "How should we identify and prioritize architectural challenges?", then it might be another candidate for Collaboration.

This working area seems to be an amalgam of Architecture, Sustainability, and Deployments. I think we should consider the central question here to be "how do we build high-quality software that we can dramatically increase the number of people that can understand it while increasing the reliability and maintainability of Wikimedia sites?" Collaboration goals should be in harmony with the goals of this working area, but it's entirely possible to increase our reliability and maintainability without scaling up the number of people collaborating on the software.

As I suggested in another comment, I think I made a mistake by suggesting that we shoot for one-and-only-one category. @cscott's idea of designating a primary working area with the opportunity for multiple secondaries seems sensible to me.

@RobLa-WMF It's unclear to me how the deployment process sessions fit into this topic area. There is some overlap in the area of package management (composer & co), but otherwise, the deployment process (scap & co) seem unrelated to software engineering as such.

@daniel: I believe this area is here to answer the following question: "how do we build high-quality software that we can dramatically increase the number of people that can understand it while increasing the reliability and maintainability of Wikimedia sites?" I believe that @thcipriani will be happy to tell you why and how T114045: Scap3: updates, upgrades, and challenges will help answer that question. If he's not, there might be some other question that that T114045 is a better answer for, but I'm not sure what it is. So, using that as an example, you may be arguing as to why that session should be cut, rather than arguing that it's miscategorized.

The general goal of putting the design patterns work in with the test infrastructure piece is the goal seems similar to me. T384: RfC: Dependency Injection for MediaWiki core seems to rely heavily on emphasizing the importance of testing; thus it seems logical to me that we talk more generally about the obstacles to testing/reliability, and more importantly, use the quality of answers to the central question posed as a means of determining what survives the cut list as we narrow down the session count to a manageable schedule.

@RobLa-WMF My concern is about the target audience. Do we group the sessions by the implementation dependencies (blockers), or by audience?

You are right that CI infrastructure and DI framework have a common goal: better test coverage, and by that, higher code quality. From that perspective, they belong together. What worries me is that the people and technologies involved are largely distinct. CI infrastructure is completely independent of the question if and how we do DI, and people who would be impacted by DI are typically not involved with maintaining the CI infrastructure. From the perspective of event planning, I would even be inclined to schedule such sessions in parallel, because of the low overlap in target audience.

Similarly, the scap3 proposal identifies two potential audiences: "Opsen who can help move MediaWiki and other projects to a more automated deployment" and "Repo deployers whose repositories haven't yet (by that point) moved to Scap3". These are generally not the people who are interested in how page translation code works or how we should refactor the code that is now in MobileFrontend.

There is of course some overlap: SOA has implications for the software architecture as well as for deployment and networking. The same is true for package management, etc.

Anyway - I suppose my question is what should govern the planning process: logical dependencies (blockers of implementation), or target audience? For managing software development, we use the former, but for setting up an event schedule, we should perhaps focus on the latter. In terms of target audience, deployment would perhaps fit better with (some of) what's in the API category.

What worries me is that the people and technologies involved are largely distinct.

Yes, that worries me too. Let's fix that.

You are right that CI infrastructure and DI framework have a common goal: better test coverage, and by that, higher code quality.

I don't think DI is limited to, or even mainly about, testing (although easier testing is certainly one of the major benefits). DI decouples functionally unrelated parts of the code, and by that increases the velocity of change (you don't have to go around and update all of MediaWiki when you change some central component). It can also be used to give much larger control over the architecture to reusers, and by that enable a paradigm shift from content management system to CMS development framework (personally I think that could be the key for making MediaWiki financially independent from the Wikimedia movement).

enable a paradigm shift from content management system to CMS development framework

@Tgr I like that :)

@RobLa-WMF My concern is about the target audience. Do we group the sessions by the implementation dependencies (blockers), or by audience?

You are right that CI infrastructure and DI framework have a common goal: better test coverage, and by that, higher code quality. From that perspective, they belong together. What worries me is that the people and technologies involved are largely distinct. CI infrastructure is completely independent of the question if and how we do DI, and people who would be impacted by DI are typically not involved with maintaining the CI infrastructure. From the perspective of event planning, I would even be inclined to schedule such sessions in parallel, because of the low overlap in target audience.

-1 to putting scap3 and things that will involve services on parallel tracks.

Dependency injection is fundamentally the process of encouraging objects to collaborate via composition (has-a) instead of inheritance (is-a) and to externalize the realization of those dependencies via setter or constructor injection. There are two strong arguments for this design pattern in my opinion that are relevant to MediaWiki:

  1. Loosely coupled dependencies can be replaced by mocks and stubs during testing to allow more fine grained verification of business logic.
  2. Loosely coupled dependencies can be replaced with alternate implementations at runtime to optimize for different use cases related to performance, security and complexity concerns.

Both of these concerns are of interest to the Release-Engineering-Team team. The second concern is of interest to the Services team. These two teams will be major stakeholders in the scap3 discussions.

I do however tend to agree that we will have people interested in talking about T107595: [RFC] Multi-Content Revisions who have no interest at all in T114045: Scap3: updates, upgrades, and challenges.

I personally think that T114320 (Differential as code-review) is a really important topic for WikiDev16. It's also going through the RFC process as well, but having an open time to discuss this with people in the same room would be, I believe, very beneficial (along the lines of making sure we do it right).

Maybe one thing that would be good to do here is to clarify the central question. The version I wrote initially is this: "how do we build high-quality software that we can dramatically increase the number of people that can understand it while increasing the reliability and maintainability of Wikimedia sites?" I understand the idea I was trying to communicate, but I don't think I communicated it well.

Basically, what I'm trying to say is we want to solve these problems simultaneously:

  • Create high-quality software
  • Improve the quality of our existing software
  • Increase the number of people that can simultaneously participate
  • Decrease the skill level required to participate
  • Ensure that specialists can bring their skill to our software without needing to jump through a lot of hoops or become experts in areas they don't care to learn
  • Minimize the time between idea to sitewide deployment
  • Ensure that deployments happen efficiently
  • Ensure that mistakes are corrected well before they are deployed
  • Minimize the amount of paid staff we need to assign to minutia/crapwork
  • Ensure that improving our site software is rewarding

My original statement of the problem was an attempt to state that concisely, but it could use some wordsmithing. I'd like to make sure that we define this concisely, as well as to make the problem distinct from T119030: WikiDev 16 working area: Collaboration. I see T119030 as more focused on the social side of the equation, whereas this area (T119032) is more about the technical side. To use a playground metaphor, I see T119030 as about making sure the kids on the playground are able to have fun without worrying about fights, predators, criminals, unfair playmates, unfair/negligent parents, etc, whereas T119032 is about whether the playground equipment is well maintained, works well, is the best equipment for the space, and is safe and fun.

Ideas?

Back in September, @cscott commented on T96903: Identify and prioritize architectural challenges

[We should strive for] programming language agnosticism in core. We can embed PHP in node, and vice-versa. Let's invest in the infrastructure necessary for mediawiki-core to play well in a multi-language environment. Perhaps a service-oriented architecture is part of this, so that more parts of core can be split into separate services and acccessed via language-agnostic APIs. Perhaps it's investing in a PHP-node bridge so that extensions can be written in JavaScript and play nicely with code's PHP engine. It's too early (and unwise) to consider rewriting the PHP core of mediawiki -- but we can start the process of decoupling PHP from our identity, so that PHP isn't a wall for new contributors to climb.

This comment seems to be relevant to this area, and seems to tie into T113210: How should Wikimedia software support non-Wikimedia deployments of its software? and T114803: Service-Oriented Architecture, quo vadis? Is this something that people in this area are prepared to speak about? Are we prepared to have a productive conversation on this topic, or are we doomed to continue talking past one another on it?

@RobLa-WMF The multi-lingual topic seems to fit better into T119029: WikiDev 16 working area: Content access and APIs I think, though it certainly has implications on architecture.

I personally think that factoring services out of core using a SOA approach is the way to go. In-process bridges between languages as different as JS and PHP seems like asking for trouble. But I have not investigated this, it's just a first take on this.

I personally think that factoring services out of core using a SOA approach is the way to go. In-process bridges between languages as different as JS and PHP seems like asking for trouble. But I have not investigated this, it's just a first take on this.

@cscott: do you still believe this is a good solution? Is this the right place to have this conversation? If not, where?

Qgil triaged this task as Medium priority.Dec 11 2015, 8:11 AM

Here is my current take on what sessions we should have to cover the Software Engineering (aka Code Quality) topic area:

I think making our codebase more modular and testable is essential for the future of MediaWiki, so we should talk about T384: RfC: Dependency Injection for MediaWiki core and perhaps other architecture guideleines. In this context, it would be useful to discuss concrete parts of the code base that are in need of refactoring, as suggested by T114071: Let's discuss the skin creation process and T113034: RFC: Overhaul Interwiki map, unify with Sites and WikiMap.

Another important goal is making our content model more flexible. One very broad RFC about that is T99088: [RFC] Evolving our content platform: Content adaptability, structured data and caching, which also touches the content format (T119022) and access APIs (T119029) topics. In this context, we should definitely talk about T107595: [RFC] Multi-Content Revisions (which may also fit into the content format track) and about T114065: The future of MobileFrontend.

The topic of multi-lingual content is becomming increasingly important, and progress is held back by the monolithic design of MediaWiki core. To fix this, we should have T113002: Let's discuss LanguageConverter and T113034: RFC: Overhaul Interwiki map, unify with Sites and WikiMap on the program, and perhaps include T114640: make Parser::getTargetLanguage aware of multilingual wikis, too.

Finally, testing and deployment (CI) is essential for maintaining quality of our code and services. The hottest topic there is probably T114045: Scap3: updates, upgrades, and challenges, with T114457: [RFC] Use `npm install mediawiki-express` as basis for all-in-one install of MediaWiki+services and T105638: RFC: Streamlining Composer usage tying into the discussion.

It seems to me that four sessions would be enough to cover the above. Maybe it can be folded into three or even two, but we'll have to trim them down some more then.

One quick topic: the name "Software Engineering (aka Code Quality)" seems biased toward reducing the scope of this area (in particular, the "Code Quality" part). I'm not comfortable renaming this area, given how I read this. Is your intention to reduce the scope of this area to focus more on our PHP code and less about Puppet manifests, test infrastructure, etc?

I'm going to attempt to summarize the must-haves captured in @daniel's session description:

Did I get this right? T114320: Code-review migration to Differential status/discussion isn't even considered in the list, despite @greg's comment T119032#1843116 suggesting it "is a really important topic for WikiDev16". Is that an oversight?

One quick topic: the name "Software Engineering (aka Code Quality)" seems biased toward reducing the scope of this area (in particular, the "Code Quality" part). I'm not comfortable renaming this area, given how I read this. Is your intention to reduce the scope of this area to focus more on our PHP code and less about Puppet manifests, test infrastructure, etc?

I find the name "Software Engineering" problematic, because at least two other topic areas (content format and access APIs) are also software engineering.

My intention is not to narrow this topic area to PHP code quality / refactoring. Though I'm thinking that the 80 minute slot currently reserved for "software engineering" should focus on technical dept and architecture in core. But we shouldn't just drop the other things. Code quality also matters in JS, CI infrastructure is essential to maintaining code quality. Deployment is more about the quality of our services, but since it's closely related to CI management, it makes sense to bundle these things.

As to your summary: this does not seem to incorporate the comment I wrote yesterday in T119032#1885039; It for instance misses the multilingual aspect. I'll update the taqsk description to reflect my current thinking.

I agree T114320: Code-review migration to Differential status/discussion and T114419: Event on "Make code review not suck" should definitely be on the schedule, these are very important topics. But I think they better fit into the Collaboration track (T119030). Quim discussed the overlap a bit in hist comment T119032#1824587.

As to frameworks, we should look into T99088: [RFC] Evolving our content platform: Content adaptability, structured data and caching as I suggested yesterday. But I wouldn't group frameworks and technical dept. Technical dept much better fits with the architecture guidelines. The "framework" aspect is more about evolving our domain model.

I've been thinking about the central question I think this area should be addressing, and I think I've got it: "how do we simultaneously optimize the following conditions? 1) make software development more logical and obvious for all Wikimedia contributors, 2) make Wikimedia software more useful and reliable for the Wikimedia sites" It's a bit wordy, but I think it clarifies the problem we should be trying to solve in this area. Thoughts?

how do we simultaneously optimize the following conditions?

The question seems to assume a false dichotomy, however there were multiple hints in the past that some people in WMF believe in such a dichotomy, which hence might be worth addressing whether real or invented (no opinion on the matter).

@RobLa-WMF I think your "lead question" sums it up pretty well. My question is: how many sessions can we use to cover all that ground?

@Nemo_bis In my mind, these two things absolutely go together. I see no contradiction, just a different focus, with some overlap. I share your impression that some people seem to see these as competing goals, though.

Some thoughts after a conversation with @RobLa:

For the main "Software Engineering" slot at the summit, we don't necessarily have to go through a list of RFCs. That can potentially be done in breakout sessions. The goal for the main slot is to give a broad overview, and gather input on, the things we want to do to make MediaWiki easier to develop and deploy.

Here are some of the key points, off the top of my hat:

  • Improve Testing. Integrate browser tests, cross-test with different extensions, use the same deployment process for CI and production, etc.
  • Improve Deployment. MediaWiki should be easy to deploy, manage, and update for 3rd parties as well as Wikimedia. For production, development, and testing. Package management is one important aspect of this, virtualization is another.
  • Modularization. Make software components easier to exchange, re-use, and re-combine. Dependency injection is one aspect of this, SOA is another.
  • Content Abstraction. Evolve our model from "a wiki page is a block of markup" to "a wiki page is composed of arbitrary content objects". This allows more use cases to make use of the wiki storage/versioning infrastructure, instead of brewing their own.

All these things tie into each other, and it makes sense to me to have a high level conversation about these in a single session. I'm not sure yet when and where we will have time to actually make decisions about concrete implementations. If all else fails, we can still do this on day three. That may be better than scheduling anything in parallel to the high level plenary sessions.

@bd808 Rob mentioned you might be able to help with structuring the session. What are your thoughts on the above?

@bd808 Rob mentioned you might be able to help with structuring the session. What are your thoughts on the above?

If I'm understanding correctly the point would be to make the best use of the 80 minute main room discussion on day 2 of the Summit. If so I agree with @RobLa that trying to adjudicate one or more RFCs would be a poor use of the collective attention.

What might be a good use of the time is trying to find some consensus on the general themes that WMF teams and individual contributors should focus on in the 2016 calendar year. It would be really awesome if we could set some goals for MediaWiki in general based on those themes and find people willing to keep the ideas active over a longer time than just the Summit and a short time immediately following. High level themes like those that you have suggested are good, but having more measurable and actionable goals would be better.

Do we have natural proponents for themes who could work up one or two potential goals to put forth to kick off discussion? I'll try to come up with some ideas myself as well.

My must attend sessions related to this area

Why I picked these sessions

Under the banner of "software engineering" I'm looking for topics that are relevant to all technical contributors to the MediaWiki code bases. I want to see informed discussion and adoption of shared points of view that can be reinforced with actions across all WMF development teams as well as the larger MediaWiki FLOSS developer community. As I mentioned in T119032#1891985 these shared points of view will come with champions willing to keep the actions decided on front of mind for all developers in the coming year.

For me the Services T114803 and DI T384 conversations go hand in hand. I believe that good integration of external services with MediaWiki will depend on introducing internal service facades that can be realized by either PHP code or external services. This to me implies a need for the easier composition and configuration of the service facades that DI hopes to achieve.

Similarly I think that the Services T114803 and non-technical installs T113210 discussions are closely related. Supporting the use of Visual Editor outside of the Wikimedia production cluster is a must have in my opinion for the long term viability of MediaWiki as a general purpose wiki platform. Parsoid and any other future services that are necessary for VE need to be easy to install and maintain.

The last two topics (T114419 & T114320) are related in more than just that Differential is a tool for code review. I believe that the Release-Engineering-Team will actually implement the conversion from Gerrit to Differential in the 2016 calendar year. This conversion will bring with it new workflows for all contributors and seems like a perfect time to introduce changes in the way that we think and act when proposing code changes and reviewing changes proposed by others.

If there are no conflicts

Why other things didn't make the cut

Thanks a lot for this analysis @bd808! I think I like it!

The problem now is how to fit it into the very tight schedule. Do you think it's possible to have all those must-have topics in a single session? If I understand correctly, that's currently the idea. It may be possible to spin off a few other sessions, but they will then likely conflict with the must-have sessions of other areas.

Thanks a lot for this analysis @bd808! I think I like it!

The problem now is how to fit it into the very tight schedule. Do you think it's possible to have all those must-have topics in a single session? If I understand correctly, that's currently the idea. It may be possible to spin off a few other sessions, but they will then likely conflict with the must-have sessions of other areas.

In the draft schedule at https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016, T113210: How should Wikimedia software support non-Wikimedia deployments of its software? and T114419: Event on "Make code review not suck" already have their own 80 minute slots. Sadly T114419 is in the last slot of day two so to follow it with a discussion of T114320: Code-review migration to Differential status/discussion we would need to plan on unconference time during day 3.

With the 80 minutes that this task is going to own I think I'd split the time between T114803: Service-Oriented Architecture, quo vadis? and an attempt at defining shared goals for advancing particular engineering practices.

Discussion of T384: RfC: Dependency Injection for MediaWiki core can happen on the patches and in other normal RfC channels and/or be incorporated as part of a practice focus of introducing service facades and other objects that will benefit from composition orchestration.

I hope that people interested in this area could help with T129651: Align use of #architecture and #technical-debt tags; in particular, this part of the conversation:

I've never thought of Technical-Debt as mostly "design" issues but rather missing unit tests, duplicated code, using deprecated API or incomplete API.

So can something "missing unit tests, duplicated code, using deprecated API or incomplete API" still be considered "well designed"?

So can something "missing unit tests, duplicated code, using deprecated API or incomplete API" still be considered "well designed"?

Yes, for instance if the code duplication, deprecations and/or incompleteness were generated later, by changes somewhere else.

I believe a "software engineering" ArcChom working group would be a good thing to have. Perhaps we can get this going at the hackathon in Jerusalem. If you are interested in joining such a group, please let me know.

The recent wikitech-l discussion about deprecation policies (wikitech-l: Legacy wikibits will no longer loaded by default on Wikimedia wikis and MediaWiki 1.27) seems appropriate for a future Software Engineering group to start finding some consensus about best practices. @MrStradivarius brought up T114384: Standardise procedures for deprecating public-facing code in particular, which does a pretty good job of highlighting the tension we're managing.

T114384 isn't an TechCom-RFC yet, but @MrStradivarius, adding that tag would help raise the issue on TechCom's workflow (if you'd like that).

daniel closed subtask T124504: Transition WikiDev '16 working areas into working groups as Declined.

And the other blocker, making working groups modeled after the rust community, is probably not moving any time soon.

Closing as done.