Page MenuHomePhabricator

Investigate moving the www.wikipedia.org portal to Git/Gerrit
Closed, ResolvedPublic

Description

We are planning to run some small UX tests on Wikipedia.org page, and before we do that we need to take a few steps to make us ready for it.

The scope of this task is to create a gerrit repository that one could build the portal from if so desired. We can then show this to prominent contributors to the portal such as @mxn and get feedback, announce the changes publicly, then switch the portals over to the more sane system.

We've compiled documentation around Wikipedia.org Portal Improvements project here: https://www.mediawiki.org/wiki/Wikipedia.org_Portal_Improvements

Also there is some information about our research plans for the project here: https://meta.wikimedia.org/wiki/Research:Portal_experiments

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Closing a task as Invalid is no way to resolve an ongoing discussion. The main maintainer of this page and a team responsible for 'search and discover' think that the proposal is sensible. If there are open questions, let's find the answers.

As an observer of the discussion with no opinion or take, the reaction is surprising. What is wrong about "investigate moving" a regular HTML page that presumably gets many visits to an infrastructure designed for code review and software deployment? When the maintainer is saying that contributors can keep proposing changes via wiki at /temp, and when all the people who has touched the template directly in the past years are fluent Git/Gerrit users.

I also still don't understand why it needs to be moved to git. While wikipages have its own drawbacks, we have had this implementation for years now without huge issues.. From what I can see, the main reason why this move has been proposed is to make it so that you don't run into merge conflicts..? How does moving it to git solve that problem? Even on git, as long as new conflicting changes are being made to master, you will have to fix those before merging.
If you don't want others to override your edits on a wikipage, you can have your own specific page for it - when it needs to be merged with the portal, new changes made to the live portal since your last edit will have to be taken into account. The same applies for git as well.

Yikes, I step away for a couple days and this issue suddenly gets a lot of traction. :-)

I'm afraid I have to agree with @Krenair and the others here. Splitting the portals only makes things worse. They aren't maintained by different teams; they're maintained by the same people using identical infrastructure. If we start moving portals piecemeal, Meta admins like me then have to keep track of two systems which will in all likelihood diverge as the UX experiments are conducted on only one of them.

To make matters worse, we'd be forking not only the HTML but also the JS and CSS. Before Krenair (?) unified all the portals' JS and CSS into the MediaWiki namespace, all the portals had separate resources that fell wildly out of sync with each other. This would certainly happen again.

In fact, I don't care so much about version control – I just view this as a necessary step towards building portals that keep their statistics up to date without constant human intervention. (I suppose powering the portals with Lua modules would be a nonstarter.) But if those benefits only come to Wikipedia, then that would be very sad for the other projects.

I understand that the Discovery team doesn't want scope creep, but if supporting all the portals is scope creep, then is there a more appropriate team to put this under? (I'm just a volunteer, so I know nothing about the WMF engineering org structure.)

Before Krenair (?) unified all the portals' JS and CSS into the MediaWiki namespace, all the portals had separate resources that fell wildly out of sync with each other.

Um... Maybe @Krinkle?

Yikes, I step away for a couple days and this issue suddenly gets a lot of traction. :-)

I'm afraid I have to agree with @Krenair and the others here. Splitting the portals only makes things worse. They aren't maintained by different teams; they're maintained by the same people using identical infrastructure. If we start moving portals piecemeal, Meta admins like me then have to keep track of two systems which will in all likelihood diverge as the UX experiments are conducted on only one of them.

What I'm trying to avoid here is the situation where people see that the Discovery Department touched the setup of those portals last and therefore think that we are responsible for their maintenance and barrage us with requests. This has happened to me before. My department is not very big, and maintaining all of those portals is not something we have the manpower to do. We only have the manpower to focus on the biggest one, www.wikipedia.org. I want to be clear, upfront, to avoid setting unrealistic expectations.

Blocking improving the biggest of the portals on having the manpower to maintain all of them is an example of poor resource allocation. The Wikipedia portal represents 5% of our total traffic. That's an incredible opportunity for us to help users find their way around all of our sites. The others are nowhere near close to this traffic level. When you're operating with constrained resources, you have to think about how you can get the biggest win for the amount of effort you spend, and this is it. I can't let perfect be the enemy of the good, so I have to be very clear: the Discovery Department will not be doing any maintenance or improvements to any portal other than www.wikipedia.org at all, for the foreseeable future.

If, given that, @mxn is still wanting all of the portals to be migrated to a Git/Gerrit system, then we can do that. He is and has been the maintainer of these portals, pretty much on his own, for a long time, so I think he's qualified to know what's best for them. That said, it is on record here that Discovery Department will not be responsible for the continued maintenance or improvements of any portals except www.wikipedia.org.

So how about we create a generic wikimedia/portals or equivalent repository, with a wikipedia directory containing wikipedia's portal files (or if it's just a single file, wikipedia.html), and a README saying something like this:

Currently only Wikipedia's portal lives here, as WMF Discovery is not committing to maintain the others.

And then if someone else decides to take responsibility and port the rest, it should be straightforward and not require several extra instances of the repository creation process and puppet patches, scripts, and separate commits for any changes applying to all portals.

OK, changed the repo request to wikimedia/portals.

Now that 239987 has been merged (based on [[m:Special:PermanentLink/13619906]]), does that mean changes need to be pushed there from now on? Or when does the cutover happen? We’re nearing a major milestone (5M articles at en:), so we need to make sure both the portals and the Meta documentation get updated.

Now that 239987 has been merged (based on [[m:Special:PermanentLink/13619906]]), does that mean changes need to be pushed there from now on? Or when does the cutover happen? We’re nearing a major milestone (5M articles at en:), so we need to make sure both the portals and the Meta documentation get updated.

I wouldn't worry about the repository contents yet, in theory whoever presses the button to finally make the move should be checking that it's fully in sync with the one on meta.

@mxn no, not yet. It's just so that we have something to work with. When it'd be ready we'll re-sync it with the original template and make an explicit announcement. Do not worry about it till then :)

Now that 239987 has been merged (based on [[m:Special:PermanentLink/13619906]]), does that mean changes need to be pushed there from now on? Or when does the cutover happen? We’re nearing a major milestone (5M articles at en:), so we need to make sure both the portals and the Meta documentation get updated.

We're just playing around with it right now. We've got a few tests to run to make sure it's really working properly, and that it does everything it needs to. I'd like to see us build an example site (in Labs, or somewhere else) that pulls HTML from the repo, and assembles it all correctly, before we consider migrating; that's another task that I should file. And then there's an announcement to do, as Stas says. So, there's some time yet.

What would we do with beta? Right now it uses separate meta wiki (meta.wikimedia.beta.wmflabs.org) but if we move portals to git how would we serve separate parts to beta and to production?

You don't? :) Beta Cluster should be whatever will hit production in the next deploy.

If you need a place to run things that you aren't going to deploy every week right away, then setting up a one-off instance (not in Beta Cluster) would be wise. When you're ready to jump on the train, then you can merge to master, make sure the relevant configs are correct, and go.

(btw, I think I'm going to remove the Gerrit project from this task as A) new repo requests don't go in Phab and B) this isn't an issue with the software known as Gerrit. If you need any specific-to-Gerrit things done, please file new tasks for them :) )

OK, not having special case for beta sounds good. Also easier.

Change 240888 had a related patch set uploaded (by Smalyshev):
switch to git-based portal

https://gerrit.wikimedia.org/r/240888

What we've got working so far:

  • On deployment-bastion, we have the repo on /srv/mediawiki-staging
  • We (anybody with wikidev) can manually update it and sync it with sync-dir
  • https://gerrit.wikimedia.org/r/#/c/240888/ makes it work on beta, and we can make similar path for main site once we're OK with it

What doesn't work yet:

  • Automatically updating /srv/mediawiki-staging one on commit

I also still don't understand why it needs to be moved to git. While wikipages have its own drawbacks, we have had this implementation for years now without huge issues.. From what I can see, the main reason why this move has been proposed is to make it so that you don't run into merge conflicts..? How does moving it to git solve that problem? Even on git, as long as new conflicting changes are being made to master, you will have to fix those before merging.

Over at https://gerrit.wikimedia.org/r/240888, I've asked @Smalyshev repeatedly for an explanation of why the wiki is not suitable here. Specifically, I'd like to know how and why the wiki failed to perform as a collaborative editing platform.

Smalyshev keeps pointing back to this task as the appropriate forum for discussion, yet on this task, I still can't find any clear rationale for proposing this change.

I also still don't understand why it needs to be moved to git. While wikipages have its own drawbacks, we have had this implementation for years now without huge issues.. From what I can see, the main reason why this move has been proposed is to make it so that you don't run into merge conflicts..? How does moving it to git solve that problem? Even on git, as long as new conflicting changes are being made to master, you will have to fix those before merging.

It was explained by @mxn in T110070#1570330. I presume you disagree with that rationale. If so, please do state specific concerns. In the mean time, we will be proceeding.

It was explained by @mxn in T110070#1570330. I presume you disagree with that rationale. If so, please do state specific concerns. In the mean time, we will be proceeding.

The commit message for https://gerrit.wikimedia.org/r/240888 has been updated:

As discussed with mxn in https://phabricator.wikimedia.org/T110070, git-based portal would allow better cooperating for multiple change sets, automating count updates and further work for automating maintenance tasks..

What does any of this mean? None of it makes any sense to me. How will count updates be automated going forward? You're talking about a Git repo with a static HTML file, right? If we look at a concrete example of an update, such as https://gerrit.wikimedia.org/r/246790, how is this any different from a wiki page?

What maintenance tasks are being referred to? And "cooperating for multiple change sets"? Huh? Wiki pages can have more than one revision. You can also have as many wiki pages as necessary.

This entire task looks like a solution in search of a problem. If there are clear reasons for switching from the current consistent wiki page implementation to a one-off for www.wikipedia.org using a static HTML file, those reasons are still completely unclear. What specifically is being gained here?

My understanding is that the discovery team is open to making the portal a dynamic page, as I have suggested, for enabling automatic article counts. No code has been committed toward that goal, but putting the portal under Git is seen as a necessary first step. I suppose you're saying that we shouldn't cut over until the page is dynamic and fully automated, right?

The primary motivation for the discovery team seems to be that Git would let them iterate on proposed changes on a branch while routine changes occur on the live version and other branches. Obviously that's possible with wiki pages, but it's a pain to merge revisions from disparate pages on the wiki. Or is there a MediaWiki extension that would streamline that task?

I suspect the discovery team would also like to be able to run A/B tests in the future. In theory, that would be possible even with a set of wiki pages, but managing just two wiki pages (live and /temp) is already enough trouble that not many sysops bother to touch them.

I'm still open to the idea that, if necessary, the wiki could still be used as a sandbox for new ideas from people who aren't comfortable with Gerrit. I'm not especially fluent with Gerrit myself (though I expect to be with a little more practice), since it isn't purely Web-based like GitHub.

My understanding is that the discovery team is open to making the portal a dynamic page, as I have suggested, for enabling automatic article counts. No code has been committed toward that goal, but putting the portal under Git is seen as a necessary first step. I suppose you're saying that we shouldn't cut over until the page is dynamic and fully automated, right?

If automated/automatic updates are a goal, we should pursue that independently of where the content lives. I think anyone who suggests that we must have these pages in Git in order to be automatically updated is being disingenuous/dishonest. If the implementation is changed from a flat HTML file into a PHP script, then Git is probably necessary. But as it is, I don't see what benefit we're gaining by moving one portal to a Git repo. The current system is almost identical, except it has watchlist and RecentChanges support, it has a working access list and process for updates (a local admin can sync the /temp page), it now has a better HTML editor... why would we disrupt this? What are we gaining here?

That was I was trying to extract from the Git change as well. A clear rationale for why making a change here is a good idea and how/why the current implementation failed.

The primary motivation for the discovery team seems to be that Git would let them iterate on proposed changes on a branch while routine changes occur on the live version and other branches. Obviously that's possible with wiki pages, but it's a pain to merge revisions from disparate pages on the wiki. Or is there a MediaWiki extension that would streamline that task?

It's honestly not much better in Git. I don't understand what changes people are intending to make on the portal. It sounds like those types of changes need to be discussed. If the Discovery Department is hijacking this portal page and claiming it as its own, I doubt there will be reasonable discussion.

I suspect the discovery team would also like to be able to run A/B tests in the future. In theory, that would be possible even with a set of wiki pages, but managing just two wiki pages (live and /temp) is already enough trouble that not many sysops bother to touch them.

I've even more wary of treating Wikimedians like lab rats.

I'm still open to the idea that, if necessary, the wiki could still be used as a sandbox for new ideas from people who aren't comfortable with Gerrit. I'm not especially fluent with Gerrit myself (though I expect to be with a little more practice), since it isn't purely Web-based like GitHub.

Moving to Git/Gerrit is unquestionably a higher barrier to entry over editing a wiki page. As discussed above, there may be valid reasons to make this change, but probably not piecemeal and not if it's going to be essentially the same implementation (a static HTML file) that doesn't integrate at all with MediaWiki.

If automated/automatic updates are a goal, we should pursue that independently of where the content lives.

If that's a goal, we can set up a server-side bot to update the counts as quickly as desired. It's much easier on wiki, we can copy from operations/puppet/files/misc/scripts/characterEditStatsTranslate which has been working for a while.

I'm afraid I have to agree with @Krenair and the others here. Splitting the portals only makes things worse.

Sounds like it indeed. +1

If the Discovery Department is hijacking this portal page and claiming it as its own, I doubt there will be reasonable discussion.

I think that the grandparent task T112172: EPIC: of epics Wikipedia.org Portal UX tests to run is what @MZMcBride and other community members may be missing/misunderstanding. My personal understanding from things I have heard discussed in various places (irc, emails, meetings) is that the Discovery team is taking ownership of www.wikipedia.org with the intent of using it to create a more attractive and engaging "front door" to the Wikipedia project family than the current list-of-languages version. Perhaps @Deskana can shed more light on this and point people to the relevant discussions/decisions in his role as Product Manager for the Discovery team.

Thanks for pointing that out, but I confirm I wasn't missing the general picture and I stand by what was said above. :)

Change 240888 merged by Andrew Bogott:
Switch to git-based portal

https://gerrit.wikimedia.org/r/240888

If the Discovery Department is hijacking this portal page and claiming it as its own, I doubt there will be reasonable discussion.

I think that the grandparent task T112172: EPIC: of epics Wikipedia.org Portal UX tests to run is what @MZMcBride and other community members may be missing/misunderstanding. My personal understanding from things I have heard discussed in various places (irc, emails, meetings) is that the Discovery team is taking ownership of www.wikipedia.org with the intent of using it to create a more attractive and engaging "front door" to the Wikipedia project family than the current list-of-languages version. Perhaps @Deskana can shed more light on this and point people to the relevant discussions/decisions in his role as Product Manager for the Discovery team.

This is mostly right. It's the long-term plan to get towards a more engaging front door, but we're a long way away from that. We want to do A/B tests to improve the page incrementally.

One key point though is that Discovery is not looking to "take ownership of the portal". It's a shared interest. As noted in T113288: Fix permissions for wikimedia/portals, we've given merge rights for the repository to most of the people who have been responsible for maintenance, and anyone who expressed interest.

I think that the grandparent task T112172: EPIC: of epics Wikipedia.org Portal UX tests to run is what @MZMcBride and other community members may be missing/misunderstanding. My personal understanding from things I have heard discussed in various places (irc, emails, meetings) is that the Discovery team is taking ownership of www.wikipedia.org with the intent of using it to create a more attractive and engaging "front door" to the Wikipedia project family than the current list-of-languages version. Perhaps @Deskana can shed more light on this and point people to the relevant discussions/decisions in his role as Product Manager for the Discovery team.

If there's a better implementation for www.wikipedia.org and its sister portals, that seems like something worth discussing. As it is, the page is just being moved into a place that's more difficult for Wikimedians to edit. It also looks like changing to a Gerrit repo will slow updates, when they're currently nearly instant.

What @MZMcBride and others are trying to say is: why not build something better and then try to sell "the community" on that? As it is, this is doing a one-off (only for www.wikipedia.org) with no new features. It's moving from a wiki page, with actual benefits including already working for about a decade, to a Gerrit repo that has actual detriments (less monitoring and integration with Meta-Wiki, you have to use Gerrit, separate access list from local Meta-Wiki admins, higher barrier to entry to use Git instead of a wiki page).

And, again, all of this comes at the cost of not getting any new features, though people are apparently happy to talk about potential new features that might one day be built and exist. Meanwhile, the Discovery Department (which will honestly probably not last through 2016) is attempting to "take ownership" of a page that isn't theirs and never has been.

We've compiled documentation around Wikipedia.org Portal Improvements project here: https://www.mediawiki.org/wiki/Wikipedia.org_Portal_Improvements

Also there is some information about our research plans for the project here: https://meta.wikimedia.org/wiki/Research:Portal_experiments

These two documents should answer most of the questions about what the broader goals and rationale are for the Portal Improvements project, and why we need to move Wikipedia.org code to gerrit. We realize that we should have provided this documentation before this discussion started, but we're trying to make up for it now.

Since this is not really about MediaWiki or associated software, but actually about Wikimedia's specific setup, such pages should be posted at meta.wikimedia.org or wikitech.wikimedia.org instead.

@Krenair, the location of this wiki page is not evident, but in any case is a problem easy to solve. Developer documentation goes by default to mediawiki.org, and we have plenty of Wikimedia specific information there, so it is not so clear. If you think Meta is better let's have the page in Meta.

why we need to move Wikipedia.org code to gerrit

Not really. These are just standard "vcs vs. wiki" arguments, nothing specific to the portals. Some parts are also plain wrong, see T110070#1736642

Developer documentation goes by default to mediawiki.org, and we have plenty of Wikimedia specific information there, so it is not so clear.

If "developer" means "mediawiki developer". Krenair is right. Docs about Mediawiki belong on mediawiki.org, docs about technical configuration of Wikimedia sites belong on wikitech.wm, or arguably on meta. The discussion about changing the portals on meta, because it affects all language versions, not just en.

If you think Meta is better let's have the page in Meta.

+1

In discussions about this task, the consensus I've seen/read is that any change to production would require the equal ability of people to deploy changes to the portals. That is, Gerrit merge rights were handled as part of T113288, but the ability to deploy updates to www.wikipedia.org is unresolved. I believe the inability to deploy changes immediately is a blocker to switching to Git for production.

From https://meta.wikimedia.org/w/index.php?title=Wikipedia.org_Portal_Improvements&oldid=14438598:

The current mechanism allows edits to the page to be viewed in production within about an hour. Initially, with the new mechanism, commits will not appear in production unless/until they are manually deployed, which typically would not be more than once per day. However, a new deployment tool (scap3) is being developed, which might eventually allow changes to be deployed automatically. In that case, it might deploy new portal code hourly, or at whatever interval makes sense.

This same draft document goes on to say that this project has two developers devoted to it. Is someone suggesting that it's difficult to have live updates from Git? You know we already have live updates, right? (Updates are often much faster than an hour in my experience, that figure needs a citation.) You really think having a PHP script echo the contents of a static HTML file requires "scap3"? And you "might eventually" be able to address this regression?

Putting aside how little confidence this gives me in the technical changes to come, scap (short for synchronize common all PHP) existed concurrently and entirely independently of the current extract2.php implementation for many years. Conflating the two is really misleading and inappropriate, in my opinion.


As discussed in T112172: EPIC: of epics Wikipedia.org Portal UX tests to run and in T112213: EPIC: [Portal A/B test X]: Collapse language wiki links, there's also:

  • no clear plan to retain the existing brand unity between the www portals (the "Discovery" team is only interested in www.wikipedia.org, but that does not mean that the sister portals such as www.wiktionary.org and www.wikinews.org no longer exist); and
  • no clear plan to address the issues surrounding using Wikimedians as lab rats.

In the last couple days, Mxn has revamped the portal update mechanism (for all projects) with quite a bit of Lua code that automates most of the work: https://meta.wikimedia.org/wiki/Module:Project_portal
It seems this task is superseded?

I was hoping to clean up the Lua code a bit and document the workflow a bit better before making any noise about these changes, but I should be clear about my intentions here.

The biggest pain point with the status quo was that Meta administrators (not all of whom are fluent in HTML) needed to manually tweak the raw source code based on recommendations provided by the original Scribunto module. Non-administrators and indeed anonymous users have been incredibly helpful in this regard, making the suggested changes in /temp. But even then, administrators have to make use of cssminifier.com or a similar service to minify the <style> tag before publishing the update.

What I've done is to automate every step up to the actual publishing step and fully separate the data from the markup, making the system much more like a static publishing CMS such as Movable Type. The Scribunto module now uses logic that once generated change suggestions to instead generate HTML code based on /temp (which now functions as an actual template in the CMS sense). I also ported a popular CSS minification library to Lua to eliminate the dependency on cssminifier.com. The only thing an administrator has to do is review any changes to the source pages and replace the portal's code with a subst: call that pulls in a fully formed page.

As before, the module draws statistics from wiki tables updated nightly by EmausBot, but now it also uses an additional table that defines each language's name, romanization, etc. Based on years of edits to /temp, I expect that this new table will be where all routine edits take place. /temp and its sandbox can now be a place to modernize and experiment with the actual design and structure of all the portals.

Despite all the work I put into this module last weekend, I still believe we eventually need to take at least part of this system off-wiki. We need software that can autonomously take all this logic and publish the results live – automating the last mile. I could write a Tools Lab bot to do it on Meta, but that seems like such a circuitous route to updating a static webpage.

I embarked on the Lua project this weekend to test my own assumptions about the wiki's limitations and also codify a lot of the unwritten institutional knowledge that has built up around the portals over the years. (If it looks like spaghetti code, that's partly because multilingualism is hard.) I wanted to do this two years ago when I wrote the original Scribunto module but the effort to bring the portals under source control gave it added importance.

These changes at Meta only incrementally streamline the current design and workflow. I don't honestly think the current design will age well another 10 years out. As timeless as the current design may seem now, we simply don't know whether it's the best for the project because no other format has had its chance. Ironically, a more conventional publishing system may give us more flexibility to try out radical new ideas than the wiki currently does.

I do expect the new portal-generating logic to be retained as part of any off-wiki system. As @MZMcBride has pointed out, we don't want to regress major functionality as part of the migration. That likely means porting the modules to PHP (although I personally would be more comfortable writing Node.js or Python), which makes the security review process a bit more involved than before. Hopefully the Discovery team can forgive me for the curveball I've just thrown them, but I remain committed to helping to push the portals forward both on-wiki and off.

  • no clear plan to address the issues surrounding using Wikimedians as lab rats.

This rhetoric is very unhelpful. A/B testing and UX testing on live websites is not nearly the same as the image you are trying to invoke in people's minds.

If you'd like to split hairs, we're talking about www.wikipedia.org. How many Wikimedians do you know that go to that portal to search for anything? If it's more than five, I'd be quite surprised. This is about bringing the outside audience in, the purpose of the portal. It's not like you can edit the static webpage*, can you?

  • EDIT for clarity: not the page on meta, I mean the actual www.wikipedia.org page itself. There's no edit tab, there's nothing wiki about it, it's just hyperlinks. Doesn't even tell you you can edit it on meta (or that anything can be edited anywhere) or report problems on the page there in case of an error.

Two bits I want to address that have come up on this ticket:

  1. A/B Tests: These are a standard practice, widely used at the foundation. So any objections to them should be raised at a higher level, not on this ticket.
  1. Brand unity across portals: This ticket is about a low-level technical change. Brand unity is an issue to be considered for each actual change to the UI or UX of each portal in the future. So this concern isn't relevant for this task.

And responding to the more technical issues that have been raised:

  1. Automatic deployment: We do not consider this a blocker, and especially not for the investigation phase. There is a ticket to track work on it: T114694: Deploy wikimedia/portals with scap3.
  2. Lua code: Automating code on the page was never a major part of this task (it was a potential side benefit). So @mxn's lua work doesn't affect this task one way or the other.

Also, mxn's recent work does not affect the investigation phase of this work. The developers will work with him to make sure everything continues to work properly, but that work would be outside the scope of this task.

  • no clear plan to address the issues surrounding using Wikimedians as lab rats.

This rhetoric is very unhelpful. A/B testing and UX testing on live websites is not nearly the same as the image you are trying to invoke in people's minds.

Maybe you can expound on what images are invoked in your mind when you read "A/B test". Manipulating the user experience on involuntary participants... do you have a better analogy?

  1. Brand unity across portals: This ticket is about a low-level technical change. Brand unity is an issue to be considered for each actual change to the UI or UX of each portal in the future. So this concern isn't relevant for this task.

Sure, I linked to the two relevant tickets. You haven't replied on either yet.

  1. Automatic deployment: We do not consider this a blocker, and especially not for the investigation phase. There is a ticket to track work on it: T114694: Deploy wikimedia/portals with scap3.

Also, mxn's recent work does not affect the investigation phase of this work. The developers will work with him to make sure everything continues to work properly, but that work would be outside the scope of this task.

I don't know what "investigation phase" means in this context. It's actively harmful and disruptive to (high-level) discussion to split it out across so many tickets, so this ticket has remained open for the purpose of discussion of this entire initiative, as I understand it.

What does "investigation phase" mean to you? Does that only entail deployment to Beta Labs? (And if so, isn't this task now resolved?)

Regarding automatic deployment, we currently have this. I think introducing a functionality regression here is unacceptable. In discussion with others, I don't seem to be alone in feeling this way.

Maybe you can expound on what images are invoked in your mind when you read "A/B test". Manipulating the user experience on involuntary participants... do you have a better analogy?

Just a side comment. A/B testing is a methodology widely spread in software development. It provides a good complement to other ways to gather user data and plan for better user experiences. Anything can be discussed of course, but there must be a better venue to discuss A/B testing practices, in mediawiki.org or wikitech-l.

What does "investigation phase" mean to you?

The description of this task explains it?

Also, mxn's recent work does not affect the investigation phase of this work. The developers will work with him to make sure everything continues to work properly, but that work would be outside the scope of this task.

To state the obvious, the portals do get a lot of attention from Wikipedians. Even if individual editors have more specific pages set as their home pages, everyone notices whenever the portals list their wiki with the wrong name, with the wrong article count, or not in the top 10 as they'd expect. I had to respond to quite a few inquiries about the Wikipedia portal the day that the English Wkipedia reached the five million milestone, because it missed EmausBot's midnight UTC cutoff by a few hours. No one – neither me nor anyone on the Discovery team – will want to be responsible for making these changes every other day by hand. That's why even the rudimentary automation I've set up is important.

At the moment, I'm picturing the scripts being ported to work on a local machine as part of a manually run "build process" for the portals. Anyone who has merge rights would be able to run the script, which can still draw from Meta pages via the MediaWiki API if desired. It's more roundabout than what we have now, but it should be workable as long as the Discovery team doesn't move the Wikipedia portal too far away from the static page it is now.

I'm basically signing up to do this work if the Discovery team is uninterested, but I'll need their help navigating any security discussions that may arise.

Keegan claimed this task.

The "investigation" is complete for now. Automation investigation might be a specific task that Discovery wants to look into further, but that's for Dan to decide and file a task for if needed.

The "investigation" is complete for now.

What were the results of that investigation? Is there a report somewhere?

What were the results of that investigation? Is there a report somewhere?

To summarise, the investigation concluded that there are no showstoppers preventing moving forward with the proposed plan. The details on https://meta.wikimedia.org/wiki/Wikipedia.org_Portal_Improvements reflect our latest understanding.

As this task is resolved, and has been for some time, it seems unlikely to me that T114694 actually blocks this. I've removed the link.