Page MenuHomePhabricator

[Epic] Undeploying StructuredDiscussions (Flow)
Open, Needs TriagePublic

Assigned To
None
Authored By
Trizek-WMF
Mar 14 2023, 2:56 PM
Referenced Files
None
Tokens
"Like" token, awarded by Dreamy_Jazz."Burninate" token, awarded by Sophivorus."Baby Tequila" token, awarded by valerio.bozzolan."Orange Medal" token, awarded by thiemowmde."Like" token, awarded by Ladsgroup."Orange Medal" token, awarded by Krinkle.

Description

This work is in discussion and not currently prioritized by any WMF team. A community consultation will be started if plans are made to move this work forward.

Tl;dr

Flow is a complex piece of software that was never quite finished, fits poorly into the MediaWiki architecture due to its many (eventually un-uitilized) layers of abstraction, lacks key features (such as search and solid moderation) and has many disruptive bugs, and is rejected by most Wikimedia communities. While there are communities that are happy with it, the value it provides isn't commensurate with its maintenance cost, much less with the expected future maintenance cost once there are major changes to Parsoid or VisualEditor. DiscussionTools now provides a similarly user-friendly UI, without all the problems. We should invest into undeploying Flow - it will be a significant effort, but keeping it in production would take more effort eventually.

History

Talk pages were long seen as a usability pain point (see the talk pages consultation's historical overview). The LiquidThreads extension was created back when the movement had very limited technical resources; plans to improve it eventually evolved into replacing it with an ambitious workflow system, Flow (later renamed StructuredDiscussions) which would have different modules for different types of talk page activities such as threaded discussion, voting or requests.

The eventual release only supported threaded discussion, and while it was a huge usability improvement for users unfamiliar with wikitext, it broke a number of workflows that power users considered very important, such as moderation or talk page refactoring. As a result, the rollout of Flow met fierce opposition from many wiki communities, and deployments were halted (and in some cases reversed). The WMF paused development in 2015 and instituted a freeze in new deployments of Flow a few years later. A few communities still use Flow.

In 2019, the WMF organized the talk pages consultation to look at the future of talk pages. The consultation yielded the Talk pages project. That project produced the DiscussionTools extension, which gained wide community adoption and is now the default on all wikis.

Community members have initiated discussions to replace Flow pages with DiscussionTools on https://www.mediawiki.org/ and at translatewiki.net.

Problems

Maintenance burden

Maintaining Flow is a drain on WMF resources:

  • It is a large and complex codebase, with approximately 36,000 lines of code.
  • The code is complicated to reason about and difficult to work with. Some of the original authors are no longer at WMF.
  • As it provides an alternative talk page implementation, all features that need to be interact with talk pages need a separate Flow and wikitext implementation. On the frontend side this is somewhat managed via mw.messagePoster, on the server side there isn't an off-the-shelf way to do it.
  • Subtle bugs relating to database replication issues periodically cause user talk pages to become broken (see e.g. T308907). Someone from Growth then has to run a maintenance script to fix the problem. Fixing the underlying issue is not trivial.
  • Flow accounts for 25 open issues on the production error workboard. There have been 111 Flow production error tasks in total.
  • There are 16 open Flow tasks tagged with Security, including some crippling limitations to moderation functionality. The non-trivial data flow makes it nearly impossible to make sure proper escaping is done before building SQL queries, making it prone to SQL injection.
  • There are ~1200 open tasks on the Flow workboard, with about a hundred of them having high priority.
  • Other teams are blocked on overdue maintenance needed in Flow: Content Transformers needs stored Flow HTML to be updated to the newest Parsoid version (T209120, T124837). Given its shaky change list integration (see below), it will require a lot of extra work for patrolling or moderation changes (a priority area for the next year).
  • Flow uses patterns and libraries not found elsewhere in MediaWiki, increasing maintenance cost. For example, it uses Pimple rather than MediaWikiServices for dependency injection (T150350), and it uses lightncandy for its templates, unlike any other extension.
Architecture burden

Because Flow was intended to be very generic and support all kinds of different workflows, the codebase ended up with many layers of abstractions (which eventually weren't used since only one workflow was implemented), making it hard to understand and maintain. Because it envisioned sharing talk pages between multiple wikis (this mostly got implemented but wasn't deployed), it had to invent its own alternative revision system. In addition, its authors tried to innovate in directions that the wider MediaWiki community didn't pick up on (e.g. using MySQL as a noSQL database), so Flow ended up being something of an alien object wedged into MediaWiki. Not using the revision table means Flow has to reimplement every single feature related to changes lists (page history / contributions / RC / watchlist) and moderation that other extensions get for free.

UX burden

Two competing interfaces for the same use case, where the user cannot chose which one to use and eventually has to learn both (wikis that use Flow don't use it on all talk pages), adds to the mental burden of using the site. While at a high level Flow is in many ways closer to user expectations of how a discussion system should look, the implementation is unfinished, lacks key features (such as searching or legible links), has many severe bugs and usability issues (e.g. T324416) and integrates with moderation tools poorly.

Risk of emergency undeployment

Undeploying Flow would be a significant export and a lengthy project. However, if we don't expend that effort and then find ourselves in a situation where we need to undeploy Flow anyway and can't take several weeks to do it (e.g. a hard-to-fix security issue is found), the effects would be quite catastrophic. All Flow content and all Flow-related log entries and page histories would become inaccessible, Flow-related entries would disappear from user contributions. If Flow gets reenabled later, some of the content would be corrupted or lost (as Flow needs to keep its own database in sync with e.g. page moves).

Technical plan

  • Make a security dump of the Flow databases, just in case.
  • The extension defines various things and the related translations - e.g. namespace name, action log messages. These should be moved to the dedicated part of WikimediaMessages for this situation.
  • There is a conversion script for creating wikitext archives of Flow pages, which needs to be tested and maybe improved. We'd then archive all content as subpages. The script doesn't try to replicate the edit history, but for threaded discussion there is not much extra information there anyway. The ability to find users' Flow comments via user contributions will be lost though. We also wouldn't retain deleted comments.
  • Since Flow doesn't use the revision table, once it is turned off all the Flow content is gone, it doesn't need much cleanup. Maybe update links and such.

(This doesn't cover the governance and communication aspects but they are of course important.)

See also

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
ResolvedTrizek-WMF
DuplicateNone
OpenNone
OpenNone
ResolvedTrizek-WMF
DuplicateNone
ResolvedTrizek-WMF
ResolvedUrbanecm_WMF
ResolvedTrizek-WMF
ResolvedUrbanecm_WMF
ResolvedTrizek-WMF
ResolvedTrizek-WMF
ResolvedSgs
ResolvedSgs
ResolvedEtonkovidova
DeclinedNone
OpenNone
ResolvedTrizek-WMF
ResolvedTrizek-WMF
ResolvedUrbanecm_WMF
OpenNone
OpenNone
OpenNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
KStoller-WMF renamed this task from [Epic] Prepare StructuredDiscussions undeployment to [Epic] Undeploying StructuredDiscussions (Flow).May 3 2023, 3:35 AM
KStoller-WMF updated the task description. (Show Details)
Tgr updated the task description. (Show Details)

Separately, I want to note that one of the cool features of StructuredDiscussions is that they allowed you to add new topics to the top, as in all standard forums. This has not yet been implemented in MediaWiki: T33919: Allow posting new sections to top of page on a per-page basis.

Currently Flow is also used in multiple 3rd-party wikis and if we choose to sunset Flow and: (1) any wiki with IP masking will have Flow edits unmasked unless we did T342831: Temporary Accounts: Update StructuredDiscussions (Flow); (2) if Flow is abandoned completely (archived) we need to document a way for 3rd wikis to migrate their existing data (i.e. it is not trivially archivable).

Personally I like Flow but don't like it in the current (not well maintained) state.

For wikis that flow is not actively in use, we can set wgFlowReadOnly to true so no new messages can be sent via Flow.

For any wikis ready/willing to switch away from Flow, perhaps a maintenance script could be written to convert all flow posts to wikicode that resembles a normal talk page, while turning of flow boards for those pages. The conversion would be made in one big edit per converted page.

I think we should consider dropping log entries on wikis that wish to get rid of Flow. Losing the revision information would not be ideal, but might be worth the tradeoff here (the tradeoff being to be able to uninstall Flow, which is a huge technical debt win). Talk page posts could get signed by the conversion script, preserving username and datetime.

I think I remember from another discussion that Flow can never be completely uninstalled for some reason, even if Flow was only used for a couple of test posts. What were the details of that again? Does Flow do some kind of modification to non-Flow database tables?

For any wikis ready/willing to switch away from Flow, perhaps a maintenance script could be written to convert all flow posts to wikicode that resembles a normal talk page, while turning of flow boards for those pages. The conversion would be made in one big edit per converted page.

FWIW, such thing already exists (extensions/Flow/maintenance/convertToText.php), you can even run it locally against a remote wiki. The only issue is that it doesn't preserve history of edits since it's much more complicated than it looks but generally speaking it exists. (h/t @Krinkle )

I think I remember from another discussion that Flow can never be completely uninstalled for some reason, even if Flow was only used for a couple of test posts. What were the details of that again? Does Flow do some kind of modification to non-Flow database tables?

Its ok with disabling Flow: T282132#7107443.

Tgr updated the task description. (Show Details)

I added a Technical plan section with what I understand is the current thinking of how removing Flow would look. (I won't be the one to actually do it so take it with a pinch of salt.)

I think I remember from another discussion that Flow can never be completely uninstalled for some reason, even if Flow was only used for a couple of test posts. What were the details of that again? Does Flow do some kind of modification to non-Flow database tables?

That's true to almost any extension which creates logs or content - you can uninstall it but you are left with Special:Log entries with missing i18n messages and such. We might be able to just move the few lines of code necessary for that to a placeholder extension. (We'd probably have to simplify some log messages, because Flow might look up information in its own revision database to show in logs, and we definitely don't want to keep that.)

These should be moved to the dedicated part of WikimediaMessages for this situation.

Note: we should make sure things will not break even without WikimediaMessages, as it is the case of 3rd party installations.

The script doesn't try to replicate the edit history, but for threaded discussion there is not much extra information there anyway. The ability to find users' Flow comments via user contributions will be lost though.

The edit history (i.e. modifications of already-posted messages) is probably not that important, but couldn’t the maintenance script create one revision for each comment in the name of the comment author and dated to when the comment was posted, so that Special:Contributions mostly works? Being a maintenance script, it has the technical ability to insert arbitrary entries in the revision table.

Can I somehow help pushing this forward? The Flow boards on wikis like mediawiki.org are currently heavily hit by spam bots. This is especially frustrating when the bots do exclusive Flow actions like editing the same topic summary over and over again. While this can be undone it's clumsy and frustrating as it is – by the nature of the separated Flow databases – not fully integrated into the normal MediaWiki workflows. And then there is this: T309941: IPInfo refuses to show IP informations if IP made only Flow modifications.

Can I somehow help pushing this forward? The Flow boards on wikis like mediawiki.org are currently heavily hit by spam bots. This is especially frustrating when the bots do exclusive Flow actions like editing the same topic summary over and over again. While this can be undone it's clumsy and frustrating as it is – by the nature of the separated Flow databases – not fully integrated into the normal MediaWiki workflows. And then there is this: T309941: IPInfo refuses to show IP informations if IP made only Flow modifications.

@thiemowmde there's also some discussion in T342831: Temporary Accounts: Update StructuredDiscussions (Flow).

For mediawiki.org specifically, perhaps we should make a dedicated task organaize undeployment discussion for that particular wiki. (I am not sure if there needs to be an on-wiki consultation on mediawiki.org about it.)

Undeployment is hard because you need to convert the existing content but switching Flow to readonly on a given wiki should be trivial, if there is community support for it (except someone would have to move all the Flow pages afterwards to make place for wikitext comments).

Ideally, switching Flow to readonly would happen a few weeks or months after moving all Flow pages away, so that active discussions started with Flow could be finished with Flow (but due to them moved away, new discussions would naturally be started using DiscussionTools/wikitext).