Page MenuHomePhabricator

Drop Postgres support from core
Open, LowestPublic

Description

In order to lower the maintenance burden for rarely used code/features, the code for Postgres could be made into an extension, similar to Oracle.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I'm not sure it'd be a good idea. For multiple reasons:

  • The pingback data says we have at least 500 mediawiki installations in the wild with PG. https://pingback.wmflabs.org/#database-type Sure, that's not much compared to 20K of MySQL but compare it to oracle and MSSQL that had only 10 in total when I removed its code.
  • Oracle didn't move to an extension. We removed the code and said whoever wants it, they can bring it back in an extension. That extension never happened.
    • That required core to change to support extra RDBMS types via extension. That happened but got reverted due to massive performance regressions in Wikimedia. I don't remember it got re-implemented again. In other words, we can't move RDBMS types to extensions. Maybe @Legoktm remembers more?
  • I don't know any custodian we could hand this off to, it practically means we would be killing PG support
  • I personally know really large organizations and institutions that are PG shop and would love to switch to PG if we can add support for PG in popular extensions (which we are doing so in 1.39 release), my assumption is that the number will go up slowly after 1.39 release.
  • I don't know any major hassle caused by PG support recently. We even worked heavily to make sure support is easier by building abstract schema.
  • With dropping support of 9.4 and 9.5 which we did a couple weeks ago PG is even easier to maintain and more similar to MySQL code. An example: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/818473
  • Oracle didn't move to an extension. We removed the code and said whoever wants it, they can bring it back in an extension. That extension never happened.
    • That required core to change to support extra RDBMS types via extension. That happened but got reverted due to massive performance regressions in Wikimedia. I don't remember it got re-implemented again. In other words, we can't move RDBMS types to extensions. Maybe @Legoktm remembers more?

I'm still kind of upset about how that all went down. The code to add "database extensions" was rushed through, had issues, and was reverted. I then spent a few days urgently redoing the patches (unpaid at the time) because it was supposedly a release blocker and then there was no one to review them. I'm pretty sure it's all bitrotted now. See T226857#6339733 (and the lack of reply) and https://gerrit.wikimedia.org/r/c/mediawiki/core/+/615686/.

  • I don't know any custodian we could hand this off to, it practically means we would be killing PG support
  • I personally know really large organizations and institutions that are PG shop and would love to switch to PG if we can add support for PG in popular extensions (which we are doing so in 1.39 release), my assumption is that the number will go up slowly after 1.39 release.

My current work just adopted MediaWiki on PG (though we're not "large" by any definition!) so I have some vested interest in keeping this alive in core.

  • I don't know any major hassle caused by PG support recently. We even worked heavily to make sure support is easier by building abstract schema.
  • With dropping support of 9.4 and 9.5 which we did a couple weeks ago PG is even easier to maintain and more similar to MySQL code. An example: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/818473

And these two points is why I'm confused as to why this task was filed. I thought the plan was to dumb down the PG schema as much as possible so developers can just forget it exists and things will just work. And I set up postgres in CI to validate that.

I think we should discourage new PostgreSQL installations by updating the documentation in the web installer and on mediawiki.org to indicate that PostgreSQL is not a preferred storage engine. Specifically, mw:Download and mw:Manual:Installation requirements should have or link to the relevant caveats. mw:Manual:PostgreSQL and mw:Compatibility have appropriate language already. In the web installer, the config-dbsupport-postgres message should be rewritten with a more negative tone, and in the DB type selection list, the "MariaDB, MySQL, or compatible" option should be marked as preferred.

I don't think moving database types to extensions makes sense. Database types need to be supported by core and every extension, so they belong in core. The cross-cutting nature of database types and the need for support everywhere in the MediaWiki ecosystem is also why I don't think they're worth having.

I personally know really large organizations and institutions that are PG shop and would love to switch to PG if we can add support for PG in popular extensions (which we are doing so in 1.39 release), my assumption is that the number will go up slowly after 1.39 release.

As a small team, we can't be all things to all people. We have to think about what our goals are and how we can best achieve them. I think non-WMF users of MediaWiki would be best served by having a narrowly defined preferred platform which should be used for optimal support.

I don't know any major hassle caused by PG support recently. We even worked heavily to make sure support is easier by building abstract schema.

It's not really a major hassle, it's just a constant drain. Aaron is always disappearing into rabbit holes like https://gerrit.wikimedia.org/r/c/mediawiki/core/+/574101. It's not worth spending 2% of our time to support 2% of our users, especially given that most of those users would continue to use MediaWiki if PostgreSQL support were dropped, with an equivalent or better user experience.

Having started this discussion after finish of the abstract schema work from T261912: Convert WMF Deployed Extensions to Abstract Schema is a bit ... bad timing.

The abstract schema work also includes postgres schema, often this was just added by generate the sql file from the json file, but there were also extension where the schema between postgres and mysql must be checked for schema drifts and the necessary update steps were added. Also other compatibility changes like timestamp etc. were done before or as part of that to hopefully get it voted on CI instead of getting it broken again.

The benefit of postgres is that is more strict about some sql standards making the query often more robust like casting formats or correct escaping.
The different timestamp handling is often a pain for the developer to know about the necessary function to call.

If the decision against postgres is done before the release 1.39 the newly added schema files could be dropped to not give more third parties a chance to use that and stay on outdated releases later on to keep postgres running.

As a small team, we can't be all things to all people. We have to think about what our goals are and how we can best achieve them.

That is fair. I think the underlying problem is that perf team has become the custodian for mediawiki's database management. It has some overlap but it's not really in the mandate. And I understand why you don't want to take on the work. WMF should have a team for database platform and they would become the steward of rdbms library. But for lots of reasons that's not happening.

I think non-WMF users of MediaWiki would be best served by having a narrowly defined preferred platform which should be used for optimal support.

Honestly, this is a PM decision if we had a third-party mediawiki PM, Wikibase has its own PM for third party installations, I don't know why mediawiki doesn't.

I don't know any major hassle caused by PG support recently. We even worked heavily to make sure support is easier by building abstract schema.

It's not really a major hassle, it's just a constant drain. Aaron is always disappearing into rabbit holes like https://gerrit.wikimedia.org/r/c/mediawiki/core/+/574101. It's not worth spending 2% of our time to support 2% of our users, especially given that most of those users would continue to use MediaWiki if PostgreSQL support were dropped, with an equivalent or better user experience.

PG has always been and will be "best-effort" support and here is my thinking, we shouldn't spend any time on fixing PG more than making sure gate-submit is green (so people can merge into master). Anything beyond that should be done by volunteers/maintainers of PG instances of MediaWiki. In comparison, I just wasted a full day on debugging patches done in T298485 simply because of over-engineering of LB and LBFactory and I haven't made much progress. Something so simple as reloading db configuration shouldn't be this complicated. If we remove a lot of these code and replace them either by haproxy/proxysql or a popular third-party solution or even rewrite from scratch, it would give back more time than dropping PG support from core.

Honestly, this is a PM decision if we had a third-party mediawiki PM, Wikibase has its own PM for third party installations, I don't know why mediawiki doesn't.

In June 2020 the MediaWiki PM position was axed and Cindy was reassigned. Presumably it was Corey's decision.

I think I probably gonna bug her as SME (instead of PM) to see what's her opinion on this.

aaron triaged this task as Lowest priority.Oct 4 2022, 6:31 PM
Merged patch:

[mediawiki/core] lockmanager: remove PostgreSqlLockManager

This obscure class could only be used by customizing $wgLockManagers
and was not worth the overhead of maintaining it.

https://gerrit.wikimedia.org/r/c/mediawiki/core/+/839586