Page MenuHomePhabricator

Remove MSSQL support from MediaWiki core
Closed, DeclinedPublic

Description

This is a proposal to remove MSSQL support from MediaWiki core. Previous discussions were in favor to keep it based on the assumption that it is being used.

Problem

  • The differences between its schema and MySQL
  • Current lack of maintenance makes it very difficult to support it effectively.
  • The number of installs using MSSQL is very small.
  • We do not run MSSQL on our CI, so we have code in core that we support, but are unable to test. T197995
  • Developers are expected to ensure that their schema changes work on non-free (as in speech) databases

Solution
Move MSSQL into an extension.

Event Timeline

MaxSem raised the priority of this task from to Needs Triage.
MaxSem updated the task description. (Show Details)
MaxSem subscribed.
Legoktm added a subscriber: Skizzerz.
Legoktm subscribed.

@Skizzerz is the maintainer for mssql (cc'd him), we've started drafting up https://www.mediawiki.org/wiki/Requests_for_comment/Moving_database_abstractions_out_of_MediaWiki_core which is related here IMO.

Cut the offensive tone @MaxSem, that was incredibly uncalled for. I spent at least 100 hours developing and testing making sure that it worked for far more than just "basic stuff" when I submitted the patch for support in 1.23. Does it work beyond 1.23? No, I went into it only planning to release updates for LTS releases of mediawiki because I don't get paid for this and there are far better uses of my time. I'll update it for the next LTS version, and the LTS version after that as well, and so on until the internal politics and abrasive behavior around here drive me away from mediawiki permanently (which may end up happening sooner rather than later at this rate; so much for what whole Code of Conduct thing).

If you want to actually do something productive with this, I highly encourage you to check out the linked RFC and assist in getting it rolling forward by providing your comments/thoughts on the infrastructure questions that still need to be answered with it.

Starting a discussion about the future of MSSQL backend support on a technical mailing list (and spending time trying to identify a list of potential problems created by the current situation) sounds like a more sensible approach than quickly dropping two lines of personal impressions (with enough "citation needed" potential) in some issue tracking tool and then drawing questionable conclusions.

Regarding the tone, please see https://www.mediawiki.org/wiki/Bug_management/Phabricator_etiquette - that link asks for constructive criticism.

I'm tempted to close this task as invalid.
Not because concerns and impressions shouldn't be discussed but because it's the wrong place and format.

For a good restart of this discussion, I propose to make it private while the interested parties figure out the next technical steps. The tone is disrespectful indeed, and @Skizzerz and whoever else has contributed to this project don't deserve it.

Qgil renamed this task from Kill MSSQL support with fire to Remove MSSQL support from MediaWiki core.Sep 30 2015, 1:45 PM
Qgil updated the task description. (Show Details)

@MaxSem, I was expecting that you would edit your own writing (or would at least comment here), but at the end I have done it. Please review whether the current description is valid.

Note that my rewrite is no endorsement or opinion on the content of this task. I just wanted to change unrespectful wording into neutral and informative wording. I hope this is enough to continue with the technical discussion.

Aklapper triaged this task as Lowest priority.Feb 27 2016, 8:22 PM

Thank you very much to the MSSQL Team. You made a good job. Since MediaWiki 1.27.0 the support for MS SQL Server is so good that we use it productive. OK, some thinks are still not perfect, but what’s perfect? Please, don’t cut off MSSQL! We really need it!

[This bug is pretty dead, so maybe irrelevant now] FWIW, I just attended EMWCon, there was a suprising number of corporate users who use MSSQL support and seem happy with it.

Now that we have usage statistics:

select count(*) as num, event_database from (select * from MediaWikiPingback_15781718 union all select * from MediaWikiPingback_15781718_15423246) t group by event_database order by num desc;
+-------+----------------+
| num   | event_database |
+-------+----------------+
| 18132 | mysql          |
|   606 | sqlite         |
|   366 | postgres       |
|     4 | mssql          |
+-------+----------------+
4 rows in set (0.39 sec)

That's 0.02% usage share.

Please explain what "MediaWikiPingback" is and where it comes from and how it works. Or a link.

Perhaps the way forward is by allowing database drivers in extensions (if we don't already)?
And then moving drivers that we don't want to support in core into extensions?

Prior Art:

Please explain what "MediaWikiPingback" is and where it comes from and how it works. Or a link.

Its a feature in mw core that sends back usage stats. It can optionally be enabled in the installer. the data is stored in the eventlogging dataset afaik.

I think we should keep in mind There is a possibility that mw pingback is undercounting mw installs behind corporate firewalls which is also the installs most likely to use mssql.

I think ideal situation here would be to have db driver extensions. However installer does not support this as of yet.

Perhaps the way forward is by allowing database drivers in extensions (if we don't already)?

An extension can implement IDatabase (interface) or extend Database (class). Site admins can enable their use through $wgDBtype. See also Database::factory() and Database::getClass(). This could use a better interface, but it exist and is used (codesearch) such as the OdbcDatabase extension.

This task is about removing support for MSSQL as provided/maintained by MediaWiki core.

This task is about removing support for MSSQL as provided/maintained by MediaWiki core.

Then I think that's exactly what we should do. :)

Declining this task, as the structure to do this isn't nearly in a workable state. Moving out the Database class is about 10% of what is required here; the installer/updater need to work (keep in mind that LocalSettings.php doesn't exist when the installer is being run, yet that needs to be able to load the abstraction and schema). And, if the support isn't in core, then people will simply start using more and more MySQL-specific features making it impossible for an extension to work anyway.

This task is therefore entirely unconducive to having a stable and working database abstraction layer and schema. There is an RFC in draft to fix all of the disparate schemas to actually align and operate in a way that works across all DBMSes. Once that RFC is complete, there is largely no reason to split other less-supported DBMSes out of core because everything will pretty much automatically work for them to begin with so long as people aren't trying to call $db->query() with raw SQL strings.

dbarratt added a project: TechCom-RFC.

I think the majority want MSSQL moved out of core, but let's bring in some more voices into that.

This task is therefore entirely unconducive to having a stable and working database abstraction layer and schema. There is an RFC in draft to fix all of the disparate schemas to actually align and operate in a way that works across all DBMSes. Once that RFC is complete, there is largely no reason to split other less-supported DBMSes out of core because everything will pretty much automatically work for them to begin with so long as people aren't trying to call $db->query() with raw SQL strings.

We're still not going to be running tests on MSSQL (or Oracle for that matter) so we have code in core that we support, but we cannot test. :(

I've been using some Laravel lately, and I have to say that it's database migration structure is just incredibly nice to work with. It is a very easy, driver agnostic, method of defining, upgrading and maintaining schemas and gives you access to raw SQL to do the very driver specific things, if you need to reach that last 5% of crazy detail. I find myself longing for it within MediaWiki honestly.

We definitely need more work in this area I think. When people talk about usability for 3rd parties and simpler onboarding of new developers, it's this spit and polish, this ease and flexibility that matters.

I think ideal situation here would be to have db driver extensions. However installer does not support this as of yet.

Indeed. Our database abstraction layer supports extensions, which can be used to establish dedicated database connection through a type and implementation provided by an extension. However this would only work currently for something like SqlBagOStuff (for ObjectCache), or ExternalStore (for revision text) – where the schemas are quite simple and setup by the administrator themselves.

In order for this to work for the main database schema, we need:

  • A way for the installer to enable the extension during the installation process - in order to let it register the db type and classes.
  • A way for extensions to provide a custom DatabaseUpdater class.
  • A way for extensions to provide a discovery path for DatabaseInstaller::getSqlFilePath() – for their versions of the install schema, and schema migration patches.

The third point could be avoided if we implement T191231 first, which would forego the need for database backends implementation to maintain their own version of the schema and update patches. Instead, they'd only need to interpret the shared abstract syntax.

We definitely need more work in this area I think. When people talk about usability for 3rd parties and simpler onboarding of new developers, it's this spit and polish, this ease and flexibility that matters.

I totally agree that we need a better database abstraction layer (I'm a big fan of Doctrine's DBAL which is used by Drupal) and a better installer (I've created many issues related to the installer and usage with Docker for instance).

I think it would be awesome if our DBAL could support non-relational databases like MongoDB. And honestly, I would rather spend our time on a better database abstraction layer that could do things like that, then continuing to support non-free databases.

I think the majority want MSSQL moved out of core, but let's bring in some more voices into that.

[citation sorely needed]

I'm fairly certain that the majority don't care, as long as it works well enough. There are numerous disadvantages to moving it out of core and only dubious advantages ("getting rid of code we can't test" is about the only advantage, and even then we could set up tests for it post-merge via travis). This task is completely barking up the wrong tree.

We definitely need more work in this area I think. When people talk about usability for 3rd parties and simpler onboarding of new developers, it's this spit and polish, this ease and flexibility that matters.

I totally agree that we need a better database abstraction layer (I'm a big fan of Doctrine's DBAL which is used by Drupal) and a better installer (I've created many issues related to the installer and usage with Docker for instance).

I think it would be awesome if our DBAL could support non-relational databases like MongoDB. And honestly, I would rather spend our time on a better database abstraction layer that could do things like that, then continuing to support non-free databases.

Feel free to not spend any of your time supporting non-free databases. You are however not the dictator of how other people spend their time, and I find this comment to be incredibly insulting to state that supporting a non-free database is a waste of everyone's time. You've shown your colors, and I think it's time that you excuse yourself from this task. I'm not going to go through another 3 years of people blocking things that other people use out of purely ideological reasons, and I honestly should have just closed off this task when it was opened with its initial inflammatory remarks back then.

If something has no technical merit, it can be removed from core. If something has no tests, that's not great but it's not insurmountable -- we can simply add tests for it instead. Due to the nature of our testing environment, such tests would need to be outside of the WMF infrastructure and therefore post-merge, but that isn't an impossible task. Saying it should be removed because you don't see the point of it or because it is a waste of your time is acting in bad faith. Stop that.

The true technical solution to this is T191231. Support that one instead, and leave this task to the annals of history. I'm not going to cause a status war by declining it again, but that is truly where this task belongs.

@Skizzerz, the issue here is that core MW developers don't support it, and you yourself have stated that you can only support it for LTS releases. That doesn't sound like "MW supports MSSQL" at all to me. We can't possibly claim we support something no developer other than its maintainer tests with (and even the maintainer doesn't test frequently enough to fit in out release schedule). We can't even run CI for this backend unless it becomes free-as-liberty. This is a perfect use case to move something out of the main product and let its developers maintain it at the pace they are able to.

Isarra subscribed.

This is silly. This is maintained functionality (the maintainer is right here yelling at you) that clearly works (as evidenced by the fact that people are using it), and there is already another task/RFC that would be a proper solution to the proposed problem anyway.

[citation sorely needed]

I said I think and that's why I added TechCom to find out since neither of us has quantitiative data on the mater.

We do have data on the number of installs and it is very very small:
https://pingback.wmflabs.org/#database-type/database-type-timeseries
it represents less than 0.1% of installs. I'm not sure what the margin of error is on pingback, but it seems like it must be within the range of zero.

Feel free to not spend any of your time supporting non-free databases.

When you make a schema change, you're expected to ensure that the change works on all of the databases we support in core. So that's not really an option.

You are however not the dictator of how other people spend their time, and I find this comment to be incredibly insulting to state that supporting a non-free database is a waste of everyone's time. You've shown your colors, and I think it's time that you excuse yourself from this task. I'm not going to go through another 3 years of people blocking things that other people use out of purely ideological reasons, and I honestly should have just closed off this task when it was opened with its initial inflammatory remarks back then.

Look, I'm one of the most ambivalent towards the free software guiding principal, I even created an issue to run our tests on MSSQL T197995, but that was swiftly declined.

If something has no technical merit, it can be removed from core. If something has no tests, that's not great but it's not insurmountable -- we can simply add tests for it instead. Due to the nature of our testing environment, such tests would need to be outside of the WMF infrastructure and therefore post-merge, but that isn't an impossible task. Saying it should be removed because you don't see the point of it or because it is a waste of your time is acting in bad faith. Stop that.

I don't see what is wrong with moving it into an extension. Yes it would require some work, but there doesn't seem to be a reason to keep it in core. And this is nothing against MSSQL specifically, I think it would be great if all of the database drivers were extensions.

The true technical solution to this is T191231. Support that one instead, and leave this task to the annals of history. I'm not going to cause a status war by declining it again, but that is truly where this task belongs.

I don't think they are mutually exclusive.

dbarratt updated the task description. (Show Details)

We do have data on the number of installs and it is very very small:
https://pingback.wmflabs.org/#database-type/database-type-timeseries
it represents less than 0.1% of installs. I'm not sure what the margin of error is on pingback, but it seems like it must be within the range of zero.

This is incorrect. Pingback is opt-in, and does not work for installs behind firewalls, etc. And that's where it seems like most of the MSSQL users are based on anecdotal, in person evidence.

Feel free to not spend any of your time supporting non-free databases.

When you make a schema change, you're expected to ensure that the change works on all of the databases we support in core. So that's not really an option.

This is incorrect. Some unknown person added this to the policy a month ago (https://www.mediawiki.org/w/index.php?diff=2813515&oldid=2812201&title=Development_policy&type=revision) - I just reverted them.

Look, I'm one of the most ambivalent towards the free software guiding principal, I even created an issue to run our tests on MSSQL T197995, but that was swiftly declined.

This is an incredibly bizarre interpretation of the guiding resolution. Are you also advocating to remove Windows and Mac support from MediaWiki because they're non-free platforms?

I don't see what is wrong with moving it into an extension. Yes it would require some work, but there doesn't seem to be a reason to keep it in core. And this is nothing against MSSQL specifically, I think it would be great if all of the database drivers were extensions.

The true technical solution to this is T191231. Support that one instead, and leave this task to the annals of history. I'm not going to cause a status war by declining it again, but that is truly where this task belongs.

I don't think they are mutually exclusive.

Well, first, it's (currently) impossible to have database implementations in extensions. I'd suggest you listen to @Skizzerz on this because he's the one who's done a significant amount of research and work on this (e.g. https://www.mediawiki.org/wiki/Requests_for_comment/Moving_database_abstractions_out_of_MediaWiki_core, T191231).

I'm declining this for the last time. I suggest people who want to do something productive work towards T191231: RFC: Abstract schemas and schema changes.

There is no quantitive data. The pingbacks are horribly inaccurate, especially insofar as enterprise installations of MediaWiki are represented. Corporate firewalls as well as an attitude of sysadmins to not share installation data by default both cause installation numbers (across all database systems) to be largely underrepresented. In any case, I know personally of at least 20 companies and government organizations using MediaWiki with mssql or that have used MediaWiki with mssql, and I would not be surprised if the true number was in the hundreds. Yes, this isn't much compared to MySQL installations, but it is a non-negligible number.

All of your grievances with mssql as currently implemented are completely valid. The developer friction to get patches through to 5 different dbmses sucks. Moving it into an extension doesn't fix that, it only makes it intractably worse to the point that the extension shouldn't even exist. It would be a death sentence to any non-MySQL database. The RFC I linked a couple of comments up addresses that issue in a much more meaningful manner. With an abstract representation of schema changes, a developer makes one change once, and it works flawlessly across all databases. It also eliminates the differences between the schemas.

As mentioned twice above, and as I'll mention a third time, there is nothing stopping automated tests from being run with mssql. The only snag is that they have to be run post-merge instead of via our zuul/Jenkins setup. This is not insurmountable, and is something I will be personally working on once the above RFC is implemented (or simultaneously along with its implementation).

Do not reopen this, devote your energies to things that improve the ecosystem instead of causing it to wither on the vine.

We do have data on the number of installs and it is very very small:
https://pingback.wmflabs.org/#database-type/database-type-timeseries
it represents less than 0.1% of installs. I'm not sure what the margin of error is on pingback, but it seems like it must be within the range of zero.

If I remember correctly, pingback was introduced with the 1.28 release. MSSQL is only supported for LTS. The last LTS was 1.27. I had tried to install release 1.28 for MSSQL but had given up some problems during the installation. Therefore I am surprised that there are pingbacks for MSSQL at all.

We have just upgraded our test system to 1.31 for MSSQL. Our tests are not yet completed. We had some problems with the installation. We put our solution in a git brunch (https://github.com/Mik4sa/mediawiki/tree/1.31.0-mssql).

Unfortunately, our company guidelines do not allow pingback.

We would be very happy if there would continue to be support for MSSQL.