* Affected components: MediaWiki core and WMF-maintained extensions.
* Policy steward: #DBA
* Stakeholders:
** #DBA
** #TechCom
** Maintainers of core components and WMF-deployed extensions.
### Motivation
The MediaWiki database schema contains several `ENUM` columns (e.g. `categorylinks.cl_type`). `ENUM` columns can make keyset pagination difficult, as illustrated in the Oct 2015 example belowsee below in **Current uses**).
For this reason, along with the usual reasons for not usingIssues:
# `ENUM` columns, perhaps we should discourage using them can make keyset pagination difficult, at least for indexed columns, and instead favor `tinyint`, `varbinary`as illustrated in the Oct 2015 example below.
# Not all DBMSes support `ENUM`s.
# Changing an `ENUM` requires a schema change, or `binary(1)`which slows down development.
##### Requirements
* Agreement that `ENUM()` is rarely needed in WMF-deployed MediaWiki database schemas.
* Document some of the things it is currently used for and what we recommend instead.
* Document some of the typical use cases for it in the industry, and what we recommend instead.
* Decide whether to allow it at all for new schemas, and if so what the process for that should be.
-------
### Exploration
##### Oct 2015 example
Below is from an enwiki DB dump from October 2015:
{P2336}
The comparison in the `WHERE` clause apparently is using the string value, yet the `ORDER BY` clause is using the internal ID, so the category link from the file page gets missed. In fact, ApiQueryCategoryMembers has a [[https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/0958a0bce4bd8390c97ce2042f1eb04df8308293/includes/api/ApiQueryCategoryMembers.php#187|workaround]] for this involving multiple queries. This was noticed as part of T58041. The change that likely caused that issue not only severely hurt the performance of the script, but also its correctness.
##### Current uses
In core, there are eight `ENUM` columns in five different tables (see [tables.sql](https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/0958a0bce4bd8390c97ce2042f1eb04df8308293/maintenance/tables.sql)):
* `categorylinks.cl_type`
* `image.img_media_type`
* `image.img_major_mime`
* `oldimage.oi_media_type`
* `oldimage.oi_major_mime`
* `filearchive.fa_media_type`
* `filearchive.fa_major_mime`
* `uploadstash.us_media_type`.
There are also some `ENUM` columns in extension tables:
* ArticleFeedbackv5: `aft_feedback.aft_discuss`
* CentralAuth: `globaluser.gu_enabled_method`, `localuser.lu_attached_method`, `wikiset.ws_type`, `renameuser_status.ru_status`, `renameuser_queue.rq_status`,
* CentralNotice: `cn_notice_log.notlog_action`, `cn_template_log.tmplog_action`
* ContentTranslation: `cx_translations.translation_status`
* OAI: `updates.up_action`
* SacredText: `sacredtext_verses.st_religious_text`
* WikiLog: `wikilog_comments.wlc_status`
I'm not sure whether changing the data types of any of these columns (whether in core or extensions) could ever happen given the backlog of schema changes at WMF, though it's worth considering separately from this RFC.
##### Alternatives
Indexed columns could use `tinyint`, `varbinary`, or `binary(1)` (or more appropriate type).
Encourage normalization of schemas, with justification needed for exceptions.
Virtual columns.