Page MenuHomePhabricator

Convert Math to AbstractSchema
Closed, ResolvedPublic

Description

In support of T191231: RFC: Abstract schemas and schema changes, Math should be migrated to using Abstract Schema

Details

Related Changes in Gerrit:

Event Timeline

I looked into this task, and writing the json file does not seem to be funny. However, this tool might help https://symfony.com/doc/current/doctrine/reverse_engineering.html

I looked into this task, and writing the json file does not seem to be funny. However, this tool might help https://symfony.com/doc/current/doctrine/reverse_engineering.html

It's not overly difficult. I'm currently working a script Amir started to make it do most of the migrations https://github.com/Ladsgroup/db-analyzor-tools/blob/master/db_abstractor.py

That script is now ready and basically gives you everything except comments. The output is

[
	{
		"name": "mathlatexml",
		"columns": [
			{
				"name": "math_inputhash",
				"type": "binary",
				"options": {
					"notnull": true,
					"length": 16
				}
			},
			{
				"name": "math_inputtex",
				"type": "text",
				"options": {
					"notnull": true
				}
			},
			{
				"name": "math_tex",
				"type": "text",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_mathml",
				"type": "mediumtext",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_svg",
				"type": "text",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_style",
				"type": "mwtinyint",
				"options": {
					"notnull": false
				}
			}
		],
		"indexes": [],
		"pk": [
			"math_inputhash"
		]
	}
]

and

[
	{
		"name": "mathoid",
		"columns": [
			{
				"name": "math_inputhash",
				"type": "binary",
				"options": {
					"notnull": true,
					"length": 16
				}
			},
			{
				"name": "math_input",
				"type": "text",
				"options": {
					"notnull": true
				}
			},
			{
				"name": "math_tex",
				"type": "text",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_mathml",
				"type": "text",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_svg",
				"type": "text",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_style",
				"type": "mwtinyint",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_input_type",
				"type": "mwtinyint",
				"options": {
					"notnull": false
				}
			},
			{
				"name": "math_png",
				"type": "blob",
				"options": {
					"notnull": false,
					"length": 16777215
				}
			}
		],
		"indexes": [],
		"pk": [
			"math_inputhash"
		]
	}
]

That's a good starting point at least.

I tried it on the non-wmf deployed mathsearch extension first, as this is less critical. I get the impression that KEY is not recognized as a keyword.
Input

--
-- Used by the math search module to annotate meanings of identifiers.
--
CREATE TABLE /*_*/mathidentifier (
  identifier varchar(5) NOT NULL,
  noun varchar(40) NOT NULL,
  evidence double NOT NULL,
  pageTitle varchar(255) NOT NULL,
  pageID int(8) NOT NULL,
  KEY mathidentifier_key( identifier , pageTitle )
) /*$wgDBTableOptions*/;

output

[
        {
                "name": "mathidentifier",
                "columns": [
                        {
                                "name": "identifier",
                                "type": "string",
                                "options": {
                                        "notnull": true,
                                        "length": 5
                                }
                        },
                        {
                                "name": "noun",
                                "type": "string",
                                "options": {
                                        "notnull": true,
                                        "length": 40
                                }
                        },
                        {
                                "name": "evidence",
                                "type": "double",
                                "options": {
                                        "notnull": true
                                }
                        },
                        {
                                "name": "pagetitle",
                                "type": "string",
                                "options": {
                                        "notnull": true,
                                        "length": 255
                                }
                        },
                        {
                                "name": "pageid",
                                "type": "integer",
                                "options": {
                                        "notnull": true,
                                        "length": 8
                                }
                        },
                        {
                                "name": "key",
                                "type": "mathidentifier_key",
                                "options": {
                                        "notnull": false
                                }
                        }
                ],
                "indexes": []
        }
]

Doing the rest manually is not hard ;)

But it's purposely skipped as the index needs naming, which needs human input - https://github.com/Ladsgroup/db-analyzor-tools/blob/master/db_abstractor.py#L65-L69

OK. I will figure it out... However, it does not match my KEY mathidentifier_key( identifier , pageTitle ) as the name mathidentifier_key was not expected;-) And that the key name is a type is somehow confusing.

I filed T270882 for MathSearch, math is easier;-)

Yeah, there's no handling for it

		"indexes": [
			{
				"name": "mathidentifier_key",
				"columns": [ "identifier", "pageTitle" ],
				"unique": false
			}
		],

Change 652207 had a related patch set uploaded (by Physikerwelt; owner: Physikerwelt):
[mediawiki/extensions/Math@master] Convert Math to abstract schema

https://gerrit.wikimedia.org/r/652207

Please read https://www.mediawiki.org/wiki/Manual:Coding_conventions/Database

Naming, prefixes, index names, they all need to be fixed here.

Please read https://www.mediawiki.org/wiki/Manual:Coding_conventions/Database

Naming, prefixes, index names, they all need to be fixed here.

I did read the document, and still do not understand your plan. Renaming tables or fields is nothing I think would be reasonable to change. Do you think we should have a name for the PK?

PK doesn't need a name, for PK you can just use the "pk" option. See https://github.com/wikimedia/mediawiki/blob/master/maintenance/tables.json for examples.

PK doesn't need a name, for PK you can just use the "pk" option. See https://github.com/wikimedia/mediawiki/blob/master/maintenance/tables.json for examples.

mh.. that is what your script did :-) Maybe you can give me a few hints on gerrit what can / should be improved. However, renaming fields/tables should be done in a separate change I suggest.

Change 652207 merged by jenkins-bot:
[mediawiki/extensions/Math@master] Convert Math to abstract schema

https://gerrit.wikimedia.org/r/652207