Hi DBAs, I would appreciate a review of the Linter extension before it is deployed. It creates a database table, which I've copied the definition of below (also at mediawiki/extensions/Linter, linter.sql file)
CREATE TABLE /*_*/linter ( -- primary key linter_id int UNSIGNED AUTO_INCREMENT PRIMARY KEY not null, -- page id linter_page int UNSIGNED not null, -- error category linter_cat VARCHAR(30) not null, -- extra parameters about the error, JSON encoded linter_params blob NOT NULL ) /*$wgDBTableOptions*/; -- Query by page CREATE INDEX /*i*/linter_page ON /*_*/linter (linter_page);
Each row represents one "lint error", there can be multiple lint errors per page (even for the same category). The unique part of it is technically the 'location' field in the JSON encoded params + linter_page + linter_cat, but because params are a blob, I just added an autoincrement primary key. The uniqueness is enforced in the code.
I used a JSON blob to store the data because it can be arbitrary things that come out of the wikitext (position of text, template name, etc.) so it wouldn't have fit in VARCHAR.
If there's any more information that would be useful in reviewing, please let me know. T148583 has details on how we plan to use sampling to avoid flooding the databases when doing the initial population.