ProofreadPage stores a "proofreading quality level" alongside each Page: page revision. This quality level is currently stored inside of the Wikitext content of each revision using the <pagequality> tag. Ensuring the wikitext <pagequality> tag is present, valid an unique is a fairly heavy task.
This approach also requires to fetch the Wikitext content to retrieve the proofreading quality level.
To circonvent that the current revision proofreading quality level is stored in the page_props database table.
ProofreadPage never displays the <pagequality> tag to the editors but hide it behind buttons displayed near the change summary edit field.
Since "proofreading quality level" introduction around 2008, the change tags system has been implemented into MediaWiki. It might be relevant to migrate ProofreadPage to it. This way the Page: pages content will be properly separated from metadata and the "proofreading quality level" of each revision will be easily accessible from the database.
There are two options to implement the "proofreading quality level" storage in the change tag systems.
- Tag each revision with one of the 5 possible proofreading quality level change tags.
- Tag only revisions that changes the change tag with the new change tag.
I have a preference on option 2 because it allows to quickly flag which revision has change the quality level. We already store in the page_props table the quality level of the current revision, data that is the most used. It also decreases the noise when displaying tags.
Volumetry: The biggest Wikisource (fr) currently stores ~3M pages. We can assume that the average number of proofreading quality level changes is at most 5 so it will add less than 15M tags to the biggest Wikisources.
Pros:
- Allows to access the proofreading quality level directly from the database.
- Separate the content from metadata.
- Simplifies ProofreadPage internals.
- Might allow to consider Page: pages content as plain Wikitext in the future and fix a lot of things e.g. the VisualEditor.
Cons:
- Migration cost.
- The old revisions will still contain the "<pagequality>" tag in their wikitext so artefacts of the previous system will remain.