We want an index on page.page_content_model so we can easily figure out which pages have which content model.
Something like... CREATE INDEX /*i*/page_content_model ON /*_*/page (page_content_model); probably.
We want an index on page.page_content_model so we can easily figure out which pages have which content model.
Something like... CREATE INDEX /*i*/page_content_model ON /*_*/page (page_content_model); probably.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Trizek-WMF | T115110 Track Flow activation | |||
Open | None | T63747 Flow should be able to enumerate all flow enabled pages on a wiki | |||
Duplicate | None | T105060 Create an automated listing of all Flow Boards on a wiki | |||
Declined | None | T103389 Add index to page.page_content_model |
Note T105652: RfC: Content model storage would replace the page.content_model column (a string such as "Flow-board") with page.content_model_id.
We want an index on page.page_content_model so we can easily figure out which pages have which content model.
Something like... CREATE INDEX /*i*/page_content_model ON /*_*/page (page_content_model); probably.
When you have agreed what change you want to do, and ready to deploy it, for now tell a DBA to proceed with the following data:
If there is any pending change on the same table, it is preferred to do both at the same time, as a schema change affects traffic quite significantly.
I do not monitor the schema-change tag, I do the Database and blocked-on-operations.
Having said that, an index on page_content_model may not be useful if search results account for more than 30% of the rows.
Could you clarify this?
Do you mean an index may not be sufficient if more than 30% of pages have content model X? A large amount (much more than 30%) have 'wikitext' content model.
The goal is to implement T63747: Flow should be able to enumerate all flow enabled pages on a wiki.
A large amount (much more than 30%) have 'wikitext' content model.
Then an index there will not be useful in most cases (it will not speed up the search)- the mysql optimizer will ignore the usage of such index and do a full scan instead.
I think we'd generally be looking for one of the rarer values of page_content_model. For example, T63747: Flow should be able to enumerate all flow enabled pages on a wiki asks for a special page (or similar UI) that would list all Flow pages on the wiki, and page_content_model='flow-board' is much rarer than 'wikitext'. Presumably an index would still help there, even if the majority of pages are 'wikitext'?
Note T230607: stop using page_content_model - I would suggest declining this, since the field should be removed instead
Lowering priority per last comment - could someone comment if this should be declined in favor of T230607: stop using page_content_model, please?
@ MediaWiki-ContentHandler folks: Should this task be declined in favor of T230607: stop using page_content_model or not?
Thanks in advance for a reply / decision.