Page MenuHomePhabricator

Re-index all pages on all wikis (insource and contentmodel don't play well together)
Closed, DeclinedPublic

Description

This search for insource:"flowlist" insource:/[^\{d\|]flowlist/ -contentmodel:css doesn't exclude css contentmodel pages. (The opposite insource:"flowlist" insource:/[^\{d\|]flowlist/ contentmodel:css also does not seem to restrict results solely to css pages.)

Event Timeline

debt triaged this task as Medium priority.Aug 9 2018, 5:21 PM
debt edited projects, added Discovery-Search (Current work); removed Discovery-Search.
debt subscribed.

This currently returns 19 results - looks like we just need to re-index everywhere to capture the other files. This will take quite a bit of time to do (all wiki pages in all wiki's) so, we'll have to plan for this.

debt renamed this task from insource and contentmodel don't play well together? to Re-index all pages on all wikis (insource and contentmodel don't play well together).Aug 9 2018, 5:21 PM

This currently returns 19 results - looks like we just need to re-index everywhere to capture the other files. This will take quite a bit of time to do (all wiki pages in all wiki's) so, we'll have to plan for this.

Will a reindex actually fix the problem reported though? It's a case of "contentmodel pages of a certain kind (CSS in this case) are not being appropriately included/excluded where they should be, at least when I use insource".

For example, this page shows up in the query. Why? It has always had a css content model (well, at least since content models were rolled out, I would guess), in the 6 years the page has existed. I would have guessed a previous index would have caught that....

This currently returns 19 results - looks like we just need to re-index everywhere to capture the other files. This will take quite a bit of time to do (all wiki pages in all wiki's) so, we'll have to plan for this.

Will a reindex actually fix the problem reported though? It's a case of "contentmodel pages of a certain kind (CSS in this case) are not being appropriately included/excluded where they should be, at least when I use insource".

For example, this page shows up in the query. Why? It has always had a css content model (well, at least since content models were rolled out, I would guess), in the 6 years the page has existed. I would have guessed a previous index would have caught that....

In the six years it has existed it has never been edited. So that means our search index has the latest document as of 2012. When content model wasn't a property of the search documents.

I thought that might be it. Cool, good to know ~

Rather than run a manual reindex i have put together T203622 and submitted a patch which will constantly reindex everything on a rolling 8 week cycle. This removes the need for anything to be done manually now, and in the future these problems will fix themselves within a known time frame.

Rather than run a manual reindex i have put together T203622 and submitted a patch which will constantly reindex everything on a rolling 8 week cycle. This removes the need for anything to be done manually now, and in the future these problems will fix themselves within a known time frame.

A guaranteed reindex every 2 months is pretty sensible.