Search engines such as Cirrus should examine the content of all slots when updating the search index.
|mediawiki/extensions/Wikibase||master||+15 -1||Add a way to extract content scoped search index data|
- Mentioned In
- T226722: Index captions as description fields not label
T213638: Placeholder text prompt for a caption shouldn't be inserted into the search index
rEWBIaf8cfee130c1: Merge "Adding note about workaround pending T190066"
rEWBIaf8701cff61f: Adding note about workaround pending T190066
- Mentioned Here
- rEWBI2a0610b8a2d0: Use the CirrusSearchBuildDocumentParse hook
Two big questions here are:
- One document or multiple documents? (I think the trend is for now for one document)
- If the answer is one document, how to reconcile slots with potential intersections? I.e., if both slots want to put something in opening_text, what happens? Etc.
For now, I'd blindly concatenate. That's the baseline.
We have to answer similar questions for a lot of things, including the generation of the HTML the user will see. I plan an RFC about that question.
- At least for cirrus, it pretty much needs to be one document if we want any kind of interaction between fields of multiple content types.
- I think, again only wrt cirrus, this is going to depend heavily on how those fields get into the queries issued. The current method with a variety of hard coded field names really pushes for the ability to overwrite, such as work on file media info which will overwrite opening_text field on file pages. The two will have to be figured out in parallel i suppose.
Should we set up some kind of meeting to sync on this and develop strategy? Maybe on the hackathon? I am personally still rather fuzzy on how this whole thing is supposed to work and on MCR details too, and I am suspecting I am not the only one :)
I worked around that in MediaInfo by using WikiPage::factory( $title )->getRevisionRecord() ... ought we raise a ticket to make the hook MCR compatible? Not really sure what's using the hook, so I'm not sure how to proceed ...
@Cparle this ticket here *is* about making sure all slots are passed to cirrus. Cirrus should then also pass them on via its own hooks. Changing a hook signature isn't trivial though, it's generally better to introduce a new hook.
I think this ticket here is sufficient to track the need to do this. Your workaround should be fine for MediaInfo for now. Perhaps, add a comment to your hook handler that points to this ticket.