In T163642 we made all strings of `index statements` part of the all field allowing them to be searchable by plain search queries.
Unfortunable only a subset of the statements are being indexed. Reason is that indexing statements today means that we populate the statement_keyword field. This is something we do not want to do because textual content (phrases, long text that needs tokenization) is not suited for keyword matching.
So if we want to increase recall on wikidata using textual we need to come up with a new solution to populate extra text content to existing CirrusSearch field.
Currently the text fields are:
- text: populated using \Wikibase\EntityContent::getTextForSearchIndex
- auxiliary_text: not used by EntityHandler
We should evaluate the impact on the size of the index to know if can feed all the textual properties or only a subset.