Page MenuHomePhabricator

[L] Reduce Cirrusdoc API payload
Closed, ResolvedPublic

Description

The current payload returned by cirrusDoc is currently quite large, and we are trying to find way to make its size smaller.

One possible solution would be to reduce the returned information from CirrusDoc. Currently this payload include lots of information that are not currently used by Search Preview.

Currently we just need the following fields:

  • heading -> for the sections
  • title, redirect.title, category, heading, text, auxiliary_text, file_text, source_text... -> for the snippets

Further information in this slack thread: https://wikimedia.slack.com/archives/CKZ5CPBQX/p1664883272109109

AC:

  • Expose from which field an article snippets comes from (wikitext or stripped version)
  • Define arguments in CirrusSearch\Api\QueryCirrusDoc to allow us to retrieve just a couple of relevant fields
  • Change the UI to request just the data it needs

Event Timeline

Change 838165 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[mediawiki/extensions/CirrusSearch@master] [POC] Expose field from which text is highlighted

https://gerrit.wikimedia.org/r/838165

matthiasmullie renamed this task from [SPIKE] Cirrusdoc API investigation to reduce payload to Cirrusdoc API investigation to reduce payload.Oct 4 2022, 2:09 PM
matthiasmullie renamed this task from Cirrusdoc API investigation to reduce payload to Reduce Cirrusdoc API payload.Oct 4 2022, 2:16 PM
matthiasmullie updated the task description. (Show Details)
matthiasmullie subscribed.

Had a quick look; looks possible to expose the field. I updated the ticket description, removed "SPIKE" & I think we're ready to work on this.

CBogen renamed this task from Reduce Cirrusdoc API payload to [L] Reduce Cirrusdoc API payload.Oct 5 2022, 4:29 PM

Change 841465 had a related patch set uploaded (by Simone Cuomo; author: Simone Cuomo):

[mediawiki/extensions/CirrusSearch@master] Reduce Cirrusdoc API payload

https://gerrit.wikimedia.org/r/841465

Change 841465 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Reduce Cirrusdoc API payload

https://gerrit.wikimedia.org/r/841465

Change 843507 had a related patch set uploaded (by Simone Cuomo; author: Simone Cuomo):

[mediawiki/extensions/SearchVue@master] Reduce CirrusDoc request payload

https://gerrit.wikimedia.org/r/843507

Change 843507 merged by jenkins-bot:

[mediawiki/extensions/SearchVue@master] Reduce CirrusDoc request payload

https://gerrit.wikimedia.org/r/843507

I can confirm that "Change the UI to request just the data it needs" has been completed and live since October.

The other open AC will be covered by the "expanding snippets" ticket