Page MenuHomePhabricator

Migrate search integration of ArticlePlaceholder to elastic
Closed, ResolvedPublic8 Story Points

Description

As * I want to kill the wb_terms table so that it will stop exploding

Details

ArticlePlaceholder currently uses the wb_terms table for search.
The use of wb_terms has caused us many issues over the years, causing an outage[[ https://wikitech.wikimedia.org/wiki/Incident_documentation/20180524-wikidata | rather recently (May 2018) ]].
As a result the ArticlePlaceholder search integration is currently disabled (it will be re enabled with T195751).
In order to re enable the feature and ensure stability and reliability we should first migrate the feature away from wb_terms and instead use elasticsearch.

Using elasticsearch directly in PHP poses some difficulties as all of the elastic & search code is in Repo and ArticlePlacholder only has access to Client.
During a discussion with @Legoktm, @Aleksey_WMDE & I Wikimania this year we discussed the idea of calling the repo wbsearchentities API internally from the client, this is probably the best path forward.

Impact & Priority

The feature should probably not be enabled again until this task is resolved, hence T195751 is blocked by this.
As for the priority of T195751 (turning the feature back on) that is down to @Lydia_Pintscher.
When writing this description this task was marked as "High" so I'll leave it as that for now (2018-08-19)

Task

Make ArticlePlaceholder use elasticsearch instead of the wb_terms table by calling the repo wbsearchentities API module.

Acceptance criteria

  • ArticlePlaceholder on longer uses the wb_terms table for anything
  • ArticlePlaceholder uses wikidata search via the wbsearchentities API module
  • Calls to the API module should have a short timeout and should fail gracefully & log success / failure rates

Pointers

This would be replacing the search interactor with a new interactor / service that would call the API instead: https://github.com/wikimedia/mediawiki-extensions-ArticlePlaceholder/blob/master/includes/SearchHookHandler.php#L72

Event Timeline

Addshore renamed this task from Migrate search integration of article place holder to elastic to Migrate search integration of ArticlePlaceholder to elastic.Aug 19 2018, 10:47 AM
Addshore updated the task description. (Show Details)
Addshore moved this task from Backlog to Ready on the wikidata-tech-focus board.
Addshore added a subscriber: Legoktm.
Addshore updated the task description. (Show Details)
Addshore set the point value for this task to 8.
Addshore moved this task from Incoming to In Progress on the Wikidata-Campsite board.
Addshore moved this task from incoming to in progress on the Wikidata board.Sep 18 2018, 2:25 PM

I'm giving this a try

Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptSep 25 2018, 11:46 AM

Change 462760 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/ArticlePlaceholder@master] Add API-based search integration

https://gerrit.wikimedia.org/r/462760

Change 462760 merged by jenkins-bot:
[mediawiki/extensions/ArticlePlaceholder@master] Add API-based search integration

https://gerrit.wikimedia.org/r/462760

Ladsgroup moved this task from Incoming to Done on the User-Ladsgroup board.Oct 2 2018, 8:20 PM

We need to enable it first and then change the repo url (in a test wiki) and with that, we should get different results. That's the easiest way we can test it.

Addshore lowered the priority of this task from High to Normal.Oct 9 2018, 1:44 PM
Salgo60 added a subscriber: Salgo60.Oct 9 2018, 2:18 PM

Will this work with the API:Search i.e. if I add a language that has Articleplace holder activated then I can search it

Addshore closed this task as Resolved.

We verified this on the beta cluster.
It will be deployed as part of T195751