Page MenuHomePhabricator

Create an API for fulltext search across all wikis
Closed, ResolvedPublic

Description

Summary
As requested, "an API for that across-all-wikis fulltext search yet? Maybe just on Toolforge?"

https://twitter.com/MagnusManske/status/928270803833155586

Description
Provide an API that would allow users to perform a full text search across all Wikimedia wikis. Due to the cost, perhaps extremely rate limited to limited to requests for access.

When asked about the suggestion, the a search engineers said the following:

"we asked for a budget to do it on toolforge. It wasn't approved.
well, it was approved in the sense of "we wont give you money for it, but if theres any left over then maybe"
in theory we could setup a rediculously rate limited api for it without much work against the prod clusters.
(and also limiting keywords agressively, like no regex)"

Event Timeline

debt moved this task from needs triage to search-icebox on the Discovery-Search board.
debt subscribed.

If this was in production, it'd mean that we're searching across all shards which is incredibly expensive. We need to take into account operational concerns, available servers, money, etc to see if we can take this on and then how many people would actually use this.

MPhamWMF subscribed.

Closing out low/est priority tasks over 6 months old with no activity within last 6 months in order to clean out the backlog of tickets we will not be addressing in the near term. Please feel free to reopen if you think a ticket is important, but bare in mind that given current priorities and resourcing, it is unlikely for the Search team to pick up these tasks for the indefinite future. We hope that the requested changes have either been addressed by or made irrelevant by work the team has done or is doing -- e.g. upgrading Elasticsearch to a newer version will solve various ES-related problems -- or will be subsumed by future work in a more generalized way.

RhinosF1 removed a project: Discovery-Search.
RhinosF1 subscribed.

Re-opening tasks and removing from team workboard per IRC feedback given yesterday and discussion with MPham.

Gehel claimed this task.
Gehel subscribed.

Cloudelastic is covering this requirement.