Look into producing a list of frequent 'zero result' search terms on Wikimedia projects
From a community member: "One of the easy wins in search is to publish lists of popular search terms that don't currently have an obvious wikipedia article. In some case people will be able to create redirects to resolve them."

I believe the completion suggester helps with this in some capacity.

However, the idea of a list of common search queries that show zero matching results could be useful to editors and search engineers in determining why certain queries result in no matches and possible ways to improve the search results.

This task seeks to discern the technical, legal, and privacy concerns related to creating such a list.

A few initial questions.

  • Is it feasible to create a useful list of top queries with zero results for a wiki?
  • What are the technology, privacy, and security concerns?
  • How difficult would it be to automate something like this?


One outcome is a "No" - the concerns and technical implementation are insurmountable.

The other possible outcome will be a "Yes", with clear understanding of what it would take in resources to accomplish. If it is something the Discovery team wishes to take on, a plan for implementation would be pursued.

Privacy - we don't want to reveal any private information by accident. People can accidentally copy/paste sensitive information into the search box and have that be included in any index.

I've heard this asked by a few folks in the community as a way of identifying opportunities to create new articles or reword/redirect popular terms to wiki articles.

@TJones performed an investigation into the top unsuccessful search queries. The results show that generating a list is not only difficult, but any results would be of low value.

"I think the problem with all of these strategies is that so many high-frequency queries would be eliminated by any of them that any useful mining would be down to slogging through the low-impact long tail."

