Page MenuHomePhabricator

Help panel: potential degradation in search results
Closed, InvalidPublic

Description

As I was demoing the help panel in Test Wiki yesterday, I noticed that search results seemed much worse than I had previously experienced. Just now, I tried searching the same terms that I got on Dec 19 with much different results. When I checked using Special:Search in English Wikipedia, I got the good results from Dec 19. So I'm wondering whether something is wrong with Test Wiki's implementation, or if there is a possible regression across the board.

Today's results from help panel in Test Wiki:

  • "how to change picture size"
    • Help:Wiki markup
    • Wikipedia:Requested articles/Applied arts and sciences/Computer science, computing, and Internet
    • [no more results]
  • "edit got deleted"
    • Wikipedia:Huggle/admin
    • Wikipedia:Requests/Permissions/Jamietw
    • Wikipedia:Requests/Permissions/TBloemink
  • "can i cite a documentary"
    • [no results]
  • "text i wrote disappeared"
    • [no results]

Original results from T209301

I just spent some time playing with the search API to get a sense of the feasibility of this idea with our current search capabilities. I am concerned that search results won't be good enough for this feature to be useful, and I hope we can discuss. The two main problems I see are:

  • The search includes "question words" in what it's searching. So if I search "how to add a photo", it prioritizes results that include "how" and "to". I wonder if there's a way to consider those to be stop words.
  • There are a lot of pages in the Wikipedia and Help namespaces that definitely do not contain helpful information. For instance, if I search "how to change picture size", I get several pages about "featured picture" candidates and "picture of the day". Maybe @Trizek-WMF's idea of using categories could help if we encouraged communities to apply a category to all pages that could be helpful to a newbie, and then only those would be included in the search.

Also -- what are our expectations around how search performs in other languages? Is it known to perform similarly?

Here are some of the results I got when searching both the Wikipedia and Help namespaces (top three results for each). All top results came from the Wikipedia space:

  • "how to change picture size"
    • Wikipedia:How to improve image quality
    • Wikipedia:How to create charts for Wikipedia articles
    • Wikipedia:How to draw SVG circuits using Xcircuit
  • "edit got deleted"
    • Wikipedia:So your article got deleted
    • Wikipedia:Don't delete the main page
    • Wikipedia:Lamest edit wars
  • "can i cite a documentary"
    • Wikipedia:Articles for deletion/\"web documentary\
    • Wikipedia:Peer review/Lord of the Universe (documentary)/archive1
    • Wikipedia:Articles for deletion/Bloody Island (documentary)
  • "text i wrote disappeared"
    • Wikipedia:WikiProject Articles for creation/Help desk/Archives/2018 February 3
    • Wikipedia:WikiProject Military history/Assessment/Battle of Ostrach
    • Wikipedia:WikiProject Articles for creation/Help desk/Archives/2015 February 5

And here are the results when just restricting to the Help namespace:

  • "how to change picture size"
    • Help:Infobox picture
    • Help:Pictures
    • Help:Files
  • "edit got deleted"
    • Help:Wikipedia: The Missing Manual/Editing, creating, and maintaining articles/Editing for the first time
    • Help:Wikipedia: The Missing Manual/Editing, creating, and maintaining articles/Documenting your sources
    • Help:Wikipedia: The Missing Manual/Customizing Wikipedia/Easier editing with JavaScript
  • "can i cite a documentary"
    • Help:Maintenance template removal
    • (no other results)
  • "text i wrote disappeared"
    • (no results)

Event Timeline

EBernhardson added a comment.EditedMar 5 2019, 9:26 PM

Honestly I'm a bit surprised you ever got reasonable results from testwiki, the content that exists there is very incomplete. I checked the top 3 pages you reported for how to change picture size and none of them exist on testwiki:

https://test.wikipedia.org/wiki/Wikipedia:How_to_improve_image_quality
https://test.wikipedia.org/wiki/Wikipedia:How_to_create_charts_for_Wikipedia_articles
https://test.wikipedia.org/wiki/Wikipedia:How_to_draw_SVG_circuits_using_Xcircuit

I might suggest if you want a better testwiki demo, maybe there could be a config somewhere that tells it to query enwiki when running on testwiki?

Is it possible that the initial results were obtained in betalabs (which is configured to search on en.wikipedia.org) and the new results are from testwiki (which is configured to search itself)?

MMiller_WMF closed this task as Invalid.Mar 6 2019, 12:02 AM

@EBernhardson @SBisson -- you're right. I was using Test Wiki when I should have been using Beta. I forgot which one was hooked up to the en.wikipedia.org search results. Sorry about this -- I'm closing as invalid. Thank you for taking me seriously.