Page MenuHomePhabricator

Analyze Speaker-Reviewed M2 Data for Chinese
Closed, ResolvedPublic3 Estimated Story Points


User Story: As a search developer, I want to have some idea of how Glent M‌2 is performing on Chinese queries so that I can make informed decisions about next steps (such as making improvements to Glent or running an A/B test).

Notes: The data has been reviewed by the speaker but the final analysis hasn't been done yet. The goal of the parent task was to do one or more of the CJK languages, and Japanese and Korean were done and the analysis reviewed. We still need to do Chinese and decide what to do there.

Acceptance Criteria:

  • Write up is completed and reviewed by at least one other search engineer.
  • Next steps are decided (make improvements, do an A/B test, etc.) based on recommendations.

Event Timeline

TJones renamed this task from Review Chinese M2 Data to Analyze Speaker-Reviewed M2 Data for Chinese.Nov 16 2020, 8:49 PM
TJones updated the task description. (Show Details)
CBogen triaged this task as High priority.Nov 23 2020, 6:35 PM
CBogen moved this task from needs triage to Language Stuff on the Discovery-Search board.
MPhamWMF set the point value for this task to 3.Feb 8 2021, 4:45 PM

Summary: The stats for Glent M2 suggestions for Chinese are roughly similar to Korean and Japanese. A big difference is that a fair number of suggestions are traditional-to-simplified conversions, which were rated as good suggestions, but which probably don't make much difference in search results (we do traditional-to-simplified conversion for indexing and searching behind the scenes)—though it is possible that Glent's traditional-to-simplified conversion is better for these queries than our rule-based one for searching and indexing.

Chinese production DYM also has very few suggestions for actual Chinese text (as opposed to suggestions for Latin/English text)—only ~5% of suggestions are Chinese.

As before, the new Glent M2 suggestions are largely orthogonal to the production DYM suggestions, so an A/B test seems like the right next step—though not until T265081 is done.

Full write up is on Mediawiki.