TJones (Trey Jones)
Sr. Software Engineer, Search Platform Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Jul 8 2015, 3:02 PM (175 w, 6 d)
Availability
Available
IRC Nick
Trey314159
LDAP User
Tjones
MediaWiki User
TJones (WMF) [ Global Accounts ]

I would have written a shorter comment, but I did not have the time.

I'm part of the Search Platform team and I spend my time working on search & relevance, trying to better support search in various languages, analyzing queries, and doing random mathy things. I tend to write long, detailed notes about my investigations (so as to improve the bus number of my work).

When I have to work on _GitHub,_ /‍‍/Phab,/‍‍/ and ''MediaWiki'' all on the same day, I sometimes suffer Severe Markup Incongruence Fatigue.

I � Unicode.

Recent Activity

Yesterday

TJones added a comment to T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias.

I've left some comments on the discussion page for the DWIM Gadget suggesting changes that should improve the performance of the gadget and make it work on the main search field on Special:Search.

Tue, Nov 20, 9:32 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T155104: Detect "wrong keyboard" queries for Hebrew/American keyboards on EN/HE Wikipedias.

Related to this, @Amire80 was able to update the Hebrew DWIM gadget for me so it would work (or work again, I think) on the search box on the Special:Search page, so we effectively have wrong-keyboard detection for the completion suggester in the upper corner, and in the main search input on the search page.

Tue, Nov 20, 9:20 PM · Discovery-Search, Discovery

Sat, Nov 17

TJones updated the task description for T147505: [Recurring task] CirrusSearch: what is updated during re-indexing.
Sat, Nov 17, 4:18 PM · Discovery-Search (Current work), Discovery
TJones moved T209156: Re-index Chinese Wikis to fix Surrogate Split from In progress to Done on the Discovery-Search (Current work) board.

All done!

Sat, Nov 17, 4:17 PM · Discovery-Search (Current work), Chinese-Sites, Discovery
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Waiting/Blocked to Done on the Discovery-Search (Current work) board.

Reindexing for the live search cluster (eqiad) is complete, and the example link now gives 46 results instead of ~94K. The spare cluster (codfw) is still running, so I won't move the re-indexing task to done until it finished.

Sat, Nov 17, 3:27 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery

Fri, Nov 16

TJones renamed T209156: Re-index Chinese Wikis to fix Surrogate Split from Re-index Chinese Wikis to Re-index Chinese Wikis to fix Surrogate Split.
Fri, Nov 16, 8:25 PM · Discovery-Search (Current work), Chinese-Sites, Discovery
TJones added a comment to T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].

Almost done. Reindexing (T209156) is still in progress. The smaller wikis (Wikiversity, Wikiquote, Wikibooks, Wikivoyage, Wikinews) are done and checked and everything looks good so far. Wikisource, Wiktionary, and Wikipedia are still processing.

Fri, Nov 16, 8:11 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones moved T209156: Re-index Chinese Wikis to fix Surrogate Split from Backlog to In progress on the Discovery-Search (Current work) board.
Fri, Nov 16, 3:48 PM · Discovery-Search (Current work), Chinese-Sites, Discovery

Wed, Nov 14

TJones triaged T209537: Review a few more current metrics for accuracy as Normal priority.
Wed, Nov 14, 8:56 PM · Product-Analytics, Discovery-Search (Current work)
TJones moved T197128: Review current search metrics for accuracy and documentation from In progress to Done on the Discovery-Search (Current work) board.

I think we can close this ticket. The general consensus seems to be that the "anomalies" are mostly not errors and are just unexplained variation in usage patterns, which we don't necessarily need to track down. (There's one that still bothers me, though.)

Wed, Nov 14, 8:50 PM · Patch-For-Review, Product-Analytics, Discovery-Search (Current work)

Tue, Nov 13

TJones placed T178923: Review Japanese Morphological Libraries up for grabs.
Tue, Nov 13, 6:51 PM · Discovery-Search, Discovery
TJones placed T178924: Review Vietnamese Morphological Libraries up for grabs.
Tue, Nov 13, 6:51 PM · Discovery-Search, Discovery
TJones moved T177888: Review use of CJK vs ICU default language analyzers for "Chinese" Wikis from Tech Debt/Misc to Later on the Discovery-Search board.
Tue, Nov 13, 6:51 PM · Chinese-Sites, Discovery-Search
TJones moved T185721: Null or inconsistent search results using Khmer script from Up Next to Later on the Discovery-Search board.
Tue, Nov 13, 6:47 PM · Discovery, CirrusSearch, Discovery-Search
TJones moved T186401: searching I as 1 in Kabardian Wikipedia from Up Next to Later on the Discovery-Search board.
Tue, Nov 13, 6:47 PM · Discovery-Search, Elasticsearch, Discovery, I18n
TJones moved T203117: Greek language analysis generates unexpected empty tokens from Up Next to Later on the Discovery-Search board.
Tue, Nov 13, 6:47 PM · Discovery-Search
TJones added a comment to T178923: Review Japanese Morphological Libraries.

We've moved on to other tasks and aren't spending time looking at morphological libraries these days.

Tue, Nov 13, 6:45 PM · Discovery-Search, Discovery
TJones added a comment to T178924: Review Vietnamese Morphological Libraries.

We've moved on to other tasks and aren't spending time looking at morphological libraries these days.

Tue, Nov 13, 6:40 PM · Discovery-Search, Discovery
TJones moved T178924: Review Vietnamese Morphological Libraries from Up Next to Later on the Discovery-Search board.
Tue, Nov 13, 6:33 PM · Discovery-Search, Discovery
TJones moved T178923: Review Japanese Morphological Libraries from Up Next to Later on the Discovery-Search board.
Tue, Nov 13, 6:33 PM · Discovery-Search, Discovery
TJones moved T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias from Backlog to In progress on the Discovery-Search (Current work) board.
Tue, Nov 13, 6:21 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T209348: Port the elasticsearch plugin extra-analysis-surrogates to 6.4.2 as a noop plugin.

Is this necessary? If we have to reindex everything when we upgrade to ES 6, then we should be okay, because the analysis config builder checks for the presence of extra-analysis-surrogates and will configure itself correctly without it. Or is there some interim stage of the upgrade where it needs to exist?

Tue, Nov 13, 3:12 PM · Discovery-Search (Current work)

Fri, Nov 9

TJones added a comment to T209156: Re-index Chinese Wikis to fix Surrogate Split.

Sorry, @Liuxinyu970226. I noticed that "Re-index Chinese Wikis" was recurring in T147505. "3rd", etc seem just as arbitrary, and keeping count accurately could be hard. What about renaming this task "Re-index Chinese Wikis to fix T168427" or "Re-index Chinese Wikis to fix Surrogate Split" or something else that pointed to what the reindexing will enable?

Fri, Nov 9, 3:22 PM · Discovery-Search (Current work), Chinese-Sites, Discovery
TJones updated the task description for T147505: [Recurring task] CirrusSearch: what is updated during re-indexing.
Fri, Nov 9, 3:08 PM · Discovery-Search (Current work), Discovery
TJones triaged T209156: Re-index Chinese Wikis to fix Surrogate Split as Normal priority.
Fri, Nov 9, 3:07 PM · Discovery-Search (Current work), Chinese-Sites, Discovery
TJones raised the priority of T209155: Deploy extra-analysis-surrogates & the experimental highlighter 5.5.2.4 to production from Low to Normal.
Fri, Nov 9, 3:05 PM · Discovery-Search (Current work), Chinese-Sites, Discovery
TJones triaged T209155: Deploy extra-analysis-surrogates & the experimental highlighter 5.5.2.4 to production as Low priority.
Fri, Nov 9, 3:04 PM · Discovery-Search (Current work), Chinese-Sites, Discovery
TJones renamed T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Characters in CJK extension C treated as U+FFFD when searching on zhWP to Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].
Fri, Nov 9, 3:01 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Backlog to Waiting/Blocked on the Discovery-Search (Current work) board.

I should have treated this ticket as an epic and created sub-tasks for it. The first part of the work—creating the plugin to re-merge surrogate pairs and the setting up the config to use the new plugin—was done on this ticket and is complete, but there is more to do. I don't want to close this ticket because the problem isn't solved yet, but the work I was doing here is done. So, after flailing around on the workboard a bit, I've moved it to Waiting, and I'll open sub-tasks for the remaining related tasks.

Fri, Nov 9, 3:00 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Done to Backlog on the Discovery-Search (Current work) board.
Fri, Nov 9, 2:55 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Needs review to Done on the Discovery-Search (Current work) board.
Fri, Nov 9, 2:54 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery

Wed, Nov 7

TJones added a comment to T208917: Build pipeline to transform elastic explains into feature vectors and a tf graph.

Very interesting stuff. Thanks for sharing the numbers. A few things come to mind.

Wed, Nov 7, 10:31 PM · Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].

Change 471204 merged by jenkins-bot:
https://gerrit.wikimedia.org/r/471204

Wed, Nov 7, 8:02 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery

Tue, Nov 6

TJones added a comment to T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].

I ran a quick analysis of the effect of the analysis chain change to the indexing results. There isn't much to report, and nothing surprising, so I'm not going to do a full write up. I compared before and after the surrogate merging on 10,000 Wikipedia articles (out of ~1M) and 10,000 Wiktionary articles (out of ~800K).

Tue, Nov 6, 7:45 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery

Fri, Nov 2

TJones added a comment to T208496: search platform maven projects failing post merge build.

Thanks for hunting this down, @Gehel!

Fri, Nov 2, 6:47 PM · Patch-For-Review, Release-Engineering-Team, Continuous-Integration-Config, Discovery-Search (Current work)
TJones awarded T208496: search platform maven projects failing post merge build a Like token.
Fri, Nov 2, 6:46 PM · Patch-For-Review, Release-Engineering-Team, Continuous-Integration-Config, Discovery-Search (Current work)
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from In progress to Needs review on the Discovery-Search (Current work) board.
Fri, Nov 2, 4:29 AM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery

Wed, Oct 31

TJones moved T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias from In progress to Backlog on the Discovery-Search (Current work) board.
Wed, Oct 31, 7:28 PM · Discovery-Search (Current work), Discovery
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Backlog to In progress on the Discovery-Search (Current work) board.
Wed, Oct 31, 7:28 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones claimed T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].
Wed, Oct 31, 7:28 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones moved T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish] from Up Next to Current work on the Discovery-Search board.
Wed, Oct 31, 7:27 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones updated subscribers of T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].

Things have become a bit more complicated—as they are wont to do!

Wed, Oct 31, 7:27 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery
TJones updated the task description for T168427: Characters in CJK extension C treated as U+FFFD when searching on zhWP [EPIC-ish].
Wed, Oct 31, 7:03 PM · Epic, MW-1.33-notes (1.33.0-wmf.4; 2018-11-13), Patch-For-Review, Discovery-Search (Current work), Chinese-Sites, Discovery

Mon, Oct 29

TJones added a comment to T205348: Calculate autocomplete examination probabilities from eventlogging data.

I see (sort of)—I misunderstood what the stats were measuring. I thought it was the percentage of what was clicked given that a click occurred (in which case they should sum to 1). I'll have to digest your description and think it through again. Thanks!

Mon, Oct 29, 4:16 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)

Thu, Oct 25

TJones added a comment to T205348: Calculate autocomplete examination probabilities from eventlogging data.

Sorry for catching up late.

Thu, Oct 25, 9:32 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T197128: Review current search metrics for accuracy and documentation.

Sounds good, @chelsyx —thanks for the updates and all the fixes!

Thu, Oct 25, 8:50 PM · Patch-For-Review, Product-Analytics, Discovery-Search (Current work)
TJones added a comment to T204688: cloudvps: shiny-r project trusty deprecation.

All instances in the shiny-r project needs to upgrade as soon as possible.

Thu, Oct 25, 8:48 PM · Cloud-VPS (Ubuntu Trusty Deprecation)

Wed, Oct 24

TJones added a comment to T204686: cloudvps: search project trusty deprecation.

I just logged into rel.search.eqiad.wmflabs, and @EBernhardson seems to have something there other than the default home directory, so maybe he knows what it's for and who's responsible for it. He's out this week, though.

Wed, Oct 24, 8:32 PM · Discovery-Search (Current work), User-Smalyshev, Cloud-VPS (Ubuntu Trusty Deprecation)

Oct 19 2018

TJones added a comment to T182965: Change info message to adapt for stemming not being applicable in all languages.

Is there some canonical place where a list of which languages use stemmers should be documented? Since this ticket was opened, we've added stemmers to Esperanto, Malay, Bosnian, Croatian, Serbian, Serbo-Croatian, and Slovak. Korean has been approved for adding stemming, but it's waiting on software upgrades.

Oct 19 2018, 6:38 PM · WMDE-Design, Design, Advanced-Search, TCB-Team

Oct 16 2018

TJones claimed T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias.
Oct 16 2018, 5:22 PM · Discovery-Search (Current work), Discovery
TJones moved T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias from Backlog to In progress on the Discovery-Search (Current work) board.
Oct 16 2018, 5:20 PM · Discovery-Search (Current work), Discovery
TJones edited projects for T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias, added: Discovery-Search (Current work); removed Discovery-Search.
Oct 16 2018, 5:20 PM · Discovery-Search (Current work), Discovery

Oct 12 2018

TJones triaged T206874: Add Nori (Korean) configuration to AnalysisConfigBuilder as Normal priority.
Oct 12 2018, 6:52 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T178925: Review Korean Morphological Libraries.

Filtering out the problem parts of speech looks good, so this is ready to be built out in Analysis Config Builder, but we need to upgrade to ES 6.4.2 (or at least be able to build a test environment will all the usual suspects plugins).

Oct 12 2018, 6:48 PM · Discovery-Search (Current work), Discovery
TJones moved T178925: Review Korean Morphological Libraries from In progress to Done on the Discovery-Search (Current work) board.
Oct 12 2018, 6:44 PM · Discovery-Search (Current work), Discovery

Oct 11 2018

TJones added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

Thanks for the data!

Oct 11 2018, 5:24 PM · Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T178925: Review Korean Morphological Libraries.

Indeed, @bmansurov, but your help is very much appreciated, too!

Oct 11 2018, 5:14 PM · Discovery-Search (Current work), Discovery

Oct 10 2018

TJones added a comment to T178925: Review Korean Morphological Libraries.

Speaker review is generally positive, but there are a couple of parts of speech that keep coming up as not really helpful, so I'm going to try filtering them and see what kind of diff that generates. This may or may not require another round of speaker review, depending on the impact.

Oct 10 2018, 2:35 PM · Discovery-Search (Current work), Discovery
TJones moved T178925: Review Korean Morphological Libraries from Needs review to In progress on the Discovery-Search (Current work) board.
Oct 10 2018, 2:33 PM · Discovery-Search (Current work), Discovery

Oct 5 2018

TJones added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

Are you evaluating MPC MRR based on re-ordering results "optimally" and scoring, which is indeed overfitted, or are you sorting results based on some data and evaluating on other data? My guess is that it would still do very well, because the most popular thing is going to be popular, but it could also be strongly overfitting on a longer tail that boosts the score a little here and a little there. It could also give a big boost to unique queries, which would always score perfectly, since there is no room for disagreement—and that long tail could make a big difference.

Oct 5 2018, 8:04 PM · Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.

Late to the party, but here's my 2¢ on the excellent discussion so far.

Oct 5 2018, 7:48 PM · Discovery-Search (Current work), Epic

Oct 4 2018

TJones added a comment to T178925: Review Korean Morphological Libraries.

@bmansurov, nothing to do differently on your end—that was a great review/analysis! I looked into the discrepancies to see what I could find, documented them for myself, and possibly to inform stuff you or anyone else looks at later, to see if any patterns of fixable or reportable problems emerge.

Oct 4 2018, 9:47 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T178925: Review Korean Morphological Libraries.

Thanks, Baha! I'm working on a reply right now, though it's taking a bit longer than expected. Your help is very much appreciated! (And I think you undersold your Korean skills on the language skills page.)

Oct 4 2018, 5:46 PM · Discovery-Search (Current work), Discovery

Oct 3 2018

TJones added a comment to T178925: Review Korean Morphological Libraries.

I've opened upstream tickets for Nori and CJK analyzers based on my analysis. (The Nori ticket got pushed back to Lucene.)

Oct 3 2018, 9:24 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T178925: Review Korean Morphological Libraries.

@bmansurov, if you have the time, I'd love for you to take a look! No one else has agreed to look yet, but even if they do, having multiple sets of eyes on it is a good thing. Thanks!

Oct 3 2018, 4:28 PM · Discovery-Search (Current work), Discovery

Oct 2 2018

TJones added a comment to T178925: Review Korean Morphological Libraries.

I've asked for help reviewing on the Korean Wikipedia Village Pump and the Korean Wiktionary discussion forum. I contacted five Wikipedians who have volunteered to help English speakers on Korean Wikipedia, and may contact a few more if I don't hear back from many of them. I also found contact info for the Elasticsearch engineer who wrote the blog post on the new Korean analyzer, so I've emailed him.

Oct 2 2018, 8:55 PM · Discovery-Search (Current work), Discovery
TJones awarded T205656: Convert relforge to a config format that supports nested structures a Like token.
Oct 2 2018, 3:10 PM · Discovery-Search

Sep 27 2018

revi awarded T178925: Review Korean Morphological Libraries a Like token.
Sep 27 2018, 7:03 PM · Discovery-Search (Current work), Discovery
TJones moved T178925: Review Korean Morphological Libraries from In progress to Needs review on the Discovery-Search (Current work) board.
Sep 27 2018, 6:51 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T178925: Review Korean Morphological Libraries.

First draft of my analysis of Nori is on MediaWiki.

Sep 27 2018, 6:51 PM · Discovery-Search (Current work), Discovery
TJones added a comment to T197128: Review current search metrics for accuracy and documentation.

For the anomalies, if there isn't time (yet, or at all) to investigate them individually, would it be possible to create a sampling tool that would let us look for obvious skews in the usual usage stats?

Sep 27 2018, 6:16 PM · Patch-For-Review, Product-Analytics, Discovery-Search (Current work)

Sep 25 2018

TJones renamed T204135: Warn when CirrusSearch is not configured to use local DC for an extended time from Warn when CirrusSearch is not configured to use local DCfor an extended time to Warn when CirrusSearch is not configured to use local DC for an extended time.
Sep 25 2018, 5:48 PM · Discovery-Search (Current work), Datacenter-Switchover-2018, Operations

Sep 20 2018

TJones added a comment to T204089: CirrusSearch: Add filter for exclusion of redirects or finding only them.

Note: this is a possible duplicate of T90807: Option to exclude redirection pages from search results, which was declined.

Sep 20 2018, 2:54 PM · Discovery-Search, Advanced-Search, TCB-Team, CirrusSearch

Sep 19 2018

TJones added a comment to T204868: Long single-token Unicode searches report regex error.

Another example in Korean that should get results:

Sep 19 2018, 7:18 PM · Discovery-Search
TJones created T204868: Long single-token Unicode searches report regex error.
Sep 19 2018, 7:14 PM · Discovery-Search

Sep 13 2018

TJones updated the task description for T147505: [Recurring task] CirrusSearch: what is updated during re-indexing.
Sep 13 2018, 5:08 PM · Discovery-Search (Current work), Discovery

Sep 12 2018

TJones moved T203005: Re-index Esperanto Wikis from In progress to Done on the Discovery-Search (Current work) board.
Sep 12 2018, 10:42 PM · Esperanto-Sites, I18n, Discovery-Search (Current work)
TJones added a comment to T203005: Re-index Esperanto Wikis.

All done, and basic tests on all Esperanto wikis shows that stemming is working.

Sep 12 2018, 10:42 PM · Esperanto-Sites, I18n, Discovery-Search (Current work)
TJones added a comment to T203005: Re-index Esperanto Wikis.

Quick test on eowikibooks looks good. Re-indexing the rest.

Sep 12 2018, 6:30 PM · Esperanto-Sites, I18n, Discovery-Search (Current work)
TJones moved T203005: Re-index Esperanto Wikis from Waiting/Blocked to In progress on the Discovery-Search (Current work) board.
Sep 12 2018, 6:21 PM · Esperanto-Sites, I18n, Discovery-Search (Current work)

Sep 4 2018

TJones moved T178925: Review Korean Morphological Libraries from Backlog to In progress on the Discovery-Search (Current work) board.
Sep 4 2018, 7:39 PM · Discovery-Search (Current work), Discovery
TJones moved T178925: Review Korean Morphological Libraries from Up Next to Current work on the Discovery-Search board.
Sep 4 2018, 7:39 PM · Discovery-Search (Current work), Discovery
TJones moved T192502: Don't index empty strings caused by ICU Folding in Elasticsearch from Needs review to Done on the Discovery-Search (Current work) board.
Sep 4 2018, 7:38 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)

Aug 30 2018

TJones created T203199: Separate test coverage "increased" from "stayed the same".
Aug 30 2018, 9:10 PM · phpunit-patch-coverage
TJones moved T192502: Don't index empty strings caused by ICU Folding in Elasticsearch from In progress to Needs review on the Discovery-Search (Current work) board.
Aug 30 2018, 6:57 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T192502: Don't index empty strings caused by ICU Folding in Elasticsearch.

Patch incoming in a moment. Analysis across smallish samples from ~10 relevant Wikipedias shows little impact on most text, and no unintended consequences. Full write up is on MediaWiki.

Aug 30 2018, 6:52 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)
TJones renamed T192502: Don't index empty strings caused by ICU Folding in Elasticsearch from Don't index empty strings in Elasticsearch to Don't index empty strings caused by ICU Folding in Elasticsearch.
Aug 30 2018, 3:59 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)

Aug 29 2018

TJones moved T192502: Don't index empty strings caused by ICU Folding in Elasticsearch from Backlog to In progress on the Discovery-Search (Current work) board.
Aug 29 2018, 9:21 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)
TJones added a project to T203117: Greek language analysis generates unexpected empty tokens: Discovery-Search.
Aug 29 2018, 9:03 PM · Discovery-Search
TJones created T203117: Greek language analysis generates unexpected empty tokens.
Aug 29 2018, 9:03 PM · Discovery-Search

Aug 28 2018

TJones claimed T192502: Don't index empty strings caused by ICU Folding in Elasticsearch.
Aug 28 2018, 8:33 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)
TJones moved T192502: Don't index empty strings caused by ICU Folding in Elasticsearch from Up Next to Current work on the Discovery-Search board.
Aug 28 2018, 8:32 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T73123: can't use incategory: or intitle: when category or title contains double quotes.

The new language analyzer has been in place for a while, and it doesn't make any difference for this issue.

Aug 28 2018, 8:17 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work), Discovery, TestMe, MediaWiki-Search
TJones moved T174705: Include phonetic search option to advanced / power user search from This Quarter to Later on the Discovery-Search board.
Aug 28 2018, 8:01 PM · Discovery-Search
TJones moved T184771: Set up RelForge test of phonetic title search from This Quarter to Later on the Discovery-Search board.
Aug 28 2018, 8:00 PM · Discovery-Search
TJones updated subscribers of T85514: Type-ahead search on en.wp: Entering "pass by" lists two entries but not the entry "pass-by-value" (with dashes).
Aug 28 2018, 4:49 PM · Discovery-Search (Current work), Discovery, CirrusSearch
TJones added a comment to T85514: Type-ahead search on en.wp: Entering "pass by" lists two entries but not the entry "pass-by-value" (with dashes).

@garegin16, can you check your search preferences? My guess is that you have either "strict mode" or "redirect mode" enabled. When I enable those, I only get the same two results as you. (Can you verify that you get 10 results in default mode, too, please? Thanks!)

Aug 28 2018, 4:48 PM · Discovery-Search (Current work), Discovery, CirrusSearch
TJones moved T203005: Re-index Esperanto Wikis from Backlog to Waiting/Blocked on the Discovery-Search (Current work) board.
Aug 28 2018, 4:25 PM · Esperanto-Sites, I18n, Discovery-Search (Current work)
TJones updated the task description for T147505: [Recurring task] CirrusSearch: what is updated during re-indexing.
Aug 28 2018, 4:24 PM · Discovery-Search (Current work), Discovery
TJones triaged T203005: Re-index Esperanto Wikis as Normal priority.
Aug 28 2018, 4:23 PM · Esperanto-Sites, I18n, Discovery-Search (Current work)