TJones (Trey Jones)
Sr. Software Engineer, Search Platform Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Jul 8 2015, 3:02 PM (132 w, 2 d)
Availability
Available
IRC Nick
Trey314159
LDAP User
Tjones
MediaWiki User
TJones (WMF)

I would have written a shorter comment, but I did not have the time.

I'm part of the Search Platform team and I spend my time working on search & relevance, trying to better support search in various languages, analyzing queries, and doing random mathy things. I tend to write long, detailed notes about my investigations (so as to improve the bus number of my work).

When I have to work on _GitHub,_ /‍‍/Phab,/‍‍/ and ''MediaWiki'' all on the same day, I sometimes suffer Severe Markup Incongruence Fatigue.

I � Unicode.

Recent Activity

Thu, Jan 18

TJones added a comment to T23582: Transliteration of Crimean Wiki.

Unfortunately, the config used to enable the transliteration didn't quite work in the test environment during today's SWAT. The transliteration was enabled, but clicking on the link caused a "page not found" error. Manually altering the URL to the correct syntax did work—the main page came up in Cyrillic!—so it's the URL configuration that isn't working. I'm going to study the Kazakh wiki config more closely and try to figure out what the right config is. I'm going to not worry about the short URLs until I get the the main transliteration enabled.

Thu, Jan 18, 9:41 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter
TJones added a comment to T185250: Investigate irrelevant sister project search results on Wikipedia.

Filtering quotations isn't going to solve the more general problem, and it will decrease the value of search on Wiktionary, because sometimes the quotations are where the information you want is going to be found.

Thu, Jan 18, 9:33 PM · Discovery-Search (Current work), Patch-For-Review, Discovery

Wed, Jan 17

TJones added a comment to T23582: Transliteration of Crimean Wiki.

The previous patch enables the Apache short-URL config on the beta cluster. It doesn't actually do much since the the Crimean Tatar transliteration isn't enabled and there are no Crimean Tatar projects in the beta cluster. But it did let me make sure that the Apache config didn't break anything (the Chinese language conversion still works as expected on the beta cluster). I'll be trying to get the other configs into an upcoming SWAT deploy soon.

Wed, Jan 17, 8:14 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter

Tue, Jan 16

TJones added a comment to T184767: Make it possible to stop a survey after receiving a certain number of responses.

@mpopov—I agree 100% with everything you said about doing a survey right! (And it leads me to think we could use some good documentation (or a wikibook!) somewhere on best practices on how to run a survey, especially advice on a priori power analysis and the dangers of stopping your survey early, along with documentation on how to get the needed info (e.g., page view stats), etc.)

Tue, Jan 16, 8:15 PM · QuickSurveys, Surveys, Community-Liaisons, MediaWiki-Page-editing
TJones moved T183015: Create Serbian Elasticsearch Plugin/Analysis Chain Using Serbian Morphological Libraries from Backlog to In progress on the Discovery-Search (Current work) board.
Tue, Jan 16, 3:06 PM · I18n, Discovery-Search (Current work), Discovery
TJones added a comment to T184944: Microsurveys: Don't keep asking me too often, but do ask me again, eventually, on some surveys.
  • For most surveys, wouldn't you expect the average response from a large enough sample to reflect any general changes in opinion on a topic? If ~20% of people change their mind about a topic after a given survey, that should be obvious in later iterations of the survey, whether or not you re-survey any particular person. The size of the survey and the exact proportion of opinions held would determine the confidence interval and thus how small a shift you'd be able to detect, but with large surveys that should be a few percentage points at most, so any big shift should be clear, whether you re-survey any particular individual or not. I'd be curious to see an example of a specific question that really needs to be re-ask of the same person; none come to mind, but that doesn't mean there aren't any!
Tue, Jan 16, 1:56 PM · QuickSurveys, Surveys, Community-Liaisons, MediaWiki-Page-editing

Thu, Jan 11

TJones added a comment to T89970: Enable microsurveys for long-term tracking of editing experience .

For 100/week we want around 14 /day spread across the days/hours of operation to get all the different kinds of users.

Thu, Jan 11, 9:39 PM · QuickSurveys, Surveys, Community-Liaisons, MediaWiki-Page-editing
TJones moved T182708: Investigate effect of phonetic search on Wikipedia title words from In progress to Needs review on the Discovery-Search (Current work) board.
Thu, Jan 11, 8:55 PM · Discovery-Search (Current work)
TJones added a comment to T182708: Investigate effect of phonetic search on Wikipedia title words.

Apologies for the long delay in getting this analysis done. My full write up is on MediaWiki.

Thu, Jan 11, 8:55 PM · Discovery-Search (Current work)
TJones triaged T184771: Set up RelForge test of phonetic title search as Normal priority.
Thu, Jan 11, 8:54 PM · Discovery-Search (Current work)
TJones added a comment to T89970: Enable microsurveys for long-term tracking of editing experience .

@Whatamidoing-WMF, I don't have any additional brilliant insight on the other tickets you've opened, other than to say that they look like you've got the right idea.

Thu, Jan 11, 7:37 PM · QuickSurveys, Surveys, Community-Liaisons, MediaWiki-Page-editing

Wed, Jan 10

TJones added a comment to T160106: Test and analyze new Ukrainian language analyzers.

@dalekiy_obriy—Cool! Thanks for the update! That sounds like a really big update which should be really useful in improving stemmer coverage.

Wed, Jan 10, 5:03 PM · MW-1.29-release-notes, Epic, Discovery-Search (Current work), Discovery
TJones added a comment to T162279: Collect ideas for feature engineering of LTRank.

Another idea for a feature—some similarity measure between the search term and the matching term in a document. Though after talking to @dcausse it sounds too expensive because there's no good way to map matched terms to specific query terms, and even pulling out the matched terms is a pain.

Wed, Jan 10, 5:01 PM · Discovery-Search, Discovery

Fri, Jan 5

TJones added a comment to T156037: Load cirrussearch data into druid.

This ticket came up again in a discussion earlier in the week, and we decided that adding a few more use cases wouldn't hurt, even if we don't work on it for a while.

Fri, Jan 5, 5:52 PM · Discovery-Search, Discovery, Analytics, CirrusSearch
TJones added a comment to T89970: Enable microsurveys for long-term tracking of editing experience .

We should not pre-define the trigger. And we should make this tool more general, so that it can appear on any page we're interested in, such as Recent Changes, for example. There are some other fine points to be worked out, but I definitely could use something like this.

Fri, Jan 5, 4:09 PM · QuickSurveys, Surveys, Community-Liaisons, MediaWiki-Page-editing

Dec 20 2017

TJones added a comment to T175048: Search Relevance Survey test #3: analysis of test.

Nice! That is not a perfectly straight line, but it is remarkably good considering the mess that was the original input.

Dec 20 2017, 7:33 PM · Discovery-Analysis (Current work), Discovery
TJones awarded F11851691: plot.png a Pterodactyl token.
Dec 20 2017, 4:37 PM
TJones added a comment to T175048: Search Relevance Survey test #3: analysis of test.

Alrighty, here ya go! It's not as pretty as you were probably expecting!

Dec 20 2017, 2:31 PM · Discovery-Analysis (Current work), Discovery

Dec 18 2017

TJones added a comment to T23582: Transliteration of Crimean Wiki.

So, there is only crh language wikipedia. There isn't any crh language wikibooks/etc. For the other languages with lang converter, the short url in production seem only for wikis that actually exist, where beta seems to do it for all projects, even if they don't exist.

Dec 18 2017, 8:05 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter
TJones added a comment to T23582: Transliteration of Crimean Wiki.

Thanks, @Gehel—did you mean a plan to test the config before deployment? Unfortunately I don't. Verifying it after deployment is easy, though.

Dec 18 2017, 7:25 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter

Dec 15 2017

TJones added a comment to T175048: Search Relevance Survey test #3: analysis of test.

I'll check out how it compares with respect to the distribution!

Dec 15 2017, 7:18 PM · Discovery-Analysis (Current work), Discovery
TJones added a comment to T178926: Review Serbian Morphological Libraries.

I've marked this as Done. Thanks to Vuk and Željko for reviewing this for me. There are the usual kind of issues expected because language is messy—a few ambiguous words, names of people or places, acronyms, foreign words, etc. But it's generally doing the right thing for Serbian, and it doesn't have any unwanted side effects for non-Serbian text.

Dec 15 2017, 5:55 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones moved T178926: Review Serbian Morphological Libraries from Needs review to Done on the Discovery-Search (Current work) board.
Dec 15 2017, 5:48 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones triaged T183015: Create Serbian Elasticsearch Plugin/Analysis Chain Using Serbian Morphological Libraries as Normal priority.
Dec 15 2017, 5:48 PM · I18n, Discovery-Search (Current work), Discovery
Liuxinyu970226 awarded T178926: Review Serbian Morphological Libraries a Goat token.
Dec 15 2017, 2:27 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery

Dec 13 2017

TJones created T182824: [epic] Show query-frequency-stratified results in A/B test results.
Dec 13 2017, 8:54 PM · Epic, Discovery-Search (Current work), Discovery-Analysis
TJones claimed T182708: Investigate effect of phonetic search on Wikipedia title words.
Dec 13 2017, 2:52 PM · Discovery-Search (Current work)

Dec 12 2017

TJones moved T182708: Investigate effect of phonetic search on Wikipedia title words from Backlog to In progress on the Discovery-Search (Current work) board.
Dec 12 2017, 8:52 PM · Discovery-Search (Current work)
TJones edited projects for T182708: Investigate effect of phonetic search on Wikipedia title words, added: Discovery-Search (Current work); removed Discovery-Search.
Dec 12 2017, 8:52 PM · Discovery-Search (Current work)
TJones triaged T182708: Investigate effect of phonetic search on Wikipedia title words as Normal priority.
Dec 12 2017, 7:29 PM · Discovery-Search (Current work)
TJones moved T178926: Review Serbian Morphological Libraries from In progress to Needs review on the Discovery-Search (Current work) board.
Dec 12 2017, 6:18 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery

Dec 11 2017

TJones added a comment to T178926: Review Serbian Morphological Libraries.

I've completed my analysis of stemmer #4 and it looks good, though it needs speaker review. (It's on a new page since the old page already has enough info and complexity in it.)

Dec 11 2017, 8:24 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones updated the task description for T176493: Analysis of testing on 18 wikis with > 1% of search traffic.
Dec 11 2017, 7:38 PM · Patch-For-Review, Discovery-Analysis (Current work), Discovery-Search (Current work), Discovery

Dec 8 2017

TJones added a comment to T182352: UDF for language detection.

The CLD3 page says it is intended to run in a browser and relies on Chromium... that's kinda weird. And I don't see a list of supported/identified languages. The R wrapper is excellent for testing, though!

Dec 8 2017, 3:15 PM · Discovery-Search, Discovery-Analysis, Analytics, Discovery

Dec 7 2017

TJones added a comment to T23582: Transliteration of Crimean Wiki.

I've posted an announcement to Crimean Tatar Village Pump (translation assistance to Russian or Crimean Tatar much appreciated), and the patches I think are needed to enable the transliteration on-wiki are linked in the two previous messages.

Dec 7 2017, 10:48 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter
TJones added a comment to T182352: UDF for language detection.

So, I think this is a very nifty idea, but there are some potential pitfalls to be aware of.

Dec 7 2017, 9:56 PM · Discovery-Search, Discovery-Analysis, Analytics, Discovery

Dec 5 2017

TJones added a comment to T178926: Review Serbian Morphological Libraries.

Player 4 Has Entered the Game!

Dec 5 2017, 9:52 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones added a comment to T167603: Any Chinese Wiki's projects about "Download as PDF" can not auto change to Simplified Chinese or Traditional Chinese .

I got asked to take a look at this ticket, and while the code is far from what I normally work on, I'm always like looking at issues dealing with language and all its interesting complexity.

Dec 5 2017, 8:09 PM · Proton, Readers-Web-Backlog, Electron-PDFs, Chinese-Sites
TJones moved T178926: Review Serbian Morphological Libraries from Needs review to In progress on the Discovery-Search (Current work) board.
Dec 5 2017, 6:13 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery

Dec 1 2017

TJones added a comment to T179945: Re-index English-language wikis to pick up kana mapping.

Looks good! I tested quickly on English Wikipedia, Wiktionary, Wikisource, and Wikibooks, and it's working. Thanks, @dcausse.

Dec 1 2017, 2:34 PM · Discovery-Search (Current work), Discovery, CirrusSearch

Nov 29 2017

TJones awarded T181479: Requesting access to terbium/wasat for Trey Jones a Doubloon token.
Nov 29 2017, 6:33 PM · Patch-For-Review, Ops-Access-Requests, Operations
TJones added a comment to T181479: Requesting access to terbium/wasat for Trey Jones.

@Dzhan, thanks for getting this moving so quickly! Happy to wait for Monday's ops meeting.

Nov 29 2017, 6:32 PM · Patch-For-Review, Ops-Access-Requests, Operations

Nov 27 2017

TJones added a comment to T178926: Review Serbian Morphological Libraries.

My full write up is on MediaWiki.

Nov 27 2017, 9:29 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones moved T178926: Review Serbian Morphological Libraries from In progress to Needs review on the Discovery-Search (Current work) board.
Nov 27 2017, 9:18 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones added a comment to T175048: Search Relevance Survey test #3: analysis of test.

Cool stuff, @mpopov!

Nov 27 2017, 5:17 PM · Discovery-Analysis (Current work), Discovery

Nov 24 2017

zeljkofilipin awarded T178926: Review Serbian Morphological Libraries a Goat token.
Nov 24 2017, 9:35 AM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery

Nov 21 2017

TJones added a comment to T176493: Analysis of testing on 18 wikis with > 1% of search traffic.

Report of test on 18 languages is updated with interleaved results: https://analytics.wikimedia.org/datasets/discovery/reports/CirrusSearch_MLR_AB_test_on_18_wikis.html

Nov 21 2017, 4:19 PM · Patch-For-Review, Discovery-Analysis (Current work), Discovery-Search (Current work), Discovery

Nov 20 2017

TJones updated subscribers of T23582: Transliteration of Crimean Wiki.

Thanks so much, @Bawolff!

Nov 20 2017, 10:57 PM · Patch-For-Review, Wikimedia-Hackathon-2017, I18n, MediaWiki-Language-converter

Nov 13 2017

TJones added a comment to T180387: Enable hiragana/katakana mapping for other languages.

Not directly part of this ticket, but part of the related discussion: we should decide whether we should keep going down the list of unpacked analyzers, and whether we should pro-actively unpack the other analyzers, and whether we should just enable it for languages where it has a small but positive impact (and doesn't cause problems with the analyzer).

Nov 13 2017, 6:52 PM · Discovery-Search, Discovery, CirrusSearch
TJones renamed T180387: Enable hiragana/katakana mapping for other languages from Enable hiragana/katakana mapping for other languages. to Enable hiragana/katakana mapping for other languages.
Nov 13 2017, 6:52 PM · Discovery-Search, Discovery, CirrusSearch
TJones created T180387: Enable hiragana/katakana mapping for other languages.
Nov 13 2017, 6:30 PM · Discovery-Search, Discovery, CirrusSearch
TJones closed T180365: Remove unneeded language-specific config for lowercase filters as Declined.

Declining this since I opened it while thinking about the wrong field. D'oh.

Nov 13 2017, 4:34 PM · Technical-Debt, Discovery-Search (Current work)
TJones added a comment to T180365: Remove unneeded language-specific config for lowercase filters.

On 1000 articles from Wikipedia and 1000 entries from Wiktionary it doesn't make any difference for the text analyzer... but it does make a difference in the "lowercase" analyzer! And if you don't have the ICU Plugin installed, it affects the plain analyzers, too. Whoops. TIL.

Nov 13 2017, 4:34 PM · Technical-Debt, Discovery-Search (Current work)
TJones edited projects for T180365: Remove unneeded language-specific config for lowercase filters, added: Discovery-Search (Current work); removed Discovery-Search.
Nov 13 2017, 3:58 PM · Technical-Debt, Discovery-Search (Current work)
TJones updated the task description for T180365: Remove unneeded language-specific config for lowercase filters.
Nov 13 2017, 3:58 PM · Technical-Debt, Discovery-Search (Current work)
TJones moved T180365: Remove unneeded language-specific config for lowercase filters from Needs triage to Tech Debt/Misc on the Discovery-Search board.
Nov 13 2017, 3:57 PM · Technical-Debt, Discovery-Search (Current work)
TJones created T180365: Remove unneeded language-specific config for lowercase filters.
Nov 13 2017, 3:57 PM · Technical-Debt, Discovery-Search (Current work)
TJones renamed T177876: Investigate changing ICU tokenization from whitelist to blacklist from Investigate changing ICU tokenization from whitelist to blacklist. to Investigate changing ICU tokenization from whitelist to blacklist.
Nov 13 2017, 3:48 PM · Discovery-Search

Nov 9 2017

TJones added a comment to T180169: Make list of languages where using stemmed analyzer for Wikibase is beneficial.

@Smalyshev, I think this covers the info you need. Let me know if I can give more info or help with anything else. :)

Nov 9 2017, 11:10 PM · MediaWiki-extensions-WikibaseRepository, Wikidata, Discovery-Search (Current work), Discovery

Nov 8 2017

TJones added a comment to T176197: Allow hiragana searches to find katakana results and vice versa.

I've added posts on Italian Wikipedia & Wiktionary, and Swedish Wikipedia & Wiktionary.

Nov 8 2017, 3:27 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch

Nov 7 2017

TJones added a comment to T170099: Search returns random results when search query begins with a hyphen.

@debt, is there anything left to do for this task? I don't think we want to completely disable single-term negation searching because of the three use cases @dcausse outlined above (T170099#3429153).

Nov 7 2017, 6:53 PM · Discovery-Search, Discovery, CirrusSearch
TJones moved T178926: Review Serbian Morphological Libraries from Backlog to In progress on the Discovery-Search (Current work) board.
Nov 7 2017, 5:44 PM · User-zeljkofilipin, I18n, Discovery-Search (Current work), Discovery
TJones edited projects for T179945: Re-index English-language wikis to pick up kana mapping, added: Discovery-Search (Current work); removed Discovery-Search.
Nov 7 2017, 3:46 PM · Discovery-Search (Current work), Discovery, CirrusSearch
TJones moved T176197: Allow hiragana searches to find katakana results and vice versa from Needs review to Done on the Discovery-Search (Current work) board.
Nov 7 2017, 3:46 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch
TJones added a comment to T176197: Allow hiragana searches to find katakana results and vice versa.

The code has been merged, but not deployed. I've created T179945 to re-index of English-language wikis after the code is deployed, and added it to T147505.

Nov 7 2017, 3:45 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch
TJones updated the task description for T147505: [Recurring task] CirrusSearch: what is updated during re-indexing.
Nov 7 2017, 3:44 PM · Discovery-Search (Current work), Discovery
TJones created T179945: Re-index English-language wikis to pick up kana mapping.
Nov 7 2017, 3:44 PM · Discovery-Search (Current work), Discovery, CirrusSearch

Nov 6 2017

TJones added a comment to T176197: Allow hiragana searches to find katakana results and vice versa.

Bugs filed:

  • ICU Tokenizer: U+0370 and above affect tokenization of characters after whitespace: issue 27290
  • Standard tokenizer incorrectly tokenizes hiragana: issue 27291
  • ICU Normalizer adds spaces before certain non-combining dakuten and handakuten: issue 27292
Nov 6 2017, 10:10 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch
TJones added a comment to T176197: Allow hiragana searches to find katakana results and vice versa.

Posted messages to:

Nov 6 2017, 7:48 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch
TJones moved T176197: Allow hiragana searches to find katakana results and vice versa from In progress to Needs review on the Discovery-Search (Current work) board.
Nov 6 2017, 7:46 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch
TJones added a comment to T179500: Evaluation precision of discernatron results vs our retrieval query.

@EBernhardson—sorry, I should have clarified.. that's not real data! I just made it up to show the format and the kinds of info we might see.

Nov 6 2017, 6:15 PM · Discovery-Search, Discovery, CirrusSearch
TJones added a comment to T173650: Inappropriate/broken redirecting of Japanese in search.

FYI: my recommendation over on T176197 is to enable hiragana-to-katakana mapping for English, but not Japanese because it runs afoul of a couple of ugly tokenization bugs. We'll also look into whether the French- and Russian-language communities feel they might benefit from this; if so, we may expand beyond those two.

Nov 6 2017, 4:52 PM · Discovery-Search, Discovery, CirrusSearch
TJones added a comment to T176197: Allow hiragana searches to find katakana results and vice versa.

Whew! What a ride. This turned out to be much more complicated than anticipated for the Japanese analysis. I found three tokenization bugs, one of which depends on context in unexpected ways and so made me question my data collection, which led to me re-running everything... Anyway, because of the bugs in the tokenization, I recommend not deploying this for Japanese.

Nov 6 2017, 4:49 PM · MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch

Nov 2 2017

TJones added a comment to T179500: Evaluation precision of discernatron results vs our retrieval query.

EDIT: Please note that all the data below is just made up to show the possible formats.

Nov 2 2017, 2:30 PM · Discovery-Search, Discovery, CirrusSearch
TJones added a comment to T170779: Wikidata search suggestions do not display on screen if character whose decomposition contains nukta is present in search query.

@Smalyshev, thanks for tracking this one down! That was some weird behavior, but things getting normalized and not matching makes sense.

Nov 2 2017, 1:05 PM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Wikidata-Former-Sprint-Board, User-Smalyshev, Discovery-Search (Current work), ValueView, MediaWiki-extensions-WikibaseRepository, Wikidata

Oct 27 2017

TJones updated the task description for T104814: Appropriately ignore diacritics for German-language wikis.
Oct 27 2017, 4:57 PM · Discovery-Search, Discovery, CirrusSearch
TJones updated the task description for T104814: Appropriately ignore diacritics for German-language wikis.
Oct 27 2017, 4:56 PM · Discovery-Search, Discovery, CirrusSearch
TJones added a comment to T179081: Full text search does not find article with accented word in dewiki.

Do you want to comment on the other e's that are also not folded correctly? I'll add a note to the other ticket to fix this documentation.

Oct 27 2017, 4:55 PM · Discovery-Search, Discovery, Regression, CirrusSearch
TJones added a comment to T179081: Full text search does not find article with accented word in dewiki.

@FriedhelmW can you point me at the documentation you want to change? If you are referring to "Folds character families. Diacritical folding automatically matches foreign terms" then I agree it should be updated, but please be careful not to make it incorrect in a different way. Diacritical folding is turned on for most languages, though the set of characters that are folded differs from language to language.

Oct 27 2017, 4:48 PM · Discovery-Search, Discovery, Regression, CirrusSearch
TJones added a comment to T179081: Full text search does not find article with accented word in dewiki.

D'oh—thanks @FriedhelmW, I didn't check for that. The busy, busy WikiGnomes are always fixing things. So, I'd say that this is a specific example of what's happening with e's in T104814. That ticket is on my list for this year. Is it okay to close this ticket and/or fold it into T104814?

Oct 27 2017, 4:35 PM · Discovery-Search, Discovery, Regression, CirrusSearch
TJones added a comment to T170779: Wikidata search suggestions do not display on screen if character whose decomposition contains nukta is present in search query.

Note that you don’t need to change your interface to Bengali to see these effects, and the fact that it is the Bengali keyword for “category” doesn’t seem to matter either. You can search for single characters and get the described behavior.

Oct 27 2017, 4:30 PM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Wikidata-Former-Sprint-Board, User-Smalyshev, Discovery-Search (Current work), ValueView, MediaWiki-extensions-WikibaseRepository, Wikidata
TJones added a comment to T179081: Full text search does not find article with accented word in dewiki.

@FriedhelmW, can you post a screenshot or more detailed description of what you are seeing that is wrong? Or maybe another example? When I follow the link you provided, the article for Eugénie Grandet is the first result:

Oct 27 2017, 2:53 PM · Discovery-Search, Discovery, Regression, CirrusSearch
Liuxinyu970226 awarded T177871: Re-index un-fallbacked languages a Baby Tequila token.
Oct 27 2017, 12:01 PM · User-notice, Discovery-Search (Current work), Discovery, I18n

Oct 24 2017

TJones moved T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:36 PM · Discovery, Discovery-Search
TJones moved T138858: Serbian language search does not allows for use of bald Latin alphabet from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:35 PM · CirrusSearch, Discovery-Search, Discovery, MediaWiki-Internationalization
TJones moved T138857: Serbian language search differentiates between Cyrillic and Latin alphabets from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:35 PM · CirrusSearch, Discovery-Search, MediaWiki-Internationalization, Discovery
TJones moved T140292: A/B Test TextCat settings on non-WP projects from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:35 PM · CirrusSearch, Discovery-Search, Discovery
TJones moved T149307: CirrusSearch: Replace double quotes with spaces in queries from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:35 PM · CirrusSearch, Discovery-Search, Discovery
TJones moved T149121: Go over discernatron data to get an idea of where we need to improve from This Quarter to Later on the Discovery-Search board.
Oct 24 2017, 5:35 PM · Discovery-Search, Discovery, CirrusSearch
TJones moved T145564: Discernatron should remove redirects from result set from Tech Debt/Misc to Later on the Discovery-Search board.
Oct 24 2017, 5:35 PM · Discovery-Search, Discovery
TJones moved T145564: Discernatron should remove redirects from result set from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:34 PM · Discovery-Search, Discovery
TJones moved T155104: Detect "wrong keyboard" queries for Hebrew/American keyboards on EN/HE Wikipedias from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:34 PM · Discovery-Search, Discovery
TJones moved T174621: Investigate dropping obvious question words ('what is' 'who is') to get better results from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:33 PM · Discovery-Search, Discovery
TJones moved T87136: ~"daß" should not match "dass" from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:31 PM · Discovery-Search, Discovery, CirrusSearch
TJones moved T174116: Another look at multi-hyphen tokens on enwiki and zhwiki from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:28 PM · Chinese-Sites, Discovery-Search, Discovery
TJones moved T177888: Review use of CJK vs ICU default language analyzers for "Chinese" Wikis from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:28 PM · Chinese-Sites, Discovery-Search
TJones moved T177877: Investigate enabling Nynorsk Light Stemmer from This Quarter to Tech Debt/Misc on the Discovery-Search board.
Oct 24 2017, 5:28 PM · Discovery-Search
TJones moved T178923: Review Japanese Morphological Libraries from Needs triage to Up Next on the Discovery-Search board.
Oct 24 2017, 5:15 PM · Discovery-Search, Discovery
TJones moved T178924: Review Vietnamese Morphological Libraries from Needs triage to Up Next on the Discovery-Search board.
Oct 24 2017, 5:15 PM · Discovery-Search, Discovery
TJones moved T178925: Review Korean Morphological Libraries from Needs triage to Up Next on the Discovery-Search board.
Oct 24 2017, 5:15 PM · Discovery-Search, Discovery