Page MenuHomePhabricator

TJones (Trey Jones)
Staff Computational Linguist, Search Platform Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Jul 8 2015, 3:02 PM (450 w, 14 h)
Availability
Available
IRC Nick
Trey314159
LDAP User
Tjones
MediaWiki User
TJones (WMF) [ Global Accounts ]

I would have written a shorter comment, but I did not have the time.

I'm part of the Search Platform team and I spend my time working on search & relevance, trying to better support search in various languages, analyzing queries, and doing random mathy things. I tend to write long, detailed notes about my investigations (so as to improve the bus number of my work).

When I have to work on _GitHub,_ /‍‍/Phab,/‍‍/ and ''MediaWiki'' all on the same day, I sometimes suffer Severe Markup Incongruence Fatigue.

I � Unicode.

Recent Activity

Yesterday

TJones moved T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Wed, Feb 21, 11:08 PM · Discovery-Search (Current work)
TJones moved T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from In Progress to Needs review on the Discovery-Search (Current work) board.
Wed, Feb 21, 11:01 PM · Discovery-Search (Current work)

Tue, Feb 20

TJones changed the point value for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from 5 to 8.

Full write up on MediaWiki.

Tue, Feb 20, 10:21 PM · Discovery-Search (Current work)

Tue, Feb 13

TJones updated the task description for T357473: Divehi wiki search button is misplaced on page load.
Tue, Feb 13, 9:25 PM · Local-Wiki-Template-And-Gadget-Issues, Desktop Improvements (Vector 2022)
TJones created T357473: Divehi wiki search button is misplaced on page load.
Tue, Feb 13, 9:22 PM · Local-Wiki-Template-And-Gadget-Issues, Desktop Improvements (Vector 2022)

Tue, Feb 6

TJones added a comment to T356651: Rebuild and deploy textify plugin.

T332337 has been comitted, so this is ready to go.

Tue, Feb 6, 4:21 PM · Data-Platform-SRE ( 2024.02.12 - 2024.03.03), Discovery-Search (Current work)
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Tue, Feb 6, 4:19 PM · Discovery-Search (Current work)

Mon, Feb 5

TJones updated the task description for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Mon, Feb 5, 3:56 PM · Discovery-Search (Current work)
TJones added a parent task for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair: T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Mon, Feb 5, 3:55 PM · Discovery-Search (Current work)
TJones added a subtask for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair: T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Mon, Feb 5, 3:55 PM · Discovery-Search (Current work)
TJones renamed T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair from Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, and word_break_helper to Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Mon, Feb 5, 3:55 PM · Discovery-Search (Current work)
TJones updated the task description for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Mon, Feb 5, 3:50 PM · Discovery-Search (Current work)
TJones added a subtask for T356651: Rebuild and deploy textify plugin: T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Mon, Feb 5, 3:49 PM · Data-Platform-SRE ( 2024.02.12 - 2024.03.03), Discovery-Search (Current work)
TJones removed a subtask for T332337: Repair multi-script tokens split by the ICU tokenizer: T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Mon, Feb 5, 3:49 PM · Discovery-Search (Current work)
TJones edited parent tasks for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair, added: T356651: Rebuild and deploy textify plugin; removed: T332337: Repair multi-script tokens split by the ICU tokenizer.
Mon, Feb 5, 3:49 PM · Discovery-Search (Current work)
TJones created T356651: Rebuild and deploy textify plugin.
Mon, Feb 5, 3:48 PM · Data-Platform-SRE ( 2024.02.12 - 2024.03.03), Discovery-Search (Current work)
TJones renamed T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from Update AnalysisConfigBuilder to use icu_token_repair to Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Mon, Feb 5, 3:45 PM · Discovery-Search (Current work)
TJones updated the task description for T332337: Repair multi-script tokens split by the ICU tokenizer.
Mon, Feb 5, 2:33 PM · Discovery-Search (Current work)
TJones changed the status of T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair, a subtask of T332337: Repair multi-script tokens split by the ICU tokenizer, from Open to In Progress.
Mon, Feb 5, 2:32 PM · Discovery-Search (Current work)
TJones changed the status of T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from Open to In Progress.
Mon, Feb 5, 2:32 PM · Discovery-Search (Current work)
TJones created T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Mon, Feb 5, 2:31 PM · Discovery-Search (Current work)

Fri, Jan 26

TJones added a comment to T332337: Repair multi-script tokens split by the ICU tokenizer.

More detailed writeup (overlapping plugin docs) on MediaWiki.

Fri, Jan 26, 9:34 PM · Discovery-Search (Current work)

Wed, Jan 24

TJones added a comment to T332337: Repair multi-script tokens split by the ICU tokenizer.

Gerrit patch for the plugin (which wasn't added here automatically): https://gerrit.wikimedia.org/r/c/search/extra/+/972478

Wed, Jan 24, 5:44 PM · Discovery-Search (Current work)
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from In Progress to Needs review on the Discovery-Search (Current work) board.
Wed, Jan 24, 3:45 PM · Discovery-Search (Current work)

Dec 5 2023

TJones renamed T311051: Missing space between paragraphs in extract received using API (all wikis) from Missing space between paragraphs in extract received using API (cswiki) to Missing space between paragraphs in extract received using API (all wikis).
Dec 5 2023, 9:52 PM · TextExtracts
TJones added a comment to T311051: Missing space between paragraphs in extract received using API (all wikis).

This happens across all wikis, not just cswiki.

Dec 5 2023, 9:51 PM · TextExtracts

Dec 4 2023

TJones renamed T352538: [EPIC] Evaluate the impact of the graph split from Evaluate the impact of the graph split to [EPIC] Evaluate the impact of the graph split.
Dec 4 2023, 4:36 PM · Epic, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
TJones moved T352538: [EPIC] Evaluate the impact of the graph split from Incoming to Epics on the Discovery-Search (Current work) board.
Dec 4 2023, 4:36 PM · Epic, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
TJones changed the point value for T332337: Repair multi-script tokens split by the ICU tokenizer from 8 to 13.
Dec 4 2023, 4:17 PM · Discovery-Search (Current work)

Nov 13 2023

TJones added a comment to T350974: search/glent fails on Java 11.

Not sure what to do about Spark, but the Java 11 failure is arguably a feature, not a bug! The script that changed is Adlam, and Java 11 got smarter about it. Some of the other changes lsted there make me wonder what other texty corner cases are going to be affected by the upgrade.

Nov 13 2023, 7:37 PM · Discovery-Search (Current work), ci-test-error
TJones updated the task description for T351040: Re-implement the REST endpoint for related pages in PHP.
Nov 13 2023, 4:17 PM · Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, RESTBase Sunsetting

Oct 30 2023

TJones moved T346051: Refactor slow global analysis components from Needs review to Needs Reporting on the Discovery-Search (Current work) board.
Oct 30 2023, 4:04 PM · Discovery-Search (Current work)

Oct 26 2023

TJones claimed T332337: Repair multi-script tokens split by the ICU tokenizer.
Oct 26 2023, 8:35 PM · Discovery-Search (Current work)
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Oct 26 2023, 8:34 PM · Discovery-Search (Current work)
TJones awarded T349827: mediawiki.util "debounce (old signature)" test occasionally fails a Like token.
Oct 26 2023, 3:17 PM · MediaWiki-General, ci-test-error (WMF-deployed Build Failure)

Oct 24 2023

TJones added a comment to T346051: Refactor slow global analysis components.

Dev notes and details on Mediawiki.

Oct 24 2023, 10:21 PM · Discovery-Search (Current work)

Oct 23 2023

TJones moved T346051: Refactor slow global analysis components from In Progress to Needs review on the Discovery-Search (Current work) board.
Oct 23 2023, 8:14 PM · Discovery-Search (Current work)
TJones added a comment to T349246: Bad ranking of Wikidata item search results on Special:Search when non-default namespaces are included.

Somewhat unfortunately, this is the expected behavior.

Oct 23 2023, 4:07 PM · Discovery-Search, Wikidata

Oct 13 2023

TJones updated the task description for T346051: Refactor slow global analysis components.
Oct 13 2023, 7:28 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Oct 13 2023, 7:21 PM · Discovery-Search (Current work)

Oct 9 2023

TJones renamed T346718: [Search Update Pipeline] Set max parallelism explicitly on operators with a state from [Search Update Pipeline] Set max parallelism explicitely on operators with a state to [Search Update Pipeline] Set max parallelism explicitly on operators with a state.
Oct 9 2023, 3:39 PM · Discovery-Search (Current work)
TJones updated the task description for T346717: [Search Update Pipeline] Name and identify operators that have a state.
Oct 9 2023, 3:36 PM · Discovery-Search (Current work)

Oct 6 2023

TJones raised the priority of T293398: Search term entered without diacritics on Czech Wikipedia does not list expected match from Medium to High.
Oct 6 2023, 6:05 PM · Discovery-Search, CirrusSearch

Sep 27 2023

TJones added a comment to T346051: Refactor slow global analysis components.

We previously discussed how to bundle the new filters, but talked about it again today.

Sep 27 2023, 5:02 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 27 2023, 4:47 PM · Discovery-Search (Current work)

Sep 21 2023

TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 21 2023, 9:54 PM · Discovery-Search (Current work)

Sep 18 2023

TJones moved T346456: Improve concurrency limits configuration of the wdqs updater from needs triage to Current work on the Discovery-Search board.
Sep 18 2023, 3:48 PM · Discovery-Search (Current work), [DEPRECATED] wdwb-tech, Wikidata, serviceops, Wikidata-Query-Service
TJones removed a project from T328330: Create SLI / SLO on Search update lag: Epic.
Sep 18 2023, 3:11 PM · Data-Platform-SRE, Discovery-Search (Current work)

Sep 11 2023

TJones moved T332342: Standardize ASCII-folding/ICU-folding across analyzers from In Progress to Ready for Dev -- SWE on the Discovery-Search (Current work) board.

Moved back to ready for dev while working on T346051

Sep 11 2023, 3:41 PM · Discovery-Search (Current work)
TJones moved T346051: Refactor slow global analysis components from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Sep 11 2023, 3:40 PM · Discovery-Search (Current work)
TJones claimed T346051: Refactor slow global analysis components.
Sep 11 2023, 3:28 PM · Discovery-Search (Current work)
TJones renamed T346051: Refactor slow global analysis components from Refactor slow analysis components to Refactor slow global analysis components.
Sep 11 2023, 3:21 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 11 2023, 3:20 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 11 2023, 3:20 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 11 2023, 3:19 PM · Discovery-Search (Current work)
TJones created T346051: Refactor slow global analysis components.
Sep 11 2023, 3:16 PM · Discovery-Search (Current work)
TJones moved T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair from Ready for Dev -- SRE/Ops to Blocked/Waiting on the Discovery-Search (Current work) board.
Sep 11 2023, 3:15 PM · Discovery-Search (Current work)
TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Sep 11 2023, 3:13 PM · Discovery-Search (Current work)
TJones added a comment to T170625: Smarter handling of acronyms for word_break_helper in language analyzers.

This has been deployed, but the reindexing ws stopped for being too slow. I'll move this ticket into needs reporting and open a new one for the new efficiency refactor.

Sep 11 2023, 3:13 PM · Discovery-Search (Current work)

Aug 28 2023

TJones added a comment to T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.

Sounds like my local reindexing is insufficient for detecting non-egregious slow downs in indexing speed. (I know I have other overhead—I guess it's even more than I thought.) Should we pause the reindex and investigate more thoroughly on RelForge, with the possibility of reverting some changes after finding the slowest ones?

Aug 28 2023, 2:21 PM · Discovery-Search (Current work)

Aug 1 2023

TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Aug 1 2023, 2:10 PM · Discovery-Search (Current work)

Jul 31 2023

TJones claimed T332342: Standardize ASCII-folding/ICU-folding across analyzers.
Jul 31 2023, 8:44 PM · Discovery-Search (Current work)
TJones moved T332342: Standardize ASCII-folding/ICU-folding across analyzers from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Jul 31 2023, 7:03 PM · Discovery-Search (Current work)
TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from In Progress to Needs review on the Discovery-Search (Current work) board.
Jul 31 2023, 6:17 PM · Discovery-Search (Current work)
TJones added a comment to T170625: Smarter handling of acronyms for word_break_helper in language analyzers.

acronym_fixer is rather complicated, as expected. word_break_helper is a little complicated, unexpectedly! More on MediaWiki.

Jul 31 2023, 6:15 PM · Discovery-Search (Current work)
TJones updated the task description for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Jul 31 2023, 2:47 AM · Discovery-Search (Current work)

Jul 26 2023

TJones updated the task description for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Jul 26 2023, 1:56 PM · Discovery-Search (Current work)

Jul 25 2023

TJones added a comment to T219550: [EPIC] Harmonize language analysis across languages.

While harmonizing, I noticed that the Hebrew analysis chain was creating a lot of duplicate tokens. Adding a remove_duplicates filter removed 19.7% (Wikipedia) to 22.7% (Wiktionary) of all tokens—all non-Hebrew and many Hebrew tokens were duplicated! Did a lot of refactoring (checked off the task above!), too.

Jul 25 2023, 11:40 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Discovery-Search (Current work), Epic
TJones updated the task description for T219550: [EPIC] Harmonize language analysis across languages.
Jul 25 2023, 11:36 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Discovery-Search (Current work), Epic

Jul 21 2023

Pols12 awarded T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair a Love token.
Jul 21 2023, 10:06 PM · Discovery-Search (Current work)
TJones added a comment to T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.

I'll do a write up of the before-and-after impact of the reindexing and post a link here, but anyone can do the reindexing and finish the ticket without that.

Jul 21 2023, 3:16 PM · Discovery-Search (Current work)
TJones created T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Jul 21 2023, 3:15 PM · Discovery-Search (Current work)
TJones added a comment to T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis.

@Pols12, the code is deployed, but not activated yet. In our workflow, we generally close tickets when the code is deployed, separate from when the feature is available.

Jul 21 2023, 3:05 PM · Discovery-Search (Current work), CirrusSearch

Jul 17 2023

TJones changed the point value for T332337: Repair multi-script tokens split by the ICU tokenizer from 5 to 8.
Jul 17 2023, 3:50 PM · Discovery-Search (Current work)
TJones changed the point value for T332342: Standardize ASCII-folding/ICU-folding across analyzers from 5 to 8.
Jul 17 2023, 3:50 PM · Discovery-Search (Current work)

Jul 14 2023

TJones closed T268788: Create Elasticsearch filter so we can do aggressive_splitting without causing an invalid token order, a subtask of T268730: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards, as Declined.
Jul 14 2023, 8:00 PM · MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), Discovery-Search (Current work), CirrusSearch
TJones closed T268788: Create Elasticsearch filter so we can do aggressive_splitting without causing an invalid token order as Declined.

I'm going to close this one because we've deprecated aggressive_splitting as too aggressive for the text field. It is still used in the short_text field, but the context there is constrained and not language-specific, so it's less likely to accidentally get messy.

Jul 14 2023, 8:00 PM · Discovery-Search, CirrusSearch

Jul 10 2023

TJones renamed T341332: [EPIC] The CirrusSearch streaming updater should support private wikis from The CirrusSearch streaming updater should support private wikis to [EPIC] The CirrusSearch streaming updater should support private wikis.
Jul 10 2023, 3:43 PM · Epic, Discovery-Search (Current work), CirrusSearch
TJones moved T341332: [EPIC] The CirrusSearch streaming updater should support private wikis from Incoming to Epics on the Discovery-Search (Current work) board.
Jul 10 2023, 3:43 PM · Epic, Discovery-Search (Current work), CirrusSearch
TJones moved T340548: [EPIC] Deployment of the Search Update Pipeline on Flink / k8s from Incoming to Epics on the Discovery-Search (Current work) board.
Jul 10 2023, 3:42 PM · Epic, Discovery-Search (Current work), Data-Platform-SRE
TJones renamed T340548: [EPIC] Deployment of the Search Update Pipeline on Flink / k8s from Deployment of the Search Update Pipeline on Flink / k8s to [EPIC] Deployment of the Search Update Pipeline on Flink / k8s.
Jul 10 2023, 3:41 PM · Epic, Discovery-Search (Current work), Data-Platform-SRE
TJones triaged T341073: Normalise Mongolian script when searching as High priority.

If we can get a list of mappings, this should be technically straightforward. I will review the lists provided, and consult with a linguist I know who lives in Mongolia to see if I missed anything else obvious.

Jul 10 2023, 3:23 PM · Discovery-Search, I18n, Vertical-Writing
TJones moved T315118: Handle variation in apostrophe-like characters better from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Jul 10 2023, 3:08 PM · Discovery-Search (Current work), CirrusSearch
TJones moved T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Jul 10 2023, 3:08 PM · Discovery-Search (Current work), CirrusSearch

Jul 5 2023

TJones moved T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Jul 5 2023, 1:59 PM · Discovery-Search (Current work), CirrusSearch
TJones moved T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis from In Progress to Needs review on the Discovery-Search (Current work) board.
Jul 5 2023, 1:59 PM · Discovery-Search (Current work), CirrusSearch

Jun 26 2023

TJones added a comment to T339293: Wikimedia\Assert\PostconditionException: Postcondition failed: Regex failed: 4.

I think I found the input that causes the problem.

Jun 26 2023, 3:42 PM · API Platform, MediaWiki-REST-API, CirrusSearch, Wikimedia-production-error
TJones moved T315118: Handle variation in apostrophe-like characters better from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Jun 26 2023, 3:13 PM · Discovery-Search (Current work), CirrusSearch

Jun 23 2023

TJones updated subscribers of T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis.

My full writeup is on Mediawiki.

Jun 23 2023, 7:54 PM · Discovery-Search (Current work), CirrusSearch

Jun 15 2023

TJones added a comment to T170625: Smarter handling of acronyms for word_break_helper in language analyzers.

Sorry for the hokey pokey—you put the ticket in, you take the ticket out.. you put the ticket in, and you shake it all about—but the aggressive_splitting ticket (T219108) overlaps with this one too much. And! I discovered I can do what I want for acronym collapsing with a regex (probably.. still checking on details) rather than a custom filter, which makes this easier—and I'd feel better about deploying word_break_helper everywhere with that fix in place.

Jun 15 2023, 2:08 PM · Discovery-Search (Current work)
TJones added a comment to T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis.

More details to come, but aggressive_splitting (which is a word_delimiter filter underneath) is just too aggressive. It breaks things ICU normalization does better, and word_break_helper (T170625) does or can do the good things aggressive_splitting does. My new plan is to deactivate aggressive_splitting on English-language wikis and replace it with a split_camelCase filter that addresses the original issue of "FilesystemHierarchyStandard" in this ticket, and delegate the good things it does to word_break_helper.

Jun 15 2023, 2:00 PM · Discovery-Search (Current work), CirrusSearch
TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Jun 15 2023, 1:40 PM · Discovery-Search (Current work)

Jun 12 2023

TJones claimed T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis.
Jun 12 2023, 5:54 PM · Discovery-Search (Current work), CirrusSearch
TJones moved T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Jun 12 2023, 5:54 PM · Discovery-Search (Current work), CirrusSearch
TJones moved T315118: Handle variation in apostrophe-like characters better from In Progress to Needs review on the Discovery-Search (Current work) board.
Jun 12 2023, 3:18 PM · Discovery-Search (Current work), CirrusSearch

Jun 9 2023

TJones added a comment to T315118: Handle variation in apostrophe-like characters better.

Full write up on MediaWiki.

Jun 9 2023, 9:52 PM · Discovery-Search (Current work), CirrusSearch

Jun 8 2023

TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from In Progress to Ready for Dev -- SWE on the Discovery-Search (Current work) board.
Jun 8 2023, 10:35 PM · Discovery-Search (Current work)

May 24 2023

TJones updated the task description for T219550: [EPIC] Harmonize language analysis across languages.
May 24 2023, 7:26 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Discovery-Search (Current work), Epic

May 22 2023

TJones updated the task description for T147505: [tracking] CirrusSearch: what is updated during re-indexing.
May 22 2023, 6:14 PM · Tracking-Neverending, Epic, Discovery-Search (Current work), Discovery-ARCHIVED
TJones moved T272606: [EPIC] Unpack all Elasticsearch analyzers from Epics to Needs Reporting on the Discovery-Search (Current work) board.

Holy Guacamole, Batman! It's all done!

May 22 2023, 6:12 PM · Epic, Discovery-Search (Current work)