Page MenuHomePhabricator

TJones (Trey Jones)
Staff Computational Linguist, Search Platform Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Jul 8 2015, 3:02 PM (453 w, 5 d)
Availability
Available
IRC Nick
Trey314159
LDAP User
Tjones
MediaWiki User
TJones (WMF) [ Global Accounts ]

I would have written a shorter comment, but I did not have the time.

I'm part of the Search Platform team and I spend my time working on search & relevance, trying to better support search in various languages, analyzing queries, and doing random mathy things. I tend to write long, detailed notes about my investigations (so as to improve the bus number of my work).

When I have to work on _GitHub,_ /‍‍/Phab,/‍‍/ and ''MediaWiki'' all on the same day, I sometimes suffer Severe Markup Incongruence Fatigue.

I � Unicode.

Recent Activity

Yesterday

TJones added a project to T358495: Enable dotted_I_fix (almost?) everywhere: Patch-For-Review.
Mon, Mar 18, 9:22 PM · Patch-For-Review, Discovery-Search (Current work)
TJones moved T358495: Enable dotted_I_fix (almost?) everywhere from In Progress to Needs review on the Discovery-Search (Current work) board.
Mon, Mar 18, 9:11 PM · Patch-For-Review, Discovery-Search (Current work)
TJones added a comment to T358495: Enable dotted_I_fix (almost?) everywhere.

Patch for review: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/1011442

Mon, Mar 18, 9:11 PM · Patch-For-Review, Discovery-Search (Current work)
TJones reassigned T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair from TJones to EBernhardson.

Swapping Assignee and Other Assignee with Erik, since he's still working on reindexing cloudelastic via the new update pipeline backfill mechanism.

Mon, Mar 18, 3:33 PM · Discovery-Search (Current work)

Wed, Mar 6

TJones updated Other Assignee for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair, added: EBernhardson.

Adding Erik as "other assignee" (never done that before) and increasing the points because Erik is doing more than usual for reindexing watching the cloudelastic reindex using the new update pipeline backfilling mechanism, and I've been doing more than usual gathering stats while fretting a bit over reindex speed.

Wed, Mar 6, 8:33 PM · Discovery-Search (Current work)

Mon, Mar 4

TJones created T359100: Analyze results of harmonization.
Mon, Mar 4, 7:50 PM · Discovery-Search (Current work)
TJones created T359092: Requesting access to kubernetes deployment for tjones.
Mon, Mar 4, 6:01 PM · Data-Platform-SRE (2024.03.04 - 2024.03.24), SRE, SRE-Access-Requests
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Mon, Mar 4, 4:15 PM · Discovery-Search (Current work)
TJones moved T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Mon, Mar 4, 4:15 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones claimed T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Mon, Mar 4, 4:11 PM · Discovery-Search (Current work)
TJones moved T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair from Blocked/Waiting to In Progress on the Discovery-Search (Current work) board.
Mon, Mar 4, 4:10 PM · Discovery-Search (Current work)
TJones added a comment to T356651: Rebuild and deploy textify plugin.

@RKemper, it does look like it's deployed everywhere it should be!

Mon, Mar 4, 1:41 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03), Discovery-Search (Current work)

Tue, Feb 27

TJones awarded T357473: Divehi wiki search button is misplaced on page load a Like token.
Tue, Feb 27, 4:15 PM · Local-Wiki-Template-And-Gadget-Issues, Desktop Improvements (Vector 2022)

Mon, Feb 26

TJones moved T358495: Enable dotted_I_fix (almost?) everywhere from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Mon, Feb 26, 3:38 PM · Patch-For-Review, Discovery-Search (Current work)
TJones claimed T358495: Enable dotted_I_fix (almost?) everywhere.

I prioritized this task to have a smaller task to work on as a break after the ginormous T332337 and T356643, and to have something more interruptable to work on while T342444 is running in the background.

Mon, Feb 26, 3:38 PM · Patch-For-Review, Discovery-Search (Current work)
TJones updated the task description for T219550: [EPIC] Harmonize language analysis across languages.
Mon, Feb 26, 3:33 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Discovery-Search (Current work), Epic
TJones created T358495: Enable dotted_I_fix (almost?) everywhere.
Mon, Feb 26, 3:31 PM · Patch-For-Review, Discovery-Search (Current work)
TJones triaged T332342: Standardize ASCII-folding/ICU-folding across analyzers as High priority.
Mon, Feb 26, 3:03 PM · Discovery-Search
TJones moved T332342: Standardize ASCII-folding/ICU-folding across analyzers from needs triage to Language Stuff on the Discovery-Search board.
Mon, Feb 26, 3:03 PM · Discovery-Search
TJones placed T332342: Standardize ASCII-folding/ICU-folding across analyzers up for grabs.

Moving this back to the backlog in favor of a smaller next harmonization project.

Mon, Feb 26, 3:02 PM · Discovery-Search

Wed, Feb 21

TJones moved T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Wed, Feb 21, 11:08 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones moved T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from In Progress to Needs review on the Discovery-Search (Current work) board.
Wed, Feb 21, 11:01 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)

Tue, Feb 20

TJones changed the point value for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from 5 to 8.

Full write up on MediaWiki.

Tue, Feb 20, 10:21 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)

Feb 13 2024

TJones updated the task description for T357473: Divehi wiki search button is misplaced on page load.
Feb 13 2024, 9:25 PM · Local-Wiki-Template-And-Gadget-Issues, Desktop Improvements (Vector 2022)
TJones created T357473: Divehi wiki search button is misplaced on page load.
Feb 13 2024, 9:22 PM · Local-Wiki-Template-And-Gadget-Issues, Desktop Improvements (Vector 2022)

Feb 6 2024

TJones added a comment to T356651: Rebuild and deploy textify plugin.

T332337 has been comitted, so this is ready to go.

Feb 6 2024, 4:21 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03), Discovery-Search (Current work)
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Feb 6 2024, 4:19 PM · Discovery-Search (Current work)

Feb 5 2024

TJones updated the task description for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Feb 5 2024, 3:56 PM · Discovery-Search (Current work)
TJones added a parent task for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair: T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Feb 5 2024, 3:55 PM · Discovery-Search (Current work)
TJones added a subtask for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair: T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Feb 5 2024, 3:55 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones renamed T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair from Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, and word_break_helper to Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Feb 5 2024, 3:55 PM · Discovery-Search (Current work)
TJones updated the task description for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Feb 5 2024, 3:50 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones added a subtask for T356651: Rebuild and deploy textify plugin: T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Feb 5 2024, 3:49 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03), Discovery-Search (Current work)
TJones removed a subtask for T332337: Repair multi-script tokens split by the ICU tokenizer: T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Feb 5 2024, 3:49 PM · Discovery-Search (Current work)
TJones edited parent tasks for T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair, added: T356651: Rebuild and deploy textify plugin; removed: T332337: Repair multi-script tokens split by the ICU tokenizer.
Feb 5 2024, 3:49 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones created T356651: Rebuild and deploy textify plugin.
Feb 5 2024, 3:48 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03), Discovery-Search (Current work)
TJones renamed T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from Update AnalysisConfigBuilder to use icu_token_repair to Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Feb 5 2024, 3:45 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones updated the task description for T332337: Repair multi-script tokens split by the ICU tokenizer.
Feb 5 2024, 2:33 PM · Discovery-Search (Current work)
TJones changed the status of T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair, a subtask of T332337: Repair multi-script tokens split by the ICU tokenizer, from Open to In Progress.
Feb 5 2024, 2:32 PM · Discovery-Search (Current work)
TJones changed the status of T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair from Open to In Progress.
Feb 5 2024, 2:32 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)
TJones created T356643: Enable icu_tokenizer (almost) everywhere and update AnalysisConfigBuilder to use icu_token_repair.
Feb 5 2024, 2:31 PM · MW-1.42-notes (1.42.0-wmf.20; 2024-02-27), Discovery-Search (Current work)

Jan 26 2024

TJones added a comment to T332337: Repair multi-script tokens split by the ICU tokenizer.

More detailed writeup (which partially overlaps the plugin docs) on MediaWiki.

Jan 26 2024, 9:34 PM · Discovery-Search (Current work)

Jan 24 2024

TJones added a comment to T332337: Repair multi-script tokens split by the ICU tokenizer.

Gerrit patch for the plugin (which wasn't added here automatically): https://gerrit.wikimedia.org/r/c/search/extra/+/972478

Jan 24 2024, 5:44 PM · Discovery-Search (Current work)
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from In Progress to Needs review on the Discovery-Search (Current work) board.
Jan 24 2024, 3:45 PM · Discovery-Search (Current work)

Dec 5 2023

TJones renamed T311051: Missing space between paragraphs in extract received using API (all wikis) from Missing space between paragraphs in extract received using API (cswiki) to Missing space between paragraphs in extract received using API (all wikis).
Dec 5 2023, 9:52 PM · TextExtracts
TJones added a comment to T311051: Missing space between paragraphs in extract received using API (all wikis).

This happens across all wikis, not just cswiki.

Dec 5 2023, 9:51 PM · TextExtracts

Dec 4 2023

TJones renamed T352538: [EPIC] Evaluate the impact of the graph split from Evaluate the impact of the graph split to [EPIC] Evaluate the impact of the graph split.
Dec 4 2023, 4:36 PM · Epic, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
TJones moved T352538: [EPIC] Evaluate the impact of the graph split from Incoming to Epics on the Discovery-Search (Current work) board.
Dec 4 2023, 4:36 PM · Epic, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
TJones changed the point value for T332337: Repair multi-script tokens split by the ICU tokenizer from 8 to 13.
Dec 4 2023, 4:17 PM · Discovery-Search (Current work)

Nov 13 2023

TJones added a comment to T350974: search/glent fails on Java 11.

Not sure what to do about Spark, but the Java 11 failure is arguably a feature, not a bug! The script that changed is Adlam, and Java 11 got smarter about it. Some of the other changes lsted there make me wonder what other texty corner cases are going to be affected by the upgrade.

Nov 13 2023, 7:37 PM · Discovery-Search (Current work), ci-test-error
TJones updated the task description for T351040: Re-implement the REST endpoint for related pages in PHP.
Nov 13 2023, 4:17 PM · Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, RESTBase Sunsetting

Oct 30 2023

TJones moved T346051: Refactor slow global analysis components from Needs review to Needs Reporting on the Discovery-Search (Current work) board.
Oct 30 2023, 4:04 PM · Discovery-Search (Current work)

Oct 26 2023

TJones claimed T332337: Repair multi-script tokens split by the ICU tokenizer.
Oct 26 2023, 8:35 PM · Discovery-Search (Current work)
TJones moved T332337: Repair multi-script tokens split by the ICU tokenizer from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Oct 26 2023, 8:34 PM · Discovery-Search (Current work)
TJones awarded T349827: mediawiki.util "debounce (old signature)" test occasionally fails a Like token.
Oct 26 2023, 3:17 PM · MediaWiki-General, ci-test-error (WMF-deployed Build Failure)

Oct 24 2023

TJones added a comment to T346051: Refactor slow global analysis components.

Dev notes and details on Mediawiki.

Oct 24 2023, 10:21 PM · Discovery-Search (Current work)

Oct 23 2023

TJones moved T346051: Refactor slow global analysis components from In Progress to Needs review on the Discovery-Search (Current work) board.
Oct 23 2023, 8:14 PM · Discovery-Search (Current work)
TJones added a comment to T349246: Bad ranking of Wikidata item search results on Special:Search when non-default namespaces are included.

Somewhat unfortunately, this is the expected behavior.

Oct 23 2023, 4:07 PM · Discovery-Search, Wikidata

Oct 13 2023

TJones updated the task description for T346051: Refactor slow global analysis components.
Oct 13 2023, 7:28 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Oct 13 2023, 7:21 PM · Discovery-Search (Current work)

Oct 9 2023

TJones renamed T346718: [Search Update Pipeline] Set max parallelism explicitly on operators with a state from [Search Update Pipeline] Set max parallelism explicitely on operators with a state to [Search Update Pipeline] Set max parallelism explicitly on operators with a state.
Oct 9 2023, 3:39 PM · Discovery-Search (Current work)
TJones updated the task description for T346717: [Search Update Pipeline] Name and identify operators that have a state.
Oct 9 2023, 3:36 PM · Discovery-Search (Current work)

Oct 6 2023

TJones raised the priority of T293398: Search term entered without diacritics on Czech Wikipedia does not list expected match from Medium to High.
Oct 6 2023, 6:05 PM · Discovery-Search, CirrusSearch

Sep 27 2023

TJones added a comment to T346051: Refactor slow global analysis components.

We previously discussed how to bundle the new filters, but talked about it again today.

Sep 27 2023, 5:02 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 27 2023, 4:47 PM · Discovery-Search (Current work)

Sep 21 2023

TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 21 2023, 9:54 PM · Discovery-Search (Current work)

Sep 18 2023

TJones moved T346456: Improve concurrency limits configuration of the wdqs updater from needs triage to Current work on the Discovery-Search board.
Sep 18 2023, 3:48 PM · Discovery-Search (Current work), [DEPRECATED] wdwb-tech, Wikidata, serviceops, Wikidata-Query-Service
TJones removed a project from T328330: Create SLI / SLO on Search update lag: Epic.
Sep 18 2023, 3:11 PM · Data-Platform-SRE, Discovery-Search (Current work)

Sep 11 2023

TJones moved T332342: Standardize ASCII-folding/ICU-folding across analyzers from In Progress to Ready for Dev -- SWE on the Discovery-Search (Current work) board.

Moved back to ready for dev while working on T346051

Sep 11 2023, 3:41 PM · Discovery-Search
TJones moved T346051: Refactor slow global analysis components from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Sep 11 2023, 3:40 PM · Discovery-Search (Current work)
TJones claimed T346051: Refactor slow global analysis components.
Sep 11 2023, 3:28 PM · Discovery-Search (Current work)
TJones renamed T346051: Refactor slow global analysis components from Refactor slow analysis components to Refactor slow global analysis components.
Sep 11 2023, 3:21 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 11 2023, 3:20 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 11 2023, 3:20 PM · Discovery-Search (Current work)
TJones updated the task description for T346051: Refactor slow global analysis components.
Sep 11 2023, 3:19 PM · Discovery-Search (Current work)
TJones created T346051: Refactor slow global analysis components.
Sep 11 2023, 3:16 PM · Discovery-Search (Current work)
TJones moved T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair from Ready for Dev -- SRE/Ops to Blocked/Waiting on the Discovery-Search (Current work) board.
Sep 11 2023, 3:15 PM · Discovery-Search (Current work)
TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Sep 11 2023, 3:13 PM · Discovery-Search (Current work)
TJones added a comment to T170625: Smarter handling of acronyms for word_break_helper in language analyzers.

This has been deployed, but the reindexing ws stopped for being too slow. I'll move this ticket into needs reporting and open a new one for the new efficiency refactor.

Sep 11 2023, 3:13 PM · Discovery-Search (Current work)

Aug 28 2023

TJones added a comment to T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.

Sounds like my local reindexing is insufficient for detecting non-egregious slow downs in indexing speed. (I know I have other overhead—I guess it's even more than I thought.) Should we pause the reindex and investigate more thoroughly on RelForge, with the possibility of reverting some changes after finding the slowest ones?

Aug 28 2023, 2:21 PM · Discovery-Search (Current work)

Aug 1 2023

TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Aug 1 2023, 2:10 PM · Discovery-Search (Current work)

Jul 31 2023

TJones claimed T332342: Standardize ASCII-folding/ICU-folding across analyzers.
Jul 31 2023, 8:44 PM · Discovery-Search
TJones moved T332342: Standardize ASCII-folding/ICU-folding across analyzers from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.
Jul 31 2023, 7:03 PM · Discovery-Search
TJones moved T170625: Smarter handling of acronyms for word_break_helper in language analyzers from In Progress to Needs review on the Discovery-Search (Current work) board.
Jul 31 2023, 6:17 PM · Discovery-Search (Current work)
TJones added a comment to T170625: Smarter handling of acronyms for word_break_helper in language analyzers.

acronym_fixer is rather complicated, as expected. word_break_helper is a little complicated, unexpectedly! More on MediaWiki.

Jul 31 2023, 6:15 PM · Discovery-Search (Current work)
TJones updated the task description for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Jul 31 2023, 2:47 AM · Discovery-Search (Current work)

Jul 26 2023

TJones updated the task description for T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Jul 26 2023, 1:56 PM · Discovery-Search (Current work)

Jul 25 2023

TJones added a comment to T219550: [EPIC] Harmonize language analysis across languages.

While harmonizing, I noticed that the Hebrew analysis chain was creating a lot of duplicate tokens. Adding a remove_duplicates filter removed 19.7% (Wikipedia) to 22.7% (Wiktionary) of all tokens—all non-Hebrew and many Hebrew tokens were duplicated! Did a lot of refactoring (checked off the task above!), too.

Jul 25 2023, 11:40 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Discovery-Search (Current work), Epic
TJones updated the task description for T219550: [EPIC] Harmonize language analysis across languages.
Jul 25 2023, 11:36 PM · MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), Discovery-Search (Current work), Epic

Jul 21 2023

Pols12 awarded T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair a Love token.
Jul 21 2023, 10:06 PM · Discovery-Search (Current work)
TJones added a comment to T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.

I'll do a write up of the before-and-after impact of the reindexing and post a link here, but anyone can do the reindexing and finish the ticket without that.

Jul 21 2023, 3:16 PM · Discovery-Search (Current work)
TJones created T342444: Reindex all wikis to enable apostrophe normalization, camelCase handling, acronym handling, word_break_helper, and icu_tokenizer/_repair.
Jul 21 2023, 3:15 PM · Discovery-Search (Current work)
TJones added a comment to T219108: Investigate applying aggressive_splitting everywhere, not just on English-language wikis.

@Pols12, the code is deployed, but not activated yet. In our workflow, we generally close tickets when the code is deployed, separate from when the feature is available.

Jul 21 2023, 3:05 PM · Discovery-Search (Current work), CirrusSearch

Jul 17 2023

TJones changed the point value for T332337: Repair multi-script tokens split by the ICU tokenizer from 5 to 8.
Jul 17 2023, 3:50 PM · Discovery-Search (Current work)
TJones changed the point value for T332342: Standardize ASCII-folding/ICU-folding across analyzers from 5 to 8.
Jul 17 2023, 3:50 PM · Discovery-Search

Jul 14 2023

TJones closed T268788: Create Elasticsearch filter so we can do aggressive_splitting without causing an invalid token order, a subtask of T268730: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards, as Declined.
Jul 14 2023, 8:00 PM · MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), Discovery-Search (Current work), CirrusSearch
TJones closed T268788: Create Elasticsearch filter so we can do aggressive_splitting without causing an invalid token order as Declined.

I'm going to close this one because we've deprecated aggressive_splitting as too aggressive for the text field. It is still used in the short_text field, but the context there is constrained and not language-specific, so it's less likely to accidentally get messy.

Jul 14 2023, 8:00 PM · Discovery-Search, CirrusSearch

Jul 10 2023

TJones renamed T341332: [EPIC] The CirrusSearch streaming updater should support private wikis from The CirrusSearch streaming updater should support private wikis to [EPIC] The CirrusSearch streaming updater should support private wikis.
Jul 10 2023, 3:43 PM · Epic, Discovery-Search (Current work), CirrusSearch
TJones moved T341332: [EPIC] The CirrusSearch streaming updater should support private wikis from Incoming to Epics on the Discovery-Search (Current work) board.
Jul 10 2023, 3:43 PM · Epic, Discovery-Search (Current work), CirrusSearch
TJones moved T340548: [EPIC] Deployment of the Search Update Pipeline on Flink / k8s from Incoming to Epics on the Discovery-Search (Current work) board.
Jul 10 2023, 3:42 PM · Epic, Discovery-Search (Current work), Data-Platform-SRE