Page MenuHomePhabricator

TJones (Trey Jones)
Staff Computational Linguist, Search Platform Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Jul 8 2015, 3:02 PM (341 w, 6 d)
Availability
Available
IRC Nick
Trey314159
LDAP User
Tjones
MediaWiki User
TJones (WMF) [ Global Accounts ]

I would have written a shorter comment, but I did not have the time.

I'm part of the Search Platform team and I spend my time working on search & relevance, trying to better support search in various languages, analyzing queries, and doing random mathy things. I tend to write long, detailed notes about my investigations (so as to improve the bus number of my work).

When I have to work on _GitHub,_ /‍‍/Phab,/‍‍/ and ''MediaWiki'' all on the same day, I sometimes suffer Severe Markup Incongruence Fatigue.

I � Unicode.

Recent Activity

Yesterday

TJones added a comment to T296225: [Search] Bug: opensearch API doesn't default to resolving redirects.

Hang on, this needs a lot more consideration before implementation as it will be actively undesirable, even harmful, in some circumstances. While no user is going to be confused about seeing "A Tale of Two Cities" when searching for "Tale of Two Cities", the same is not going to be true in every case - examples:

  • Diaoyu IslandsSenkaku Islands. Unless you know that one is an alternate name for the other this is going to be very confusing (and given the sensitivity of place names in disputed territories there could be other issues too)
Mon, Jan 24, 5:31 PM · Patch-For-Review, MediaWiki-Search, MediaWiki-Interface (autocomplete search), Discovery-Search
TJones updated the task description for T285265: Install extra-analysis-khmer on cloudelastic cluster.
Mon, Jan 24, 4:59 PM · Discovery-Search (Current work)
TJones added a comment to T285265: Install extra-analysis-khmer on cloudelastic cluster.

Sounds like you did what you could in terms of figuring out what happened. It's been a while, so the history is lost. As long as we try to pay attention to the situation in the future we've done what we can.

Mon, Jan 24, 4:59 PM · Discovery-Search (Current work)

Fri, Jan 21

TJones closed T189511: Locally override the name of crh from "Crimean Turkish" to "Crimean Tatar" as Resolved.

The local change has been made and future versions of CLDR should use the correct name

Fri, Jan 21, 11:32 PM · MW-1.37-notes (1.37.0-wmf.11; 2021-06-21), Language codes, Upstream, MW-1.31-release-notes (WMF-deploy-2018-03-13 (1.31.0-wmf.25)), WikimediaMessages, MediaWiki-extensions-CLDR

Mon, Jan 10

TJones added a comment to T189511: Locally override the name of crh from "Crimean Turkish" to "Crimean Tatar".

FYI, it looks like CLDR is finally going to make this change upstream, after only ~4 years.

Mon, Jan 10, 8:04 PM · MW-1.37-notes (1.37.0-wmf.11; 2021-06-21), Language codes, Upstream, MW-1.31-release-notes (WMF-deploy-2018-03-13 (1.31.0-wmf.25)), WikimediaMessages, MediaWiki-extensions-CLDR
TJones updated the task description for T297761: Create a Latin-to-Devanagari transliteration second-chance search for Hindi wikis.
Mon, Jan 10, 4:25 PM · Discovery-Search

Dec 16 2021

TJones updated the task description for T147505: [tracking] CirrusSearch: what is updated during re-indexing.
Dec 16 2021, 7:55 PM · Tracking-Neverending, Epic, Discovery-Search (Current work), Discovery
TJones added a comment to T294257: Reindex Hindi, Irish, Norwegian wikis to enable unpacked versions.

I wonder which other wikis having a better Latin to native transliterater would help with.

Dec 16 2021, 4:00 PM · Discovery-Search (Current work)

Dec 15 2021

TJones moved T294257: Reindex Hindi, Irish, Norwegian wikis to enable unpacked versions from In Progress to Needs review on the Discovery-Search (Current work) board.

@MPhamWMF, move to this to "needs reporting" when you are done looking at the summary/write up.

Dec 15 2021, 11:03 PM · Discovery-Search (Current work)
TJones added a comment to T294257: Reindex Hindi, Irish, Norwegian wikis to enable unpacked versions.

Full write up on Mediawiki.

Dec 15 2021, 11:02 PM · Discovery-Search (Current work)

Dec 14 2021

TJones added a project to T297761: Create a Latin-to-Devanagari transliteration second-chance search for Hindi wikis: Discovery-Search.
Dec 14 2021, 11:48 PM · Discovery-Search
TJones created T297761: Create a Latin-to-Devanagari transliteration second-chance search for Hindi wikis.
Dec 14 2021, 11:48 PM · Discovery-Search

Dec 8 2021

TJones moved T294257: Reindex Hindi, Irish, Norwegian wikis to enable unpacked versions from Ready for Development to In Progress on the Discovery-Search (Current work) board.
Dec 8 2021, 4:20 PM · Discovery-Search (Current work)

Dec 7 2021

TJones added a comment to T297071: Search query length limit in UI is wrong, short for non-Latin alphabets.

I can also put in 126 4-byte smiley faces (😀😃😄😁😆) or 4-byte CJK characters (丽丸𠄢乁你)

I think you can put 63 of them (tested in Chrome and Firefox), it's just that '😀'.length and '丽'.length returns 2.

Dec 7 2021, 1:00 AM · Discovery-Search, MediaWiki-Search

Dec 6 2021

TJones added a comment to T276865: Implement text analysis to support stemming.

The default English analyzer (the inner details of which are here) is probably a good place to start.

Dec 6 2021, 6:37 PM · Toolhub
TJones added a comment to T297071: Search query length limit in UI is wrong, short for non-Latin alphabets.

Some odd data points: I can put in 255 1-byte Latin characters (ABCDE, repeating), but only 127 2-byte Cyrillic characters (АБВГД, repeating). I can also put in 126 4-byte smiley faces (😀😃😄😁😆) or 4-byte CJK characters (丽丸𠄢乁你). However, I can only put in 85 3-byte Devanagari (कखगघङ) or Hangul (가각갂갃간) characters. Smiley faces are not a big concern, but Cyrillic, Devanagari, and Hangul are—something fishy is going on.

Dec 6 2021, 6:29 PM · Discovery-Search, MediaWiki-Search

Nov 23 2021

TJones created T296341: Investigate Tibetan Lucene Analyzer.
Nov 23 2021, 8:59 PM · Discovery-Search

Nov 22 2021

TJones added a comment to T258094: Improve Breton language analysis.

@VIGNERON & @Iriep, it would be great if you could take a look at the analysis I've written up so far.

Nov 22 2021, 4:34 PM · Discovery-Search
TJones edited projects for T258094: Improve Breton language analysis, added: Discovery-Search; removed Discovery-Search (Current work).
Nov 22 2021, 4:32 PM · Discovery-Search
TJones triaged T295735: Decide on the future of Cirrus development/integration environment as Medium priority.
Nov 22 2021, 4:07 PM · CirrusSearch, Discovery-Search
TJones raised the priority of T295735: Decide on the future of Cirrus development/integration environment from Medium to Needs Triage.
Nov 22 2021, 4:07 PM · CirrusSearch, Discovery-Search

Nov 15 2021

TJones updated the task description for T295705: Cleanup missing Commons index on Elasticsearch eqiad.
Nov 15 2021, 5:05 PM · Patch-For-Review, Discovery-Search (Current work)
TJones updated the task description for T295705: Cleanup missing Commons index on Elasticsearch eqiad.
Nov 15 2021, 5:04 PM · Patch-For-Review, Discovery-Search (Current work)
TJones updated the task description for T295705: Cleanup missing Commons index on Elasticsearch eqiad.
Nov 15 2021, 5:03 PM · Patch-For-Review, Discovery-Search (Current work)
TJones updated the task description for T295365: Alert when the rate of pages fixed by Saneitizer is too high.
Nov 15 2021, 4:50 PM · Discovery-Search (Current work), CirrusSearch

Nov 8 2021

TJones moved T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Nov 8 2021, 4:47 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones claimed T294067: Install and unpack Bengali analyzer.
Nov 8 2021, 4:14 PM · Discovery-Search (Current work)
TJones moved T294067: Install and unpack Bengali analyzer from Ready for Development to In Progress on the Discovery-Search (Current work) board.
Nov 8 2021, 4:14 PM · Discovery-Search (Current work)

Oct 26 2021

TJones added a comment to T293398: Search term entered without diacritics on Czech Wikipedia does not list expected match.

Thanks, Andre! It seems like a reasonable thing to look for evidence in the query logs and assess the impact on stemming and search results.

Oct 26 2021, 5:14 PM · Discovery-Search, CirrusSearch

Oct 25 2021

TJones removed the point value for T293862: Investigate using jvmquake to limit the time a JVM is unusable due to GC overhead.
Oct 25 2021, 4:04 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
TJones set the point value for T293862: Investigate using jvmquake to limit the time a JVM is unusable due to GC overhead to 5.
Oct 25 2021, 4:03 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
TJones moved T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Oct 25 2021, 3:58 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones set the point value for T294257: Reindex Hindi, Irish, Norwegian wikis to enable unpacked versions to 3.
Oct 25 2021, 3:41 PM · Discovery-Search (Current work)
TJones updated the task description for T294076: Blazegraph and MariaDB contain different sitelinks at Wikidata.
Oct 25 2021, 3:22 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
TJones created T294257: Reindex Hindi, Irish, Norwegian wikis to enable unpacked versions.
Oct 25 2021, 3:01 PM · Discovery-Search (Current work)

Oct 22 2021

TJones added a project to T272606: [EPIC] Unpack all Elasticsearch analyzers: Epic.
Oct 22 2021, 6:55 PM · Epic, Discovery-Search (Current work)
TJones removed projects from T272606: [EPIC] Unpack all Elasticsearch analyzers: MW-1.38-notes (1.38.0-wmf.1; 2021-09-21), Patch-For-Review.
Oct 22 2021, 6:55 PM · Epic, Discovery-Search (Current work)
TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Oct 22 2021, 6:53 PM · Epic, Discovery-Search (Current work)
TJones set the point value for T294147: Unpack Arabic, Latvian, Thai Elasticsearch Analyzers to 5.
Oct 22 2021, 6:52 PM · Discovery-Search (Current work)
TJones created T294147: Unpack Arabic, Latvian, Thai Elasticsearch Analyzers.
Oct 22 2021, 6:52 PM · Discovery-Search (Current work)
TJones added a comment to T293398: Search term entered without diacritics on Czech Wikipedia does not list expected match.

I don't dare to speak on behalf of Czech searchers. :)

Oct 22 2021, 5:34 PM · Discovery-Search, CirrusSearch

Oct 21 2021

TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Oct 21 2021, 9:45 PM · Epic, Discovery-Search (Current work)
TJones created T294067: Install and unpack Bengali analyzer.
Oct 21 2021, 9:43 PM · Discovery-Search (Current work)
TJones added a comment to T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers.

Given that ICU normalization and folding didn't have much effect on Devenagari script characters, does it make sense to lower the priority of unfolding other languages that use the same script? Is this true for other non-Latin scripts as well?

Oct 21 2021, 5:58 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)

Oct 20 2021

TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Oct 20 2021, 10:04 PM · Epic, Discovery-Search (Current work)
TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Oct 20 2021, 9:51 PM · Epic, Discovery-Search (Current work)
TJones moved T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers from In Progress to Needs review on the Discovery-Search (Current work) board.
Oct 20 2021, 9:44 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones added a comment to T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers.

Irish was a bit eventful—

  • It didn't have the usual dotted İ regression because we keep the Irish-specific lowercasing and add ICU normalization (rather than having it replace lowercasing).
  • I ran into a few instances of older orthography, using dotted characters like ḃ, which are written as bh in modern orthography. I added a character filter to do the mapping (ḃ => bh, etc.), and it has a small but positive impact.
  • Because of the way my script counts things, when small groups and large groups merged in Irish (which happened a lot), the large group was often counted as merging into the small group, inflating the merger impact numbers.
Oct 20 2021, 9:42 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)

Oct 18 2021

TJones added a comment to T293398: Search term entered without diacritics on Czech Wikipedia does not list expected match.

The usual heuristic is to not fold letter that are part of the alphabet for a given language. For Czech, the list of letters not to fold is: Áá Čč Ďď Éé Ěě Íí Ňň Óó Řř Šš Ťť Úú Ůů Ýý Žž

Oct 18 2021, 7:56 PM · Discovery-Search, CirrusSearch
TJones added a comment to T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions.

@MPhamWMF, if you got what you want/need from the report, this is ready to be moved to "Needs Reporting"

Oct 18 2021, 4:02 PM · Discovery-Search (Current work)

Oct 13 2021

TJones updated TJones.
Oct 13 2021, 7:53 PM

Oct 4 2021

TJones updated subscribers of T292449: Revisit approach to automated bot detection.

Expanding on what @EBernhardson said over on Slack, we have implemented a very simple heuristic which, following Erik, we can call SillyBotDetection. (Whether the bots or the detection is silly is left as an exercise for the reader.) We have been primarily focused on generating user search query corpora for training or testing, though Erik may have done some additional bot detection for the clickstream data for training.

Oct 4 2021, 5:16 PM · Research

Sep 23 2021

TJones moved T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions from In Progress to Needs review on the Discovery-Search (Current work) board.

The Czech and Finnish Wikipedia samples showed clear but rather muted impact on user query results. The Galician results are a little more robust and show a more consistent pattern of searchers not using standard accents (rather than just problems with "foreign" diacritics).

Sep 23 2021, 9:20 PM · Discovery-Search (Current work)

Sep 22 2021

TJones added a comment to T290640: Undefined variable: wgFileBlacklist.

^ Should have only been on "private wikis", but is fixed too

Sep 22 2021, 6:38 PM · Patch-For-Review, MediaWiki-Uploading, Beta-Cluster-reproducible
TJones added a comment to T290640: Undefined variable: wgFileBlacklist.

I'm getting similar errors for $wgMimeTypeBlacklist when I reindex certain wikis:

Sep 22 2021, 4:08 PM · Patch-For-Review, MediaWiki-Uploading, Beta-Cluster-reproducible

Sep 17 2021

TJones moved T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions from Waiting to In Progress on the Discovery-Search (Current work) board.
Sep 17 2021, 6:43 PM · Discovery-Search (Current work)

Sep 16 2021

TJones moved T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions from In Progress to Waiting on the Discovery-Search (Current work) board.

The train moved backwards, so it's not time to reindex.

Sep 16 2021, 6:23 PM · Discovery-Search (Current work)

Sep 15 2021

TJones claimed T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions.
Sep 15 2021, 9:39 PM · Discovery-Search (Current work)
TJones moved T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions from Ready for Development to In Progress on the Discovery-Search (Current work) board.
Sep 15 2021, 9:38 PM · Discovery-Search (Current work)

Sep 13 2021

TJones moved T284578: Unpack Czech, Finnish, Galician Elasticsearch Analyzers from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Sep 13 2021, 3:52 PM · MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Discovery-Search (Current work)
TJones updated the task description for T290604: Create alerts for GC death spiral.
Sep 13 2021, 3:47 PM · Discovery-Search (Current work), CirrusSearch

Aug 31 2021

TJones updated the task description for T147505: [tracking] CirrusSearch: what is updated during re-indexing.
Aug 31 2021, 5:58 PM · Tracking-Neverending, Epic, Discovery-Search (Current work), Discovery
TJones placed T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions up for grabs.
Aug 31 2021, 2:35 PM · Discovery-Search (Current work)
TJones created T290079: Reindex Czech, Finnish, Galician wikis to enable unpacked versions.
Aug 31 2021, 2:34 PM · Discovery-Search (Current work)
TJones moved T284578: Unpack Czech, Finnish, Galician Elasticsearch Analyzers from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Aug 31 2021, 2:29 PM · MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Discovery-Search (Current work)

Aug 30 2021

TJones merged task T289646: Missing message "apihelp-cirrus-config-dump-param-prop" into T285574: apihelp-cirrus-config-dump-param-prop needs creating.
Aug 30 2021, 3:34 PM · Discovery-Search (Current work), CirrusSearch
TJones merged T289646: Missing message "apihelp-cirrus-config-dump-param-prop" into T285574: apihelp-cirrus-config-dump-param-prop needs creating.
Aug 30 2021, 3:34 PM · MW-1.38-notes (1.38.0-wmf.9; 2021-11-16), Discovery-Search (Current work), I18n, CirrusSearch

Aug 24 2021

TJones moved T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers from Ready for Development to In Progress on the Discovery-Search (Current work) board.
Aug 24 2021, 8:51 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Aug 24 2021, 7:16 PM · Epic, Discovery-Search (Current work)
TJones moved T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers from Incoming to Ready for Development on the Discovery-Search (Current work) board.
Aug 24 2021, 7:00 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones triaged T284578: Unpack Czech, Finnish, Galician Elasticsearch Analyzers as High priority.
Aug 24 2021, 6:59 PM · MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Discovery-Search (Current work)
TJones triaged T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers as High priority.
Aug 24 2021, 6:59 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Aug 24 2021, 6:58 PM · Epic, Discovery-Search (Current work)
TJones updated the task description for T272606: [EPIC] Unpack all Elasticsearch analyzers.
Aug 24 2021, 6:58 PM · Epic, Discovery-Search (Current work)
TJones created T289612: Unpack Hindi, Irish, Norwegian Elasticsearch Analyzers.
Aug 24 2021, 6:58 PM · MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), Discovery-Search (Current work)
TJones moved T284578: Unpack Czech, Finnish, Galician Elasticsearch Analyzers from In Progress to Needs review on the Discovery-Search (Current work) board.

No problems or anything terribly unusual to report. Full details on MediaWiki.

Aug 24 2021, 5:56 PM · MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Discovery-Search (Current work)

Aug 16 2021

TJones updated the task description for T288230: Promote MediaInfo RDF format to stable.
Aug 16 2021, 3:38 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Aug 3 2021

TJones moved T280184: Enable reindexing the Commons "File" index in Cloudelastic by default from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Aug 3 2021, 5:57 PM · MW-1.37-notes (1.37.0-wmf.11; 2021-06-21), Discovery-Search (Current work)
TJones moved T283366: Unpack Basque, Catalan, Danish Elasticsearch Analyzers from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.
Aug 3 2021, 5:56 PM · Discovery-Search (Current work)
TJones moved T284691: Reindex Basque, Catalan, Danish wikis to enable unpacked versions from In Progress to Needs review on the Discovery-Search (Current work) board.

Full notes on Mediawiki.

Aug 3 2021, 5:56 PM · Discovery-Search (Current work)

Aug 2 2021

TJones renamed T284691: Reindex Basque, Catalan, Danish wikis to enable unpacked versions from Reindex Basque, Catalan, Danish Wikis to Reindex Basque, Catalan, Danish wikis to enable unpacked versions.
Aug 2 2021, 8:04 PM · Discovery-Search (Current work)
TJones moved T284691: Reindex Basque, Catalan, Danish wikis to enable unpacked versions from Waiting to In Progress on the Discovery-Search (Current work) board.
Aug 2 2021, 7:53 PM · Discovery-Search (Current work)
TJones updated the task description for T287468: Manage the #wikimedia-search channel with ircservserv.
Aug 2 2021, 3:36 PM · Discovery-Search (Current work)

Jul 29 2021

TJones updated subscribers of T287718: Look into improving Dagbani Search.
Jul 29 2021, 10:37 PM · Dagbani-Sites, Discovery-Search
TJones updated subscribers of T287718: Look into improving Dagbani Search.
Jul 29 2021, 10:35 PM · Dagbani-Sites, Discovery-Search
TJones added a project to T287719: Look into improving Igbo Search: Igbo-Wikimedians-User-Group.
Jul 29 2021, 10:34 PM · Igbo-Wikimedians-User-Group, Discovery-Search
TJones added a comment to T287719: Look into improving Igbo Search.

Stop Words
I put together a list of words from Igbo Wikipedia, sorted by frequency.

Jul 29 2021, 10:28 PM · Igbo-Wikimedians-User-Group, Discovery-Search
TJones triaged T287719: Look into improving Igbo Search as Medium priority.
Jul 29 2021, 10:03 PM · Igbo-Wikimedians-User-Group, Discovery-Search
TJones created T287719: Look into improving Igbo Search.
Jul 29 2021, 10:03 PM · Igbo-Wikimedians-User-Group, Discovery-Search
TJones updated the task description for T287718: Look into improving Dagbani Search.
Jul 29 2021, 10:01 PM · Dagbani-Sites, Discovery-Search
TJones claimed T287718: Look into improving Dagbani Search.
Jul 29 2021, 10:01 PM · Dagbani-Sites, Discovery-Search
TJones triaged T287718: Look into improving Dagbani Search as Medium priority.
Jul 29 2021, 9:54 PM · Dagbani-Sites, Discovery-Search
TJones moved T287718: Look into improving Dagbani Search from needs triage to Language Stuff on the Discovery-Search board.
Jul 29 2021, 9:54 PM · Dagbani-Sites, Discovery-Search
TJones added a comment to T287718: Look into improving Dagbani Search.

Orthography
After a quick look, I think the characters ɛ, ɣ, ŋ, ɔ, ʒ are being processed correctly, which is good.

Jul 29 2021, 9:53 PM · Dagbani-Sites, Discovery-Search
TJones created T287718: Look into improving Dagbani Search.
Jul 29 2021, 9:51 PM · Dagbani-Sites, Discovery-Search

Jul 27 2021

TJones moved T284691: Reindex Basque, Catalan, Danish wikis to enable unpacked versions from Blocked (from outside the team) to Waiting on the Discovery-Search (Current work) board.

This was blocked by the removal of my directories and files on mwmaint1002 after the data center switchover. My files have been restored, but David needs to reindex over 800 wikis for the ores_articletopicsweighted_tags rename. David's reindex will cover all of the wikis relevant to this ticket except dkwikimedia, which I just reindexed by itself, since it only has ~600 documents.

Jul 27 2021, 4:39 PM · Discovery-Search (Current work)
TJones updated the task description for T147505: [tracking] CirrusSearch: what is updated during re-indexing.
Jul 27 2021, 4:29 PM · Tracking-Neverending, Epic, Discovery-Search (Current work), Discovery
TJones added a comment to T287304: Restore ~tjones/reindex directory from mwmaint1002.

Thanks, @jcrespo! It looks like everything I need is there.

Jul 27 2021, 2:35 PM · bacula, SRE, Data-Persistence-Backup

Jul 26 2021

TJones added a comment to T287304: Restore ~tjones/reindex directory from mwmaint1002.

Any idea when someone might have time to look at this?

Jul 26 2021, 10:48 PM · bacula, SRE, Data-Persistence-Backup
TJones renamed T287231: Consider moving WDQS "munging" of RDF into Wikibase RDF output code from Consider moving WDQS "munging" of RDF into WIkibase RDF output code to Consider moving WDQS "munging" of RDF into Wikibase RDF output code.
Jul 26 2021, 3:25 PM · Wikidata, Wikidata-Query-Service, wdwb-tech