EBernhardson (EBernhardson)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Oct 7 2014, 4:49 PM (210 w, 1 d)
Availability
Available
LDAP User
EBernhardson
MediaWiki User
EBernhardson (WMF) [ Global Accounts ]

Recent Activity

Tue, Oct 16

EBernhardson added a comment to T205348: Calculate autocomplete examination probabilities from eventlogging data.

We have a week worth of autocomplete data for wikidata so i took a stab at this. It's only in a python notebook on SWAP but will hopefully clean it up into something. Currently it generates daily extracts from merged cirrus + eventlogging logs an writes them to ebernhardson.wikibasecompletionclicks in hive. I've then run the click/skip counts over them and generated the following examination probabilities

Tue, Oct 16, 3:38 AM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)

Mon, Oct 15

EBernhardson created P7682 (An Untitled Masterwork).
Mon, Oct 15, 8:19 PM
EBernhardson created T207076: All patches to mediawiki/vendor fail CI due to new package.json requirement.
Mon, Oct 15, 7:28 PM · Continuous-Integration-Config

Fri, Oct 12

EBernhardson created P7676 (An Untitled Masterwork).
Fri, Oct 12, 11:31 PM
EBernhardson moved T205958: Wikibase\Repo\Search\Elastic\Tests\EntitySearchElasticFulltextTest::testSearchElastic fails on PHP 7.1 from Backlog to Needs review on the Discovery-Search (Current work) board.
Fri, Oct 12, 7:02 PM · Discovery-Search (Current work), User-Addshore, MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), User-Ladsgroup, Patch-For-Review, Wikidata-Campsite-Iteration-∞, Wikidata-Campsite, wikidata-tech-focus, User-Smalyshev, Elasticsearch, PHP 7.1 support, MediaWiki-extensions-WikibaseRepository, Wikidata
EBernhardson moved T205958: Wikibase\Repo\Search\Elastic\Tests\EntitySearchElasticFulltextTest::testSearchElastic fails on PHP 7.1 from Needs triage to Current work on the Discovery-Search board.
Fri, Oct 12, 7:02 PM · Discovery-Search (Current work), User-Addshore, MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), User-Ladsgroup, Patch-For-Review, Wikidata-Campsite-Iteration-∞, Wikidata-Campsite, wikidata-tech-focus, User-Smalyshev, Elasticsearch, PHP 7.1 support, MediaWiki-extensions-WikibaseRepository, Wikidata
EBernhardson moved T206042: Job cirrusSearchIncomingLinkCount failures "Read timeout is reached" from Title::getFirstRevision from Backlog to Needs review on the Discovery-Search (Current work) board.
Fri, Oct 12, 1:17 AM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Discovery-Search (Current work), CirrusSearch, Wikimedia-production-error
EBernhardson added a comment to T206042: Job cirrusSearchIncomingLinkCount failures "Read timeout is reached" from Title::getFirstRevision.

Attached patch doesn't really solve the goal of this patch, which seems to identify what happened in the sept 13 deploy to change the behaviour of a previously working query, but it does simplify the query used so we might stop seeing this error.

Fri, Oct 12, 1:10 AM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Discovery-Search (Current work), CirrusSearch, Wikimedia-production-error

Thu, Oct 11

EBernhardson added a comment to T205558: Sister search / Cross-language search interaction with multicluster.

Proposed deployment process:

Thu, Oct 11, 11:58 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T205558: Sister search / Cross-language search interaction with multicluster.

Problems outline in the description are detailed in docs/multi_cluster.txt in the patch. As far as I'm aware this patch resolves all of the machinery necessary in CirrusSearch to deploy multi cluster.

Thu, Oct 11, 11:24 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson claimed T205991: "Select options below to be global" missing a checkbox.
Thu, Oct 11, 11:23 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-10-02 (1.32.0-wmf.24)), CirrusSearch, MediaWiki-extensions-GlobalPreferences, Community-Tech
EBernhardson moved T205991: "Select options below to be global" missing a checkbox from In progress to Needs review on the Discovery-Search (Current work) board.
Thu, Oct 11, 11:22 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-10-02 (1.32.0-wmf.24)), CirrusSearch, MediaWiki-extensions-GlobalPreferences, Community-Tech
EBernhardson edited P7667 Vagrant CirrusSearch multi cluster sister search conf.
Thu, Oct 11, 9:47 PM
EBernhardson moved T205991: "Select options below to be global" missing a checkbox from Backlog to In progress on the Discovery-Search (Current work) board.
Thu, Oct 11, 6:16 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-10-02 (1.32.0-wmf.24)), CirrusSearch, MediaWiki-extensions-GlobalPreferences, Community-Tech
EBernhardson assigned T198352: Setup two elasticsearch clusters on relforge to test multi-instance to Gehel.
Thu, Oct 11, 6:16 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson edited P7667 Vagrant CirrusSearch multi cluster sister search conf.
Thu, Oct 11, 4:51 PM
EBernhardson created P7667 Vagrant CirrusSearch multi cluster sister search conf.
Thu, Oct 11, 4:21 PM
EBernhardson edited P7645 Re-run failed quibble test locally.
Thu, Oct 11, 3:10 AM
EBernhardson moved T205348: Calculate autocomplete examination probabilities from eventlogging data from In progress to Waiting/Blocked on the Discovery-Search (Current work) board.
Thu, Oct 11, 3:08 AM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T205558: Sister search / Cross-language search interaction with multicluster from In progress to Needs review on the Discovery-Search (Current work) board.
Thu, Oct 11, 3:08 AM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)

Wed, Oct 10

EBernhardson added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

In a quick test:

Wed, Oct 10, 10:50 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

Are you evaluating MPC MRR based on re-ordering results "optimally" and scoring, which is indeed overfitted, or are you sorting results based on some data and evaluating on other data? My guess is that it would still do very well, because the most popular thing is going to be popular, but it could also be strongly overfitting on a longer tail that boosts the score a little here and a little there. It could also give a big boost to unique queries, which would always score perfectly, since there is no room for disagreement—and that long tail could make a big difference.

Wed, Oct 10, 10:21 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T205494: Add autocomplete evaluation via MRR to relforge from Needs review to Done on the Discovery-Search (Current work) board.
Wed, Oct 10, 10:03 PM · Patch-For-Review, Discovery-Search (Current work)

Tue, Oct 9

EBernhardson moved T206232: Repair search satisfaction browser tests from Needs review to Done on the Discovery-Search (Current work) board.
Tue, Oct 9, 5:28 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)

Fri, Oct 5

EBernhardson created P7645 Re-run failed quibble test locally.
Fri, Oct 5, 11:05 PM
EBernhardson added a comment to T206352: Extract judgment data for search indexing.

This will allow indexing the data, using that data for something depends on the use case. The most direct method is to implement a full text search keyword via the CirrusSearchAddQueryFeatures hook.

Fri, Oct 5, 6:40 PM · Discovery-Search, Scoring-platform-team, Elasticsearch, JADE

Thu, Oct 4

EBernhardson moved T206229: Port WikimediaEvents browser tests to nodejs from Needs triage to Tech Debt/Misc on the Discovery-Search board.
Thu, Oct 4, 5:21 PM · Discovery-Search, Browser-Tests
zeljkofilipin awarded T206229: Port WikimediaEvents browser tests to nodejs a Cookie token.
Thu, Oct 4, 4:31 PM · Discovery-Search, Browser-Tests
EBernhardson added a comment to T206232: Repair search satisfaction browser tests.

relatedly these should move to the standardized nodejs infra: T206229

Thu, Oct 4, 3:34 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T206232: Repair search satisfaction browser tests from Backlog to Needs review on the Discovery-Search (Current work) board.
Thu, Oct 4, 3:32 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson claimed T206232: Repair search satisfaction browser tests.
Thu, Oct 4, 3:31 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson created T206232: Repair search satisfaction browser tests.
Thu, Oct 4, 3:31 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson triaged T206229: Port WikimediaEvents browser tests to nodejs as Normal priority.
Thu, Oct 4, 3:28 PM · Discovery-Search, Browser-Tests
EBernhardson created T206229: Port WikimediaEvents browser tests to nodejs.
Thu, Oct 4, 3:28 PM · Discovery-Search, Browser-Tests

Wed, Oct 3

EBernhardson added a comment to T189242: [Bug] Image clicks are ignored by searchSatisfaction.

Per the tags from ReleaseTaggerBot this should have been deployed with wmf.22, and we are now on wmf.23 with .24 rolling out. So this should be deployed.

Wed, Oct 3, 7:46 PM · MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Patch-For-Review, Discovery-Search (Current work)

Tue, Oct 2

EBernhardson added a comment to T206042: Job cirrusSearchIncomingLinkCount failures "Read timeout is reached" from Title::getFirstRevision.

We started calling Revision::getFirstRevision (via WikiPage::getOldestRevision) in https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/CirrusSearch/+/433986/3/includes/Updater.php as part of T195071. Nothing should be new to this code path on the CirrusSearch side since that patch was deployed in late may.

Tue, Oct 2, 10:09 PM · MW-1.33-notes (1.33.0-wmf.1; 2018-10-23), Patch-For-Review, Discovery-Search (Current work), CirrusSearch, Wikimedia-production-error
EBernhardson closed T201479: Submitting multiple mjolnir patches receive -1 with OSError: [Errno 12] Cannot allocate memory as Resolved.

mjolnir tox jobs look to be running on m4executor, not sure what changed but this doesn't seem to be a problem anymore.

Tue, Oct 2, 9:19 PM · Continuous-Integration-Config, Patch-For-Review
EBernhardson moved T205558: Sister search / Cross-language search interaction with multicluster from Backlog to In progress on the Discovery-Search (Current work) board.
Tue, Oct 2, 5:42 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson claimed T205558: Sister search / Cross-language search interaction with multicluster.
Tue, Oct 2, 5:42 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson edited projects for T177774: Refactor Elastic TTM Server implementation to allow experimenting new queries without breaking production usage , added: Discovery-Search; removed Discovery-Search (Current work).
Tue, Oct 2, 5:41 PM · Discovery-Search, Patch-For-Review, Discovery, Elasticsearch, MediaWiki-extensions-Translate
EBernhardson edited projects for T101236: TTMServer performance and coverage issues, added: Discovery-Search; removed Discovery-Search (Current work).
Tue, Oct 2, 5:41 PM · Discovery-Search, Epic, Language-Engineering April-June 2016, User-notice, Discovery, Elasticsearch, MediaWiki-extensions-Translate
EBernhardson moved T196032: Huge messages in eqiad.mediawiki.job.cirrusSearchElasticaWrite (and other?) topics from Waiting/Blocked to Done on the Discovery-Search (Current work) board.
Tue, Oct 2, 5:40 PM · Discovery-Search (Current work), EventBus, MediaWiki-JobQueue, Services (designing), Analytics
EBernhardson edited P7616 (An Untitled Masterwork).
Tue, Oct 2, 4:15 PM
EBernhardson created P7616 (An Untitled Masterwork).
Tue, Oct 2, 4:13 PM
TJones awarded T205656: Convert relforge to a config format that supports nested structures a Like token.
Tue, Oct 2, 3:10 PM · Discovery-Search

Mon, Oct 1

EBernhardson removed a project from T205558: Sister search / Cross-language search interaction with multicluster: Epic.
Mon, Oct 1, 9:44 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T205558: Sister search / Cross-language search interaction with multicluster from Up Next to Current work on the Discovery-Search board.
Mon, Oct 1, 9:44 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson updated the task description for T205558: Sister search / Cross-language search interaction with multicluster.
Mon, Oct 1, 9:44 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson updated the task description for T205558: Sister search / Cross-language search interaction with multicluster.
Mon, Oct 1, 9:42 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T205558: Sister search / Cross-language search interaction with multicluster.

Search config needs to know the name of the cluster to connect to. Currently it only has the local wiki config and not the config of the wiki being searched

Mon, Oct 1, 9:41 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T193701: Explore using user clicks data to tune Wikidata search parameters.

Work has started in T205111 to collect wikidata autocomplete click data and use the click data to perform offline evaluation of a proposed autocomplete ranker. The ability to evaluate the relative quality of multiple rankers is an essential first step to being able to tune the ranker.

Mon, Oct 1, 9:08 PM · Epic, CirrusSearch, Wikidata, Discovery, Discovery-Search
EBernhardson removed a project from T204135: Warn when CirrusSearch is not configured to use local DC for an extended time: Patch-For-Review.
Mon, Oct 1, 8:59 PM · Discovery-Search (Current work), Datacenter-Switchover-2018, Operations
EBernhardson added a comment to T204776: Investigate brief CirrusSearch outage (MW exception spike for api.php).

I think for the purposes of this ticket we can call it complete. There isn't a whole lot that can be done about the network issues from the mediawiki side besides the already merged patch to fail gracefully. From the elasticsearch side the oversized impact (dropping requests for ~2 minutes) of this is expected to be mitigated by ongoing work to reduce the number of shards per cluster.

Mon, Oct 1, 8:58 PM · Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), CirrusSearch, Wikimedia-Incident, Wikimedia-production-error
EBernhardson moved T204776: Investigate brief CirrusSearch outage (MW exception spike for api.php) from Backlog to Done on the Discovery-Search (Current work) board.
Mon, Oct 1, 8:56 PM · Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), CirrusSearch, Wikimedia-Incident, Wikimedia-production-error
EBernhardson moved T198351: Refactor puppet to support multiple elasticsearch instances on same node from Needs review to Done on the Discovery-Search (Current work) board.
Mon, Oct 1, 8:50 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T195389: Text content of wiki page in search index can merge words making them unfindable. from Needs review to Done on the Discovery-Search (Current work) board.
Mon, Oct 1, 6:38 PM · MW-1.32-notes (WMF-deploy-2018-10-02 (1.32.0-wmf.24)), Patch-For-Review, Discovery-Search (Current work), Discovery, CirrusSearch
EBernhardson moved T205494: Add autocomplete evaluation via MRR to relforge from Backlog to Needs review on the Discovery-Search (Current work) board.
Mon, Oct 1, 5:21 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T205348: Calculate autocomplete examination probabilities from eventlogging data from Needs review to In progress on the Discovery-Search (Current work) board.
Mon, Oct 1, 5:21 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson moved T205348: Calculate autocomplete examination probabilities from eventlogging data from In progress to Needs review on the Discovery-Search (Current work) board.
Mon, Oct 1, 5:21 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

After re-reviewing my code this morning I found a bug where all of the scores were off by one (so first position was 1/2 instead of 1/1). Re-running gives much higher numbers, but in relative terms things are pretty similar

Mon, Oct 1, 5:09 PM · Patch-For-Review, Discovery-Search (Current work)

Fri, Sep 28

EBernhardson added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

Thinking about this, MPC MRR might be considered the optimal ordering if we don't have the ability to use additional context per query. If my line of thinking is correct MPC should be the maximum possible MRR on this click dataset if we have to return the same result set for the same prefix every time. MPC could be significantly improved on if short prefixes could vary their results based on some sort of context clues.

Fri, Sep 28, 10:12 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T205494: Add autocomplete evaluation via MRR to relforge.

I have a working example of this now, will need some cleanup and test cases written before uploading to gerrit.

Fri, Sep 28, 9:44 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson updated the task description for T205746: Cleanup wikidata autocomplete logs.
Fri, Sep 28, 8:59 PM · User-Smalyshev, Discovery-Search (Current work)
EBernhardson triaged T205746: Cleanup wikidata autocomplete logs as Normal priority.
Fri, Sep 28, 8:59 PM · User-Smalyshev, Discovery-Search (Current work)

Thu, Sep 27

EBernhardson triaged T205660: Search Satisfaction eventlogging doesn't include input location with click events as Low priority.
Thu, Sep 27, 7:54 PM · Discovery-Search
EBernhardson moved T205660: Search Satisfaction eventlogging doesn't include input location with click events from Needs triage to Tech Debt/Misc on the Discovery-Search board.
Thu, Sep 27, 7:54 PM · Discovery-Search
EBernhardson created T205660: Search Satisfaction eventlogging doesn't include input location with click events.
Thu, Sep 27, 7:53 PM · Discovery-Search
EBernhardson updated the task description for T205656: Convert relforge to a config format that supports nested structures.
Thu, Sep 27, 7:01 PM · Discovery-Search
EBernhardson triaged T205656: Convert relforge to a config format that supports nested structures as Normal priority.
Thu, Sep 27, 7:00 PM · Discovery-Search
EBernhardson moved T205656: Convert relforge to a config format that supports nested structures from Needs triage to Tech Debt/Misc on the Discovery-Search board.
Thu, Sep 27, 7:00 PM · Discovery-Search
EBernhardson created T205656: Convert relforge to a config format that supports nested structures.
Thu, Sep 27, 7:00 PM · Discovery-Search
EBernhardson added a comment to T23139: Option to sort search results by size, number of words and date in advanced.

See the sort parameter to the search api, added in the last few months: https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&srsearch=qqq&srinterwiki=1&srsort=relevance

Thu, Sep 27, 5:20 PM · Discovery-Search, CirrusSearch, MediaWiki-Search
EBernhardson renamed T62976: Prefix Search: Would be nice if search engine could highlight the result rather than js from Prefix Search: Would be nice if php could highlight the result rather than js to Prefix Search: Would be nice if search engine could highlight the result rather than js.
Thu, Sep 27, 5:17 PM · Discovery-Search, CirrusSearch, JavaScript, MediaWiki-Search
EBernhardson closed T88891: PHP Warning: Search backend error during regex search for 'insource:/~~~~/' after 128. Regex syntax error: expected ')' at position 10] as Resolved.

insource:/\~\~\~\~/ works as expected

Thu, Sep 27, 5:10 PM · Discovery-Search, CirrusSearch
EBernhardson closed T88891: PHP Warning: Search backend error during regex search for 'insource:/~~~~/' after 128. Regex syntax error: expected ')' at position 10], a subtask of T41480: Issues affecting translatewiki.net, as Resolved.
Thu, Sep 27, 5:10 PM · Tracking, MediaWiki-General-or-Unknown
EBernhardson added a comment to T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.

Another useful reference, this follows the development of autocomplete from MPC to ~2016: https://www.slideshare.net/YichenFeng1/tutorial-on-query-autocompletion

Thu, Sep 27, 3:34 PM · Discovery-Search (Current work), Epic

Wed, Sep 26

EBernhardson updated subscribers of T205558: Sister search / Cross-language search interaction with multicluster.
Wed, Sep 26, 4:51 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson created T205558: Sister search / Cross-language search interaction with multicluster.
Wed, Sep 26, 4:51 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson created T205552: Create vagrant role for testing multi-cluster elasticsearch (crosswiki/language).
Wed, Sep 26, 4:14 PM · Discovery-Search

Tue, Sep 25

EBernhardson created T205494: Add autocomplete evaluation via MRR to relforge.
Tue, Sep 25, 9:40 PM · Patch-For-Review, Discovery-Search (Current work)
EBernhardson added a comment to T204776: Investigate brief CirrusSearch outage (MW exception spike for api.php).

If the master does not receive acknowledgement from at least discovery.zen.minimum_master_nodes nodes within a certain time (controlled by the discovery.zen.commit_timeout setting and defaults to 30 seconds) the cluster state change is rejected

The part about minimum_master_nodes is slightly confusing. Our discovery.zen.minimum_master_nodes is set to 2, but this isn't clear if 2 master capable nodes must ack the state update, or if any 2 nodes could ack the update. Additionally though the logstash messages only mention a timeout and not that the update was rejected.

Tue, Sep 25, 9:28 PM · Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), CirrusSearch, Wikimedia-Incident, Wikimedia-production-error
EBernhardson moved T202339: Evaluate adding an image quality score to media search result ranking from Needs review to Done on the Discovery-Search (Current work) board.
Tue, Sep 25, 9:21 PM · Discovery-Search (Current work)
EBernhardson moved T205348: Calculate autocomplete examination probabilities from eventlogging data from Backlog to In progress on the Discovery-Search (Current work) board.
Tue, Sep 25, 5:48 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson claimed T205348: Calculate autocomplete examination probabilities from eventlogging data.
Tue, Sep 25, 5:48 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)

Mon, Sep 24

EBernhardson added a comment to T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.

In addition to this, see T205348 for updating eventlogging to collect enough information to calculate examination probabilities.

Mon, Sep 24, 10:15 PM · Discovery-Search (Current work), Epic
EBernhardson added a comment to T205348: Calculate autocomplete examination probabilities from eventlogging data.

We need to update wikidata and search satisfaction logging to do one of the following:

  1. Record the searchToken for each result set displayed
  2. Record the list of pages for each result set displayed. This would probably necessitate recording an event for each displayed search.
  3. Keep an in-memory history of autocomplete results. When an item is selected look back through the history for all the possible prefixes and record what prefixes and what position held the item finally selected
Mon, Sep 24, 9:44 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson created T205348: Calculate autocomplete examination probabilities from eventlogging data.
Mon, Sep 24, 9:44 PM · MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), Patch-For-Review, Discovery-Search (Current work)
EBernhardson created T205301: Property searches in wikidatacompletionsearchclicks have mostly null values.
Mon, Sep 24, 4:43 PM · MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), Patch-For-Review, User-Smalyshev, Discovery-Search (Current work), Wikidata
EBernhardson added a comment to T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.

Other metrics that i've seen used for autocomplete in my review:

Mon, Sep 24, 4:26 PM · Discovery-Search (Current work), Epic
EBernhardson added a comment to T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.

I spent most of friday digging through User model based metrics for offline query suggestion evaluation. The two metrics provided, eSaved and pSaved, are not too dissimilar from MRR. The simpler of the two is pSaved. In pSaved the metric is the sum of P(S_ij = 1) across i and j where i represents the number of letters typed, and j represents the position of suggestion. P(S_ij=1) is defined relatively simply: The user is satisfied if the correct result is provided and the user examines that result. The paper gives a relatively simple algorithm for looking over your interaction logs and calculating the probability of examination. eSaved is a modification of pSaved that further accounts for length.

Mon, Sep 24, 4:22 PM · Discovery-Search (Current work), Epic

Fri, Sep 21

EBernhardson updated subscribers of T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.

This is a nice survey of the query autocomplete literature (circa 2016): https://staff.fnwi.uva.nl/m.derijke/wp-content/papercite-data/pdf/cai-survey-2016.pdf

Fri, Sep 21, 5:52 PM · Discovery-Search (Current work), Epic
EBernhardson created T205111: [EPIC] Transform wikidata autocomplete click logs into a useful dataset.
Fri, Sep 21, 3:52 PM · Discovery-Search (Current work), Epic

Thu, Sep 20

EBernhardson moved T204959: Add a way to configure timeouts of autocomplete queries from Needs review to Done on the Discovery-Search (Current work) board.
Thu, Sep 20, 10:59 PM · MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), Patch-For-Review, Discovery-Search (Current work), CirrusSearch
EBernhardson added a comment to T204776: Investigate brief CirrusSearch outage (MW exception spike for api.php).

Potentially the reason this drug on longer than it should have is timeouts updating the cluster state: https://logstash.wikimedia.org/goto/1f67b952e7da4dec76fc66addb6b901b

Thu, Sep 20, 9:34 PM · Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), CirrusSearch, Wikimedia-Incident, Wikimedia-production-error
EBernhardson added a comment to T205005: Reindex produces a lot of Undefined index messages.

Is this from the task reindexing the same cluster that is having a restart (codfw, iirc)? I don't think elasticsearch tasks or scrolls are able to move between hosts and will fail in the face of one of the nodes participating restarting. The error message also means we don't detect that case, there is probably no particular error handling in the Cirrus side of the reindexing code to detect errors.

Thu, Sep 20, 8:16 PM · Discovery-Search
EBernhardson triaged T204982: Collect per-node latency statistics from each node separately as Normal priority.
Thu, Sep 20, 5:40 PM · Discovery-Search (Current work)
EBernhardson added a comment to T143396: Display translated page title in search results.

This is now blocked until roughly december waiting for data to populate the indices before it can be used

Thu, Sep 20, 5:27 PM · Discovery-Search (Current work), Patch-For-Review, MediaWiki-Search, Discovery, MediaWiki-extensions-Translate
EBernhardson moved T143396: Display translated page title in search results from In progress to Waiting/Blocked on the Discovery-Search (Current work) board.
Thu, Sep 20, 5:27 PM · Discovery-Search (Current work), Patch-For-Review, MediaWiki-Search, Discovery, MediaWiki-extensions-Translate
EBernhardson moved T204363: Modify elasticsearch_shard_size_check plugin to display only indices and shard size from Needs triage to This Quarter on the Discovery-Search board.
Thu, Sep 20, 5:15 PM · Operations, Discovery-Search, Elasticsearch
EBernhardson moved T204776: Investigate brief CirrusSearch outage (MW exception spike for api.php) from Needs triage to Current work on the Discovery-Search board.
Thu, Sep 20, 5:15 PM · Discovery-Search (Current work), MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), CirrusSearch, Wikimedia-Incident, Wikimedia-production-error