@Gopavasanth index names mean the name of the indices created by the extension CirrusSearch in elasticsearch.
CirrusSearch is the extension that provides search functionalities using elasticsearch as a backend.
Tue, Jan 16
@Oetterer I don't think this is related, this task is just to track progress on making the extension CirrusSearch compatible with the new extension registration process. It is just listing the pieces of code that make this refactoring problematic not actual problems regarding Config factories.
Perhaps we'll end up having the same issues but I don't think we have code that need to be run just after the extension is loaded.
Fri, Jan 5
Wed, Jan 3
I remember that it's due to the type of API param we use. When setting an array as ApiBase::PARAM_TYPE a default must be provided IIRC.
The use of arrays was a way to expose the list of possible profiles to use but the drawback was that the API would fail if you provide an unknown param. I think this is wrong, I agree, cirrus should be able to know if a profile was explicitly set by the user.
Thu, Dec 21
This is not only the namespaces selected and saved by the users but also the list of default namespace searched by default.
Currently when the extension is enabled you can encouter a strange behavior that looks like a bug:
Wed, Dec 20
Same for me I'd be for trying to increase the refresh rate on wikidata_content.
Tue, Dec 19
I ported elasticsearch-memory and elasticsearch-indexing.
Dec 18 2017
Q45825730 is me, I used this one just to test.
If a large majority of such usecases involve searching the entity id (QXXX) of the newly created item we can perform an additional db match to compensate the lag of the search index.
It's what we do for normal wikis, a db match is run in addition to the query sent to the search index.
If users search for the label or aliases of the newly created then this solution is pointless.
Dec 14 2017
Dec 13 2017
The error EADDRINUSE /tmp/cirrussearch-integration-tagtracker means that the tests are running in the background or that we failed to cleanup the socket when the tests finished or was killed.
It's perfectly fine to delete /tmp/cirrussearch-integration-tagtracker if you think the test is no longer running.
A decent place for profiles has always been a pain and I could not find something sane. I'd like to address (improve) this problem adding a ProfileManager in cirrus.
Dec 12 2017
Dec 7 2017
@zeljkofilipin we might be ready to port our selenium-CirrusSearch jenkins job to nodejs, I uploaded https://gerrit.wikimedia.org/r/#/c/395872/ to try to comply with the structure expected by jenkins:
- tests in tests/selenium/specs
- wdio config in tests/selenium/wdio.conf.jenkins.js
Dec 6 2017
Dec 1 2017
Reindex is done, @TJones could you check few indices to make sure it worked as expected?
Nov 30 2017
Nov 29 2017
Nov 28 2017
Nov 27 2017
Nov 21 2017
I'm investigating two approaches here:
- provide a way inside logstash filters to blacklist some known fields (move them into a debug_blob field that is not indexed)
- investigate disabling dynamic mapping where the first step would be to log all elastic queries to discover what are the fields we currently use. It'll allow to create the first static mapping.
Nov 20 2017
Moving back to backlog as this task actually covers 2 experiments and thought it was new:
- dbn group sizing : 20 and 35 (https://gerrit.wikimedia.org/r/#/c/387586/): do we have a task to run the report on this data?
- grouping by reording query terms
List of affected wikis:
labtestwiki mediawikiwiki test2wiki testwiki testwikidatawiki zerowiki advisorywiki auditcomwiki betawikiversity bewikimedia boardgovcomwiki boardwiki cawikimedia chairwiki chapcomwiki checkuserwiki collabwiki commonswiki donatewiki electcomwiki enwikibooks enwikinews enwikiquote enwikisource enwikiversity enwikivoyage enwiktionary execwiki fdcwiki foundationwiki grantswiki iegcomwiki incubatorwiki internalwiki labswiki legalteamwiki loginwiki metawiki movementroleswiki nostalgiawiki nycwikimedia nzwikimedia officewiki ombudsmenwiki otrs_wikiwiki outreachwiki pa_uswikimedia projectcomwiki qualitywiki searchcomwiki simplewiktionary sourceswiki spcomwiki specieswiki stewardwiki strategywiki techconductwiki transitionteamwiki usabilitywiki votewiki wikidatawiki wikimania2005wiki wikimania2006wiki wikimania2007wiki wikimania2008wiki wikimania2009wiki wikimania2010wiki wikimania2011wiki wikimania2012wiki wikimania2013wiki wikimania2014wiki wikimania2015wiki wikimania2016wiki wikimania2017wiki wikimania2018wiki wikimaniateamwiki arbcom_enwiki enwiki simplewiki tenwiki wg_enwiki
Nov 17 2017
It was just a brief spike of 300 errors today around 15:00 UTC, looking at the code I see no obvious reasons why it could happen, except a broken response from elastic.
I'll assume that this error was due to the rolling restart and won't try to hide it by calling isset on the result sets.
moving to high because this code should not reach group2
Nov 14 2017
@Smalyshev it's certainly the case yes, tuning the noop script should fix the issue.
Nov 13 2017
It's a bug but the only explanation I have to explain why this error is frequent is that your index may have run out of date. Reading the code I understand that this error could only happen when trying to update some metadata in elasticsearch concerning a document that is not indexed.
Nov 9 2017
My fear is that the "too many fields" problem is going to be more painful than the mapping conflicts one.
Speaking only about short term solutions:
- For mapping conflicts we can rename
- For too many fields I don't have yet a short term solution
Typically logstash/elastic is not able to sustain this kind of events: https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2017.11.08/mediawiki?id=AV-ZBn5-gaOKEclNGWio&_g=h@8b5b71a
EventBus.events.params should be marked as debug only in some ways
I think we should introduce a pattern where log emitters can freely send large and complex objects that would be only available for debugging purposes on a per event basis: no need to search/aggregate them.
The current strategy we have is to index everything and to allow aggregation we index 2 elastic fields per json field.
Looking at some EventBus logs I see : https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2017.11.09/mediawiki?id=AV-gV9NNSUnOz-leF_9O&_g=h@44136fa
Nov 7 2017
Nov 3 2017
Nov 2 2017
Nov 1 2017
The mirror config is now empty in labs and should not cause issues anymore.
Oct 31 2017
Oct 30 2017
@Nikerabbit nothing that I'm aware of. If this error is new I have no idea what could have happened.
Reading the code in CommonSettings I think there's bug there, it properly checks for wmfAllServices to not add a cluster but it still adds the mirror config.
(So, I don't think this error is new).
Oct 27 2017
Oct 26 2017
Oct 25 2017
@dbarratt sadly I don't know all the details of this cluster, but you could get it working by not specifying an index:
Yes the syntax is slightly different:
- you need to set Content-Type: application/x-ndjson
- every request must be formed of 2 lines:
- first line some metadata such as the index you want to query
- second line the search request body