Page MenuHomePhabricator

autocomplete in querybuilder is missing results
Closed, ResolvedPublicBUG REPORT

Event Timeline

It looks like auto complete is working.

image.png (177×386 px, 12 KB)

when you type "ma" which entity are you expecting to appear?

This will have something to do with search on your Wikibase.

The query that ends up happening for your case that you can find in the JS network logs is https://query.kunstmuseum.nl/proxy/wikibase/w/api.php?origin=*&action=wbsearchentities&format=json&limit=50&continue=0&language=en&uselang=en&search=ma&type=property
(which as we can see returns nothing)

Removing the format=json part of that url shows a warning of type 'error'... that looks like https://phabricator.wikimedia.org/T260276

The code change there only looks cosmetic in no longer showing this error.
I manually patched in the changes, which indeed hides the error...
But what could be causing it?

On https://api.kunstmuseum.nl/w/index.php?search=figure&title=Special%3ASearch&fulltext=1 It's also showing An error has occurred while searching: We could not complete your search due to a temporary problem. Please try again later. But there is nothing related in the docker logs unfortunately to point to a cause.

In /var/log/mediawiki/error.log I saw the same error with a stacktrace.

2020-10-08 07:43:01 807591ec944f my_wiki: [46cfcdb2d547f4ac352ea8f1] /w/api.php?action=wbsearchentities&search=gogh&format=json&language=en&uselang=en&type=item   ErrorException from line 333 of /var/www/html/includes/debug/MWDebug.php: PHP Warning: {"type":"error","message":"cirrussearch-backend-error","params":[]} [Called from Wikibase\Search\Elastic\EntitySearchElastic::getRankedSearchResults in /var/www/html/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php at line 318]
#0 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /var/www/html/includes/debug/MWDebug.php(333): trigger_error(string, integer)
#2 /var/www/html/includes/debug/MWDebug.php(188): MWDebug::sendMessage(string, array, string, integer)
#3 /var/www/html/includes/GlobalFunctions.php(1079): MWDebug::warning(string, integer, integer, string)
#4 /var/www/html/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(318): wfLogWarning(string)
#5 /var/www/html/extensions/Wikibase/repo/includes/Api/CombinedEntitySearchHelper.php(48): Wikibase\Search\Elastic\EntitySearchElastic->getRankedSearchResults(string, string, string, integer, boolean)
#6 /var/www/html/extensions/Wikibase/repo/includes/Api/TypeDispatchingEntitySearchHelper.php(54): Wikibase\Repo\Api\CombinedEntitySearchHelper->getRankedSearchResults(string, string, string, integer, boolean)
#7 /var/www/html/extensions/Wikibase/repo/includes/Api/SearchEntities.php(120): Wikibase\Repo\Api\TypeDispatchingEntitySearchHelper->getRankedSearchResults(string, string, string, integer, boolean)
#8 /var/www/html/extensions/Wikibase/repo/includes/Api/SearchEntities.php(254): Wikibase\Repo\Api\SearchEntities->getSearchEntries(array)
#9 /var/www/html/includes/api/ApiMain.php(1598): Wikibase\Repo\Api\SearchEntities->execute()
#10 /var/www/html/includes/api/ApiMain.php(537): ApiMain->executeAction()
#11 /var/www/html/includes/api/ApiMain.php(508): ApiMain->executeActionWithErrorHandling()
#12 /var/www/html/api.php(87): ApiMain->execute()
#13 {main}

Do you see any errors in the elastic search services?

No, here's the log after restarting elasticsearch and attempting a few searches:

wikibase@wikibase:~/wikibase-docker$ docker-compose  up elasticsearch
Starting wikibase-docker_elasticsearch_1 ... done
Attaching to wikibase-docker_elasticsearch_1
elasticsearch_1    | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
elasticsearch_1    | OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=0
elasticsearch_1    | [2020-10-13T12:14:50,181][INFO ][o.e.e.NodeEnvironment    ] [7DEERnP] using [1] data paths, mounts [[/ (overlay)]], net usable_space [4.8gb], net total_space [14.9gb], types [overlay]
elasticsearch_1    | [2020-10-13T12:14:50,186][INFO ][o.e.e.NodeEnvironment    ] [7DEERnP] heap size [495.3mb], compressed ordinary object pointers [true]
elasticsearch_1    | [2020-10-13T12:14:50,189][INFO ][o.e.n.Node               ] [7DEERnP] node name derived from node ID [7DEERnPBTzujGriDPZxI8w]; set [node.name] to override
elasticsearch_1    | [2020-10-13T12:14:50,191][INFO ][o.e.n.Node               ] [7DEERnP] version[6.5.4], pid[1], build[default/tar/d2ef93d/2018-12-17T21:17:40.758843Z], OS[Linux/4.19.0-10-amd64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/11.0.1/11.0.1+13]
elasticsearch_1    | [2020-10-13T12:14:50,192][INFO ][o.e.n.Node               ] [7DEERnP] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.T6CKFqYS, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.cgroups.hierarchy.override=/, -Xms512m, -Xmx512m, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
elasticsearch_1    | [2020-10-13T12:14:52,638][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [aggs-matrix-stats]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [analysis-common]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [ingest-common]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [lang-expression]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [lang-mustache]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [lang-painless]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [mapper-extras]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [parent-join]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [percolator]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [rank-eval]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [reindex]
elasticsearch_1    | [2020-10-13T12:14:52,639][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [repository-url]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [transport-netty4]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [tribe]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-ccr]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-core]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-deprecation]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-graph]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-logstash]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-ml]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-monitoring]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-rollup]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-security]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-sql]
elasticsearch_1    | [2020-10-13T12:14:52,640][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-upgrade]
elasticsearch_1    | [2020-10-13T12:14:52,641][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded module [x-pack-watcher]
elasticsearch_1    | [2020-10-13T12:14:52,641][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded plugin [experimental-highlighter]
elasticsearch_1    | [2020-10-13T12:14:52,641][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded plugin [extra]
elasticsearch_1    | [2020-10-13T12:14:52,641][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded plugin [ingest-geoip]
elasticsearch_1    | [2020-10-13T12:14:52,641][INFO ][o.e.p.PluginsService     ] [7DEERnP] loaded plugin [ingest-user-agent]
elasticsearch_1    | [2020-10-13T12:14:57,904][INFO ][o.e.x.s.a.s.FileRolesStore] [7DEERnP] parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]
elasticsearch_1    | [2020-10-13T12:14:58,642][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [7DEERnP] [controller/73] [Main.cc@109] controller (64 bit): Version 6.5.4 (Build b616085ef32393) Copyright (c) 2018 Elasticsearch BV
elasticsearch_1    | [2020-10-13T12:14:59,519][INFO ][o.e.d.DiscoveryModule    ] [7DEERnP] using discovery type [single-node] and host providers [settings]
elasticsearch_1    | [2020-10-13T12:15:00,503][INFO ][o.e.n.Node               ] [7DEERnP] initialized
elasticsearch_1    | [2020-10-13T12:15:00,505][INFO ][o.e.n.Node               ] [7DEERnP] starting ...
elasticsearch_1    | [2020-10-13T12:15:00,716][INFO ][o.e.t.TransportService   ] [7DEERnP] publish_address {172.18.0.2:9300}, bound_addresses {0.0.0.0:9300}
elasticsearch_1    | [2020-10-13T12:15:00,903][INFO ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [7DEERnP] publish_address {172.18.0.2:9200}, bound_addresses {0.0.0.0:9200}
elasticsearch_1    | [2020-10-13T12:15:00,905][INFO ][o.e.n.Node               ] [7DEERnP] started
elasticsearch_1    | [2020-10-13T12:15:01,400][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [7DEERnP] Failed to clear cache for realms [[]]
elasticsearch_1    | [2020-10-13T12:15:01,508][INFO ][o.e.l.LicenseService     ] [7DEERnP] license [0c2df63d-7cd6-4d45-8610-e27c6818a667] mode [basic] - valid
elasticsearch_1    | [2020-10-13T12:15:01,535][INFO ][o.e.g.GatewayService     ] [7DEERnP] recovered [0] indices into cluster_state

I did notice max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144] from earlier so I've now set vm.max_map_count=262144 in /etc/sysctl.conf but has not helped.

We're a bit further... Via Extension:CirrusSearch I ended up on the CirrusSearch README. Which mentioned the steps:

Now run this script to generate your elasticsearch index:
 php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php
Now remove $wgDisableSearchUpdate = true from LocalSettings.php.  Updates should start heading to Elasticsearch.
Next bootstrap the search index by running:
 php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks --indexOnSkip
 php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipParse
Note that this can take some time.  For large wikis read "Bootstrapping large wikis" below.

After doing those steps I started getting results on the search page but still not in the search box URL

Is there a maintenance script that should have triggered this automatically from docker-compsose?

Together with @despens we found that in the search box most letters give a result, but not the 'g' which you would expect 'van Gogh' to be found. It's still a mystery...

helmo claimed this task.

The last question is probably explained by the type of search that is done. It's probably a prefix search while we expected a full text search. With large datasets it probably makes sense to do it this way.