Page MenuHomePhabricator

[OPS] lucene-search-2 uses too much memory on labs
Closed, DeclinedPublic


search in UI returns nothing

According to the tracking bug, addressing Search via curl() is working, but Search in the UI is not working, see screen shot.

Version: unspecified
Severity: normal


Screen_shot_2013-03-21_at_11.16.10_AM.png (240×1 px, 64 KB)



Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:40 AM
bzimport set Reference to bz46459.

For the last couple days the php entry points were giving an Error 500 because the Thanks extension was not in mediawiki/extensions.git (that is fixed now). Lucene search poll all the wiki via the OAI extension, that definitely served error 500 page that might have broke the search system.

The two search instances are using puppetmaster::self so their puppet configuration have to be done manually. I have updated them a few hours ago.

Doing a search does not work right now:

Gives out:


Need to investigate the PHP error logs and look at the search box logs.

deployment-search01:~$ curl -x localhost:8123 http://localhost/search/enwiki/Main
curl: (7) couldn't connect to host

I have restarted lucene-search2 there

Search is working again. What is troublesome is that lucene-search2 should be restart by puppet automatically whenever it dies. I am leaving this bug open to monitor it a bit more.

The lucene process is probably killed by the OOM catcher. We need to tweak the java -Xm parameter to limit the amount of memory being used.

Both and have San Francisco article:

At when you enter San in search box, several search suggestions appear (wikipedia.png attachment). No search suggestions appear when the same is done at (wmflabs.png attachment).

Created attachment 12054
wikipedia screenshot


wikipedia.png (712×645 px, 91 KB)

Created attachment 12055
wmflabs screenshot


wmflabs.png (712×644 px, 116 KB)

Rewording the summary. The root cause is the java process asking for 20GB memory on an instance having 4GB.

I have hacked the script locally to limit memory to 2GB. Will see how well it goes then hack the puppet class and init.d script to let us easily tweak the memory settings for lucene.

Command running right now is:

/usr/bin/java -Xmx2000m -Dsun.rmi.transport.tcp.handshakeTimeout=10000 -Djava.rmi.server.codebase=file:///a/search/lucene-search/LuceneSearch.jar -Djava.rmi.server.hostname=deployment-search01 -classpath :/usr/share/java/udp2log-log4j.jar:/a/search/lucene-search/LuceneSearch.jar org.wikimedia.lsearch.config.StartupManager


Taking bug, raising priority. I need to fix that this week.

The deployment-search01 Icinga report is

I have restarted the lucene-search-2 service that was apparently no more listening although there has been no OOM message :-] So we have some progress!

pending ops review, updating summary to reflect that.

Peter has merged the changes and deployed them in production. I have to make sure that works fine in labs and will most probably recreate the existing instances.

Most of the work has been completed, thus lowering priority.

Chad and Nik have migrated beta to CirrusSearch extension which uses an
ElasticSearch backend. Hence this Lucene search bug is no more valid :-)