Page MenuHomePhabricator

rebuildTermSearchKey script on wikidata.org
Closed, ResolvedPublic

Description

Please run the rebuildTermSearchKey script on Wikidata.


Version: unspecified
Severity: enhancement

Details

Reference
bz46378

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:34 AM
bzimport set Reference to bz46378.

That's extensions/Wikibase/repo/maintenance/rebuildTermsSearchKey.php, to be run on wikidatawiki.

This will rebuild the term_search_key column in the wb_terms table. That table is quite large, so this ma take a while. Database updates are done in batches, and the script waitrs for slaves to catch up before starting the next batch.

greg added a comment.Mar 21 2013, 4:13 PM

Asher: Could

greg added a comment.Mar 21 2013, 9:17 PM

So, I didn't mean to save that.... Asher, I think Rob emailed CT about this. Sorry for the noise...

According to IRC, CT Woo asked Peter to review and needs to follow up with him.

greg added a comment.Mar 27 2013, 8:02 PM

Per https://bugzilla.wikimedia.org/show_bug.cgi?id=46613 this has been reviewed by Asher and is OK to run.

Great! So, here's what needs to happen:

  1. run rebuildTermSearchKey.php. It will log a line after each batch, e.g.:

    Updated 500 search keys, up to row 20999.

Should the script die or be stopped for some reason, it can be restarted from the row it left off, using --start-row 20999. The batch size can be tuned using --batch-size.

When the script is done, mark the last row it reported (we'll need it later!)

  1. tell the wiki to start using the index, with $wgWBRepoSettings['withoutTermSearchKey'] = false; (or rather, by removing the line from the config that forces this to true).
  1. re-run rebuildTermSearchKey.php from the row it finished at, to cover items that have been added/modified between the time the script finished and the time the settings were changed.

oh, in case we lose the info which row to restart from, use this:

select min(term_row_id) from wb_terms where term_search_key = '';

(off the top of my head)

greg added a comment.Mar 27 2013, 8:51 PM

(btw, that link was supposed to be http://rt.wikimedia.org/Ticket/Display.html?id=4801, not bug 46613)

Thanks Daniel.

Peter: Would this go on your plate, or someone else in Ops?

reedy@hume:/home/wikipedia/common/php-1.21wmf12$ mwscript extensions/Wikibase/repo/maintenance/rebuildTermsSearchKey.php wikidatawiki
...Update 'Wikibase\RebuildTermsSearchKey' already logged as completed.
reedy@hume:/home/wikipedia/common/php-1.21wmf12$ mwscript extensions/Wikibase/repo/maintenance/rebuildTermsSearchKey.php wikidatawiki --force

Fatal error: Class 'ObservableMessageReporter' not found in /home/wikipedia/common/php-1.21wmf12/extensions/Wikibase/repo/maintenance/rebuildTermsSearchKey.php on line 59

(In reply to comment #9)

Fatal error: Class 'ObservableMessageReporter' not found in
/home/wikipedia/common/php-1.21wmf12/extensions/Wikibase/repo/maintenance/
rebuildTermsSearchKey.php
on line 59

Looks like we forgot to make that class available in production. Fixed in I6a1a7244975.

Now that I6a1a7244975 is merged Reedy might try again?

Reedy: Could you try again?

Reedy added a comment.Apr 9 2013, 10:17 PM

(In reply to comment #12)

Reedy: Could you try again?

I did try before. Got to 25-30% completion, but then was cancelled due to replication lag issues.

Seems to be working ok again..

reedy@hume:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildTermsSearchKey.php wikidatawiki --only-missing --force
Updated 100 search keys, up to row 22873110.
Updated 100 search keys, up to row 22873222.
Updated 100 search keys, up to row 22873330.