Case insensitivity in the search box does not function for Titles with Mixed Case
Closed, ResolvedPublic

Description

Case insensitivity doesn't work for Titles with Mixed Case (see [[WP:MIXEDCAPS]]).

We had a bot creating thousands upon thousands of redirects from the small-case versions of mixed-case articles. http://en.wikipedia.org/wiki/User:BOTijo

This is obviously sub-optimal to simply fixing the case-insensitivity in the search box to find the mixed case article automagically.

We've revoked the bot's authorization in hopes this bug can be fixed.


Version: unspecified
Severity: normal
URL: http://en.wikipedia.org/wiki/WP:MIXEDCAPS

bzimport set Reference to bz19882.
Xeno created this task.Jul 22 2009, 7:56 PM
brion added a comment.Jul 24 2009, 4:28 PM

"Go" searches to titles with mixed capitalization should work just fine on Wikipedia for the last year or two thanks to the TitleKey extension; there's no need to create redirects for that.

Can you provide some sample searches that fail? We might have had a regression in functionality or a break in indexing.

Xeno added a comment.Jul 27 2009, 6:03 PM

See any Page with Mixed Caps, e.g.

"Congolese Union of Republicans" (this is a new page I just found, might not be around when you get to this)

Type in search box: "congolese union of republicans" = Does not get you to the article.

(In reply to comment #1)

"Go" searches to titles with mixed capitalization should work just fine on
Wikipedia for the last year or two thanks to the TitleKey extension; there's no
need to create redirects for that.

[[en:Special:Version]] doesn't list TitleKey at this time.

rainman wrote:

We are using lucene as a prefix backend on en.wp at the moment.

catlow wrote:

Has TitleKey recently been removed? I'm sure this used to work fine. Will it be restored?

Anomie added a comment.Aug 6 2009, 4:19 PM

(In reply to comment #4)

We are using lucene as a prefix backend on en.wp at the moment.

Lucene doesn't seem to use the SearchGetNearMatch hook, which AFAICT is what is needed to affect the "Go" button.

rainman wrote:

It would be good if they both could co-exist. lucene.php should be loaded after titlekey in CommonsSettings.php and $wmgUseTitleKey = false removed from lucene.php. Also, titlekey index might need rebuilding for the past couple of months.

brion added a comment.Aug 6 2009, 4:26 PM

I think I found our problem:

} elseif ( in_array( $wgDBname, array( 'enwiki' ) ) ) {

  1. Big RAM pool 1, via LVS $wgLuceneHost = '10.2.1.11'; $wgLuceneSearchVersion = 2.1; $wgEnableLucenePrefixSearch = true; $wgLucenePrefixHost = '10.0.3.8'; #search8 $wmgUseTitleKey = false;

}

For some mysterious reason the Lucene configuration disables TitleKey on enwiki. Ouch! Removing this...

brion added a comment.Aug 6 2009, 4:30 PM

Ok, TitleKey is reenabled and I'm rebuilding the index.

brion added a comment.Aug 6 2009, 5:40 PM

Ok, me & Robert worked out the compat issue between TitleKey and MWSearch; should now be fixed with the adjustment from r54533.

TitleKey is still on to handle the "go" search, but no longer interferes with MWSearch's Lucene prefix search when it's enabled as long as we load them in the right order.

brion added a comment.Aug 6 2009, 6:56 PM

Ok, TitleKey index rebuild is now done and we have the best of both worlds. :) Case-insensitive match on 'go' searches works and we have the more advanced drop-down ajax search with the Lucene backend.

Add Comment