Case insensitivity in the search box does not function for Titles with Mixed Case
Closed, ResolvedPublic

Description

Case insensitivity doesn't work for Titles with Mixed Case (see [[WP:MIXEDCAPS]]).

We had a bot creating thousands upon thousands of redirects from the small-case versions of mixed-case articles. http://en.wikipedia.org/wiki/User:BOTijo

This is obviously sub-optimal to simply fixing the case-insensitivity in the search box to find the mixed case article automagically.

We've revoked the bot's authorization in hopes this bug can be fixed.


Version: unspecified
Severity: normal
URL: http://en.wikipedia.org/wiki/WP:MIXEDCAPS

bzimport added a project: MediaWiki-Search.Via ConduitNov 21 2014, 10:40 PM
bzimport set Reference to bz19882.
Xeno created this task.Via LegacyJul 22 2009, 7:56 PM
brion added a comment.Via ConduitJul 24 2009, 4:28 PM

"Go" searches to titles with mixed capitalization should work just fine on Wikipedia for the last year or two thanks to the TitleKey extension; there's no need to create redirects for that.

Can you provide some sample searches that fail? We might have had a regression in functionality or a break in indexing.

Xeno added a comment.Via ConduitJul 27 2009, 6:03 PM

See any Page with Mixed Caps, e.g.

"Congolese Union of Republicans" (this is a new page I just found, might not be around when you get to this)

Type in search box: "congolese union of republicans" = Does not get you to the article.

Anomie added a comment.Via ConduitAug 6 2009, 12:17 PM

(In reply to comment #1)

"Go" searches to titles with mixed capitalization should work just fine on
Wikipedia for the last year or two thanks to the TitleKey extension; there's no
need to create redirects for that.

[[en:Special:Version]] doesn't list TitleKey at this time.

bzimport added a comment.Via ConduitAug 6 2009, 12:33 PM

rainman wrote:

We are using lucene as a prefix backend on en.wp at the moment.

bzimport added a comment.Via ConduitAug 6 2009, 3:01 PM

catlow wrote:

Has TitleKey recently been removed? I'm sure this used to work fine. Will it be restored?

Anomie added a comment.Via ConduitAug 6 2009, 4:19 PM

(In reply to comment #4)

We are using lucene as a prefix backend on en.wp at the moment.

Lucene doesn't seem to use the SearchGetNearMatch hook, which AFAICT is what is needed to affect the "Go" button.

bzimport added a comment.Via ConduitAug 6 2009, 4:25 PM

rainman wrote:

It would be good if they both could co-exist. lucene.php should be loaded after titlekey in CommonsSettings.php and $wmgUseTitleKey = false removed from lucene.php. Also, titlekey index might need rebuilding for the past couple of months.

brion added a comment.Via ConduitAug 6 2009, 4:26 PM

I think I found our problem:

} elseif ( in_array( $wgDBname, array( 'enwiki' ) ) ) {

  1. Big RAM pool 1, via LVS $wgLuceneHost = '10.2.1.11'; $wgLuceneSearchVersion = 2.1; $wgEnableLucenePrefixSearch = true; $wgLucenePrefixHost = '10.0.3.8'; #search8 $wmgUseTitleKey = false;

}

For some mysterious reason the Lucene configuration disables TitleKey on enwiki. Ouch! Removing this...

brion added a comment.Via ConduitAug 6 2009, 4:30 PM

Ok, TitleKey is reenabled and I'm rebuilding the index.

brion added a comment.Via ConduitAug 6 2009, 5:40 PM

Ok, me & Robert worked out the compat issue between TitleKey and MWSearch; should now be fixed with the adjustment from r54533.

TitleKey is still on to handle the "go" search, but no longer interferes with MWSearch's Lucene prefix search when it's enabled as long as we load them in the right order.

brion added a comment.Via ConduitAug 6 2009, 6:56 PM

Ok, TitleKey index rebuild is now done and we have the best of both worlds. :) Case-insensitive match on 'go' searches works and we have the more advanced drop-down ajax search with the Lucene backend.

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.