Page MenuHomePhabricator

Searches with hyphens yield a database query error
Open, MediumPublic

Description

MediaWiki 1.32.0
PHP 7.2.15-0ubuntu0.18.04.2 (apache2handler)
MariaDB 10.1.34-MariaDB

A database query error has occurred. This may indicate a bug in the software.


[42cc5c670a243b95937180db] /w/index.php?title=Special%3ASearch&search=Binding+Coil+of+Bahamut+-+Turn+2&go=Go Wikimedia\Rdbms\DBQueryError from line 1496 of /var/www/<domain>/htdocs/w/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
 Query: SELECT page_id,page_namespace,page_title FROM `ffxiv_mw_page`,`ffxiv_mw_searchindex` WHERE (page_id=si_page) AND ( MATCH(si_title) AGAINST('+binding +coil +ofu800 +bahamut +- +turn +2u800 ' IN BOOLEAN MODE) ) AND page_namespace IN ('0','200','800') ORDER BY MATCH(si_title) AGAINST('+binding +coil +ofu800 +bahamut +- +turn +2u800 ' IN NATURAL LANGUAGE MODE) DESC LIMIT 21 
 Function: SearchMySQL::searchInternal
 Error: 1064 syntax error, unexpected '-' (<endpoint>.rds.amazonaws.com)


Backtrace:

#0 /var/www/<domain>/htdocs/w/includes/libs/rdbms/database/Database.php(1466): Wikimedia\Rdbms\Database->makeQueryException(string, integer, string, string)
 #1 /var/www/<domain>/htdocs/w/includes/libs/rdbms/database/Database.php(1226): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
 #2 /var/www/<domain>/htdocs/w/includes/libs/rdbms/database/Database.php(1693): Wikimedia\Rdbms\Database->query(string, string)
 #3 /var/www/<domain>/htdocs/w/includes/search/SearchMySQL.php(191): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
 #4 /var/www/<domain>/htdocs/w/includes/search/SearchMySQL.php(178): SearchMySQL->searchInternal(string, boolean)
 #5 /var/www/<domain>/htdocs/w/includes/search/SearchDatabase.php(69): SearchMySQL->doSearchTitleInDB(string)
 #6 /var/www/<domain>/htdocs/w/includes/search/SearchEngine.php(140): SearchDatabase->doSearchTitle(string)
 #7 /var/www/<domain>/htdocs/w/includes/search/SearchEngine.php(169): SearchEngine->{closure}()
 #8 /var/www/<domain>/htdocs/w/includes/search/SearchEngine.php(141): SearchEngine->maybePaginate(Closure)
 #9 /var/www/<domain>/htdocs/w/includes/specials/SpecialSearch.php(341): SearchEngine->searchTitle(string)
 #10 /var/www/<domain>/htdocs/w/includes/specials/SpecialSearch.php(190): SpecialSearch->showResults(string)
 #11 /var/www/<domain>/htdocs/w/includes/specialpage/SpecialPage.php(569): SpecialSearch->execute(NULL)
 #12 /var/www/<domain>/htdocs/w/includes/specialpage/SpecialPageFactory.php(568): SpecialPage->run(NULL)
 #13 /var/www/<domain>/htdocs/w/includes/MediaWiki.php(288): MediaWiki\Special\SpecialPageFactory->executePath(Title, RequestContext)
 #14 /var/www/<domain>/htdocs/w/includes/MediaWiki.php(860): MediaWiki->performRequest()
 #15 /var/www/<domain>/htdocs/w/includes/MediaWiki.php(517): MediaWiki->main()
 #16 /var/www/<domain>/htdocs/w/index.php(42): MediaWiki->run()
 #17 {main}

Event Timeline

debt subscribed.

This is related to Mediawiki core...and not our production search.

I found a task that was created after this incident as an exact duplicate and merged it. As mentioned in the original post it was discovered on 1.32.0 (production) and is still present as of 1.33.0-rc.0 on a development install. I will be upgrading production to 1.33.0 soon but given the change log since rc.0 I do not anticipate any improvement.

TJones lowered the priority of this task from High to Medium.Aug 27 2020, 9:18 PM
Universal_Omega raised the priority of this task from Medium to High.Jan 7 2021, 6:13 AM

Raising priority since this is still happening and is causing Database errors.

RhinosF1 lowered the priority of this task from High to Medium.Jan 7 2021, 6:23 AM

Priority reflects reality not wishes. Unless anyone wants to work on this and it's high priority for them then it's inappropriate to change priority.

Universal_Omega raised the priority of this task from Medium to High.Jan 7 2021, 6:25 AM

Priority reflects reality not wishes. Unless anyone wants to work on this and it's high priority for them then it's inappropriate to change priority.

I do plan to work on this. I forgot to mention that in my previous comment.

Priority reflects reality not wishes. Unless anyone wants to work on this and it's high priority for them then it's inappropriate to change priority.

Priority also reflects a basis of functionality. This sort of behavior described in the task is a critical negative that is detrimental to the functionality of a wiki's core search capability.

Aklapper lowered the priority of this task from High to Medium.

Compared with existing higher priority tasks this looks like medium priority overall. Please feel welcome to work on this, which can be done without raising priority. Thanks!

Can't seem to figure it out. Hopefully someone else can.

Not Miraheze related beyond we use MediaWiki.

Is this lost in the fold still? This has been an ongoing issue for a while. It's been affecting our wiki for a long while, and it's frustrating. I see this is medium priority at the moment, though I have little concept of what that means in terms of this getting resolved. Here's someone posting about it back in 2014!

InnoDB full-text search does not support the use of a leading plus sign with wildcard ('+*'), a plus and minus sign combination ('+-'), or leading a plus and minus sign combination ('+-apple'). These invalid queries return a syntax error.

https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html

The search query generated is therefore not valid. I'd echo T225517#5257171, in that dropping the single character as a search condition is probably for the best. Wrapping it with double quotes likely won't have the desired effect, as apparently the search engine does not strictly match nonword characters. As explained in the link above, "test phrase" matches "test, phrase", and there's probably going to be other weirdness in matching "-".

The search query generated is therefore not valid. I'd echo T225517#5257171, in that dropping the single character as a search condition is probably for the best. Wrapping it with double quotes likely won't have the desired effect, as apparently the search engine does not strictly match nonword characters. As explained in the link above, "test phrase" matches "test, phrase", and there's probably going to be other weirdness in matching "-".

I recognize the easiest way around this on a MediaWiki installation is to simply disallow a hyphen, dash, etc. in the title of a page. But that seems heavy handed, particularly for a non-Wikipedia or MediaWiki-related wiki. In the end, maybe we'll have to convert titles like "Book:Laboratory Journal - Spring 2019" into "Book:Laboratory Journal, Spring 2019" on our personal installation if this can never be addressed. But is seems a shame that a hyphen or dash causes so many problems.