Page MenuHomePhabricator

Searching with hyphen can cause mysql query error
Closed, DuplicatePublic

Description

Reported on IRC, MW 1.31.2, MySQL 5.7.26-0ubuntu0.18.04.1

https://comicwiki.dk/index.php?title=Speciel%3AS%C3%B8gning&search=Blueberry%20-%20De%20samlede%20eventyr&fulltext=1

Search string was Blueberry - De samlede eventyr

Seems the issue has been around a long time - https://www.mediawiki.org/wiki/Topic:Rz5tsfylmgjdmlvp (since at least 1.21)

[XP@91yP-rZLZDBuH3rzr-wAAAAI] /index.php?title=Speciel%3AS%C3%B8gning&search=Blueberry%20-%20De%20samlede%20eventyr&fulltext=1 Wikimedia\Rdbms\DBQueryError from line 1457 of /var/www/html/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: SELECT page_id,page_namespace,page_title FROM `mw_page`,`mw_searchindex` WHERE (page_id=si_page) AND ( MATCH(si_title) AGAINST('+blueberry +- +deu800 +samlede +eventyr ' IN BOOLEAN MODE) ) AND page_namespace = '0' LIMIT 20 
Function: SearchMySQL::searchInternal
Error: 1064 syntax error, unexpected '-' (localhost)
Backtrace:

#0 /var/www/html/includes/libs/rdbms/database/Database.php(1427): Wikimedia\Rdbms\Database->makeQueryException(string, integer, string, string)
#1 /var/www/html/includes/libs/rdbms/database/Database.php(1200): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /var/www/html/includes/libs/rdbms/database/Database.php(1653): Wikimedia\Rdbms\Database->query(string, string)
#3 /var/www/html/includes/search/SearchMySQL.php(191): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
#4 /var/www/html/includes/search/SearchMySQL.php(178): SearchMySQL->searchInternal(string, boolean)
#5 /var/www/html/includes/specials/SpecialSearch.php(318): SearchMySQL->searchTitle(string)
#6 /var/www/html/includes/specials/SpecialSearch.php(185): SpecialSearch->showResults(string)
#7 /var/www/html/includes/specialpage/SpecialPage.php(565): SpecialSearch->execute(NULL)
#8 /var/www/html/includes/specialpage/SpecialPageFactory.php(568): SpecialPage->run(NULL)
#9 /var/www/html/includes/MediaWiki.php(288): SpecialPageFactory::executePath(Title, RequestContext)
#10 /var/www/html/includes/MediaWiki.php(861): MediaWiki->performRequest()
#11 /var/www/html/includes/MediaWiki.php(524): MediaWiki->main()
#12 /var/www/html/index.php(42): MediaWiki->run()
#13 {main}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I note, on my dev wiki (so roughly master) with MariaDB (rather than MySQL) 10.3.15-MariaDB-1:10.3.15+maria~disco-log it doesn't error

I feel this report is dejavu (like I've seen something else reported recently)... So maybe it's fixed in master (or some recent ish release branch) and something needs backporting...

Else we're doing something allowed only on newer mysql/mariadb, and older supported versions don't like it

The affected wiki is 1.31.2, not 1.32.1.

Pulled a mysql 5.7.26 container, and indeed in boolean mode mysql doesn't like +-. Interestingly I tried with mariadb 10.3.15 and it emits the same error. This suggests we are doing some sort of different query building on mysql vs mariadb?

With a little playing around it looks like double quoting would prevent the parse error, but +"-" doesn't match anything. This is somewhat expected, most search engines don't index things like the -. It might be worth dropping single non-character strings when prefixing everything with +?

On my dev wiki with $wgDebugDumpSql

[DBQuery] SearchMySQL::minSearchLength [0.006s]: SHOW GLOBAL VARIABLES LIKE 'ft\_min\_word\_len'
SearchMySQL::parseQuery: Would search with '+blueberry  +-  +deu800  +samlede  +eventyr '
SearchMySQL::parseQuery: Match with /\bBlueberry\b|\b\-\b|\bDe\b|\bsamlede\b|\beventyr\b/
[DBQuery] SearchMySQL::searchInternal [0.002s]: SELECT  page_id,page_namespace,page_title  FROM `mw_page`,`mw_searchindex`    WHERE (page_id=si_page) AND ( MATCH(si_title) AGAINST('+blueberry  +-  +deu800  +samlede  +eventyr ' IN BOOLEAN MODE) ) AND page_namespace = '0'  ORDER BY  MATCH(si_title) AGAINST('+blueberry  +-  +deu800  +samlede  +eventyr ' IN NATURAL LANGUAGE MODE) DESC  LIMIT 21
SearchMySQL::parseQuery: Would search with '+blueberry  +-  +deu800  +samlede  +eventyr '
SearchMySQL::parseQuery: Match with /\bBlueberry\b|\b\-\b|\bDe\b|\bsamlede\b|\beventyr\b/
[DBQuery] SearchMySQL::searchInternal [0.001s]: SELECT  COUNT(*) as c  FROM `mw_page`,`mw_searchindex`    WHERE (page_id=si_page) AND ( MATCH(si_title) AGAINST('+blueberry  +-  +deu800  +samlede  +eventyr ' IN BOOLEAN MODE) ) AND page_namespace = '0'
SearchMySQL::parseQuery: Would search with '+blueberry  +-  +deu800  +samlede  +eventyr '
SearchMySQL::parseQuery: Match with /\bBlueberry\b|\b\-\b|\bDe\b|\bsamlede\b|\beventyr\b/
[DBQuery] SearchMySQL::searchInternal [0.001s]: SELECT  page_id,page_namespace,page_title  FROM `mw_page`,`mw_searchindex`    WHERE (page_id=si_page) AND ( MATCH(si_text) AGAINST('+blueberry  +-  +deu800  +samlede  +eventyr ' IN BOOLEAN MODE) ) AND page_namespace = '0'  ORDER BY  MATCH(si_text) AGAINST('+blueberry  +-  +deu800  +samlede  +eventyr ' IN NATURAL LANGUAGE MODE) DESC  LIMIT 21
SearchMySQL::parseQuery: Would search with '+blueberry  +-  +deu800  +samlede  +eventyr '
SearchMySQL::parseQuery: Match with /\bBlueberry\b|\b\-\b|\bDe\b|\bsamlede\b|\beventyr\b/
[DBQuery] SearchMySQL::searchInternal [0.001s]: SELECT  COUNT(*) as c  FROM `mw_page`,`mw_searchindex`    WHERE (page_id=si_page) AND ( MATCH(si_text) AGAINST('+blueberry  +-  +deu800  +samlede  +eventyr ' IN BOOLEAN MODE) ) AND page_namespace = '0'
[DBQuery] Title::getTitleProtectionInternal [0.001s]: SELECT  pt_user AS `user`,pt_expiry AS `expiry`,pt_create_perm AS `permission`,comment_pt_reason.comment_text AS `pt_reason_text`,comment_pt_reason.comment_data AS `pt_reason_data`,comment_pt_reason.comment_id AS `pt_reason_cid`  FROM `mw_protected_titles` JOIN `mw_comment` `comment_pt_reason` ON ((comment_pt_reason.comment_id = pt_reason_id))   WHERE pt_namespace = '0' AND pt_title = 'Blueberry_-_De_samlede_eventyr'

Interesting, wonder whats different:

$ docker run --name some-mariadb -e MYSQL_ROOT_PASSWORD=password -d mariadb:10.3.15
a8a35ac96e5556b78a813df08eec9198e0ef6499e37428783d56ddfb8c531a65

$ docker exec -it some-mariadb mysql -u root -ppassword
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.3.15-MariaDB-1:10.3.15+maria~bionic mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> create database search;
Query OK, 1 row affected (0.000 sec)

MariaDB [(none)]> use search;
Database changed
MariaDB [search]> CREATE TABLE articles (
    ->           id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
    ->           title VARCHAR(200),
    ->           body TEXT,
    ->           FULLTEXT (title,body)
    ->         ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.160 sec)

MariaDB [search]> INSERT INTO articles (title,body) VALUES
    ->         ('MySQL Tutorial','DBMS stands for DataBase ...'),
    ->         ('MySQL - Tutorial', 'In this tutorial we will ...');
Query OK, 2 rows affected (0.006 sec)
Records: 2  Duplicates: 0  Warnings: 0

MariaDB [search]> select * from articles where MATCH(title,body) AGAINST('+mysql +- +tutorial' IN BOOLEAN MODE) ORDER BY MATCH(title, body) AGAINST ('+mysql +- +tutorial' IN NATURAL LANGUAGE MODE) DESC;
ERROR 1064 (42000): syntax error, unexpected '-'

I'm just using the mariadb.org package for Ubuntu 19.04, stock config AFAIK, but possibly not currently what's distributed (as it's an old vm that's been upgraded through many many OS versions)...

Looks like you're using the packages from the same place, just for an older OS

I can post my mysql config/config vars if it helps

It's Ubuntu 18.04 with mysql 5.7.26, which is what was in the repository. It doens't seem like MariaDB would have made any difference.