Page MenuHomePhabricator

Mysql search issues flagged by Phabricator setup
Closed, ResolvedPublic

Description

We switched over to MySQL for our search backend today. A few issues are now flagged (this doesn't necessarily mean we need to change them) but it's more like a "are you sure this is what you want?" message from the setup check. I'm wondering if 'stop words' we leave default, change the min_word_len to 3 and change default boolean search to AND.

@Springle thoughts? :)

Event Timeline

chasemp created this task.Feb 11 2015, 6:44 PM
chasemp raised the priority of this task from to Needs Triage.
chasemp updated the task description. (Show Details)
chasemp added projects: acl*sre-team, Phabricator.
chasemp added subscribers: chasemp, Springle.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 11 2015, 6:44 PM
chasemp triaged this task as High priority.Feb 11 2015, 6:47 PM
chasemp set Security to None.
demon added a comment.Feb 11 2015, 8:01 PM
  • +1 to changing the boolean syntax to AND instead of OR. Nobody expects OR by default.
  • +1 to lowering min word length to 3, as long as it doesn't have insane performance implications
  • Indifferent to using Phabricator's stopword file

+1 to all three options, imo.

The ARIA engine uses the same fulltext stopword list as MyISAM did, which is fairly long[1]. We also need to increase the aria_pagecache_buffer_size.

[1] https://mariadb.com/kb/en/mariadb/stopwords/#myisam-stopwords

+1 to the word length (I just reported this, see T89369) and +1 to the operator. To be honest I would classify the fact that a search engine defaults to OR as a bug nowadays.

Question: Is the fact that the search ignores my attempts to type AND a bug I should report?

gerritbot added a subscriber: gerritbot.

Change 190775 had a related patch set uploaded (by Springle):
phabricator using mysql fulltext T89274, tweaked for mariadb/aria

https://gerrit.wikimedia.org/r/190775

Patch-For-Review

The ft_boolean_syntax fix for default AND behavior has been applied as it doesn't technically need a DB restart. The ft_min_word_len, stopwords, and table rebuild do need the restart.

Question: Is the fact that the search ignores my attempts to type AND a bug I should report?

The technical reason is that MySQL boolean fulltext syntax[1] uses + instead of AND. A phabricator bug report pitching it as a user interface fail might be justified.

http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

@Springle I think the puppet looks good. When is good for you to knock this out? I'll try to sync up with you on irc also. Thanks again

Change 190775 merged by Springle:
phabricator using mysql fulltext T89274, tweaked for mariadb/aria

https://gerrit.wikimedia.org/r/190775

chasemp closed this task as Resolved.Feb 18 2015, 12:50 AM
chasemp claimed this task.

done