Page MenuHomePhabricator

SemanticForms autocompletion on property buggy (patch attached)
Closed, ResolvedPublic

Description

Author: mail

Description:
the fix

I brought this up on the mailing list a while ago:
http://groups.google.com/group/semantic-forms/browse_thread/thread/1d92f80ac8f8d4ee/2fce47200af23513?lnk=gst#2fce47200af23513

Now I updated to version 1.7.2 from 1.6 and found that the bug is still present, so I'm trying it via the bugtracker ;)

As it is now, SF's autocompletion does only work in some cases on properties - see for yourself at the specified URL (my Wiki is currently running vanilla SF 1.7.2, without my patch): Start with entering an 'a' into the "Test autocomplete on property (remote)" field. You will get "Abcd Efgh" and "abcd efgh" in the autocompletion list. Now empty the field and start with entering an 'e', which should bring up those two again, but nothing will come up. The attached patch fixes this problem.

The problem I fixed, was, that MySQL's LOWER() [1] does not work on BINARY, VARBINARY and BLOB. But page_title is (at least in my database) of type VARBINARY. I changed the SQL queries so that all arguments to LOWER() function calls are converted to UTF8 first.


Version: unspecified
Severity: normal
URL: http://patrick-nagel.net/wiki/Special:AddData/Test_autocomplete_on_property/X

attachment sf_autocomp_case3.patch ignored as obsolete

Details

Reference
bz19406

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:42 PM
bzimport set Reference to bz19406.

mail wrote:

Oh, I just noticed that SF_Utils.inc changed between SF 1.6 and SF 1.7.2, so my patch won't apply cleanly. I'll iron this out and attach a new patch.

mail wrote:

the fix (ported from SF 1.6 to SF 1.7.2)

attachment sf_autocomp_case4.patch ignored as obsolete

It appears that that page_title column used to be of type varbinary, but was then switched to varchar at least four years ago (version 1.5) - you must be using a very old wiki that's been upgraded along the way. In any case, thanks for the patch.

mail wrote:

(In reply to comment #3)

It appears that that page_title column used to be of type varbinary, but was
then switched to varchar at least four years ago (version 1.5) - you must be
using a very old wiki that's been upgraded along the way.

What is your source for this information? Both Wikis I'm having this problem on are recent installations (started with 1.11.0 or later).

I'm quite sure you can still get this with current Wiki installations - I think it's even the default. Have a look at the MediaWiki installer (config/index.php), at section Database character set:

"Select one:

  • MySQL 4.1/5.0 binary
  • MySQL 4.1/5.0 UTF-8
  • MySQL 4.0 backwards-compatible UTF-8

[...]

In binary mode, MediaWiki stores UTF-8 text to the database in binary fields. This is more efficient than MySQL's UTF-8 mode, and allows you to use the full range of Unicode characters. In UTF-8 mode, MySQL will know what character set your data is in, and can present and convert it appropriately, but it won't let you store characters above the Basic Multilingual Plane. "

MySQL 4.1/5.0 binary made more sense for me when I installed both Wikis (and it was set from the beginning, thus I regard it as default setting), and I guess that's where the VARBINARY comes from.

Oh, I see; my source was http://www.mediawiki.org/wiki/Manual:Page_table - it looks like that one doesn't explain the full story. Well, that's very good to know, and thanks again for the patch.

mail wrote:

Hi Yaron, thanks for including the patch in version 1.7.3.

I now realise that my patch actually fixed two issues - the one with the case and "VARBINARY users", and another one with autocompletion only working on the beginning of the property value, not on a beginning of a word in a property value. I had forgotten about the second fix already, and thus didn't mention it (which now causes a bit of confusion... sorry).

In line 133 of includes/SF_AutocompleteAPI.php you changed
[...] LIKE '% " [...]
back to
[...] LIKE '%\_" [...]
which looks logical when taking into account what line 130 does.

But the thing is, that column value_xsd of table smw_atts2 does *not* use underscores instead of spaces (at least not in my two Wiki databases), which leads to the problem that autocompletion only finds beginnings of property values, but not beginnings of words in property values.

Knowing this, and thinking a bit more about it, we also need to remove lines 130 and 131, since they make it impossible to have a space in the autocomplete search string (substring).

I'll use the following example property value to illustrate the situation: "Software Engineer / IT Security Engineer"

Current situation in 1.7.3 (and before my patch against 1.7.2):

  • Autocompleting on "Soft" shows the string
  • Autocompleting on "Security" does not show the string
  • Autocompleting on "Security Engineer" does not show the string

Situation after my (unchanged) old patch (against 1.7.2):

  • Autocompleting on "Soft" shows the string
  • Autocompleting on "Security" shows the string
  • Autocompleting on "Security Engineer" does not show the string

Situation after my new patch (against 1.7.3, which again changes "%\_" to "% " like my old patch did, and additionally removes the space-to-underscore conversion in the autocomplete substring):

  • Autocompleting on "Soft" shows the string
  • Autocompleting on "Security" shows the string
  • Autocompleting on "Security Engineer" shows the string

mail wrote:

fix for the second autocompletion issue

attachment sf_autocomp_case5.patch ignored as obsolete

Ah, I didn't notice the other part of your initial patch, although maybe that's for the best since there are more changes now. But unfortunately, this new patch doesn't work for me, for properties of type "page", since page titles are stored in the database with underscores instead of spaces - any space in the search string leads to no results. Do page titles have spaces in the "VARBINARY" representation? In any case, a true solution has to handle both, in one way or another - maybe just through an "if" statement.

mail wrote:

(In reply to comment #8)

But unfortunately, this new
patch doesn't work for me, for properties of type "page", since page titles are
stored in the database with underscores instead of spaces - any space in the
search string leads to no results. Do page titles have spaces in the
"VARBINARY" representation?

No, they also have underscores - I just didn't think of that, and didn't test with properties of type "page". Now the space-problem is fixed for type "string" properties, but it's broken for type "page" properties.

In any case, a true solution has to handle both, in
one way or another - maybe just through an "if" statement.

Yes, either with an if-statement, or doing a MySQL-String-Replace (http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_replace) in order to convert the underscores to spaces. Selecting the smw_sortkey column instead of the smw_title column would probably also work - since the only difference between those two seems to be that smw_sortkey values have spaces instead of underscores.

Which solution would you prefer?

mail wrote:

next try, replaces underscores to spaces in SQL (against 1.7.3)

This is the solution with MySQL's string replace.

Attached:

Thanks for the patch; this has been added to version 1.8.