Page MenuHomePhabricator

Manual linking to Special:SearchByProperty difficult due to parameter encoding
Closed, ResolvedPublic

Description

Author: mediawiki.bugzilla

Description:
If I have a page or property with a hyphen or space in its name, I may want to do a Special:SearchByProperty on it, using the tidy slash-based syntax. For example, http://sandbox.semantic-mediawiki.org/wiki/User:Paulproteus contains:

[[Special:SearchByProperty/Your-Mom/Wears Polyester]]

This generates the following URL:

http://sandbox.semantic-mediawiki.org/wiki/Special:SearchByProperty/Your-Mom/Wears_Polyester

Visiting that URL shows that SMW interpreted the values this way: A list of all pages that have property "Your%Mom" with value "Wears_Polyester"

The sandbox wiki seems to be running SMW 1.4a. On learn.creativecommons.org/community (running SMW 1.3.4), the generated URLs follow use + to encode spaces in the value portion, and _ to encode spaces in the property portion. On that wiki, there are different sorts of encoding errors; further details on that are below.

There are two sensible fixes: Changing the way links get generated so that space and hyphen are properly encoded, or fixing the Special:SearchByProperty to do its decoding differently.

MORE INFO ON SMW 1.3.4 BEHAVIOR

The past behavior is not quite correct, but it is in some ways less wrong. http://learn.creativecommons.org/community/ATT_Knowledge_Network_Explorer_-_Education links to http://learn.creativecommons.org/community/Special:SearchByProperty/Tag/K-12%2Beducation , turning the " " in the value portion into a +. However, both non-alphanumeric characters there are corrupted. It does link the word "no" to http://learn.creativecommons.org/community/Special:SearchByProperty/Open_or_Free_Statement/no , properly encoding and decoding the "Open or Free Statement" into Open_or_Free_Statement.

WHY THIS MATTERS

This is the only way (I think) to, within the wiki, link to SearchByProperty results.

SUGGESTION

In my half-serious opinion, the best way to fix this sort of thing is to write a test and make quasi-random changes until the test passes, then read the patch that fixes the issue, be able to justify it as if you came up with it intentionally, and then commit.


Version: unspecified
Severity: normal

Details

Reference
bz16150

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:22 PM
bzimport set Reference to bz16150.
bzimport added a subscriber: Unknown Object (MLST).

I fear that the current processing is not "mis-decoding" but was a deliberate change in the way parameters are processed, with the goal of allowing SMW to create internal links with arbitrary parameter values from wiki pages (this was not possible before, creating all kinds of problems with special symbols).

The current encoding works as follows: first encode special symbols (including "%", "/", and "-") URL-style (using %), and then replace all "%" with "-". Multiple parameters are separated with /. One reason for the replacement of "%" is that we use the same encoding method everywhere, including in our XML-based export: In XML tags, % is not allowed as a symbol.

The replacement of " " by "_" is something that happens for all links in a wiki. I assume that we cannot avoid this at all. In the above encoding, one would write "-20" for spaces. This is another reason why we cannot use % encoding: "%20" could be used to encode " ", but then the URL does again contain "_" in this place.

I am rather sure that the new encoding is much more reliable than the old method. We can now encode essentially all parameter values in a way that allows us to make an internal wiki link and an XML entity. I also understand your request but I do not see a significantly better solution that is still able to encode all possible symbols reliably -- and changing the encoding would also change many existing URLs that might be used in other places.

I could offer to support a syntax like [[Special:SearchByProperty/Your-Mom::Wears Polyester]]. This syntax would then have no way to encode " " at all (the example would again mean "[[Special:SearchByProperty/Your-Mom::Wears_Polyester]]"), but - would work as expected.

mediawiki.bugzilla wrote:

Howdy Markus,

Sorry about the delay. I would appreciate that syntax, since it would be able to encode everything but " " (and it seems to me that using ugly URLs one should be able to encode " " anyway).

Unknown Object (User) added a comment.Nov 7 2012, 10:38 AM

In light of recent changes in SMW 1.8 and in respect of [1], does the described behaviour still prevail?

[1] https://gerrit.wikimedia.org/r/#/c/32023/

I think all parameters are still using the same escaping strategy as before. We should change this at some point. The special escaping was needed for RDF/XML export but there does not seem to be a reason why we should use a uniform escape strategy for all cases (so we can have special escaping for RDF and standard escaping for other functions).

Unknown Object (User) added a comment.Nov 7 2012, 11:10 AM

Well in this case I put this one on the 1.9 tracking ticket (bug 41842) so we don't have to break a leg or two here.

Removing as blocker for 1.9

Setting to new as no one is working on this