Page MenuHomePhabricator

Special:LinkSearch shows normal chars, if the real weblink is encoded.
Closed, ResolvedPublic

Description

The Special:LinkSearch for example shows links with ' if there is a %27 at the weblink. For replacing weblinks this is very unusable.

See also:

Details

Related Gerrit Patches:

Event Timeline

Luke081515 raised the priority of this task from to Needs Triage.
Luke081515 updated the task description. (Show Details)
Luke081515 added a subscriber: Luke081515.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 14 2016, 2:16 PM

Example in other direction:
URL in Lemma [[de:Olympische Sommerspiele 1912/Tennis/Dameneinzel/Halle]]: http://www.itftennis.com/procircuit/tournaments/women%27s-tournament/info.aspx?tournamentid=1020000015 (with %27!)

Test with original URL with %27: Spezial:LinkSearch/http://www.itftennis.com/procircuit/tournaments/women%27s-tournament/info.aspx?tournamentid=1020000015 =>fail/no match

Test with manipulated URL with >'<: Spezial:LinkSearch/http://www.itftennis.com/procircuit/tournaments/women's-tournament/info.aspx?tournamentid=1020000015 =>success/[[de:Olympische Sommerspiele 1912/Tennis/Dameneinzel/Halle]]

Valid URL-encoded chars should not be replaced by decoded chars.

Here a list of all replaced encoded chars:

[%21 !], [%24 $], [%26 &], [%27 '], [%28 (], [%29 )], [%2A *], [%2B +], [%2C 0], [%2D -], [%2E .], [%30 0], [%31 1], [%32 2], [%33 3], [%34 4], [%35 5], [%36 6], [%37 7], [%38 8], [%39 9], [%3A :], [%3B ;], [%3D =], [%40 @], [%41 A], [%42 B], [%43 C], [%44 D], [%45 E], [%46 F], [%47 G], [%48 H], [%49 I], [%4A J], [%4B K], [%4C L], [%4D M], [%4E N], [%4F O], [%50 P], [%51 Q], [%52 R], [%53 S], [%54 T], [%55 U], [%56 V], [%57 W], [%58 X], [%59 Y], [%5A Z], [%5F _], [%61 a], [%62 b], [%63 c], [%64 d], [%65 e], [%66 f], [%67 g], [%68 h], [%69 i], [%6A j], [%6B k], [%6C l], [%6D m], [%6E n], [%6F o], [%70 p], [%71 q], [%72 r], [%73 s], [%74 t], [%75 u], [%76 v], [%77 w], [%78 x], [%79 y], [%7A z], [%7E ~]

you will find the URL-decoded chars only in externalurl table, the URL in html of the articles is the original URL (with encoded chars).

Thoken added a subscriber: Thoken.Mar 2 2016, 7:01 PM

Change 275906 had a related patch set uploaded (by Ferveo):
Normalize user provided URL link for Special:LinkSearch page

https://gerrit.wikimedia.org/r/275906

ferveo added a subscriber: ferveo.Mar 8 2016, 8:41 PM
Florian assigned this task to ferveo.Mar 10 2016, 4:52 PM
Florian triaged this task as Medium priority.
Florian moved this task from To triage to Special:LinkSearch on the MediaWiki-Special-pages board.
Florian removed a project: MediaWiki-General.
  • Please evaluate if it is possible to give the original URLwithout changes into the el_from column (but do not html escapes into it) and the normalized URL into the el_index columne. This could also help to find internationalized domain names (IDN rfc5890)
  • For %27 please normalize »'« to %27 because »''« ist part of wikitext syntax.

Change 275906 merged by jenkins-bot:
Normalize user provided URL link for Special:LinkSearch page

https://gerrit.wikimedia.org/r/275906

@ferveo: Hi! Is this task still valid and should still be open? If yes, are you still working (or still plan to work) on this task? (If you do not plan to work on this task anymore, please remove yourself as assignee (via Add Action...Assign / Claim in the dropdown menu) so in theory others could work on it.) Thanks!

Aklapper closed this task as Resolved.Fri, Jan 3, 12:28 AM

No reply to my last comment, assuming this is done. Please reopen if I'm wrong.