Page MenuHomePhabricator

Blank anchor [[#|anchor link]] should not link to the Main Page
Open, MediumPublicBUG REPORT

Description

Author: charlottethewebb

Description:
# or foobar links to a null page name rather than a null section. The example links point to http://en.wikipedia.org/wiki/ which redirects to the main page.

(except for ## which points to the current url plus "#.23")

As far as html literalism goes the least astonishing behavior for # would be to make a link like this <a href="#">#</a>

that would be the same as "#top" except in cases where it is overridden by a javascript onClick attribute (which it is, most times you see anything like this).

course if there's some way to make it turn blue and clickable but not actually do anything that might be best... i'm not sure when this would intentionally be used except when attempting to create a blue link that does nothing ever.

That's how i stumbled upon it after all :-/


Version: 1.21.x
Severity: trivial
URL: http://en.wikipedia.org/w/index.php?oldid=263850468
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=45583

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:23 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz17006.
bzimport added a subscriber: Unknown Object (MLST).

Probably a bad check for fragments somewhere.... perhaps in multiple places. It should indeed be possible to add an empty fragment but _have_ a fragment... which perhaps complicates things. :D

I've tracked it down to Title::getLinkUrl(), the problem is that Title stores fragments without the leading #, then adds it back on when making a URL (Title::getFragmentForURL()), so it doesn't know the difference between a blank fragment and no fragment at all. So when making the URL, it sees no namespace, no text, and no fragment.

We could use false or null to distinguish between empty string and not-present... but then we have to be more anal about all the checks... or replace the != '' etc with a hasFragment() which'll be easier to remember.

We could also probably reasonably assume that if we've ended up with an empty title object, it's a self-link with empty fragment, so could handle it as a special case.

  • Bug 26766 has been marked as a duplicate of this bug. ***
  • Bug 32728 has been marked as a duplicate of this bug. ***
  • Bug 23799 has been marked as a duplicate of this bug. ***
  • Bug 34545 has been marked as a duplicate of this bug. ***
  • Bug 33437 has been marked as a duplicate of this bug. ***
  • Bug 38103 has been marked as a duplicate of this bug. ***
  • Bug 44270 has been marked as a duplicate of this bug. ***

Raising the importance according to the number of dupes - obviously very wanted feature and more and more actual when we create more and more gadgets nowadays.

  • Bug 58647 has been marked as a duplicate of this bug. ***

Title::hasFragment as mention by brion in comment 3 was added with gerrit 104750, so checking against false or null would be possible (for core)

Aklapper changed the subtype of this task from "Task" to "Bug Report".Feb 5 2022, 2:33 PM
Aklapper removed a subscriber: wikibugs-l-list.

Change #1287469 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/core@master] Add some parser tests for `[[#]]` and `[[#|x]]`

https://gerrit.wikimedia.org/r/1287469

IMO the root of the problem is that Title::getFragment() (and also related lower-level interfaces, like Parsoid's LinkTarget::getFragment()) does not distinguish between no fragment and empty fragment. As a result we can't distinguish a completely empty title (which should be invalid, unless maybe T374555 changes something here) from a title with an empty fragment (which represents a link to a section within a page).

I think that's fixable, although it will take some effort. Start by changing these interfaces: https://codesearch.wmcloud.org/search/?q=public+function+getFragment\(\)%3A+string to return ?string instead of string, then review all of the code to return null for no fragment and '' for empty fragment…

IMO the root of the problem is that Title::getFragment() (and also related lower-level interfaces, like Parsoid's LinkTarget::getFragment()) does not distinguish between no fragment and empty fragment. As a result we can't distinguish a completely empty title (which should be invalid, unless maybe T374555 changes something here) from a title with an empty fragment (which represents a link to a section within a page).

I think that's fixable, although it will take some effort. Start by changing these interfaces: https://codesearch.wmcloud.org/search/?q=public+function+getFragment\(\)%3A+string to return ?string instead of string, then review all of the code to return null for no fragment and '' for empty fragment…

Note that LinkTarget::hasFragment() exists, so the first step could be just to distinguish $x->getFragment()=='' && $x->hasFragment() from $x->getFragment()=='' && !$x->hasFragment() (ie, define ::hasFragment to be something other than just $this->getFragment()!==''.).

There's also LinkTarget::createFragmentTarget() which currently says "pass '' to remove the fragment", and LinkTarget::isSameLinkAs() which doesn't check hasFragment.

And even after we fix all of that there will probably be some code which tests ::getFragment() for falsiness, which will also fail to distinguish between '' and null. Sigh.

I think it's probably easiest to use a new name, ie introduce getFragmentOrNull to return ?string; that makes it easy to @deprecate getFragment and use the tools we already have to ensure that all callers can use the new return value. Unfortunately, getFragment is a great name, so we might actually want to do this in multiple steps: first (a) allow hasFragment to return false when getFragment()==null and change code to *allow* passing null as a fragment (but not *return* null yet, which could break callers), (b) deprecate getFragment() in favor of getFragmentNew() (which can return null), (c) once a deprecation period has passed, remove ::getFragment(), (d) once another deprecation period has passed, deprecate getFragmentNew() in favor of getFragment() again.

We could skip some of those steps if we can think of a better name for getFragmentNew() -- maybe getUrlFragment() ?

Change #1289460 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/services/parsoid@master] Begin to distinguish "missing fragment" from "empty fragment"

https://gerrit.wikimedia.org/r/1289460