Page MenuHomePhabricator

Make page titles case-insensitive (but preserve case-sensitivity for display)
Open, LowestPublicFeature

Description

Author: alpeterson

Description:
I finally got a good idea of how to reduce the number of redirects, and annoying
piped links... and start the transitioning to case-preserving case-insensitive

a "unique" boolean column / field, added to the table that lists the wiki pages.

If the list has the tag "unique", that is it, stop the mysql query, and output
that result.

If it does not have the tag unique.. then, keep searching for potential pages to
return.

Basically, have a flag for uniquely spelled pages, and this unique flag can be
turned on initially in english only named pages.. as not to interfere with
UTF8 named pages.

This would reduce the editing time substantially on pages, and would allow, a,
probably loved, so I hate to bash it, awkward convention of capitalizing the
first letter of a WikiName

I am hoping to join the coding effort, I need to learn a bunch of stuff first,
but I really hope to get this feature... at least on my lesser wiki where
performance isn't a problem.. yet... -AP

Maybe mysql doesn't support this? maybe postgres? does? the combining of two
queries into one, where if one is met, the other halts... because this could
result in a performance increase of up to 1/2 for the database end of things
(assuming checking a boolean is cheap, and checking the filename is expensive)

anyway, when sombody writes a new page of a name that is already there, the
database has to be queried to see if anybody has edited that page in the mean
time.. that same querry could return "did you mean to edit this page with the
same spelling but different capitalization? if the person says no, the unique
flag is removed from that page, and then the querry must continue on to check
for all instances of a certain page...

(I was told that the mysql querry is already case-insensitive.. which doesn't
make sence to me.. that selecting of the right page is done in php with the
list of results returned by the querry when loading a page??? I might have
been mistaken, the search.. makes sence to be case insensitive, and the person
was probably thinking that that was what I was talking about)

anyway, case-insensitive could have other advantages for the database.. (I'm
theorising, I'm not a database designer... so .. this is why I wanted to post
this idea on the wiki counterpart to this page.. but.. [[m:case sensitivity]]
isn't something I feel like typing into my browser.

Bah, the page names are probably all cached niftilly...

anyway, I am constantly annoyed by miscapitalizations and rarely correct
compromises... i really hope that my musings have resulted in some usefull
ideas for this system.

-AP (my email address is changing soon)


Version: unspecified
Severity: enhancement

Details

Reference
bz453

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 6:52 PM
bzimport set Reference to bz453.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

*** Bug 9173 has been marked as a duplicate of this bug. ***

ayg wrote:

I'm seriously confused by comment #0. As far as I can see, this would quite
simply (hah) entail adding a field page_display_title that would allow any
capitalization and allow mixing underscores and spaces, uniformly lowercase
page_title, and use the normalized page_title as the DB key as now while using
page_display_title for display instead of the voodoo magic we do now. The issue
would be in converting everything so existing conflicts are dealt with cleanly
and nothing in the entire codebase breaks. I don't get the proposed solution,
which is also incidentally 2.5 years old.

Anyway, I'm clarifying the purpose of this bug, since I didn't see any suitable
similar bugs.

andre wrote:

Hey, I'm just posting to let you know that allowing case insensitive page titles-searches, but case-sensitive pages would in my eyes be a really great feature for the Wikipedia-project, so I really vote for implementing this. It would reduce the number of necessary redirect from things like "John doe" to "John Doe" (not to speak about things like "HHC Hardenberg" or "President of the United States" where it gets harder and harder), reduce the possibility of getting two articles about the same subject (which at least happens regularly on the Dutch wikipedia) and would not bother the user of Wikipedia with a worthless "(Redirected from John doe)"-message when entering "john doe" (as everyone does) in the search field. So... I really hope that this can be implemented.

dotcom wrote:

This feature is possible. I have implemented it at the Homestar Runner wiki. Not only does my implementation match links in a case-insensitive manner, it will also match plurals (configurable in LocalSettings on a per-wiki basis). The functions handle typed URLs, searches, and links in wikitext. Here are the specifications as listed on http://www.hrwiki.org/index.php/HRWiki:Autopipe/Autoredirect :


When a link is entered on a page that points to a page in the _main, project, user, or help namespace_, the system first tries to match it exactly as it's written. If the page does not exist, the system automatically tries to create a hidden piped link to a _capitalized/lowercase or plural form_ (except it does not attempt to match plural forms for the user namespace). Similarly, if the system cannot find a page based on a URL, it tries to automatically create a redirect. What this means is that it is no longer necessary to create a piped link from "goat" to "Goats", for example, and it is not necessary to have a corresponding redirect page. Pages where the capitalized and lowercase forms mean different things, like "Homsar" and "homsar", are unaffected.

If it should ever become necessary to create a page that ordinarily would be autoredirected, just type it into the URL, and you'll be presented with a "rediected from" link, such as http://www.hrwiki.org/index.php?title=goat&redirect=no. Then create the page like normal.

My implementation requires some methods to be added to Title.php; a minor change to Wiki.php, SearchEngine.php, and Linker.php; and some fairly substantial changes to a couple of spots in Parser.php. If this feature were to be seriously considered for MediaWiki proper, I would be more than happy to elaborate on how I did it.

happy.melon.wiki wrote:

IMO this should be WONTFIXed: we have redirects, which are *not* evil as many people believe, we have the recently-overhauled LuceneSearch and mwsuggest, and we have {{DISPLAYTITLE:...}} to correct page display. I don't think that any additional functionality is needed, certainly not on the level of a schema change.

dotcom wrote:

My implementation in comment #5 doesn't require a schema change. It makes many redirects unnecessary but not obsolete. Having used it for a while now I can't imagine living without it.

Go ahead and post the patch (against trunk, please). Can't hurt to have it in front of us to review, and it may very well be worth committing to MW.

  • Bug 22055 has been marked as a duplicate of this bug. ***
  • Bug 22211 has been marked as a duplicate of this bug. ***

a.koppad wrote:

Using the word CHICKEN, I am trying to resolve this issue.

For example, the link http://en.wikipedia.org/wiki/CHICKEN returns that Wikipedia does not have an article with this exact name. However it does state that "Please search for CHICKEN in Wikipedia to check for alternative titles or spellings. "

And when we click on the alternatives link, then we get the link connecting to the relevant pages. For now, this issue can be updated to be resolved?

Anu: The default "Article not found" page with a Search link is not a "resolution" for this request.
FYI, the underlying problem of this bug report is about *page titles* and described in https://meta.wikimedia.org/wiki/Case sensitivity_of_page_names

The correct link is https://meta.wikimedia.org/wiki/Case_sensitivity_of_page_names

Coincidentally, I also found https://meta.wikimedia.org/wiki/Case_insensitivity_of_page_names , where you can find more argumentation pro and against case-sensitiveness.

In any case, nobody is currently planning to work on this. Setting priority accordingly.

Unknown Object (User) subscribed.Sep 27 2017, 8:17 PM
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM
Aklapper removed a subscriber: bzimport.