Page MenuHomePhabricator

Source Editor autocorrects category entries
Closed, ResolvedPublicBUG REPORT

Description

Go to https://incubator.wikimedia.org/wiki/Wp/nys/Yira_djinang

Click on 'Edit source'

Make edit (e.g. insert new line 'test')

Click on 'Show preview'

What happens?:

[[Category:musician]] is autocorrected to [[Category:Musician]]

[[Category:Musician]] is not a current category, so under 'Parser profiling data:' it shows Musician in red

If you now press 'Publish changes' the page will be updated with the incorrect category

There was no warning about this change being made and the only clue that this happens is the red text if a new category is being created.
The red text is unlikely to be seen anyway as it appears right at the bottom of the wikipedia page, which is probably off the browser screen.

What should have happened instead?:
No change should have been made.

Clearly this could be a major problem if I'd have been going through multiple pages making edits and being unaware that I was introducing random errors in the category entries

Workaround is to use the Visual editor to correct the autocorrect

Software version (skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):
Using a MacBook Pro Mojave 10.14.6 with either Safari 14.1.2 or Firefox 109.0.1
N.B. Firefox Preferences -> General -> Language -> Check your spelling as you type, is unchecked

Event Timeline

matmarex added subscribers: jhsoby, matmarex.

I can reproduce in the normal wikitext editor.

image.png (2×3 px, 421 KB)

I've never seen this feature before. It looks like it's provided by a gadget: https://incubator.wikimedia.org/wiki/MediaWiki:Gadget-AddPrefix.js @jhsoby is the author, please help :)

(Although the relevant localisation messages are defined in the WikimediaIncubator extension, which is a bit confusing: https://gerrit.wikimedia.org/g/mediawiki/extensions/WikimediaIncubator/+/8c3f1a460d150ad427926cfeca6b2ad57a3f8978/i18n/onwikimessages/en.json#27)

I'm pretty sure that the answer here is going to be that Incubator has some strange configuration that allowed the creation of titles like Category:Wp/nys/musician which are not following $wgCapitalLinks=true formatting. The 2017 editor's normally helpful title normalization is causing issues because the titles are weird. The expected title would be Category:Wp/nys/Musician with a capital M.

Typical page title semantics are slightly different in Incubator because of its heavy use of subpages so that content is prefixed with two letters indicating the project, 2-3 letters indicating the language, and finally the "real" page title that will be in use when the content is promoted from Incubator to an independent wiki.

It's a feature, not a bug. :-)

@bd808 is right about the titles. The Incubator uses prefixes (project/language code/) for all pages, and those "interfere" with normal capitalization of pages. In a normal wiki without prefixes, "Category:musician" would be corrected to "Category:Musician" when viewing the page because the first character in the pagename (the "m") is not capitalized. On the Incubator, however, that doesn't happen, because the first character in the page name is technically "W" (from the prefix "Wp/nys/").

The AddPrefix gadget removes prefixes when you start editing, and adds them back when you finish editing (i.e. when you save or preview), so that you don't have to deal with prefixes at all. That way you can just write [[abc]] instead of [[Wp/nys/Abc|abc]]. Like bd808 touches upon, this is also related to the $wgCapitalLinks setting, or what it would have been if the test wiki was an independent wiki. The orthographies of most languages written in the Latin (and Cyrillic, and Greek) alphabets follow the same conventions of capitalizing the first letter in a sentence, but there are some that don't (notably many Taiwanese aboriginal languages). For that we have a separate setting (that the AddPrefix gadget uses) in MediaWiki:Globals.js that we can add the language code to.

The main point of this feature of capitalizing category names and page names is to avoid duplicate pages when projects are approved. Technically, you could have two different pages on the Incubator, like "Wp/nys/abc" and "Wp/nys/Abc", but those would actually be the same page ("Abc") when they are exported to an independent wiki, unless the language uses lowercase initials.

So the question to you, @Aarghdvaark, is: What is the orthographic conventions in Nyungar? Do sentences start with a capital letter? If so, you should probably rename the categories with lowercase initials to have uppercase initials. If not, we can add nys to the configuration for lowercase languages in Globals.js.

Done a visual scan on the 573 pages in Wp/nys and all start with a capital letter, except for language building block pages Wp/nys/-al etc.

Categories are a different matter, the majority start with a capital but a significant number are lower case.

I'm not adverse going with a convention that categories should start with a capital letter. But I am adverse to correcting by hand all the pages with categories starting with a lower case!

I can help to fix those automatically with my bot, that will only take me a few minutes.

jhsoby claimed this task.

It has been done now – all categories with lowercase initials were moved to uppercase, and all articles within them have been moved to the new categories.