Page MenuHomePhabricator

magic character conversions in Esperanto cause problems with foreign words, interlanguage links
Closed, ResolvedPublic

Description

Author: gangleri

Description:
See: http://eo.wikipedia.org/wiki/Vikipedio:Bugzilla_1512

Regards Reinhardt


Version: unspecified
Severity: normal
URL: http://eo.wikipedia.org/wiki/Vikipedio:Bugzilla_1512

Details

Reference
bz1512
TitleReferenceAuthorSource BranchDest Branch
roles: remove stewards-usergroup@lists.wikimedia.org from the steward rolerepos/stewards/onboarding-system!1jjmc89steward-remove-wmsugmain
Allow trailing whitespace in Bug and Change-Id trailersrepos/ci-tools/commit-message-validator!18bd808work/bd808/markdown-trailersmain
Update README againlucaswerkmeister-wmde/test!2lucaswerkmeister-wmdemrmain
Update READMElucaswerkmeister-wmde/test!1lucaswerkmeister-wmdeprmain
Bump version to 1.4.2 due to bug in osv clirepos/security/gitlab-ci-security-templates!26sbassettT351263-bump-osv-version-due-to-errormain
[PHP] Reinstate MetricsClient::SCHEMA constantrepos/data-engineering/metrics-platform!19phuedxwork/phuedx/T351294main
Do not require extension.json schema changesrepos/growth/community-configuration-example!2urbanecmextension-attributesmain
[maintain-harbor] minor readability refactorrepos/cloud/toolforge/maintain-harbor!19raymond-ndibeminor_refactormain
Customize query in GitLab

Revisions and Commits

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 8:12 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz1512.
bzimport added a subscriber: Unknown Object (MLST).

gangleri wrote:

Dear friends,

there are a few errors related to the special characters (c+x, ... u+x) used in
Esperanto especially if the combinations "c" and "x", ... "u" and "x" required
to generate them schould be used in titels as [[Bordeaux]].

Both functions at [[:eo:]] and interlanguage links to and from [[:eo: are affected.

The error description is in English. Lease feel free to contact me if you need
further details.

Regards Reinhardt

gangleri wrote:

A brief description in German is available at [[:de:Benutzer
Diskussion:FlaBot#eo: und Bugzilla:1512]] .

gangleri wrote:

a brief information to [[sourceforge:projects/pywikipediabot]] was posted at
http://sourceforge.net/mailarchive/forum.php?thread_id=6600723&forum_id=36014

rowan.collins wrote:

[attempt at a more descriptive summary; please correct it if I've misrepresented
the central problem under discussion]

gangleri wrote:

Hallo!

Please see pages linked trough interlanguage links to
http://www.wikipage.de/de/index.php/Andr%C3%A9_Malraux

Because the interwiki links at wikipage are actualy "n * n" and not using a
gateway as Wikipedia is using with [[en:]] I could not experience interlanguage
failurs at WikiPage.

At Wikipedia [[eo:André MALRAUXX]] will fail despite the fact that *two* X's are
used. The link would lead anyway to [[eo:André MALRAŬ]] which is a *temporary*
redirect to be able to reach these titles.

After some months of dealing with this error and asking bots to handle it
properly (to use double X#s) I recomend *now* that interlanguage links to these
titles at [[eo:]] should be KISS, that normal "copy and paste" of the titles
should be the right way: [[eo:André MALRAUX]] and *not* [[eo:André MALRAUXX]],
[[eo:Bordeaux]] and *not* [[eo:Bordeauxx]].

I assume that the InterLanguage bug to [[eo:]] could be found (and hope that it
would be fixed) in the gateway configuration at [[en:]]. Other gateways as
[[meta:]] should be verified as well.

Note: Beside this error the *original* bug report / linked description and the
*original* title ("interlanguage problems and others") mentions a lot of other
bug with improper character conversion for many actions; edit, purge; watch *at
[[eo:]]* where you will return to [[eo:André MALRAŬ]] and *not* to [[eo:André
MALRAUX]] after these actions. If it is necessary to split these bugs just
commnet here.

Regards Reinhardt [[user:gangleri]]

gangleri wrote:

related to the Esperanto character conversion:
bug 2942: "Esperanto characters in image names are not handled correctly"

andreengels wrote:

I find this rather problematic. Check for example the Esperanto page
[[Bordeaux]]. This is being interpreted as [[Bordeaŭ]], which is redirected to
[[Bordeauxx]] = [[Bordeaux]] as a repair. However, if one goes to that page
(even if by typing Bordeauxx as the page one goes to), and then clicks the edit
page, one gets the edit page for [[Bordeaŭ]] = [[Bordeaux]]. The page itself
cannot be edited, except by changing the edit page URL by hand.

gangleri wrote:

in response to comment #7

Halló André!

Thanks for your comment! I addressed this issue during the Wikimania MediaWiki
workshop and asked about how and when reported bugs are handled / fixed because
this particular bug was reported a half of year before. I do not recall the
answer exactly but it seams that there are no "rigid" procedures and fixing
depends on availablity of developers and that "people are able to understand the
reports". At the workshop I mentioned the impacts on InterWiki links and bots.

Magic character conversion is Esperanto is less tricky as editing BiDi text and
not browser dependent. It seams that some / many functions do not care about it
(magic character conversion is Esperanto). Another example then mentioned by you
is adding / removing an article to / from your whatchlist which was mentioned at
the initial URL.

Whitout the available (od some) redirects as [[eo:Rosa Luxemburg]] it would not
be possible to access these pages trough wiki links:

[[eo:Speciala:Whatlinkshere/Roza_LUKSEMBURG]] shows "Rosa LUXEMBURG". However if
you click on the displayed link no redirect is available because this link
translates to
http://eo.wikipedia.org/w/index.php?title=Rosa_LUXEMBURG&redirect=no
and *not* to
http://eo.wikipedia.org/w/index.php?title=Rosa_LUXXEMBURG&redirect=no

It would be great if this could be fixed. Whoever would work on this will have
my full support providing test cases and feedback.

Best regards Reinhardt [[user:gangleri]]

  • Bug 2942 has been marked as a duplicate of this bug. ***

I've now disabled the interpretation of X-code in incoming links. That
was always problematic, but got nastier and more obvious once the
referrer check was removed before conversion.

(Editing should not be affected; items there should be properly
converted in both directions still).

gangleri wrote:

in reply to comment #10

I've now disabled the interpretation of X-code in incoming links. That
was always problematic, but got nastier and more obvious once the
referrer check was removed before conversion.

(Editing should not be affected; items there should be properly
converted in both directions still).

please see
bug 3609 migration of interlanguage links to Esperanto wikis

gangleri wrote:

leaving this bug as fixed

please see
bug 3615: blocks of code not handling magic character conversions in Esperanto
correcty - reason for page deletion

epriestley added a commit: Unknown Object (Diffusion Commit).Mar 4 2015, 8:15 AM