Page MenuHomePhabricator

LanguageConverter fails with HTML entities, &#nnnn; and &#xnnn; in external links
Closed, ResolvedPublic

Description

Author: milant

Description:
gangleri told me to report a bug that he found. Its explained at
http://sr.wikipedia.org/wiki/user:Gangleri/tests/bugzilla/HTML_Entity_Names

i will quote gangleri:
" [http://sr.wikipedia.org/ Ä] should just generate a link "Ä" "


Version: unspecified
Severity: normal
URL: http://sr.wikipedia.org/wiki/user:Gangleri/tests/bugzilla/HTML_Entity_Names#Auml

Details

Reference
bz5233

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:09 PM
bzimport set Reference to bz5233.
bzimport added a subscriber: Unknown Object (MLST).

milant wrote:

This is on serbian wikipedia

gangleri wrote:

Thanks Milan!

*additional note*
At [[sr:user:Gangleri/tests/bugzilla/HTML_Entity_Names#Auml]]
[http://sr.wikipedia.org/ Ä] generates the link "&Аумл" and not "Ä"

At [[sr:user:Gangleri/tests/bugzilla/HTML_Entity_Names#x6d]]
[http://sr.wikipedia.org/ m] generates the link "&#x6д" and not "m".
Please note that "x6д" is Cyrillic "x6%D0%B4".

LanguageConverter translates HTML entities but *first* HTML entities should be
converted into UTF-8 characters and later LanguageConverter should makes his job.

best regards reinhardt [[user:gangleri]]

gangleri wrote:

[[sr:user:Gangleri/tests/bugzilla/HTML_Entity_Names#nbsp]] is the only HTML
entity which "works".

Not shure if this bug blocks
Bug 3985: character conversion (tracking)
or
Bug 3969: unicode compatibitity (tracking)
There is no tracking bug for "general UTF-8 character coding compatibility".