Page MenuHomePhabricator

LocalSetting.php Encoding type (multi-language/Hebrew support as site name)
Closed, DeclinedPublic

Description

Author: dov.wrk

Description:
the default encoding type of LocalSettings.php is ANSI,
this is great for English site but for foreigns sites multi-language support is needed
for example I had tried to change my Wiki name to Hebrew name but there were encoding problem, because of the encoding type of LocalSettings.php is ANSI
I had discovered that converting the file encoding to UTF-8 solving this problem
I advise you to change the default encoding type of LocalSettings to UTF-8


Version: unspecified
Severity: normal
OS: Linux

Details

Reference
bz14653

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:11 PM
bzimport set Reference to bz14653.
bzimport added a subscriber: Unknown Object (MLST).

As far as I know LocalSettings.php is fully ASCII by default, and thus also UTF-8. Are you sure this is not a problem with the text editor you are using?

dov.wrk wrote:

I am sure about this...
the problem is on Linux systems (I am using FC7 on this particular server)
and while the charset is ASCII it can't recognize well other languages, while when I had opened the file and recoded it as UTF-8, the Linux machine recognized my language (hebrew) very well...

so I am suggesting to make this file as UTF-8 for default..

dov.wrk wrote:

(In reply to comment #1)

As far as I know LocalSettings.php is fully ASCII by default, and thus also
UTF-8. Are you sure this is not a problem with the text editor you are using?

I am sure about this...
the problem is on Linux systems (I am using FC7 on this particular server)
and while the charset is ASCII it can't recognize well other languages, while
when I had opened the file and recoded it as UTF-8, the Linux machine
recognized my language (hebrew) very well...

so I am suggesting to make this file as UTF-8 for default..

I believe there is an issue of some editors not supporting UTF-8. And additionally, the only real difference between them is how something is set to read the file. The only way to force something to read the file as UTF-8 is to add a BOM to the start of the file, however, as I've experienced not every program (and PHP is within this area as I remember) handles UTF-8 code files with a BOM at the front of them very cleanly. It's quite possible that adding a BOM to the start of the file could end up leading to some people on varying hosts ending up with a lot of "headers already sent in LocalSettings.php" errors in their wiki.

However, I think the real issue here may be multibyte stuff. Which PHP has an alternate way of handling. For example æ ends up converted to æ. Perhaps the solution would be to edit with an editor like Notepad++ or another which you can type in, then use HEX mode or some sort of conversion to convert the multibyte UTF-8 into single byte ASCII and paste that into the string. Then your text will be outputted correctly in the web front. Do note that this is basically how all multibyte characters are handled in the language files.

Not really a bug, as it's an issue with the end-user's editor, not the file itself.

All MediaWiki files, including LocalSettings.php, are all UTF-8 all the time.