Page MenuHomePhabricator

mwdumper direct MySQL connection needs to distinguish UTF-8 and compat schemas
Closed, DeclinedPublicBUG REPORT

Description

Currently mwdumper always tells the database it's on a UTF-8 connection when
speaking directly to the DB over JDBC.

This should work correctly with the UTF-8 schema, but not with the default
backwards-compatible schema; the result will be conversion by the DB into actual
Latin-1.

Some sort of horrible double-conversion might be necessary, assuming that even
works...


See Also: T16379: mwdumper crashes on non-latin input characters
Version: unspecified
Severity: normal
OS: Linux
Platform: PC

Details

Reference
bz9279

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:38 PM
bzimport set Reference to bz9279.

Test with binary schema as well...

and one more bugmail test. Sigh.

Assigning to brion. Problably created before new mwdumper issues were auto-assigned.

brion set Security to None.
Aklapper changed the subtype of this task from "Task" to "Bug Report".Feb 6 2022, 5:56 PM
hashar subscribed.

mwdumper is no more able to process dump generated since MediaWiki 1.31 (released in June 2018). The tool started in 2005 and is no more maintained, it is thus being archived, see T351228 for reference.