Page MenuHomePhabricator

ERROR 1062: Duplicate entry for key 'name_title' (mwDumper failed to insert data into mysql)
Closed, DeclinedPublicBUG REPORT

Description

bugs found

recently, i'm using mwdumper in chinese wikipedia. However, i found that the mw dumper fail to load some pages to page,text and revision tables.
it seems that all the pages that their pageId is more than 4487034 is not inserted into mysql tables (while the pages info are do exist in raw xml.bz2 file)

try to fix

i redo the process into 2 parts, first using mwdumper to generate sql file. then inserting the sql file into mysql.

after this i found that the sql file is probablily correct(including the large page_ids).

however, the inserting part interrupts because of mysql key error : ERROR 1062 (23000) at line 16389: Duplicate entry '4-\xE7\x9F\xA5\xE8\xAF\x86\xE9\x97\xAE\xE7\xAD\x94' for key 'name_title'
(I'm not familliar with the encoding so i can't find it's page_id)
And the following lines faied to insert into mysql.

My suggestion is to change "insert into text/page/revision" to "insert ignore into text/page/revision" to jump over this duplicate record ?

SqlWriter update

page_counter field is no longer needed by mediawiki page table, it should be removed from SqlWriter15.java

Event Timeline

iampkuhz assigned this task to awight.
iampkuhz raised the priority of this task from to Needs Triage.
iampkuhz updated the task description. (Show Details)
iampkuhz added a project: Utilities-mwdumper.
iampkuhz subscribed.
Aklapper renamed this task from mwDumper failed to insert data into mysql to mwDumper failed to insert data into mysql (ERROR 1062: Duplicate entry).Apr 23 2016, 9:00 AM
Aklapper renamed this task from mwDumper failed to insert data into mysql (ERROR 1062: Duplicate entry) to ERROR 1062: Duplicate entry for key 'name_title' (mwDumper failed to insert data into mysql).
Aklapper triaged this task as Medium priority.
Aklapper set Security to None.
awight subscribed.
Aklapper changed the subtype of this task from "Task" to "Bug Report".Feb 6 2022, 5:56 PM
hashar subscribed.

mwdumper is no more able to process dump generated since MediaWiki 1.31 (released in June 2018). The tool started in 2005 and is no more maintained, it is thus being archived, see T351228 for reference.