Page MenuHomePhabricator

In replace.py the XML page generator no longer seems to work
Closed, ResolvedPublicBUG REPORT

Description

My bot runs in de.wiktionary.

bot@udo-t:~/pywikibot-core$ python3 pwb.py version
Pywikibot: [https] r-pywikibot-core (b32470d, g16369, 2022/04/07, 16:06:45, stable)
Release version: 7.1.0
setuptools version: 62.0.0
mwparserfromhell version: 0.6.4
wikitextparser version: n/a
requests version: 2.27.1
  cacerts: /etc/ssl/certs/ca-certificates.crt
    certificate test: ok
Python: 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110]
PYWIKIBOT_DIR: Not set
PYWIKIBOT_DIR_PWB: /home/bot/pywikibot-core
PYWIKIBOT_NO_USER_CONFIG: Not set
Config base dir: /home/bot/pywikibot-core
Usernames for family 'wiktionary':
        de: UT-Bot

I have a fix in user-fixes.py that has run smoothly over thousands of pages with a file containing the pages to be changed.

So this fix is not the problem per se.

But now I want to run this fix with an XML dump as a page generator.

But this attempt is immediately terminated with the following error message:

bot@udo-t:~/pywikibot-core$ python3 pwb.py replace -lang:de -ns:0 -xml:./dump/dewiktionary-latest-pages-articles.xml -fix:nbsp
Traceback (most recent call last):
  File "/home/bot/pywikibot-core/pwb.py", line 496, in <module>
    main()
  File "/home/bot/pywikibot-core/pwb.py", line 480, in main
    if not execute():
  File "/home/bot/pywikibot-core/pwb.py", line 463, in execute
    run_python_file(filename, script_args, module)
  File "/home/bot/pywikibot-core/pwb.py", line 143, in run_python_file
    exec(compile(source, filename, 'exec', dont_inherit=True),
  File "./scripts/replace.py", line 1100, in <module>
    main()
  File "./scripts/replace.py", line 1082, in main
    gen = pagegenerators.XmlDumpPageGenerator(
AttributeError: module 'pywikibot.pagegenerators' has no attribute 'XmlDumpPageGenerator'
CRITICAL: Exiting due to uncaught exception <class 'AttributeError'>

I have three other fixes (with replace.py) that have been running automatically (via shell script and cron) for a very long time whenever there is a new XML dump.

On 21 March, these still worked. But shortly after 01 April, when there is another new xml dump, these 3 fixes stopped working. If I now enter each of these in the command line, I also get the above error message for these 3 fixes.

By the way, I saw that replace.py and also pagegenerators.py were last changed on 26 March.

Event Timeline

Udo_T renamed this task from In replace.py the XML page generator does not seem to work any more to In replace.py, the XML page generator no longer seems to work.Apr 10 2022, 11:28 AM
Udo_T renamed this task from In replace.py, the XML page generator no longer seems to work to In replace.py the XML page generator no longer seems to work.Apr 10 2022, 12:38 PM
Udo_T updated the task description. (Show Details)

Hi @HieuEuro ,

can you please explain what my task has to do with task T305803?

Best regards
Udo

Xqt triaged this task as High priority.Apr 10 2022, 8:02 PM

Change 778690 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [bugfix] Fix XMLDumpPageGenerator usage

https://gerrit.wikimedia.org/r/778690

Change 778690 merged by jenkins-bot:

[pywikibot/core@master] [bugfix] Fix XMLDumpPageGenerator usage

https://gerrit.wikimedia.org/r/778690