Page MenuHomePhabricator

interwikibot crashes when page is empty
Closed, ResolvedPublic

Description

interwiki.py -wiktionary found [[yi:אנו]], which is empty:

======Post-processing [[cs:'nv]]======
Updating links on page [[lt:'nv]].
Changes to be made: +Interwiki
@@ -22,0 +23 @@
+ [[cs:'nv]]

@@ -31 +32 @@
- [[yi:'nv]]
+ [[yi:'nv]]

`NOTE: Updating live wiki...
Updating links on page [[tr:'nv]].
Changes to be made: +Interwiki
@@ -9,0 +10 @@
+ [[cs:'nv]]

@@ -18 +19 @@
- [[yi:'nv]]
+ [[yi:'nv]]


NOTE: Updating live wiki...
Not editing [[yi:'nv]]: page is empty
Dump cs (wiktionary) appended.
Traceback (most recent call last):
  File "I:\py\rewrite\pwb.py", line 222, in <module>
    run_python_file(filename, argv, argvu, file_package)
  File "I:\py\rewrite\pwb.py", line 81, in run_python_file
    main_mod.__dict__)
  File "I:\py\rewrite\scripts\interwiki.py", line 2647, in <module>
    main()
  File "I:\py\rewrite\scripts\interwiki.py", line 2622, in main
    bot.run()
  File "I:\py\rewrite\scripts\interwiki.py", line 2365, in run
    self.queryStep()
  File "I:\py\rewrite\scripts\interwiki.py", line 2343, in queryStep
    subj.finish()
  File "I:\py\rewrite\scripts\interwiki.py", line 1790, in finish
    if self.replaceLinks(page, new):
  File "I:\py\rewrite\scripts\interwiki.py", line 1846, in replaceLinks
    raise SaveError
TypeError: __init__() takes exactly 2 arguments (1 given)
Page [[lt:'nv]] saved
Page [[tr:'nv]] saved
<type 'exceptions.TypeError'>
CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort

Event Timeline

JAnD raised the priority of this task from to Needs Triage.
JAnD updated the task description. (Show Details)
JAnD subscribed.
JAnD triaged this task as High priority.Jan 1 2015, 9:04 PM
JAnD set Security to None.

Okay this looks like a bug in the script. The Error class in pywikibot.exceptions requires at least 1 additional parameter (the other is defined automatically) since cde6b11565e2068bf052041dd2ad563658c4faf3 and the interwiki.py script is calling SaveError without a parameter since e8c2790c5b8bd38d905ff32c6f5ca9efd7d3721b, which was uploaded about three years after the change to the Error class (in fact this was the second change to pywikibot.exceptions only when this happened). At that time SaveError already [[https://github.com/wikimedia/pywikibot-core/blob/e8c2790c5b8bd38d905ff32c6f5ca9efd7d3721b/scripts/interwiki.py#L362|subclassed Error]] so this bug is probably in it ever since.

I'll upload a patch shortly (shouldn't be to complex).

Change 182425 had a related patch set uploaded (by Mpaa):
Interwikibot crashes when page is empty

https://gerrit.wikimedia.org/r/182425

Patch-For-Review

XZise subscribed.

XZise, sorry, didn't see your comment. We were working in parallel :-)

Yeah I just wanted to make sure that there wasn't another underlying breakage like that Error class originally didn't require a parameter and that it was changed recently (which could cause other errors). And well I wouldn't have used a unicode string ;) If you want you can include the info from above into the commit message so that others don't think it was broken recently.

If possible @JAnD, could you test @Mpaa patch please because I'd say it's looking good (surprise ;) ).

The relevant code was added in 2011 by @Xqt , mentioning https://sourceforge.net/support/tracker.php?aid=3414669 , but that URL is now 404. It would be nice to know what that problem was. I am wondering why this bug hasnt been hit in 2+ years.

Anyway, the problematic page appears to be https://yi.wiktionary.org/w/index.php?title=%D7%90%D7%A0%D7%95&action=history&uselang=en . Worth noting that has never been truly empty. It has three (3) characters, whereas page.isEmpty() returns true for any page with less than four (4) characters. This 'greater than 4 characters' rule was especially introduced for interwiki.py (and is only used by interwiki.py), in 2004. https://mediawiki.org/wiki/Special:Code/pywikipedia/405

Change 182425 merged by jenkins-bot:
Interwikibot crashes when page is empty

https://gerrit.wikimedia.org/r/182425