Page MenuHomePhabricator

fixing_redirects.py / replace_links fails on links such as [[{{2001}}]]
Closed, ResolvedPublic

Description

Via IRC:

ERROR: InvalidTitle: u'{{2001}}' contains illegal char(s) u'{'
Traceback (most recent call last):
  File "pwb.py", line 248, in <module>
    if not main():
  File "pwb.py", line 242, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 120, in run_python_file
    main_mod.__dict__)
  File "./scripts/fixing_redirects.py", line 138, in <module>
    main()
  File "./scripts/fixing_redirects.py", line 131, in main
    bot.run()
  File "/home/glavkos/core/pywikibot/bot.py", line 1413, in run
    self.treat(page)
  File "/home/glavkos/core/pywikibot/bot.py", line 1700, in treat
    super(ExistingPageBot, self).treat(page)
  File "/home/glavkos/core/pywikibot/bot.py", line 1764, in treat
    super(NoRedirectPageBot, self).treat(page)
  File "/home/glavkos/core/pywikibot/bot.py", line 1627, in treat
    self.treat_page()
  File "./scripts/fixing_redirects.py", line 73, in treat_page
    newtext = pywikibot.textlib.replace_links(newtext, [page, target])
  File "/home/glavkos/core/pywikibot/textlib.py", line 615, in replace_links
    label=groups['label'])
  File "/home/glavkos/core/pywikibot/page.py", line 5224, in create_separated
    link.parse()
  File "/home/glavkos/core/pywikibot/page.py", line 4946, in parse
    u"%s contains illegal char(s) %s" % (repr(t), repr(m.group(0))))
pywikibot.exceptions.InvalidTitle: u'{{2001}}' contains illegal char(s) u'{'
<class 'pywikibot.exceptions.InvalidTitle'>
CRITICAL: Closing network session.

Page: Κοινότητα Ασπροποτάμου Τρικάλων

Command: python pwb.py fixing_redirects.py -cat
with category: Χωριά του Νομού Τρικάλων

The issue seems to be the link

[[{{2001}}]]

on the page. Πρότυπο:2001 contains

Ελληνική απογραφή 2001|2001

Event Timeline

valhallasw raised the priority of this task from to Needs Triage.
valhallasw updated the task description. (Show Details)
valhallasw added subscribers: valhallasw, Glavkos.

We can extend the try except block here : https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/textlib.py#L612 to except "pywikibot.exceptions.InvalidTitle". What exactly should happen for links such as [[{{2001}}]] ?

Change 485387 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [bugfix] Ignore InvalidTitle for fixing_redirects

https://gerrit.wikimedia.org/r/485387

Looking at the stack trace, the user is running a very old version of pywikibot; the use of textlib.replace_links was removed in november 2015 (8a7c42f5).

The fix looks OK, but the issue is likely also present in the textlib.replace_links implementation; it might make sense to synchronize those (although I realize this is a nontrivial effort with respect to testing)

Change 485387 merged by jenkins-bot:
[pywikibot/core@master] [bugfix] Ignore InvalidTitle for fixing_redirects

https://gerrit.wikimedia.org/r/485387