Page MenuHomePhabricator

redirect.py crashes when running on pages that belong to previously deleted namespaces (uncaught exception: RuntimeError)
Closed, DuplicatePublicBUG REPORT

Description

Background information:

  • On Wikimedia wikis, sometimes extra namespaces are deleted without properly deleting/moving the pages in those namespace which leads to the pages being inaccessible to users.
  • The technical details of this issue are described in T109238: Clean up broken namespace pages across Wikimedia sites
  • The list of affected pages on Wikimedia wikis is available as P1884
  • Take Arabic Wikiversity (arwikiversity) as an example. From the above paste, the list of affected pages is:
arwikiversity:  id=2394 ns=0 dbk=Topic:علم_المواد -> Topic:علم_المواد (no conflict) DRY RUN ONLY
arwikiversity:  id=1958 ns=0 dbk=Topic:ميكانيكا_تطبيقية -> Topic:ميكانيكا_تطبيقية (no conflict) DRY RUN ONLY
arwikiversity:  id=2393 ns=0 dbk=Topic:ميكانيكا_تطبيقية/box-footer -> Topic:ميكانيكا_تطبيقية/box-footer (no conflict) DRY RUN ONLY
arwikiversity:  id=1959 ns=0 dbk=Topic:ميكانيكا_تطبيقية/box-header -> Topic:ميكانيكا_تطبيقية/box-header (no conflict) DRY RUN ONLY
arwikiversity:  id=1949 ns=0 dbk=Topic:هندسة_ميكانيكية/Intro -> Topic:هندسة_ميكانيكية/Intro (no conflict) DRY RUN ONLY
for db in $( echo "select dbname from wiki" | sql meta ); do echo $db ; echo "select count(page_id) from page where page_namespace=2600" | sql $db ; done
[...]
5	arwikiversity

Steps to Reproduce:

Now back to Pywikibot, run the following command to fix broken redirects on arwikiverity:

python pwb.py redirect broken -delete -lang:ar -family:wikiversity

Actual Results:

Retrieving broken redirect special page...
Retrieving 9 pages from wikiversity:ar.
WARNING: Page [[ar:الإمتصاص]] on wikiversity:ar is skipped because it is not a redirect
WARNING: Page [[ar:باحث]] on wikiversity:ar is skipped because it is not a redirect
WARNING: Page [[ar:لغة C++.Net]] on wikiversity:ar is skipped because it is not a redirect
WARNING: Page [[ar:لغة Delphi.NET]] on wikiversity:ar is skipped because it is not a redirect
WARNING: Page [[ar:مستقبل الكمبيوتر]] on wikiversity:ar is skipped because it is not a redirect
WARNING: Page [[ar:وصف الجسد في اللغة الاسبانية]] on wikiversity:ar is skipped because it is not a redirect


>>> موضوع:ميكانيكا تطبيقية <<<

0 pages read
0 pages written
6 pages skipped
Execution time: 0 seconds
Script terminated by exception:

ERROR: RuntimeError: getredirtarget: No 'redirects' found for page موضوع:ميكانيكا تطبيقية.
Traceback (most recent call last):
  File "pwb.py", line 298, in <module>
    if not main():
  File "pwb.py", line 293, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 96, in run_python_file
    main_mod.__dict__)
  File "./scripts/redirect.py", line 770, in <module>
    main()
  File "./scripts/redirect.py", line 766, in main
    bot.run()
  File "/mnt/nfs/labstore-secondary-tools-home/meno25/core/pywikibot/bot.py", line 1511, in run
    self.treat(page)
  File "./scripts/redirect.py", line 679, in treat
    super(RedirectRobot, self).treat(page)
  File "/mnt/nfs/labstore-secondary-tools-home/meno25/core/pywikibot/bot.py", line 1738, in treat
    self.treat_page()
  File "./scripts/redirect.py", line 489, in delete_1_broken_redirect
    targetPage = redir_page.getRedirectTarget()
  File "/mnt/nfs/labstore-secondary-tools-home/meno25/core/pywikibot/page.py", line 1683, in getRedirectTarget
    return self.site.getredirtarget(self)
  File "/mnt/nfs/labstore-secondary-tools-home/meno25/core/pywikibot/site.py", line 3198, in getredirtarget
    .format(title))
RuntimeError: <exception str() failed>
CRITICAL: Exiting due to uncaught exception <type 'exceptions.RuntimeError'>

Expected Results:

  • The script should either skip those pages silently or print an error message to the user and continue but I don't believe that crashing the script run like this is the desired behavior here.