Page MenuHomePhabricator

delinker.py script of pywikibot can't solve those deleted files comes with no extension
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue:

  • The doc of the delinker.py script.
  • On ckbwiki, run python pwb.py delinker -localonly.

What happens?:
After a while, we got this:

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Update 'since' to scripts.ini file

493 read operations
23 skip operations
Execution time: 352 seconds
Read operation time: 0.7 seconds
Skip operation time: 15.3 seconds
Script terminated by exception:

ERROR: 'خولە پیزە' does not have a valid extension (djvu, flac, gif, jpeg, jpg, mid, midi, mp3, mpeg, mpg, oga, ogg, ogv, opus, pdf, png, svg, tif, tiff, wav, webm, webp, xcf). (ValueError)
Traceback (most recent call last):
  File "C:\Pywikibot\pwb.py", line 39, in <module>
    sys.exit(main())
  File "C:\Pywikibot\pwb.py", line 35, in main
    runpy.run_path(str(path), run_name='__main__')
  File "C:\Users\Aram\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\Aram\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\Aram\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Pywikibot\pywikibot\scripts\wrapper.py", line 513, in <module>
    main()
  File "C:\Pywikibot\pywikibot\scripts\wrapper.py", line 497, in main
    if not execute():
  File "C:\Pywikibot\pywikibot\scripts\wrapper.py", line 484, in execute
    run_python_file(filename, script_args, module)
  File "C:\Pywikibot\pywikibot\scripts\wrapper.py", line 147, in run_python_file
    exec(compile(source, filename, 'exec', dont_inherit=True),
  File "C:\Pywikibot\scripts\delinker.py", line 164, in <module>
    main()
  File "C:\Pywikibot\scripts\delinker.py", line 160, in main
    bot.run()
  File "C:\Pywikibot\pywikibot\bot.py", line 1650, in run
    page = self.init_page(item)
  File "C:\Pywikibot\scripts\delinker.py", line 90, in init_page
    return pywikibot.FilePage(self.site, item['title'])
  File "C:\Pywikibot\pywikibot\page\_filepage.py", line 62, in __init__
    raise ValueError(
ValueError: 'خولە پیزە' does not have a valid extension (djvu, flac, gif, jpeg, jpg, mid, midi, mp3, mpeg, mpg, oga, ogg, ogv, opus, pdf, png, svg, tif, tiff, wav, webm, webp, xcf).
CRITICAL: Exiting due to uncaught exception ValueError: 'خولە پیزە' does not have a valid extension (djvu, flac, gif, jpeg, jpg, mid, midi, mp3, mpeg, mpg, oga, ogg, ogv, opus, pdf, png, svg, tif, tiff, wav, webm, webp, xcf).

Looks like someone created File:خولە پیزە (without any extension) in file namespace on ckbwiki just as a test page! Here, the question is why Mediawiki allowed that action? Fortunately, I had a query for this purpose here. By using the search field, I found the file title, who deleted it and when. I undeleted it and tried to move it to another namespace, but Mediawiki didn't let us to move files to another namespace. As you can see in the log for the file, I changed the title to the username who uploaded it first and a random extension just to make it a valid file name and finally deleted it again. But I got the same error for the old name خولە پیزە. I tried many ways to get rid of the error, but they were unuseful. I tried Wikimedia Commons this time and got the same error, but this time it was for Mahnmal Falkensee.

What should have happened instead?:
I'm not sure what should be the right action here. Skip files with no extension automatically or whatever to prevent the script from throwing that error and then stopping working.

Event Timeline

I think there should be a fallback somehow from FilePage to Page object in such case.
@Enag2000: are you (still) working on this issue?

@Xqt, @Enag2000 has been inactive for six months and this task is important to address. Can you please do something about it? Thanks.

Xqt removed Enag2000 as the assignee of this task.Jul 14 2024, 4:58 PM
Xqt added a subscriber: Enag2000.
Xqt triaged this task as Medium priority.

Change #1054503 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [fix] Ignore extension check in delinker.py scripts/delinker.py

https://gerrit.wikimedia.org/r/1054503

Change #1054503 merged by Xqt:

[pywikibot/core@master] [fix] Ignore extension check in delinker.py scripts/delinker.py

https://gerrit.wikimedia.org/r/1054503