Page MenuHomePhabricator

checkimages.py -commons -duplicates options no longer work
Closed, ResolvedPublic

Description

~$ python /home/.../core/pwb.py /home/.../core/scripts/checkimages.py -duplicates -nologerror -commons -break
Retrieving the latest 80 files for checking...

Loading the allowed licenses...


>> Loaded the real-time page... <<
Checking if [[Bears trailer screenshot.jpg]] is on commons...
Traceback (most recent call last):
  File "/home/.../core/pwb.py", line 239, in <module>
    if not main():
  File "/home/.../core/pwb.py", line 233, in main
    run_python_file(filename, argv, argvu, file_package)
  File "/home/.../core/pwb.py", line 88, in run_python_file
    main_mod.__dict__)
  File "/home/.../core/scripts/checkimages.py", line 1837, in <module>
    main()
  File "/home/vito/.../core/scripts/checkimages.py", line 1818, in main
    if not Bot.checkImageOnCommons():
  File "/home/vito/.../core/scripts/checkimages.py", line 917, in checkImageOnCommons
    total=1), None)
TypeError: PageGenerator object is not an iterator
<type 'exceptions.TypeError'>
CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort

Event Timeline

Vituzzu raised the priority of this task from to Needs Triage.
Vituzzu updated the task description. (Show Details)
Vituzzu subscribed.
Vituzzu renamed this task from checkimages.py -commons -duplicates no longer work to checkimages.py -commons -duplicates options no longer work.Jul 8 2015, 5:02 PM
Vituzzu updated the task description. (Show Details)
Vituzzu set Security to None.

Change 223582 had a related patch set uploaded (by XZise):
[FIX] checkimages: Convert generator to iterator

https://gerrit.wikimedia.org/r/223582

First of please only use Pywikibot or Pywikibot-compat and not Pywikibot-compat-to-core unless you want to request that a feature present in compat is transferred to core.

Now regarding your bug it seems to be introduced in ff628236 as APISite.getFilesFromAnHash returns a list while APISite.allimages returns a generator which needs to be converted into an iterator first as next() won't work otherwise.

Well I had a working checkimages.py till I had to switch to core's one, so I considered it as a migration issue. Anyway thanks for your fix to my task and for the patch.

Yeah that is understandable. Anyway I'd suggest to add Pywikibot or Pywikibot-compat to show where you've found the bug (like a version tag).

Apart from that have you been able to test the patch? I admit that I personally haven't and @Xqt noted that there might be the closing bracket at the wrong position. And if you could test the patch (as you have a use case) it would either show me that I need to fix it or whether @Xqt's assessment is wrong.

Change 223582 merged by jenkins-bot:
[FIX] checkimages: Convert generator to iterator

https://gerrit.wikimedia.org/r/223582

It works! Ty, though I've just found another bug .__.

Here's the new bug, it shouldn't be related to your patch though.

Checking if [[Unione Sportiva Sassuolo Calcio - Anni '80.jpg]] is on commons...
Unione Sportiva Sassuolo Calcio - Anni '80.jpg has a duplicate! Reporting it...
Traceback (most recent call last):
  File "pwb.py", line 239, in <module>
    if not main():
  File "pwb.py", line 233, in main
    run_python_file(filename, argv, argvu, file_package)
  File "pwb.py", line 88, in run_python_file
    main_mod.__dict__)
  File "./scripts/checkimages.py", line 1837, in <module>
    main()
  File "./scripts/checkimages.py", line 1822, in main
    if not Bot.checkImageDuplicated(duplicates_rollback):
  File "./scripts/checkimages.py", line 1004, in checkImageDuplicated
    data = time.strptime(self.timestamp, u"%Y-%m-%dT%H:%M:%SZ")
  File "/usr/lib/python2.7/_strptime.py", line 467, in _strptime_time
    return _strptime(data_string, format)[0]
  File "/usr/lib/python2.7/_strptime.py", line 322, in _strptime
    found = format_regex.match(data_string)
TypeError: expected string or buffer
<type 'exceptions.TypeError'>
CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort

using -break -commons -duplicates and -nologerror options

Change 223852 had a related patch set uploaded (by XZise):
[FIX] checkimages: Convert generator to iterator

https://gerrit.wikimedia.org/r/223852

Change 223852 merged by jenkins-bot:
[FIX] checkimages: Convert generator to iterator

https://gerrit.wikimedia.org/r/223852

Okay looks like when file revisions were added to PWB that this code wasn't checked. self.timestamp is now at least a Timestamp which doesn't need to be parsed. I currently can't check easily who is the culprit but d298facc changed it to the current version (the other patch who is responsible for the original bug just changed it a bit so is not responsible for the new bug).

Otherwise I'm not sure if you should make a new bug report as its origin is different. What do the others say?

Change 224308 had a related patch set uploaded (by XZise):
[FIX] checkimages: Expect Timestamp instance

https://gerrit.wikimedia.org/r/224308

Change 224308 merged by jenkins-bot:
[FIX] checkimages: Expect Timestamp instance

https://gerrit.wikimedia.org/r/224308