Page MenuHomePhabricator

Cosmetic_changes.py crashes on wrong digits in ISBN; needs an -ignore parameter
Closed, ResolvedPublic

Description

I ran cc with -newpages:500, after encountering page with wrong number of digits in ISBN cc ended giving

Script terminated by exception:

ERROR: InvalidIsbnException: ISBN-13: The ISBN 8080632441 is not 13 digits long. / ISBN-10: The ISBN checksum of 8080632441 is incorrect.
Traceback (most recent call last):
  File "pwb.py", line 239, in <module>
    if not main():
  File "pwb.py", line 233, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 111, in run_python_file
    main_mod.__dict__)
  File ".\scripts\cosmetic_changes.py", line 144, in <module>
    main()
  File ".\scripts\cosmetic_changes.py", line 137, in main
    bot.run()
  File "...\core\pywikibot\bot.py", line 1805, in run
    super(MultipleSitesBot, self).run()
  File "...\core\pywikibot\bot.py", line 1619, in run
    self.treat(page)
  File "...\core\pywikibot\bot.py", line 1906, in treat
    super(ExistingPageBot, self).treat(page)
  File "...\core\pywikibot\bot.py", line 1970, in treat
    super(NoRedirectPageBot, self).treat(page)
  File "...\core\pywikibot\bot.py", line 1833, in treat
    self.treat_page()
  File ".\scripts\cosmetic_changes.py", line 74, in treat_page
    changedText = ccToolkit.change(self.current_page.get())
  File "...\core\pywikibot\cosmetic_changes.py", line 279, in change
    new_text = self._change(text)
  File "...\core\pywikibot\cosmetic_changes.py", line 273, in _change
    text = self.safe_execute(method, text)
  File "...\core\pywikibot\cosmetic_changes.py", line 260, in safe_execute
    result = method(text)
  File "...\core\pywikibot\cosmetic_changes.py", line 957, in fix_ISBN
    text, strict=False if self.ignore == CANCEL_MATCH else True)
  File "...\core\pywikibot\cosmetic_changes.py", line 206, in _reformat_ISBNs
    text, lambda match: _format_isbn_match(match, strict=strict))
  File "...\core\pywikibot\textlib.py", line 1593, in reformat_ISBNs
    text = isbnR.sub(match_func, text)
  File "...\core\pywikibot\cosmetic_changes.py", line 206, in <lambda>
    text, lambda match: _format_isbn_match(match, strict=strict))
  File "...\core\pywikibot\cosmetic_changes.py", line 175, in _format_isbn_match
    scripts_isbn.is_valid(isbn)
  File "...\core\scripts\isbn.py", line 1376, in is_valid
    getIsbn(isbn)
  File "...\core\scripts\isbn.py", line 1344, in getIsbn
    % (e13, e10))
scripts.isbn.InvalidIsbnException: ISBN-13: The ISBN 8080632441 is not 13 digits long. / ISBN-10: The ISBN checksum of 8080632441 is incorrect.
<class 'scripts.isbn.InvalidIsbnException'>
CRITICAL: Closing network session.

Mentor: @jayvdb

Event Timeline

Wesalius raised the priority of this task from to Needs Triage.
Wesalius updated the task description. (Show Details)
Wesalius subscribed.
jayvdb renamed this task from Cosmetic_changes.py crashes on wrong digits in ISBN to Cosmetic_changes.py crashes on wrong digits in ISBN; needs an -ignore parameter.Jan 9 2016, 3:22 AM
jayvdb updated the task description. (Show Details)

Not sure this is worth GCI; writing up the task may take longer than simply fixing it, and the lessons learnt are not very re-usable even within the Pywikibot project.

@Xize so should the ignore parameter also ignore the wrong isbn this what i need to do ?, if we ignore the wrong isbn what will be its use ?

standalone script cc must be invoked with -ignore parameter to prevent this exception. page._cosmetic_changes_hook() incokes cc with the needed ignore parameter. No idea what to do here; imho this request could be closed.

I think the default for cosmetic_changes script should be to warn about invalid ISBNs in a page, but keep processing other ISBN on the page and continue processing pages. An exception for an invalid ISBN doesnt feel like the correct response.

Xqt triaged this task as Medium priority.Feb 4 2019, 11:34 AM
This comment was removed by Dvorapa.

Change 699486 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [IMPR] set -ignore option to CANCEL.MATCH by default

https://gerrit.wikimedia.org/r/699486

Change 699486 merged by jenkins-bot:

[pywikibot/core@master] [IMPR] set -ignore option to CANCEL.MATCH by default

https://gerrit.wikimedia.org/r/699486