Page MenuHomePhabricator

ISBN.py breaks while formatting
Closed, ResolvedPublic

Description

ISBN.py wrongly breaks while formatting the ISBN.

command line was:

python pwb.py isbn -page:Franz-Eher-Verlag -format -lang:de -family:wikipedia -simulate

The ISBN is ISBN 3-00-013343-7 which is correct and exists [1]. But the script fails.
[1] https://portal.dnb.de/opac.htm?referrer=Wikipedia&method=enhancedSearch&index=num&term=3000133437&operator=and

Traceback (most recent call last):
  File "C:\pwb\SVN\core\pwb.py", line 228, in <module>
    run_python_file(filename, argv, argvu, file_package)
  File "C:\pwb\SVN\core\pwb.py", line 85, in run_python_file
    main_mod.__dict__)
  File ".\scripts\isbn.py", line 1662, in <module>
    main()
  File ".\scripts\isbn.py", line 1657, in main
    bot.run()
  File ".\scripts\isbn.py", line 1526, in run
    self.treat(page)
  File ".\scripts\isbn.py", line 1504, in treat
    new_text = self.isbnR.sub(_hyphenateIsbnNumber, new_text)
  File ".\scripts\isbn.py", line 1415, in _hyphenateIsbnNumber
    i.format()
  File ".\scripts\isbn.py", line 1334, in format
    ISBN.format(self)
  File ".\scripts\isbn.py", line 1225, in format
    % self.code)
__main__.InvalidIsbnException: ISBN 3-00-013343-7: publisher number unknown.
<class '__main__.InvalidIsbnException'>
CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort

Event Timeline

Xqt raised the priority of this task from to Medium.
Xqt updated the task description. (Show Details)
Xqt added a project: Pywikibot.
Xqt subscribed.
Restricted Application added subscribers: Aklapper, Unknown Object (MLST). · View Herald TranscriptMay 2 2015, 10:49 AM
Xqt set Security to None.

Should we fix that or suggest the user to install an external library?

jayvdb subscribed.

The internal data tables in isbn.py are extremely outdated / poor quality.

In core, you can install external ISBN libraries stdnum , isbnlib or isbn_hyphenate , which were added in T85240. They will be used instead of the internal data in isbn.py

Should we fix that or suggest the user to install an external library?

Add a warn() when no other library found?

and treat() should also catch these exceptions. (and run looks like it should override Bot.run)

This bug does not come from an outdated table. The table is right for the given ISBN.

if rest[:length] > start and rest[:length] <= end:

probably should be

if rest[:length] >= start and rest[:length] <= end:

Change 208376 had a related patch set uploaded (by Xqt):
[bugfix] Fix ISBN formatting

https://gerrit.wikimedia.org/r/208376

Change 208377 had a related patch set uploaded (by Xqt):
[bugfix] Fix ISBN formatting

https://gerrit.wikimedia.org/r/208377

Change 208376 merged by jenkins-bot:
[bugfix] Fix ISBN formating

https://gerrit.wikimedia.org/r/208376

Change 208377 merged by jenkins-bot:
[bugfix] Fix ISBN formatting

https://gerrit.wikimedia.org/r/208377