Page MenuHomePhabricator

pywikibot.Link should support interwiki prefix be-x-old
Closed, ResolvedPublic

Description

After T11823,

$ python /shared/pywikipedia/core/pwb.py shell
Welcome to the Pywikibot interactive shell!
>>> import pywikibot as pyb
>>> pywikibot.Link(u"be-tarask:Страсбург", source=pywikibot.Site("en", "wikipedia"))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/data/project/pywikibot/public_html/core/pywikibot/page.py", line 4749, in __repr__
    return "pywikibot.page.Link(%r, %r)" % (self.title, self.site)
  File "/data/project/pywikibot/public_html/core/pywikibot/page.py", line 4937, in title
    self.parse()
  File "/data/project/pywikibot/public_html/core/pywikibot/page.py", line 4831, in parse
    self._text, self._site, prefix, e))
SiteDefinitionError: be-tarask:Страсбург is not a local page on wikipedia:en, and the interwiki prefix be-tarask is not supported by PyWikiBot!:
Unknown URL 'https://be-x-old.wikipedia.org/wiki/$1'.
>>> pywikibot.Link(u"be-x-old:Страсбург", source=pywikibot.Site("en", "wikipedia"))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/data/project/pywikibot/public_html/core/pywikibot/page.py", line 4749, in __repr__
    return "pywikibot.page.Link(%r, %r)" % (self.title, self.site)
  File "/data/project/pywikibot/public_html/core/pywikibot/page.py", line 4937, in title
    self.parse()
  File "/data/project/pywikibot/public_html/core/pywikibot/page.py", line 4831, in parse
    self._text, self._site, prefix, e))
SiteDefinitionError: be-x-old:Страсбург is not a local page on wikipedia:en, and the interwiki prefix be-x-old is not supported by PyWikiBot!:
Unknown URL 'https://be-x-old.wikipedia.org/wiki/$1'.

Expected: Instead of SiteDefinitionError, It should parse the link to be-tarask:Страсбург.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
StalledNone
StalledNone
InvalidNone
StalledNone
StalledNone
StalledNone
StalledNone
StalledFeatureNone
StalledNone
StalledFeatureNone
StalledFeatureNone
StalledFeatureNone
StalledNone
StalledNone
OpenNone
ResolvedNone
ResolvedLadsgroup

Event Timeline

zhuyifei1999 raised the priority of this task from to Needs Triage.
zhuyifei1999 updated the task description. (Show Details)
zhuyifei1999 added a project: Pywikibot.

The Link.parse method could in theory check the family's interwiki_replacements and replace them before getting the actual site associated with this link.

Ah dang it that isn't actually the problem… the problem is that we don't parse be-x-old.wikipedia.org into the new site. When you use Python 3 you actually see the original traceback and it's actually written in the exception text: Unknown URL 'https://be-x-old.wikipedia.org/wiki/$1'.

The actual bug is that there is no hostname provided for obsolete sites. So it might be a duplicate of T74674: Unable to communicate with obsolete / non-existing wikis .

Yeah, be-tarask in url works:

>>> pywikibot.Site(url="https://be-tarask.wikipedia.org/wiki/$1")
Site("be-tarask", "wikipedia")

Why not add checks for obsolete/moved wikis in Family.from_url ?

Because it compares the article paths and to do that it needs to query the wiki and that is not possible (hence T74674, because the hostname is not available). If that is possible, it might be possible to Family.from_url to work also for obsolete wikis.

BTW be-tarask works in pywikibot.Site(), but not in interwiki link as such: pywikibot.Page(pywikibot.Site(), "be-tarask:page")
be-x-old doesn't work at all

Script terminated by exception:

ERROR: SiteDefinitionError: be-tarask:Баўгарыя is not a local page on wikipedia:cs, and the interwiki prefix be-tarask is not supported by Pywikibot!
Unknown URL 'https://be-x-old.wikipedia.org/wiki/$1'.
Traceback (most recent call last):
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/page.py", line 5609, in parse
    newsite = self._site.interwiki(prefix)
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/site.py", line 956, in interwiki
    return self._interwikimap[prefix].site
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/site.py", line 710, in __getitem__
    raise self._iw_sites[prefix].site
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/site.py", line 673, in site
    self._site = pywikibot.Site(url=self.url)
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/__init__.py", line 1238, in Site
    code, fam = _code_fam_from_url(url)
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/__init__.py", line 1195, in _code_fam_from_url
    raise SiteDefinitionError("Unknown URL '{0}'.".format(url))
pywikibot.exceptions.SiteDefinitionError: Unknown URL 'https://be-x-old.wikipedia.org/wiki/$1'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pwb.py", line 253, in <module>
    if not main():
  File "pwb.py", line 246, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 115, in run_python_file
    main_mod.__dict__)
  File "./neexistujici-kotvy.py", line 272, in <module>
    main()
  File "./neexistujici-kotvy.py", line 258, in main
    bot.run()  # guess what it does
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/bot.py", line 1505, in run
    self.treat(page)
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/bot.py", line 1733, in treat
    self.treat_page()
  File "./neexistujici-kotvy.py", line 162, in treat_page
    if testovana_stranka.isRedirectPage():
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/page.py", line 811, in isRedirectPage
    return self.site.page_isredirect(self)
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/page.py", line 218, in site
    return self._link.site
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/page.py", line 5705, in site
    self.parse()
  File "/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core/pywikibot/page.py", line 5616, in parse
    .format(self._text, self._site, prefix, e))
pywikibot.exceptions.SiteDefinitionError: be-tarask:Баўгарыя is not a local page on wikipedia:cs, and the interwiki prefix be-tarask is not supported by Pywikibot!
Unknown URL 'https://be-x-old.wikipedia.org/wiki/$1'.
CRITICAL: Closing network session.

Every single time Pywikibot tries to parse [[be-tarask:something]]

Dalba triaged this task as High priority.Aug 9 2018, 10:46 AM
Xqt claimed this task.
Xqt removed Xqt as the assignee of this task.
Xqt subscribed.