Page MenuHomePhabricator

OSM Wiki actually has item_namespace
Open, HighPublicBUG REPORT

Description

I setup user-config.py file with osm family to connect to OSM Wiki

import pywikibot
site = pywikibot.Site()
item = pywikibot.ItemPage(site, "Key:amenity")

Crashes with

AttributeError: APISite instance has no attribute 'item_namespace'

despite that item namespace exists - see https://wiki.openstreetmap.org/wiki/Key:amenity with "Data item" in the sidebar

Event Timeline

I guess APISite is the problem which must be DataSite interface then. The usual way is to get the repository from APIsite site:

repo = site.data_repository()
item = ItemPage(repo, <title>)

repo = site.data_repository()
item = pywikibot.ItemPage(repo, "Q4980")

works, but it does not solve anything as

repo = site.data_repository()
item = pywikibot.ItemPage(repo, "highway=motorway")
print(item)

repo = site.data_repository()
item = pywikibot.ItemPage(repo, "Tag:highway=motorway")
# fails with 'Tag:highway=motorway' is not a valid item page title
print(item)

still fail - and what I want is to get a matching data item for a given page ( https://wiki.openstreetmap.org/wiki/Tag:highway%3Dmotorway )

So far I found code that simply fetched every single data item and cached all of them.

Key:amenity is a regular page, not an item. You should be able to access the item through:

import pywikibot
site = pywikibot.Site('en', 'osm')
page = pywikibot.Page(site, 'Key:amenity')
item = page.data_item()

However, OSM is not a DataSite. When I update the family file, it throws ValueError: Cannot parse a site out of wiki. because of a sitelink to wiki (local page?).

https://wiki.openstreetmap.org/w/api.php?action=wbgetentities&sites=wiki&titles=Key:bridge:movable&languages=en|fr&format=json that can be used as an workaround

https://github.com/matkoniecz/data_item_spillover_cleanup/blob/master/data_item_extractor.py has some code that (partially) parses this response into something usable (I have not found any way to get data using normal pywikibot code)

Xqt triaged this task as High priority.Dec 9 2020, 12:22 PM

osm is its own DataSite

>>> site = pywikibot.Site('osm:en')
>>> repo = site.data_repository()
>>> site
APISite("en", "osm")
>>> repo
DataSite("en", "osm")
>>>

but there are some problems with supporting osm:

item.get() loads the content and stores it in _content attribute. The problem is the initializing after that:

# make use of lazy initialization (T245809)
print(self.DATA_ATTRIBUTES)
for key, cls in self.DATA_ATTRIBUTES.items():
    value = cls.fromJSON(self._content.get(key, {}), self.repo)  # <-- this will fail
    setattr(self, key, value)
    data[key] = value
Xqt changed the subtype of this task from "Task" to "Bug Report".Dec 9 2020, 1:56 PM
Traceback (most recent call last):
  File "<pyshell#73>", line 1, in <module>
    d = i.get()
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 4539, in get
    data = super().get(force, *args, **kwargs)
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 4149, in get
    data = WikibaseEntity.get(self, force=force)
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 3917, in get
    value = cls.fromJSON(self._content.get(key, {}), self.repo)
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 3561, in fromJSON
    return cls(repo, data)
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 3551, in __init__
    self.update(data)
  File "C:\Python39\lib\_collections_abc.py", line 856, in update
    self[key] = other[key]
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 3602, in __setitem__
    val = SiteLink.fromJSON(val, self.repo)
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 6193, in fromJSON
    sl = cls(data['title'], data['site'])
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 6136, in __init__
    site, namespace, title = SiteLink._parse_namespace(title, site)
  File "C:\pwb\GIT\core\pywikibot\page\__init__.py", line 6160, in _parse_namespace
    site = pywikibot.site.APISite.fromDBName(site)
  File "C:\pwb\GIT\core\pywikibot\site\__init__.py", line 177, in fromDBName
    raise ValueError('Cannot parse a site out of %s.' % dbname)
ValueError: Cannot parse a site out of wiki.

Change 647259 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [bugfix] Enable creating ItemPage for osm

https://gerrit.wikimedia.org/r/647259