Page MenuHomePhabricator

`APISite.is_data_repository` does not work if `self` is an `APISite`
Open, Needs TriagePublic

Description

It is not possible to use that function to determine if the site is a data repository when the site is not already a DataSite instance. And in that case an isinstance() would work too.

If possible that should be done together with T85331 because the current system doesn't allow that a Wikibase repo uses another Wikibase repo (although I'm not sure if that is possible).

Event Timeline

XZise raised the priority of this task from to Needs Triage.
XZise updated the task description. (Show Details)
XZise added subscribers: Aklapper, Unknown Object (MLST), XZise and 2 others.

I believe Wikimedia will soon have repo's that talk to other repos. If I understand correctly, the Commons Metadata project will make use of Wikidata .. somehow.

matej_suchanek subscribed.

I think the method now works as expected.

>>> import pywikibot
>>> site = pywikibot.Site('en', 'wikipedia')
>>> site.is_data_repository()
False
>>> repo = site.data_repository()
>>> repo
DataSite("wikidata", "wikidata")
>>> repo.is_data_repository()
True
>>> pywikibot.Site('wikidata', 'wikidata')
DataSite("wikidata", "wikidata")
>>> pywikibot.Site('en', 'wiktionary').is_data_repository()
False

The only thing we could do better is to override this method in DataSite():

def is_data_repository(self):
  return True

since Site.data_repository() always returns DataSite().

Note that this task is referenced from two places in code, so some cleanup would be useful.

>>> site = pywikibot.Site('wikidata', 'wikidata', interface='APISite')
>>> site
APISite("wikidata", "wikidata")
>>> site.is_data_repository()
False
>>> site.data_repository()
DataSite("wikidata", "wikidata")

Though I'm not sure why someone would do interface='APISite' in the first place.

I am thinking of whether this method is useful for anything but a confusion. If the site should be repo, it is a DataSite. That it's a DataSite you can test via isinstance(). The identity testing in the current APISite.is_data_repository() method is either obsolete (for repos themselves), or it doesn't work (for the mentioned case).

IMO the best solution would be:

  1. deprecate and remove APISite.is_data_repository()
  2. always query for the repository
  3. override DataSite.data_repository() with return self (for better performance)

I am thinking of whether this method is useful for anything but a confusion. If the site should be repo, it is a DataSite. That it's a DataSite you can test via isinstance().

  • deprecate and remove APISite.is_data_repository()

+1. Agreed, make sense. I don't write bots for wikidata so I can't say for wikidata bot coders, though.

Though I'm not sure why someone would do interface='APISite' in the first place.

In fact, you may need to treat the site as a client wiki. We shouldn't prevent this.

In fact, you may need to treat the site as a client wiki. We shouldn't prevent this.

DataSite inherits from APISite. Anything valid in APISite (as a client wiki) should be equally valid in DataSite (as a repository wiki), but not necessarily the other way around. If you need to treat Wikidata as a client wiki, using only method provided in APISite, whether you initialize it as APISite or DataSite should make no difference.

In fact, you may need to treat the site as a client wiki. We shouldn't prevent this.

whether you initialize it as APISite or DataSite should make no difference.

It does, see for example WikidataSPARQLPageGenerator. This is the case where you may want the generator to return pages in the project namespace connected to items, for instance.

I think this task is invalid because a DataSite is not a APISite but a subclass of it. The is_date_repository can either test for DataSite instance or for identity because we have all sites in pywikibot._sites (except there are different users). I am also unhabpy with the current Site comparison:

import pywikibot
class foo(pywikibot.tools.ComparableMixin):
    def _cmpkey(seld):
        return 'wikipedia', 'de'
    
site = pywikibot.Site()
bar = foo()
site == bar
True

The later quacks like a duck but it is a frog!

The breaking changes was made in b9d252e (very early Pywikibot 2)