This phabricator task for fixing two separate bugs in handling commons files where mediainfo doesn't exist.these.
Bug 1 is that there is no handling for missing mediainfo. Also there is no method for creating empty mediainfo.
Code
gen = pagegenerators.RecentChangesPageGenerator( site=self.site, namespaces=[6], # File namespace changetype="new", total=100 ) """Seek to first page without mediainfo.""" for page in gen: if not 'mediainfo' in page.latest_revision.slots: item = page.data_item() """Get fails as there is no mediainfo.""" item.get()
Result
Traceback (most recent call last): File "/Users/kimmovirtanen/wikitech/core/tests/file_tests.py", line 400, in test_file_exist_but_without_item item.get() File "/Users/kimmovirtanen/wikitech/core/pywikibot/page/_wikibase.py", line 427, in get data = self.file.latest_revision.slots['mediainfo']['*'] KeyError: 'mediainfo'
Fix
Handle the missing key as NoWikibaseEntityError and creating get_data_for_new_entity() if user wants to create new item for the file.
Bug 2 (T222159) is that the empty statements is list instead of dictionary
Code
gen = pagegenerators.RandomPageGenerator(total=1000, site=site, namespaces=[6]) # Namespace 6 corresponds to files """Seek to first pagewith mediainfo.""" for page in gen: if 'mediainfo' in page.latest_revision.slots: item = page.data_item() """Get fails in first item where is no statements in mediainfo.""" data=item.get()
Result
Traceback (most recent call last): File "/Users/kimmovirtanen/pywikibot/latestfiles.py", line 23, in <module> data=item.get() File "/Users/kimmovirtanen/pywikibot/venv/lib/python3.10/site-packages/pywikibot/page/_wikibase.py", line 446, in get return super().get(force=force) File "/Users/kimmovirtanen/pywikibot/venv/lib/python3.10/site-packages/pywikibot/page/_wikibase.py", line 275, in get value = cls.fromJSON(self._content.get(key, {}), self.repo) File "/Users/kimmovirtanen/pywikibot/venv/lib/python3.10/site-packages/pywikibot/page/_collections.py", line 213, in fromJSON for key, claims in data.items(): AttributeError: 'list' object has no attribute 'items' CRITICAL: Exiting due to uncaught exception AttributeError: 'list' object has no attribute 'items'
Fix
Pywikibot fix is todetect incorrect list and convert it to dictionary when data is loaded.
Howto test if it is working
import pywikibot site = pywikibot.Site('commons', 'commons') page = pywikibot.FilePage(site,'Image:Montemurro1857.png') item = page.data_item() data=item.get() print(data)