Page MenuHomePhabricator

Make WbQuantity handle case without errors
Closed, ResolvedPublic

Description

T115269 means that
in addition to

"datavalue":{
  "value":{
    "amount":"+700",
    "unit":"1",
    "upperBound":"+710",
    "lowerBound":"+690"
  },
  "type":"quantity"
},

clients may now also encounter this:

"datavalue":{
  "value":{
    "amount":"+700",
    "unit":"1"
  },
  "type":"quantity"
},

WbQuantity should both be able to handle this and we should ensure that we don't add errors unless they are explicitly set.

Before deploying this should be announced on the mailing list as some users may rely on the current error=None -> +/-0 "feature".

Details

Event Timeline

Restricted Application added subscribers: pywikibot-bugs-list, Aklapper. · View Herald TranscriptNov 7 2016, 9:05 PM

Licking this cookie since I have a patch in the works (sine I've been changing WbQuantity lately). Just need to add tests.

We should probably also ensure that this gets pushed to 2.0 and that a new rc is built with it included to avoid messing up (by adding false +/-0) the clean-up work planned by the Wikidata crew.

Change 320649 had a related patch set uploaded (by Lokal Profil):
Make uncertainties in WbQuantity optional

https://gerrit.wikimedia.org/r/320649

Lokal_Profil added a comment.EditedNov 9 2016, 9:11 PM

Change 320649 had a related patch set uploaded (by Lokal Profil):
Make uncertainties in WbQuantity optional
https://gerrit.wikimedia.org/r/320649

There is a lot of extra code in here just to ensure this is compatible with older versions of WikiBase.

Specifically WbQuantity now requires a site object which it then only uses to check the MediaWiki version of the site. To me that feels like unnecessary overhead (and would require a lot of people update their client code).

A solution would be to simply ignore older versions of WikiBase (everything works fine as long as you supply an error) and maybe just listen for "pywikibot.data.api.APIError: invalid-snak" when someone tries to add a Claim where the target is WbQuantity and if so raise an error/warning suggesting "this might be due to you not specifying an error"? The downside is that this occurs much later in the workflow (when adding a claim not when creating the WbQuantity object).

Great job keeping the backwards compatibility! I say submit the code with the current logic and remove the extra logic later if the code becomes unmaintainable (this will eventually happen after a few more breaking changes to wikibase).

Great job keeping the backwards compatibility! I say submit the code with the current logic and remove the extra logic later if the code becomes unmaintainable (this will eventually happen after a few more breaking changes to wikibase).

My worry is that the current solution would require everyone who wants their bot to correctly handle Quantities on Wikidata to update their code (adding the site object). Whereas if they ignore the warning they can happily go on mishandling them (which affects T142087).

After thinking about this some more one solution would be to have _require_errors() return False instead if no site object is provided. That way Wikidata users will automatically get the desired behaviour without changing anything. User of older WikiBase installations will get an initial crash but it will be directly preceded by the warning message and they could easily get their code running again by simply providing a site object.

If we go with that solution the patch should of course not be merged until after the change goes live on Wikidata.

Strainu added a comment.EditedNov 10 2016, 10:13 AM

My worry is that the current solution would require everyone who wants their bot to correctly handle Quantities on Wikidata to update their code (adding the site object). Whereas if they ignore the warning they can happily go on mishandling them (which affects T142087).
After thinking about this some more one solution would be to have _require_errors() return False instead if no site object is provided. That way Wikidata users will automatically get the desired behaviour without changing anything.

Having the default accommodate the newest version of WikiBase seems like a good idea, but it would kind of break the whole idea of keeping backwards compatibility. I don't think the current version would qualify as "incorrect" though, even from an ontology POV. It would simply be a QuantityValue instead of an UnboundedQuantityValue. I would expect a maintenance script to be run after the change if the intention is to move all Quantities with upper/lower bound 0 to UnboundedQuantityValue. I'm not familiar enough with ontologies to be able to say if this is needed/desirable.

Change 320649 merged by jenkins-bot:
[Breaking]Make uncertainties in WbQuantity optional

https://gerrit.wikimedia.org/r/320649

Magul closed this task as Resolved.Nov 21 2016, 2:06 PM
daniel added a subscriber: daniel.EditedNov 21 2016, 2:12 PM

@Strainu defaulting to +/-0 was nearly always definitely wrong. +/-0 means ABSOLUTE precision. That does not exist for any measured quantity, only for definitions and exact counts.

Omitted bounds does not have the same semantics as +/-0 at all. A bot is going to change many (not all) instances of +/-0 to unbounded, because +/-0 was abused a lot on wikidata to prevent the uncertainty bounds to be displayed. The same bot will also change many (not all) +/-1 quantities to unbound, because +/-1 was the old default.

In any case, please DO NOT use +/-0 as the default for value imports. It's semantically wrong in almost all cases. +/-0 can be used only if the value is KNOWN to be an EXACT count or definition.

OK. So the current patch was merged which allows pywikibot to go on dealing with Wikidata.

To avoid the scenario which @daniel mentions a follow-up patch can be submitted which changes _require_errors() to False instead if no site object is provided. I we put together that patch but am ok with either solution.

More importantly this change (or changes) should be pushed to the 2.0 branch to ensure they can also deal with unbound errors correctly.

Change 322672 had a related patch set uploaded (by Lokal Profil):
Require site object only for older MediaWiki installations

https://gerrit.wikimedia.org/r/322672

Magul reopened this task as Open.Nov 21 2016, 3:37 PM

Since backwards compatibility will be broken anyway, the question is if we should keep the extra code that handles older versions of Wikibase. Is there any data on the number of Wikibase installations out there?

Since backwards compatibility will be broken anyway, the question is if we should keep the extra code that handles older versions of Wikibase. Is there any data on the number of Wikibase installations out there?

I pinged the pywikibot list to see if anyone is using pywikibot for other wikibase installations but my gut feeling says it's unlikely.

@Strainu No reply on the list. I would suggest merging the patch as I've already seen a couple of questions from people running on Wikidata.

Change 322672 merged by jenkins-bot:
Require site object only for older MediaWiki installations

https://gerrit.wikimedia.org/r/322672

Lokal_Profil closed this task as Resolved.Nov 28 2016, 12:22 PM

Many thanks @Strainu

Holger added a subscriber: Holger.Jan 19 2017, 6:10 PM

Is this fix already in 2.0rc5?
Getting this error in lots of places.
Thanks :)

Lokal_Profil added a comment.EditedJan 19 2017, 9:39 PM

Is this fix already in 2.0rc5?
Getting this error in lots of places.
Thanks :)

This is not in the PyPI version sadly. You'll need to get the git master from gerrit or the github mirror (you can still use pip with that url). For more details on the state of the PyPI version see T152907 and T106121

Please consider fixing this in 2.0rc5/the pip install version. I just came across this, and spent the time to debug my code to make sure that wasn't the problem, tracked down the problematic value on a Wikidata entry, wrote a test script that demonstrates the issue, tested it on several computers, and came here to submit the bug - only to then find this thread. If the pip version was up-to-date, none of that would have been necessary. :-(

Per T150210#2954454 you will need to raise this on the tasks for making a new release. At this point 2.0r5 is so far behind master that it might not be possible to reliably backport individual patches.

Per T150210#2954454 you will need to raise this on the tasks for making a new release. At this point 2.0r5 is so far behind master that it might not be possible to reliably backport individual patches.

OK, raised at T159280. It sounds like this isn't a case of backporting bug fixes, rather the version available through pip needs to be updated to the master version.

Hi!
I am sorry, but
with this version:

Pywikibot: pywikibot/__init__.py (c4ccfae, -1 (unknown), 2017/05/05, 01:02:04, n/a)
Release version: 2.0rc5
httplib2 version: 0.9.1
  cacerts: D:\SNMs\refinery\pywikibot\core_stable\externals\httplib2\python3\httplib2\cacerts.txt
    certificate test: ok
Python: 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
  unicode test: ok
PYWIKIBOT2_DIR: Not set
PYWIKIBOT2_DIR_PWB:
PYWIKIBOT2_NO_USER_CONFIG: Not set
Config base dir: D:\SNMs\refinery\pywikibot\core_stable
Usernames for family "wikipedia":
        ru: Sovmogil (no sysop configured)

and with this script --

import pywikibot

from pywikibot import pagegenerators as pg

def list_template_usage(site_obj, tmpl_name):
    """
    Takes Site object and template name and returns a generator.

    The function expects a Site object (pywikibot.Site()) and
    a template name (String). It creates a list of all
    pages using that template and returns them as a generator.
    The generator will load 50 pages at a time for iteration.
    """
    name = "{}:{}".format(site_obj.namespace(10), tmpl_name)
    tmpl_page = pywikibot.Page(site_obj, name)
    ref_gen = pg.ReferringPageGenerator(tmpl_page, onlyTemplateInclusion=True)
    filter_gen = pg.NamespaceFilterPageGenerator(ref_gen, namespaces=[0])
    generator = site_obj.preloadpages(filter_gen, groupsize=20, templates=False, langlinks=False)
    return generator

site = pywikibot.Site("ru", 'wikipedia')
tmpl_gen = list_template_usage(site, "geo-stub")

for page in tmpl_gen:
    item = pywikibot.ItemPage.fromPage(page)
    print(item.getID())

I am stil getting this error --

D:\SNMs\refinery\pywikibot\core_stable>python testpull02.py
Retrieving 20 pages from wikipedia:ru.
Q5214867
Q4163312
Q4289585
Traceback (most recent call last):
  File "testpull02.py", line 25, in <module>
    item = pywikibot.ItemPage.fromPage(page)
  File "D:\SNMs\refinery\pywikibot\core_stable\pywikibot\page.py", line 3556, in fromPage
    if not lazy_load and not i.exists():
  File "D:\SNMs\refinery\pywikibot\core_stable\pywikibot\page.py", line 3158, in exists
    self.get()
  File "D:\SNMs\refinery\pywikibot\core_stable\pywikibot\page.py", line 3576, in get
    c = Claim.fromJSON(self.repo, claim)
  File "D:\SNMs\refinery\pywikibot\core_stable\pywikibot\page.py", line 3995, in fromJSON
    claim.type, lambda value, site: value)(value, site)
  File "D:\SNMs\refinery\pywikibot\core_stable\pywikibot\page.py", line 3942, in <lambda>
    'quantity': lambda value, site: pywikibot.WbQuantity.fromWikibase(value),
  File "D:\SNMs\refinery\pywikibot\core_stable\pywikibot\__init__.py", line 512, in fromWikibase
    upperBound = eval(wb['upperBound'])
KeyError: 'upperBound'

I've read it was kind of fixed, but it seems to be not... Any suggestions how it is to be better handled please.

Thank you for your time

Lokal_Profil added a comment.EditedMay 5 2017, 1:17 PM

@Sovmogil you need to use the newest version on PyPI. I.e. 3.0.xxx.

Try running pip install --upgrade pywikibot

Thank you @Lokal_Profil . The solution worked.