Page MenuHomePhabricator

Make the "allowed units" constraint type recognize classes of units
Open, Needs TriagePublic

Description

According to the 2020 report on Property constraints (T244043):

To make this constraint type more correctly used and easier to maintain, it is suggested that implementations recognize the definition of classes of units (e.g., the class unit of length), usually better than trying to list most or all the instances of such a class (e.g., the instances "light-year", "astronomical unit", "foot", "parsec", "ångström", "metre", "centimetre", "millimetre", and potentially hundreds of other instances).

Event Timeline

I think this is mostly an issue for @Ivan_A_Krestinin? WikibaseQualityConstraints already applies unit conversion when checking “allowed units” constraints – if existing constraints redundantly specify many different “length” units, I assume that’s either because the constraint authors are unaware it’s not necessary, or because it’s still necessary for KrBot’s checks.

I also observe that the property with the highest number of allowed units, conversion to SI unit, presumably wouldn’t profit from this at all – its 84 units should all be different “classes”, otherwise it’s not very much of a standard conversion. From a glance at the list, that indeed appears to be the case (the classes being “length”, “mass”, “length squared”, “length over time”, etc.).

(It also follows that for this particular constraint, the checker should in fact not apply unit conversions, but we don’t have a way to express that in the constraint statement yet.)

I also observe that the property with the highest number of allowed units, conversion to SI unit, presumably wouldn’t profit from this at all – its 84 units should all be different “classes”, otherwise it’s not very much of a standard conversion. From a glance at the list, that indeed appears to be the case (the classes being “length”, “mass”, “length squared”, “length over time”, etc.).

(It also follows that for this particular constraint, the checker should in fact not apply unit conversions, but we don’t have a way to express that in the constraint statement yet.)

We could create a class that represented a "non-SI unit convertible to SI in Wikidata", of which the corresponding Items (units) were instances. However, I'm not sure if this is possible right now or not at the implementation level because I also don't know to what extent there are hardcoded classes of units in WikibaseQualityConstraints or whether it's possible to create new classes and immediately use them in "allowed units" constraints.

WikibaseQualityConstraints doesn’t have any lists of units of its own, hard-coded or otherwise. (It knows two individual units, “year” and “second”, to calculate differences.) It uses the standard unit conversions from Wikibase – the same ones that are used for the RDF export – and otherwise just compares if two item IDs are the same or not.

Great! So, as far as WikibaseQualityConstraints (and not other implementations) is concerned, we could use "coherent SI unit" in the "allowed units" constraint of "conversion to SI unit" (P2370)?

No, because it only checks the identity of items, it doesn’t care about their statements. (I assume you meant that this would allow any unit with P31 coherent SI unit?)

(I assume you meant that this would allow any unit with P31 coherent SI unit?)

That's it! :-)

No, because it only checks the identity of items, it doesn’t care about their statements.

Now I'm lost. You mean the Wikibase code (or its configuration) does include QIDs of classes of units? In any case, from the last answer I interpret that WikibaseQualityConstraints also doesn't recognize arbitrary/new classes in "allowed units" constraints, is that right?

In production, there’s a JSON file with unit conversions. (On other installs it may be configured differently.) When checking an “allowed units” constraint, for each allowed unit, WikibaseQualityConstraints will effectively search for the allowed unit and the actual unit in that file, replace each of them with the corresponding standard unit if present (e. g. Q531 → Q11573, from the first block), and then check if the result is the same item ID. (This is hidden behind some layers of indirection – WikibaseQualityConstraints itself never sees the JSON file.)