Page MenuHomePhabricator

Disallow wikidata.org URIs to non-Item pages in unit/globe/calendar values
Closed, ResolvedPublic3 Estimated Story Points

Description

Certain DataValues use URIs internally, and we are slowly introducing more of these cases:

  • Calendar models in time values.
  • Globes in coordinates.
  • Units in quantities.

In all these cases our current validators allow everything, as long as the URL starts with http or https and doesn't exceed 255 characters. The relevant code can be seen in Wikibase\Repo\ValidatorBuilders.

In all these cases I suggest to:

  • Disallow http://wikidata.org with the "www" missing.
  • Disallow http://www.wikidata.org/wiki/ with "wiki" instead of "entity". Moreover, disallow every wikidata.org URI but canonical /entity/ URIs.
  • Clarify what canonical /entity/ URIs include
  • Disallow every entity type but Items.

Event Timeline

thiemowmde raised the priority of this task from to Needs Triage.
thiemowmde updated the task description. (Show Details)

Change 219016 had a related patch set uploaded (by Thiemo Mättig (WMDE)):
Rework EntityLabelUnitFormatter into ItemLabelUnitFormatter

https://gerrit.wikimedia.org/r/219016

Lydia_Pintscher set Security to None.
Lydia_Pintscher moved this task from ready to go to consider for next sprint on the Wikidata board.
Jonas renamed this task from Disallow wikidata.org URIs to non-Item pages in unit/globe/calendar values to [Story] Disallow wikidata.org URIs to non-Item pages in unit/globe/calendar values.Aug 13 2015, 3:03 PM

Change 219016 abandoned by Thiemo Mättig (WMDE):
Rework EntityLabelUnitFormatter into ItemLabelUnitFormatter

Reason:
Became obsolete with I776a4aa.

https://gerrit.wikimedia.org/r/219016

Sorry, https://gerrit.wikimedia.org/r/218917 was about formatting, but this ticket is about validation, which is not resolved.

ItamarWMDE renamed this task from [Story] Disallow wikidata.org URIs to non-Item pages in unit/globe/calendar values to Disallow wikidata.org URIs to non-Item pages in unit/globe/calendar values.Oct 27 2022, 9:12 AM
ItamarWMDE added a project: Wikidata Dev Team.

FWIW, the current validation rules are:

  • coordinate globes and time calendars must be URIs below http://www.wikidata.org/entity/. (Always Wikidata, hard-coded in ValidatorBuilders::$wikidataBaseUri.)
  • quantity units must be URIs below the local wiki’s /entity/ URI. (On Wikidata, this is the same URI as for coordinate globes and time calendars, but on other wikis it’s different. All Wikibases refer to Wikidata for globes and calendars, but units are local.)

So I think the first two bullet points of the suggestions in the task description (disallow missing “www”, and disallow “/wiki/” or other non-“/entity/” paths) are already done. However, we don’t yet validate that the part after the /entity/ is a valid item ID (I confirmed locally that http://www.wikidata.org/entity/abcde is accepted as a globe, and it definitely shouldn’t be).

Task Triage Notes:

  • This task is truly about making sure the part after /entity/ is a valid entity ID
  • We can also reaffirm that the above comment is correct
  • We will discuss this further at story time, to improve the task description
  • coordinate globes and time calendars must be URIs below http://www.wikidata.org/entity/. (Always Wikidata, hard-coded in ValidatorBuilders::$wikidataBaseUri.)
  • quantity units must be URIs below the local wiki’s /entity/ URI. (On Wikidata, this is the same URI as for coordinate globes and time calendars, but on other wikis it’s different. All Wikibases refer to Wikidata for globes and calendars, but units are local.)

Fun fact: when this was implemented in 2015, we forgot to define the wikibase-validator-bad-prefix message used by these errors.

{
    "error": {
        "code": "modification-failed",
        "info": "⧼wikibase-validator-bad-prefix⧽",
        "messages": [
            {
                "name": "wikibase-validator-bad-prefix",
                "parameters": [
                    "http://example.com/"
                ],
                "html": "⧼wikibase-validator-bad-prefix⧽"
            }
        ],
        "docref": "See https://www.wikidata.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."
    },
    "servedby": "mw1358"
}

Change 889813 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Add strict types to ValidatorBuilders

https://gerrit.wikimedia.org/r/889813

Change 889814 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Validate entity ID part of URIs in data values

https://gerrit.wikimedia.org/r/889814

Change 889813 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Add strict types to ValidatorBuilders

https://gerrit.wikimedia.org/r/889813

Change 889814 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Validate entity ID part of URIs in data values

https://gerrit.wikimedia.org/r/889814