Page MenuHomePhabricator

wbsearchentities should fail for queries longer than 250 characters
Open, Needs TriagePublic

Description

Wikidata API's wbsearchentities action "Searches for entities using labels and aliases".

Wikidata has a limit of 250 characters for labels and aliases.

As a result, any search using wbsearchentities for a query longer than 250 characters would return an empty array. Therefore, a user may be led to think no item exists for their query (possibly leading to the creation of duplicates), when actually their query made no sense in the first place.

Consider the following example. The full name of the work described by Q106923254 is "Manuel de l'amateur de la gravure sur bois et sur métal au XVe siècle: Contenant un catalogue des gravures xylographiques se rapportant aux saints et saintes, sujets religieux, mystiques et profanes calendriers, alphabets, armoiries, portraits et suivi d'une specification des impostures: avec des notes critiques, bibliographiques et iconologiques". The user who created it decided to name it "Manuel de l'amateur de la gravure sur bois et sur métal au XVe siècle", because the original title exceeded the 250-character limit.

A user (app or service, such as Wikidata's reconciliation service) wanting to know if there is an item in Wikidata for this work, may be tempted to use wbsearchentities with the full name as query. The Wikidata API would then return an empty array, as if the item did not exist.

I think a safer option would be that it returned an error instead.