Page MenuHomePhabricator

Feed Entity Suggester with ASCII equivalents of labels
Open, Needs TriagePublic

Description

The Wikidata Entity Suggester helps to identify items as values for claims at Wikidata, based on their labels. Unfortunately, this is difficult if the label contains special characters which might be difficult to enter if the keyboard does not provide direct input.

I therefore propose to suggest an item based on the ASCII equivalent of the label(s) as well.

Currently there is (approved) bot editing going on at Wikidata to add ASCII equivalents of labels as aliases (at least for languages en and es). This is also justified with the difficulties related to the Entity Suggester mentioned above. It appears to me that we are adding aliases here to compensate a functional deficit, not because the entity is actually known under the ASCII equivalent of the label.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 16 2017, 9:18 PM
Emijrp added a subscriber: Emijrp.Jul 17 2017, 8:04 AM

i leave this link here with a discussion about how a full version of this feature could be undesired https://www.wikidata.org/w/index.php?title=User_talk:Emijrp&oldid=522355148#Aliases_on_given_name

Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptAug 1 2017, 10:34 PM

To be clear: this is not related to the property/entity suggester. The switch to ElasticSearch will fix this.

The switch to ElasticSearch will fix this.

If there are proper names/aliases, of course. Accent folding and such should probably take care of most Latin diacritics, but won't do transliterations and such.

You are welcome to test on http://elastic-wikidata.wmflabs.org/wb.html if it works better with ES and complain if it doesn't :)

The switch to ElasticSearch will fix this.

If there are proper names/aliases, of course. Accent folding and such should probably take care of most Latin diacritics, but won't do transliterations and such.
You are welcome to test on http://elastic-wikidata.wmflabs.org/wb.html if it works better with ES and complain if it doesn't :)

But this task is about that, right? Transliterations are something else, but diacritics is already a good step. :)

Restricted Application removed a subscriber: Liuxinyu970226. · View Herald TranscriptSep 17 2017, 2:11 PM
Izno reopened this task as Open.Sep 17 2017, 4:46 PM