Page MenuHomePhabricator

', ˊ, ʹ, ʼ, or ˈ. I.e. who the heck came up with these?
Open, Needs TriagePublic

Description

Many of the sources used for importing Skolt labels into wd use the wrong characters, thus making much of the content useless to the community. Most of these seem to be placenames.

So:

  • all of the ' and ˊ need to be turned into ʹ or ʼ, depending on which one is correct.
  • keep the version with the wrong chars and deprecate them with the reason being "error in referenced source or sources"
  • mark the correct version as preferred with the reason being "error in referenced source has been fixed on Wikidata"

This task requires at least a basic knowledge of the current orthography of Skolt Saami.

Event Timeline

Marking the existing statement as deprecated and adding a second almost identical statement seems excessive to me. I would suggest simply correcting the characters and if the statement has an external reference with the incorrect characters, add stated as to the reference instead.

Also, I don't know how you're currently finding the things that need fixing, but I happened to have some queries open from trying to find monolingual text statements by language code that might be useful: This query lists all the sms values for monolingual text properties for the subset of predicates that have been used between 500,000 and 1 million times (for reference, this query lists all the monolingual text predicates and how many times they're used). By adjusting the range, it should be possible to check all of them except ps:P1476.

Marking the existing statement as deprecated and adding a second almost identical statement seems excessive to me. I would suggest simply correcting the characters and if the statement has an external reference with the incorrect characters, add stated as to the reference instead.

I chose the excessive route so that way the next time someone imports stuff, they can't miss that they are not the correct versions. Ideally whatever is used would show up in searches since not everyone has a Skolt keyboard so they use whatever looks close enough. So totally open to suggestions :)

Also, I don't know how you're currently finding the things that need fixing, but I happened to have some queries open from trying to find monolingual text statements by language code that might be useful: This query lists all the sms values for monolingual text properties for the subset of predicates that have been used between 500,000 and 1 million times (for reference, this query lists all the monolingual text predicates and how many times they're used). By adjusting the range, it should be possible to check all of them except ps:P1476.

Mainly by accident, so that query is a goldmine! Spb! (=Späʹsseb! aka Thanks!)

I have created an item for the incorrect pal. markers and the currently correct one so that these can be used as the reasons for the preferred value and the reason for deprecation (along with error in referenced source). These have also been added to the properties themselves.

EDIT: These items do not reflect the correct apostrophe, which I should probably take into account too.