Page MenuHomePhabricator

Create a page to list all forms with a certain spelling
Open, Needs TriagePublic

Description

This is a proposal to create a special page which lists all lexemes containing forms with the same spelling (e.g. a page which would list all lexemes which have a lemma or form "orange" in different languages). This would provide a more Wiktionary-like view of the data: In Wiktionary, there is one page for each string of characters, which contains all the forms spelt the same. E.g. https://en.wiktionary.org/wiki/orange lists all forms spelt "orange" in various languages.

The creation of a special page would mean we can use Cognate to link Wiktionary pages to Wikidata and vice versa and therefore make it possible to navigate between Wikidata and Wiktionary.

There should be links from lexemes to the special page, so that people can navigate from lexemes to the special page. When there are multiple lemmas or forms, there should be a link for each unique spelling.

It would be different from a search:

  • It would return things in a consistent order (e.g. by language then by part of speech)
  • It would pay attention to capitalisation and accents
  • It would not include partial matches

E.g. https://www.wikidata.org/w/index.php?search=the&ns146=1 finds "the", "thé" and "The quick brown fox jumps over the lazy dog"

There are several suggested ways to implement this:

  • As a special page (e.g. "Special:ListLexemes/orange")
  • As a namespace (e.g. "Homograph:orange")

Since the page would be entirely generated and is not expected to be user-editable, a special page seems most appropriate and it's not clear what benefit a dedicated namespace would have.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
iecetcwcpggwqpgciazwvzpfjpwomjxn renamed this task from Create special page to list all lexemes with a certain spelling to Create a page to list all forms with a certain spelling.May 29 2018, 12:01 PM
This comment was removed by iecetcwcpggwqpgciazwvzpfjpwomjxn.
Vvjjkkii renamed this task from Create a page to list all forms with a certain spelling to 8ecaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
Mbch331 renamed this task from 8ecaaaaaaa to Create a page to list all forms with a certain spelling.Jul 1 2018, 6:56 AM
Mbch331 raised the priority of this task from High to Needs Triage.
Mbch331 updated the task description. (Show Details)
Mbch331 added a subscriber: Aklapper.

I've reorganised the description, it got messed up by the vandalism and was really difficult to read.

I removed the bit about the entity suggester since it's really unclear to me how that would work and it's something that could be added later once we actually have this. I think there's more chance of this being implemented if we don't make it too complicated. :)

Aklapper updated the task description. (Show Details)

@Micru: This task is not invalid ("when the problem is not a bug, or when it is a change that is outside the power of the component's developers") until there are specific reasons provided.

As the actual creator of this ticket, I would still like to see it implemented. As far as I can tell, nothing has changed which would make the suggestions in this ticket superfluous.

Since I realised I hadn't linked it yet, https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_data/Archive/2017/07 is the previous discussion which inspired me.

Just adding a link to Hauki here as it does part of the job:

https://hauki.toolforge.org/lex/en/orange

I agree that the Special page should be fully integrated in Wikidata, in order to allow for linking to the Wiktionary projects and bring the two projects closer together.