Page MenuHomePhabricator

Create a searchable and shared glossary
Open, Needs TriagePublic

Description

The idea behind that is to have a dictionary of terms used on Mediawiki extensions. the goal is to create glossaries that expand the idea behind T150261: Publish best practices on MediaWiki.org concerning glossaries.

That system can be:

  • searchable, to find what a term means on various projects or see possible synonyms (linked)
  • translatable, to have a maximal adaptability
  • general and specific, to see all possible terms, or just terms about a specific topic.

A term has one to many definitions
A term has zero to many synonyms
A term has zero to many language equivalents (not translations but terms matching the same definition in an other language)
A definition is linked to one to more project (extensions)

A term has zero to many translations
Synonyms have zero to many translations or have zero to many language equivalents
A definition has zero to many translations

A user may want to search for a term, a translation, a synonym or a group of terms linked to a project. For example, I want to see what "Topic" means, if "Sujet" is a synonym of something or all definitions linked to Flow.

Related Objects

Event Timeline

Where and how should this term be stored? Within plain wikitext on the wiki, within a Scribunto Data Module, or within Wikidata?

A Scribunto implementation might look like this

glossary = {
  ['term'] = {
    -- A term has zero to many language equivalents
     ['en'] = {
        --  A term has one to many definitions
         [1] = { ['definition'] = 'Some defintion',  ['aliases'] = {'first synonym', 'second synonym'}, -- A term has zero to many synonyms
         [2] = { ['definition'] = 'Some other defintion',  ['aliases'] = {'other first synonym', 'other second synonym'},
     } 
     ['eo'] = {
        --  no equivalent for first definition in this language
         [2] = { ['definition'] = 'Iu difino,  ['aliases'] = {'alia una sinonimo', 'alia dua sinonimo'}, 
     }
  ['term2'] = { -- an other entry, filled with an similar structure as the previous one
  }
}

The pro is that it is far more flexible regarding the data rendering. The con is that I have no idea how hard this would be to manage such a structure within the translation extension, I don't think it is treatable as is with the current state of the extension.

I'm not a developer, but I see that glossary as something separate where
translators can find terms. A dictionary.

Sure, but would such an approach be adopted this is not the form most users would consult the glossary, it just make it more structured with all the pro and cons of such an approach. Just like wikidata in fact.

Glossary needs to contain both software and project terms.

  • Software term glossary should be used from Translatewiki, from Mediawiki.org and from Meta (for technical help).
  • Project term glossary should be used mainly from Meta.

Note that all localization project have implemented their own glossary service. See Mozilla, Drupal, Wordpress, KDE, Transifex, Crowdin… We should not have to reinvent the wheel, but to think about how to connect a glossary service to both TranslateWiki and Wikimedia project galaxy.

Note that all localization project have implemented their own glossary service. See Mozilla, Drupal, Wordpress, KDE, Transifex, Crowdin… We should not have to reinvent the wheel, but to think about how to connect a glossary service to both TranslateWiki and Wikimedia project galaxy.

Then one concrete task for someone would then be to check out one or multiple those services and write a summary how they work, are the open source, etc.

There is task for glossary support in Translate: T52092: Implement glossary (terminology) support. Whatever will be implemented, it should be easy to use it in our translation extensions.

Uninteresting:

Interesting:

To summarize:

  • Terminator is the only dedicated glossary management service I’ve found, but is probably outdated.
  • Zanata is a full localization platform with an integrated glossary management service.
  • Transvision manages full strings (not only terms) but is interesting because it gets translation from several distinct sources.