Page MenuHomePhabricator

Link adaptation - add link by auto-completion
Closed, DeclinedPublic

Description

Migrated from: https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4184

Context

Given that Wikidata provides an equivalence for Wikipedia links (e.g., Cheese - Formaggio). It is possible to guess when the user is writing the translation for a link, and suggest the creation of such link in those cases.

This will not be perfect since the translated word may be different than the Wikidata translated label (requiring the user to modify the inserted word or ignore it at all depending on the case), but it will speed up the process in many cases. There may be also problems due to the detection of word boundaries (e.g., if the user types "a " will we be able to suggest "a day in the life" if that is one of the suggestions from the source links?

Narrative

As a user<i>, I can get suggestions or creating links based on the source text</i>//<i>, so that I can add links just by typing without extra selection</i>.//

Acceptance Criteria

  • Given a link in the source English text ("Cheese"), when the user types "for" in the Italian translation, a suggested text (in grey) is shown for "formaggio".
  • If the user accepts the suggestion, a link pointing to the corresponding article (based on Wikidata) is created.
  • Suggestions for insertion are shown below the current cursor position.
  • To avoid frequent false positives, suggestions may be based on 2-3 character occurrences.
  • A link fromthe source is not suggested if it is already present in the target at any time (i.e., only the first "formaggio" will be linked, being annoying to show the next times the user writes "formaggio").

A number of linguistic features are approximated with "quick fixes":

  • Word boundary detection (simple non-internationalized regex-based approach, e.g. /[A-Za-z_]/)
  • Word stem matching in link target set (match "word" prefixes)
  • Multi-word phrase matching (match on first "word" only)
  • Likely match detection (match the first n characters in the editor after the last "word boundary")

Design details

Content-translation-designs.pdf.jpg (600×800 px, 57 KB)

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedsanthosh
DeclinedNone
OpenNone
Resolvedsanthosh
ResolvedPginer-WMF
ResolvedNone
ResolvedAmire80
InvalidNone
ResolvedPginer-WMF
DeclinedNone
Resolved Petar.petkovic
Resolved Petar.petkovic
Resolved Petar.petkovic
ResolvedPginer-WMF
OpenNone
Resolved Petar.petkovic
Resolved Petar.petkovic
Resolved Petar.petkovic
ResolvedPginer-WMF
Resolvedsanthosh
Resolveddchan
Resolvedsanthosh
Resolvedsanthosh
ResolvedNikerabbit
OpenNone
Resolvedsanthosh
ResolvedPginer-WMF
OpenNone
ResolvedPginer-WMF
InvalidNone
Resolved Petar.petkovic
Resolvedsanthosh
Resolvedsanthosh
Resolvedsanthosh

Event Timeline

Amire80 raised the priority of this task from to Low.Feb 20 2015, 4:18 PM
Amire80 added a subscriber: Amire80.
Arrbee set Security to None.
Pginer-WMF added a subscriber: Pginer-WMF.

Other tickets cover simpler approaches to facilitate the addition of tickets present in the source paragraphs. In particular: