Page MenuHomePhabricator

[timebox 2h] Conceive way to display long lemma
Closed, ResolvedPublic0 Estimated Story Points


Research task for T195367


  • research currently supported ways of dealing with (too) long words - working cross-browser and without losing valuable page content (cut words, ...)
  • mend T195367 according to findings


Event Timeline

Jakob_WMDE set the point value for this task to 0.Jul 10 2018, 1:07 PM

I saw an impressive feat of word-wrap on in the press release section, playing with the display width. Can we learn something there?
-> actually uses ­ in places where the language used allows for syllabification, would need to be rendered into the HTML (on the server-side), and require a library to know the correct spots.

Modern CSS offers the possibility for auto hyphenation leveraging rules provided by the browser engine. As the hyphenation rules are language-specific, this requires the language to be marked-up correctly.

  • would work as-is for
    • items, properties (language marked-up on the container element .firstHeading)
    • lexeme sense glosses (language and directionality marked-up right on the very element)
  • requires changes to
    • the requirements of T195367/T199081 for lexeme lemmas
    • the way we render lexeme form representations

The semantic value of our mark-up would benefit from these change regardless if the CSS change comes into effect.

There is a less mixin in mediawiki (limited threshold of originality, but precedence), and sporadic use of it. It offers cross-browser compatibility and some degree of backwards compatibility via word-break.

We could use it on

  • .wb-entitypage .wikibase-title-label -> T203242
  • .wb-lexemepage .lemma-widget_lemma-value -> T195367
  • .wb-lexemepage .representation-widget_representation-value -> T203240
  • .wb-lexemepage .wikibase-lexeme-sense-gloss-value -> T203241


Bildschirmfoto von 2018-08-30 10-26-41.png (965×1 px, 154 KB)
Bildschirmfoto von 2018-08-30 10-26-19.png (965×1 px, 118 KB)


Bildschirmfoto von 2018-08-30 10-27-00.png (965×1 px, 158 KB)
Bildschirmfoto von 2018-08-30 10-27-47.png (965×1 px, 125 KB)

As the range of lexeme term languages includes options that can not be known to browser engines providing the hyphenation rules (think "mis" or "en-x-Q123"), there will be constellations where this will yield unsatisfactory results (e.g. falling back to English rules). Additionally, differences between browser engines might yield results that may be perceived as unsatisfactory but which we can not control. Both could be overcome by using custom hyphenation rules and applying them on render - yet this would constitute a substantial challenge to cover what can be assumed a limited limited number of cases.


  • inconsistencies in browser implementation
  • lack of support for mis language
  • possible effort for implementing special language lookup (get value of LexemeLanguageCodePropertyId inside Q123 in case of en-x-Q123) Can we trust browsers to use the prefix only? FWIW my firefox cuts "en" differently from "fr" but "fr" just like "fr-x-Q123"

@Lydia_Pintscher Would you please give this a read, check if this could work for you, and if you have suggestions for other people (e.g. UX) we maybe should involve before going with it.

Yes this looks good to me. Thanks for the research!