Don't capitalize Wikidata descriptions when viewing article
Open, Stalled, LowPublic

Description

I've noticed, when using the beta version, the Wikidata description on top of the page starts with a upper-case letter, despite the one of Wikidata starting with a lower-case one. This is rather ugly in some languages, can this be un-forced?

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 26 2016, 11:14 PM
Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Mar 28 2016, 3:51 PM

Suer it can. Any links to actual pages where we can see this?

Suer it can. Any links to actual pages where we can see this?

Well, if you use the random function on the Dutch Wikipedia you can see a lot of examples. Anyway, I've used the nearby function to find https://nl.m.wikipedia.org/wiki/Peerd_van_Ome_Loeks. It displays normally (lowercase) inside the nearby feature. The page itself shows a forced capitalised version, that is rather ugly.

jhobs triaged this task as Low priority.Apr 26 2016, 5:12 PM
jhobs changed the task status from Open to Stalled.

Doesn't seem to be a CSS added feature.

@Lydia_Pintscher and @Nirzar, what's the expectation here? I think in the apps there's capitalization of the description field (CC @JMinor and @Dbrant in case they can add context; CC @Deskana for history on this, too).

Doesn't seem to be a CSS added feature.

I see this in the source code:

.heading-holder .tagline:first-letter {
text-transform: capitalize
}

@dr0ptp4kt The history here is that this was done because it worked better and looked more correct in English. In honesty, we neglected to consider how this would look in other languages. In retrospect, it's obvious that some languages have different rules and conventions on when capitalisation is correct which would mean our initial approach is suboptimal.

@Lydia_Pintscher and @Nirzar, what's the expectation here? I think in the apps there's capitalization of the description field (CC @JMinor and @Dbrant in case they can add context; CC @Deskana for history on this, too).

Yeah to me this looks like a case where user input should not be messed with. I suggest just using what comes from Wikidata.

@Dbrant, @JMinor, @Nirzar, @JKatzWMF, thoughts on consistently just keeping the user-provided casing?

@Dbrant, @JMinor, @Nirzar, @JKatzWMF, thoughts on consistently just keeping the user-provided casing?

What user value do you propose this change adds? So far I see mostly hyperbole in this task (e.g. "rather ugly in some languages"), rather than a solid rationale for work to be undertaken. We can do better than that.

Incidentally, this afternoon I asked a native Dutch speaker whether this looked wrong in Dutch, showing him the specific example cited in the task, and he said that it looked fine.

Doesn't seem to be a CSS added feature.

I see this in the source code:

.heading-holder .tagline:first-letter {
text-transform: capitalize
}

I was checking in the browser debugger tools and I missed it! Thanks

CCing some language folks, they probably have good insight regarding this matter.

@Dbrant, @JMinor, @Nirzar, @JKatzWMF, thoughts on consistently just keeping the user-provided casing?

What user value do you propose this change adds? So far I see mostly hyperbole in this task (e.g. "rather ugly in some languages"), rather than a solid rationale for work to be undertaken. We can do better than that.

I think the same can be said about capitalizing it in the first place ;-)

I agree with @Lydia_Pintscher and @Sjoerddebruin. Just show whatever Wikidata has.

What user value do you propose this change adds? So far I see mostly hyperbole in this task (e.g. "rather ugly in some languages"), rather than a solid rationale for work to be undertaken. We can do better than that.

I think the same can be said about capitalizing it in the first place ;-)

Fair point, so allow me to explain the rationale. This change was made because having a lower case description created a capitalisation inconsistency with the article layout; the article title and first character of the article were capitalised, but the description was not. It looked odd, and created an inconsistent scan line.

Fair point, so allow me to explain the rationale. This change was made because having a lower case description created a capitalisation inconsistency with the article layout; the article title and first character of the article were capitalised, but the description was not. It looked odd, and created an inconsistent scan line.

Yes, that's more or less what I understood from your previous explanation.

I'd rather let the communities decide about this rather than force a software solution.

The problem with "just let the contributor decide" is that, in this case, the editors don't seem to have a consistent policy or knowledge of how these will be used. They edit one description at a time, but this inconsistency only becomes problematic in contexts where many descriptions are displayed together along with titles, etc.

To me this is much like page titles, which can be special cased when needed, but where there is a default which is software enforced. By saying "just display inconsistent formatting if thats what editors write" we are shifting some cognitive work of parsing and reading the descriptions to our readers. We could, by policy and community engagement I suppose, ask editors to take that work back and use "Wikipedia case" for all descriptions (or whatever the consensus ends up) or we could use a software solution, in languages where it makes sense, to reduce the burden on both editors and readers.

For an example of why casing makes parsing a list of items more effort compare:
https://en.m.wikipedia.org/wiki/Special:Nearby
Which uses the editors cases and mixes strings like "hospital" with "One of the 50 hills of San Francisco", vs. either Android or iOS's search presentation which normalizes for case. I don't have eye tracking studies or dwell time to prove it, but I feel strongly that the latter is easier to read, visually more pleasing and makes the use of the descriptions look much more intentional.

JKatzWMF added a comment.EditedApr 27 2016, 5:19 PM

If we leave it alone, some will be capitalized and some will not and there is a cost to inconsistency for the reader. This is a cost we bear throughout the projects, but the top of the page is where the user first lands orients themselves and begins digesting information. The cost of having inconsistency here is much higher than anywhere else. This is one reason why lead images are either on or off and not opt in (if there is an eligible photo). For this reason, this is one place where I think order is more important than individual preferences. I also do not like forcing a software solution, but I would like to keep it as is until a community standard evolves (the way it has with titles).

Um, what @JMinor said :)

The community standard is very clearly defined: https://www.wikidata.org/wiki/Help:Description

The community standard is very clearly defined: https://www.wikidata.org/wiki/Help:Description

The standard for entry of the descriptions is quite well defined indeed. That said, rigorously applying something that's quite literally a guideline outside its original context (i.e. display of the descriptions within Wikidata) seems unwise to me.

@Lydia_Pintscher sorry for my ignorance of WikiData norms. I knew there were standards for good descriptions, but took from this thread that capitalization was not covered.

Lest I get back on my hobby horse of auto-generated descriptions...
It seems like what we have here is a conflict of use cases. According to the official guidelines, the purpose of the description is to "disambiguate items with the same or similar labels," which is subtly but crucially different from our use case of "a one-line summary of the subject." I fear that if we're not aligned on these use cases at these early stages, then we'll have deeper issues than capitalization later on.

Given the rationale:

Fair point, so allow me to explain the rationale. This change was made because having a lower case description created a capitalisation inconsistency with the article layout; the article title and first character of the article were capitalised, but the description was not. It looked odd, and created an inconsistent scan line.

I have a question for/need help from language folks (ping @Amire80). Is there any language we can think of where keeping the capitalization will break reading comprehension? Here's an example where you can edit the html with other languages and test it with native speakers http://jsbin.com/xexeqix/edit?html,output

I'm trying to understand if this is really a blocker for showing the descriptions on stable.

I'd also like the perspective/rationale from Design (ping @Nirzar). Design direction and consistency is very important for providing a useful reading experience, given we don't break other important things like language comprehension.

the first letter being lowercase gives a sense of incompleteness in a sentence < like this.

It's difficult to quantify or rationalize this but sentence case has a better sense of human intervention. in branding otherwise, sometimes companies use all lowercase to suggest the "casualness" of the company. that's why facebooks F is lowercase. If you see, sentence case is used in as a standard in English prose and our communication also follows it throughout the product.

thoughts on consistently just keeping the user-provided casing?

I strongly believe we should use "Sentence case" descriptions. but the bigger problem is, CSS doesn't have sentence case as an option. it has Capitalize which makes the first letter of every word capitalised. that's just title case.

overall this is a obvious choice. Communications dept within WMF also uses Sentence case.

as far as I know, cases don't exist in other scripts like devnagari (for hindi, marathi) and if i am not wrong, hebrew according to wikipedia.

Case exists only in Latin, Cyrillic, Greek and Armenian. Also, it is used differently in different Latin-based languages, although sentence case is pretty universal.

@Nirzar Just a note that the current CSS implementation actually sentence cases the sentence by leveraging the :first-letter pseudo-selector so it is actually doing proper sentence casing and not title casing.

@Jhernandez ooo that's excellent. my css knowledge is fading :(

CONVERT TO ALL CAPS .. REALLY THE BEST ;)

Jdlrobson moved this task from Backlog to Tracking on the MinervaNeue board.Aug 16 2017, 8:49 PM

I wonder if this leads people to do descriptions starting with caps when doing mobile edits: https://www.wikidata.org/w/index.php?title=Special:RecentChanges&tagfilter=mobile+edit

@Esc3300 I believe mobile edit and mobile app edit are edits from the native Android/iOS apps, and this task is about mobile web, which doesn't have a wikidata description editing functionality or is even rolled out, it only is on beta.

If this is the case, mobile phones have autocorrect enabled by default, so people typing edits on their phones will be submitting sentence cased descriptions because the operating system's are correcting the text to be that way. So that would probably be the reason why most descriptions from mobile apps are capitalized.

RHo added a comment.Aug 22 2017, 12:44 PM

hi @Jhernandez @Esc3300 - actually the keyboard has been made to default to lowercase when adding/editing Wikidata descriptions specifically to reduce the incidence of incorrect capitalization. In addition we have included a point in the help text explaining not to capitalize unless the first word is a proper noun.

Currently the ability to edit Wikidata descriptions is in the Android app only, and has now been slowly rolled out to all languages except English, with edits being monitored for quality.

Kaartic removed a subscriber: Kaartic.Aug 22 2017, 1:09 PM

mobile edit also applies if wikidata.org mobile domain is being used. And yes @Esc3300 autocapitalisation does happen on a mobile device. I've experienced it first hand. E.g. https://m.wikidata.org/wiki/Special:SetLabelDescriptionAliases/Q31887667/en anyway this is getting a little off topic.... :)

It feels like this is a won't fix in that I don't see any way to resolve this such that all descriptions are consistent and uppercase without enforcing capitalisation in a wikidata validation layer.

It seems a bit odd that people view a description "Village in Kafkanistan" and then are expected to type "village in Kafkanistan" for the item of the neighboring village

or get reverted when they change "village in Kafkanistan" to "Village in Kafkanistan".

To summarise the problems im hearing here from all sides:

  1. editors expect how descriptions display to match case when they edit them and this can cause edit confusion
  2. descriptions are inconsistent despite guidelines; wikipedia clients want to be consistent with how they display them
  3. from a design perspective it makes sense to render sentence case in this context as otherwise it will look like an incomplete sentence.

I still see this an editing problem. When editing a wikidata description it should guide me to not use sentence case if that is indeed a policy or it should remove any leading uppercase letter . That solves 1 and 2. I liken this problem to code linting. Some developers like to use tabs and some like spaces. The only way you can make consistency happen is invalidating when the rules are broken and enforcing it.

Wikidata is a data store. Just as we wouldn't expect clients to have to render dates mm/dd/yy we shouldn't expect them to have to use case. We should be caring about the content not how it's used. I think #3 is up to the client. Rather than say it's wrong it would be helpful to point out examples where it doesn't work. Right now these seem to be hypothetical and/or rare.

I'm not sure about (2.): obviously there are descriptions with caps, possibly due to Android auto-completion, but the bulk of descriptions at Wikidata are bot generated and are unlikely to have incorrect caps.

Thanks for all the info @RHo!

My guesses definitely don't apply to the Android app, they do apply to edits via mobile web on wikidata.org.

Elitre added a subscriber: Elitre.Sep 5 2017, 2:02 PM