I've noticed, when using the beta version, the Wikidata description on top of the page starts with a upper-case letter, despite the one of Wikidata starting with a lower-case one. This is rather ugly in some languages, can this be un-forced?
Well, if you use the random function on the Dutch Wikipedia you can see a lot of examples. Anyway, I've used the nearby function to find https://nl.m.wikipedia.org/wiki/Peerd_van_Ome_Loeks. It displays normally (lowercase) inside the nearby feature. The page itself shows a forced capitalised version, that is rather ugly.
@dr0ptp4kt The history here is that this was done because it worked better and looked more correct in English. In honesty, we neglected to consider how this would look in other languages. In retrospect, it's obvious that some languages have different rules and conventions on when capitalisation is correct which would mean our initial approach is suboptimal.
What user value do you propose this change adds? So far I see mostly hyperbole in this task (e.g. "rather ugly in some languages"), rather than a solid rationale for work to be undertaken. We can do better than that.
Fair point, so allow me to explain the rationale. This change was made because having a lower case description created a capitalisation inconsistency with the article layout; the article title and first character of the article were capitalised, but the description was not. It looked odd, and created an inconsistent scan line.
The problem with "just let the contributor decide" is that, in this case, the editors don't seem to have a consistent policy or knowledge of how these will be used. They edit one description at a time, but this inconsistency only becomes problematic in contexts where many descriptions are displayed together along with titles, etc.
To me this is much like page titles, which can be special cased when needed, but where there is a default which is software enforced. By saying "just display inconsistent formatting if thats what editors write" we are shifting some cognitive work of parsing and reading the descriptions to our readers. We could, by policy and community engagement I suppose, ask editors to take that work back and use "Wikipedia case" for all descriptions (or whatever the consensus ends up) or we could use a software solution, in languages where it makes sense, to reduce the burden on both editors and readers.
For an example of why casing makes parsing a list of items more effort compare:
Which uses the editors cases and mixes strings like "hospital" with "One of the 50 hills of San Francisco", vs. either Android or iOS's search presentation which normalizes for case. I don't have eye tracking studies or dwell time to prove it, but I feel strongly that the latter is easier to read, visually more pleasing and makes the use of the descriptions look much more intentional.
If we leave it alone, some will be capitalized and some will not and there is a cost to inconsistency for the reader. This is a cost we bear throughout the projects, but the top of the page is where the user first lands orients themselves and begins digesting information. The cost of having inconsistency here is much higher than anywhere else. This is one reason why lead images are either on or off and not opt in (if there is an eligible photo). For this reason, this is one place where I think order is more important than individual preferences. I also do not like forcing a software solution, but I would like to keep it as is until a community standard evolves (the way it has with titles).
The standard for entry of the descriptions is quite well defined indeed. That said, rigorously applying something that's quite literally a guideline outside its original context (i.e. display of the descriptions within Wikidata) seems unwise to me.
Lest I get back on my hobby horse of auto-generated descriptions...
It seems like what we have here is a conflict of use cases. According to the official guidelines, the purpose of the description is to "disambiguate items with the same or similar labels," which is subtly but crucially different from our use case of "a one-line summary of the subject." I fear that if we're not aligned on these use cases at these early stages, then we'll have deeper issues than capitalization later on.
Given the rationale:
I have a question for/need help from language folks (ping @Amire80). Is there any language we can think of where keeping the capitalization will break reading comprehension? Here's an example where you can edit the html with other languages and test it with native speakers http://jsbin.com/xexeqix/edit?html,output
I'm trying to understand if this is really a blocker for showing the descriptions on stable.
I'd also like the perspective/rationale from Design (ping @Nirzar). Design direction and consistency is very important for providing a useful reading experience, given we don't break other important things like language comprehension.
the first letter being lowercase gives a sense of incompleteness in a sentence < like this.
It's difficult to quantify or rationalize this but sentence case has a better sense of human intervention. in branding otherwise, sometimes companies use all lowercase to suggest the "casualness" of the company. that's why facebooks F is lowercase. If you see, sentence case is used in as a standard in English prose and our communication also follows it throughout the product.
thoughts on consistently just keeping the user-provided casing?
I strongly believe we should use "Sentence case" descriptions. but the bigger problem is, CSS doesn't have sentence case as an option. it has Capitalize which makes the first letter of every word capitalised. that's just title case.
overall this is a obvious choice. Communications dept within WMF also uses Sentence case.
as far as I know, cases don't exist in other scripts like devnagari (for hindi, marathi) and if i am not wrong, hebrew according to wikipedia.
@Esc3300 I believe mobile edit and mobile app edit are edits from the native Android/iOS apps, and this task is about mobile web, which doesn't have a wikidata description editing functionality or is even rolled out, it only is on beta.
If this is the case, mobile phones have autocorrect enabled by default, so people typing edits on their phones will be submitting sentence cased descriptions because the operating system's are correcting the text to be that way. So that would probably be the reason why most descriptions from mobile apps are capitalized.
hi @Jhernandez @Esc3300 - actually the keyboard has been made to default to lowercase when adding/editing Wikidata descriptions specifically to reduce the incidence of incorrect capitalization. In addition we have included a point in the help text explaining not to capitalize unless the first word is a proper noun.
Currently the ability to edit Wikidata descriptions is in the Android app only, and has now been slowly rolled out to all languages except English, with edits being monitored for quality.
mobile edit also applies if wikidata.org mobile domain is being used. And yes @Esc3300 autocapitalisation does happen on a mobile device. I've experienced it first hand. E.g. https://m.wikidata.org/wiki/Special:SetLabelDescriptionAliases/Q31887667/en anyway this is getting a little off topic.... :)
It feels like this is a won't fix in that I don't see any way to resolve this such that all descriptions are consistent and uppercase without enforcing capitalisation in a wikidata validation layer.
It seems a bit odd that people view a description "Village in Kafkanistan" and then are expected to type "village in Kafkanistan" for the item of the neighboring village
or get reverted when they change "village in Kafkanistan" to "Village in Kafkanistan".
To summarise the problems im hearing here from all sides:
- editors expect how descriptions display to match case when they edit them and this can cause edit confusion
- descriptions are inconsistent despite guidelines; wikipedia clients want to be consistent with how they display them
- from a design perspective it makes sense to render sentence case in this context as otherwise it will look like an incomplete sentence.
I still see this an editing problem. When editing a wikidata description it should guide me to not use sentence case if that is indeed a policy or it should remove any leading uppercase letter . That solves 1 and 2. I liken this problem to code linting. Some developers like to use tabs and some like spaces. The only way you can make consistency happen is invalidating when the rules are broken and enforcing it.
Wikidata is a data store. Just as we wouldn't expect clients to have to render dates mm/dd/yy we shouldn't expect them to have to use case. We should be caring about the content not how it's used. I think #3 is up to the client. Rather than say it's wrong it would be helpful to point out examples where it doesn't work. Right now these seem to be hypothetical and/or rare.
I'm not sure about (2.): obviously there are descriptions with caps, possibly due to Android auto-completion, but the bulk of descriptions at Wikidata are bot generated and are unlikely to have incorrect caps.