Page MenuHomePhabricator

Plural in pt (pt-PT): 0 must be plural (use CLDR's pt_PT rule)
Open, MediumPublic

Description

Plural in pt and pt-BR should be:

  • 1 = single,
  • everything else = plural.

In particular, 0 should be plural. There have been various requests to fix it for pt and pt-BR in the past.

Despite the above fixes, the problem exists again.

Event Timeline

Nemo_bis triaged this task as Medium priority.EditedNov 18 2016, 7:11 AM
Nemo_bis added a subscriber: Nemo_bis.

Please report in CLDR instead and provide sources for your claim. There are already reports for the matter:

The current rule for pt-br is http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html#pt , while pt_PT (i.e. our pt) is http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html#pt_PT

While the differing opinions on what rules to use where should be settled in CLDR, MediaWiki core should learn to use the CLDR's "pt" rules for our pt-br and the CLDR's "pt_PT" for our pt.

OK, so I'll leave pt-BR for the Brasilians to address.

For European Portuguese, "pt" in MediaWiki ("pt-PT" in CLDR) the rule stated in CLDR, at:

http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html#pt_PT

is:

cardinal
one	1
other   0, 2~16, 100, 1000, 10000, 100000, 1000000, …
        0.0~1.5, 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, …

confirms my statement: 1=single, everything else=plural, in particular, that 0 is plural.

I'm looking at the following file for the implementation of that definition in MediaWiki (is this the correct place?):

https://phabricator.wikimedia.org/diffusion/MW/browse/master/languages/data/plurals.xml

it states:

<pluralRules locales="pt">
    <pluralRule count="one">n = 0..2 and n != 2 @integer 0, 1 @decimal 0.0, 1.0, 0.00, 1.00, 0.000, 1.000, 0.0000, 1.0000</pluralRule>
    <pluralRule count="other"> @integer 2~17, 100, 1000, 10000, 100000, 1000000, … @decimal 0.1~0.9, 1.1~1.7, 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, …</pluralRule>
</pluralRules>

This makes 0 single instead of plural, does it not?

pt should just be added to this group:

<pluralRules locales="ast ca de en et fi fy gl it ji nl sv sw ur yi">
    <pluralRule count="one">i = 1 and v = 0 @integer 1</pluralRule>
    <pluralRule count="other"> @integer 0, 2~16, 100, 1000, 10000, 100000, 1000000, … @decimal 0.0~1.5, 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, …</pluralRule>
</pluralRules>

The rule is exactly the same as English:

-infinity files
-1.000000000000000000000000000x files
-1 file
-0.999999999999999999999999999x files
+0.999999999999999999999999999x files
+1 file
+1.000000000000000000000000000x files
+infinity files

Ok, if your concern is only with pt-PT this report becomes much easier.

Nemo_bis renamed this task from Plural in pt and pt-BR: 0 must be plural to Plural in pt (pt-pt): 0 must be plural (use CLDR's pt_PT rule).Nov 20 2016, 8:57 AM
Nemo_bis removed a project: Upstream.

As far as I know, "0" is still singular in pt-BR
https://translatewiki.net/wiki/Thread:Support/PLURAL_for_pt-br
http://unicode.org/cldr/trac/ticket/2746#comment:5
T25707: Wrong rule for {{plural:}} of zero in pt-BR: zero has to be plural

A search for "zero" in "Corpus Brasileiro v. 2.3"¹ gives the following distribution of person and number:

35304 cases.
Distribution
There was 5 different values of pessnum.
P 33185
S 2041
1S 44
£NP-LONG 19
0 15

¹ http://www.linguateca.pt/acesso/corpus.php?corpus=CBRAS

He7d3r renamed this task from Plural in pt (pt-pt): 0 must be plural (use CLDR's pt_PT rule) to Plural in pt (pt-PT and pt-BR): 0 must be plural (use CLDR's pt_PT rule).Nov 20 2016, 1:14 PM

However, searching for specific examples of phrases containing "zero <something>", I got these statistics in the same corpus:

Plural example#Singular example#
zero pontos25zero ponto72
zero graus18zero grau141
zero dias8zero dia18
zero ciclos1zero ciclo0
zero acidentes4zero acidente139
zero natimortos2zero natimorto0
zero defeitos14zero defeito52

Now I no longer know what this report is about. I would greatly appreciate if you followed my advice at T151008#2805007.

He7d3r renamed this task from Plural in pt (pt-PT and pt-BR): 0 must be plural (use CLDR's pt_PT rule) to Plural in pt (pt-PT): 0 must be plural (use CLDR's pt_PT rule).Nov 22 2016, 12:01 PM
He7d3r removed a project: Regression.
He7d3r removed a subscriber: He7d3r.

However, searching for specific examples of phrases containing "zero <something>", I got these statistics in the same corpus:

From the counts it seems like the singular zero has more hits than plural zero. Does the "distribution of person and number" from the previous comment contradict this? I couldn't understand the output format used there.

From the counts it seems like the singular zero has more hits than plural zero. Does the "distribution of person and number" from the previous comment contradict this? I couldn't understand the output format used there.

Hi waldyrious, He7d3r's point is in regard to pt-BR, which is now outside the scope of this task T151008. So as not to confuse matters further, could perhaps pt-BR be discussed elsewhere? Besides, if one believes that Brazilians say "zero carro" then CLDR is correct and MediaWiki is already configured according to CLDR, so appropriately for pt-BR, and nothing needs to be done for pt-BR.

For pt, however, CLDR is correct and MediaWiki is incorrectly configured, which is what this task is about.

And I believe that all that needs to be done is to move pt from where it is into the group:

<pluralRules locales="ast ca de en et fi fy gl it ji nl sv sw ur yi">

joining English, Italian, German, etc.

From the counts it seems like the singular zero has more hits than plural zero. Does the "distribution of person and number" from the previous comment contradict this? I couldn't understand the output format used there.

I don't remember the specifics of that corpus so, I just run a grep on a Wikipedia dump to get some updated statistics:
https://public-paws.wmcloud.org/User:He7d3r/examples/2023-08-10-plural-pt-zero.ipynb

I believe both MW and CLDR would need an update since we usually use plural for zero in pt-BR, as in these examples extracted from above:

154 zero fontes
  8 zero fonte
109 zero emissões
  6 zero emissão
 84 zero pontos
 30 zero ponto
 56 zero resultados
  3 zero resultado
 24 zero casas
  0 zero casa
 17 zero votos
  1 zero voto
 13 zero referências
  1 zero referência

The exceptions seem to be measurement units, such as:

3207 zero hora
  27 zero horas
 118 zero grau
  42 zero graus
  11 zero volt
   8 zero volts

Thanks both for the updates! @He7d3r, please clarify your position: do you think the exceptions justify treating pt-BR differently and keeping this issue about pt-PT only? Or is your opinion that, precisely because they are exceptions, the default should be plural zero and thus this issue should refer to both pt-PT and pt-BR?