Fix translateAndCapitalizeNamespaces for Portuguese
Closed, ResolvedPublic

Description

Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1323/
Reported by: heldergeovane
Created on: 2011-06-30 14:48:20

Per discussion on
https://pt.wikipedia.org/wiki/Wikipédia:Esplanada/propostas/Incentivar_o_uso_de_"Imagem"_em_vez_de_"Arquivo"_ou_"Ficheiro"_(12mar2011)?uselang=en
please, change the function translateAndCapitalizeNamespaces (from cosmetic_changes.py) so that the bots stop doing the following chnges:

  • Image --> Ficheiro
  • File --> Ficheiro
  • Arquivo --> Ficheiro
  • Imagem --> Ficheiro

This is necessary in order to avoid linguistic problems, considering that "Arquivo" is the preferred word on Brazil but "Ficheiro" is preferred on Portugal.

For image files, the word "Imagem" is common to both Portuguese variants, and as such it is preferred, so this should be the name used when changing the namespace name of images. The use of "Ficheiro" and "Arquivo" is preferred only for other kinds of files (such as PDF or OGG), which are not images.

So, in short, the bots should do the following changes:

  • For images (i.e. files with one of the following extensions: png, gif, jpg, jpeg, svg, tiff, tif), change:
    • Image --> Imagem
    • File --> Imagem
    • Ficheiro --> Imagem
    • Arquivo --> Imagem
  • For other files (i.e. files with one of the following extensions: xcf, pdf, mid, ogg, ogv, djvu, oga):
    • Arquivo --> Do not change (we should respect the variant used by the editors)
    • Ficheiro --> Do not change (we should respect the variant used by the editors)
    • File --> Do not change (or change randomly to "Ficheiro" or "Arquivo", since it is indeed a "file" and both pt and pt-BR are acceptable)
    • Image --> Do not change (or change randomly to "Ficheiro" or "Arquivo", since it is indeed a "file" and both pt and pt-BR are acceptable)
    • Imagem --> Do not change (or change randomly to "Ficheiro" or "Arquivo", since it is indeed a "file" and both pt and pt-BR are acceptable)

Details

Reference
bz55242
bzimport raised the priority of this task from to Normal.
bzimport set Reference to bz55242.
bzimport added a subscriber: Unknown Object (????).
Legoktm created this task.Oct 5 2013, 4:49 AM

Raising the priority since this bug is still afecting bots on every Portuguese wikis.

  • priority: 5 --> 7

The bot doesn't see the extension of that links. For implement this behavior, that code needs to be redesigned. Maybe a future feature. If there is a way for fixing namespace aliases without looking at the extension, we could do it sooner. I've deactivated translateAndCapitalizeNamespaces for the file namespace now.

  • assigned_to: nobody --> xqt
  • status: open --> open-later

I guess the prio could be degraded since the code is deactivated

  • priority: 7 --> 5

It was reported against compat, but probably exists in the same code that appears in core.

jayvdb set Security to None.
jayvdb moved this task from Backlog to Wikimedia prod/Cloud Services issues on the Pywikibot board.
jayvdb removed a subscriber: Unknown Object (????).
jayvdb updated the task description. (Show Details)Oct 16 2015, 6:39 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 16 2015, 6:39 AM
jayvdb updated the task description. (Show Details)Oct 16 2015, 7:03 AM
jayvdb removed a project: Pywikibot-compat.

Removed compat, as this feature request is unlikely to be implemented there.

Hi. This bug is rather old. Is the issue reported here still happening? Thanks.

Restricted Application added a subscriber: Cyberpower678. · View Herald TranscriptDec 2 2017, 6:53 PM
MarcoAurelio raised the priority of this task from Normal to Needs Triage.Dec 2 2017, 6:55 PM

(retriage needed, no action on this since years)

Dvorapa moved this task from Ready to go to Backlog on the Pywikibot board.May 17 2018, 10:27 PM
Dvorapa moved this task from Backlog to Ready to go on the Pywikibot board.
Xqt triaged this task as Lowest priority.May 18 2018, 10:30 AM
Xqt added a project: goodfirstbug.

It's not a bug, it's a feature request.

@Xqt I cannot see this script in pywikibot-core. Does this still exist? Thanks.

cosmetic_changes? Yes

I was looking for translateAndCapitalizeNamespaces.py yet indeed it sounded like something for cosmetic_changes.py. Thanks.

So to sum up you want cosmetic_changes not to change links containing Imagem.

Change 441180 had a related patch set uploaded (by MarcoAurelio; owner: MarcoAurelio):
[pywikibot/core@master] [IMPR|WIP] cosmetic_changes: skip changing 'Imagem' links for pt.* wikis

https://gerrit.wikimedia.org/r/441180

Dvorapa updated the task description. (Show Details)EditedJun 20 2018, 9:54 AM

But only for these files: png, gif, jpg, jpeg, svg, tiff, tif
For xcf, pdf, mid, ogg, ogv, djvu, oga the behavior should stay as before, just Arquivo shouldn't be changed (or none of them)

Maybe it'd be easier to just ignore the whole NS:6 for pt.* wikis?

Maybe it'd be easier to just ignore the whole NS:6 for pt.* wikis?

This is the current behavior of cosmetic_changes. This task basically waits to some file link file extension parsing in cc.

Change 441180 abandoned by MarcoAurelio:
[IMPR|WIP] cosmetic_changes: skip changing 'Imagem' links for pt.* wikis

https://gerrit.wikimedia.org/r/441180

Dvorapa raised the priority of this task from Lowest to Low.Jun 20 2018, 12:22 PM
Dvorapa claimed this task.
Dvorapa moved this task from Backlog to Deactivated code on the Pywikibot-cosmetic-changes.py board.
Dvorapa moved this task from Backlog to Doing on the goodfirstbug board.

Change 441200 had a related patch set uploaded (by Dvorapa; owner: Dvorapa):
[pywikibot/core@master] [IMPR] Fix Portuguese file namespace translation in cc

https://gerrit.wikimedia.org/r/441200

Change 441200 merged by jenkins-bot:
[pywikibot/core@master] [IMPR] Fix Portuguese file namespace translation in cc

https://gerrit.wikimedia.org/r/441200

Dvorapa closed this task as Resolved.Jul 15 2018, 1:09 PM