Page MenuHomePhabricator

Babel AutoCreate creates incorrectly named and duplicated categories on multiple WMF wikis
Closed, ResolvedPublic

Description

It is creating wrong categories, instead of the patten "Category:User <code>-<level>" from 16 set 2015 it is creating new categories "Category:<code>-<level>". See mw:Extension:Babel#Configuration. --V.Riullop (talk) 10:40, 17 September 2015 (UTC)

Event Timeline

Steinsplitter raised the priority of this task from to High.
Steinsplitter updated the task description. (Show Details)
Steinsplitter subscribed.

What are exact steps to reproduce and see the problem?

Aklapper renamed this task from Babel AutoCreate to Babel AutoCreate creates incorrectly named and duplicated categories on Commons.Sep 17 2015, 11:24 AM

It's not specific to Commons; I found incorrect babel category (Ko-N, En-3) created on kowikinews, kowikiversity.

Aklapper renamed this task from Babel AutoCreate creates incorrectly named and duplicated categories on Commons to Babel AutoCreate creates incorrectly named and duplicated categories on multiple WMF wikis.Sep 17 2015, 11:40 AM

The two most recent Babel code changes touch category related code.
They are https://gerrit.wikimedia.org/r/#/c/237481/ as a followup to https://gerrit.wikimedia.org/r/#/c/213300/

CC'ing @Nemo_bis and @Parent5446

Yes, this needs a backport. https://gerrit.wikimedia.org/r/#q,If0cb5d9a303a28dc7d7f7bb7bcdf6263e63655e9,n,z

Nevermind, it's already deployed. No idea then.

Nemo_bis added a subscriber: Legoktm.
> var_dump( $wgBabelCategoryNames );
array(7) {
  [0]=>
  string(8) "%code%-0"
  [1]=>
  string(8) "%code%-1"
  [2]=>
  string(8) "%code%-2"
  [3]=>
  string(8) "%code%-3"
  [4]=>
  string(8) "%code%-4"
  [5]=>
  string(8) "%code%-5"
  ["N"]=>
  string(8) "%code%-N"
}

But the config for that wiki is supposed to be this:

		'0' => 'User %code%-0',
		'1' => 'User %code%-1',
		'2' => 'User %code%-2',
		'3' => 'User %code%-3',
		'4' => 'User %code%-4',
		'5' => false,
		'N' => 'User %code%-N'

I also found incorrect categories on cawiktionary, previously configured per T49287: Configure $wgBabelCategoryNames for ca.wiktionary

legoktm@terbium:~$ mwscript eval.php --wiki=commonswiki
> return $wmgBabelCategoryNames;
array(7) {
  [0]=>
  string(13) "User %code%-0"
  [1]=>
  string(13) "User %code%-1"
  [2]=>
  string(13) "User %code%-2"
  [3]=>
  string(13) "User %code%-3"
  [4]=>
  string(13) "User %code%-4"
  [5]=>
  bool(false)
  ["N"]=>
  string(13) "User %code%-N"
}
> return $wgBabelCategoryNames;
array(7) {
  [0]=>
  string(8) "%code%-0"
  [1]=>
  string(8) "%code%-1"
  [2]=>
  string(8) "%code%-2"
  [3]=>
  string(8) "%code%-3"
  [4]=>
  string(8) "%code%-4"
  [5]=>
  string(8) "%code%-5"
  ["N"]=>
  string(8) "%code%-N"
}
>

Change 239140 had a related patch set uploaded (by Legoktm):
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/239140

Change 239141 had a related patch set uploaded (by Legoktm):
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/239141

Change 239142 had a related patch set uploaded (by Legoktm):
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/239142

Change 239142 merged by Legoktm:
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/239142

Change 239141 merged by Legoktm:
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/239141

Change 239140 merged by jenkins-bot:
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/239140

Fix has been deployed, so no new bad categories will be created. I'll work on a script to delete the bad ones that were created.

Change 240565 had a related patch set uploaded (by Legoktm):
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/240565

Change 240565 merged by jenkins-bot:
registration: Fix merging of array_plus

https://gerrit.wikimedia.org/r/240565

I'll work on a script to delete the bad ones that were created.

@Legoktm: Has that happened in the meantime?

Nuked a bunch of incorrectly named categories on Wikidata and Commons

Blocked the bot on both Wikidata and Commons. This bot is causing more harm than good.

Any update on this? There are still around 4000 of these categories that need deleting, I've even seen a few where someone created a Wikidata item.

@matej_suchanek: That's not the same problem as the one here, see T63993 for those.

Any update on this? There are still around 4000 of these categories that need deleting

You'd probably have better luck asking a deletion bot at https://meta.wikimedia.org/wiki/Steward_requests/Miscellaneous

I'll work on a script to delete the bad ones that were created.

@Legoktm: Has that happened in the meantime?

@Legoktm, Ping Again, or should we just close out this ticket?

Although I'm guessing most of the other communities have already dealt with this already (I just did the ~37 on mw.wiki)

I asked a few weeks ago at https://meta.wikimedia.org/wiki/Steward_requests/Miscellaneous#Badly_named_Babel_AutoCreate_categories and a couple of people have been helping delete them. It seems like progress has stalled though.

Sorry for dropping the ball on this.

In the meanwhile Babel Autocreate is still running and creating wrong categories, at least on it.wiki, it.wikiquote and en.wikiquote: for example "Category:User zh-Hans" or "Category:User de-AT" with wrong capitalization, where the correct categories "User zh-hans" and "User de-at" already exist.

@Superchilum This particular ticket was about a bug last year which has been fixed (not yet closed because we're still waiting for the last three badly named categories to be deleted - if anyone has admin access on wikimediafoundation.org, please have a look at the metawiki link above). See T63993 for the problem of inconsistent capitalisation (which was "fixed" a couple of days ago... in a way which uses the wrong capitalisation).

Nikki removed Nikki as the assignee of this task.Oct 15 2016, 12:57 PM