Page MenuHomePhabricator

Gather language assets for Swedish
Closed, ResolvedPublic

Description

Event Timeline

Halfak created this task.Apr 1 2016, 9:00 AM
Restricted Application added subscribers: Josve05a, Aklapper. · View Herald TranscriptApr 1 2016, 9:00 AM
Halfak updated the task description. (Show Details)Apr 1 2016, 9:02 AM
Halfak added a subscriber: Ladsgroup.

@Ladsgroup, can you run Bad-Words-Detection-System on svwiki?

Started the bot. You'll have the results in this page

Is thee anything I, as a Swede, can do to help.

@Josve05a It would be great if you review the list generated by bot and divide them into three lists:

  • Words that are not acceptable to use anywhere in Wikipedia like 'f**ck, 'sh*t'. I understand these words are okay to use in article related to the subject but that's okay since we are considering proportion of added/changed words not total number of them
  • Words that are not okay in Wikipedia articles but it's okay to use them in talk pages like 'Hey', 'LOL', etc.
  • Words that are none of the above so false positives picked by bot.

Put it somewhere and we will do the rest.

Thanks

@Josve05a It would be great if you review the list generated by bot and divide them into three lists:

  • Words that are not acceptable to use anywhere in Wikipedia like 'f**ck, 'sh*t'. I understand these words are okay to use in article related to the subject but that's okay since we are considering proportion of added/changed words not total number of them
  • Words that are not okay in Wikipedia articles but it's okay to use them in talk pages like 'Hey', 'LOL', etc.
  • Words that are none of the above so false positives picked by bot.

    Put it somewhere and we will do the rest.

    Thanks

Ok thanks, I'll get on that!

It's great! thank you. Please keep in mind that you should not sort "generated common words" only "generated words". I'm saying this since you're list contains 379 "Unknown" words but all of generated bad words are 250 words so my guess is you are also working on generated common words too.

Johan added a comment.Apr 5 2016, 6:44 PM

I've gone through the rest of them, sorted them and left comments on a group of words that are often used in vandalism but I guess would have more common legitimate uses than most of the words on the list. @Josve05a , feel free to through my edits and see if there's anything you disagree with – some of the ones left weren't exactly obvious.

Halfak updated the task description. (Show Details)May 12 2016, 8:18 PM
Halfak assigned this task to Ladsgroup.

Change 289162 had a related patch set uploaded (by Ladsgroup):
ores: install aspell-sv

https://gerrit.wikimedia.org/r/289162

Change 289162 merged by Alexandros Kosiaris:
ores: install aspell-sv

https://gerrit.wikimedia.org/r/289162

Ladsgroup updated the task description. (Show Details)
Ladsgroup moved this task from Review to Done on the Scoring-platform-team (Current) board.
Ladsgroup closed this task as Resolved.
Restricted Application added a project: artificial-intelligence. · View Herald TranscriptJul 21 2017, 11:07 AM