Page MenuHomePhabricator

Language assets for Czech
Closed, ResolvedPublic

Description

This task is done when we have a set of language utilities in revscoring for Czech

  • Run BWDS to get potential badword list
  • Human review of BWDS list
  • Integrate with available badwords lists
  • Integrate into revscoring

Event Timeline

Halfak renamed this task from Language assess for Czech to Language assests for Czech.Apr 2 2016, 6:27 PM
Halfak updated the task description. (Show Details)
Halfak renamed this task from Language assests for Czech to Language assets for Czech.Apr 3 2016, 8:26 AM

@Danny_B & @Petrb , You can help us get started here by gathering curse words, racial slurs and informal language ("hahaha", "woo hoo!", etc.) for czech. A good way to do this would be looking at abuse filter or searching the internet for lists that have already been generated. E.g. I found http://www.youswear.com/index.asp?language=Czech

@Ladsgroup, can you kick of Bad-Words-Detection-System for czwiki

just for the record: cswiki ;-)

Started, You will have it in ~7 hours. I check again in 24 hours. Check this page