Language assets for Czech
This task is done when we have a set of language utilities in revscoring for Czech

  • Run BWDS to get potential badword list
  • Human review of BWDS list
  • Integrate with available badwords lists
  • Integrate into revscoring

Event Timeline

@Danny_B & @Petrb , You can help us get started here by gathering curse words, racial slurs and informal language ("hahaha", "woo hoo!", etc.) for czech. A good way to do this would be looking at abuse filter or searching the internet for lists that have already been generated. E.g. I found

@Ladsgroup, can you kick of Bad-Words-Detection-System for czwiki

just for the record: cswiki ;-)

Started, You will have it in ~7 hours. I check again in 24 hours. Check this page