@Josve05a It would be great if you review the list generated by bot and divide them into three lists:
- Words that are not acceptable to use anywhere in Wikipedia like 'f**ck, 'sh*t'. I understand these words are okay to use in article related to the subject but that's okay since we are considering proportion of added/changed words not total number of them
- Words that are not okay in Wikipedia articles but it's okay to use them in talk pages like 'Hey', 'LOL', etc.
- Words that are none of the above so false positives picked by bot.
Put it somewhere and we will do the rest.
It's great! thank you. Please keep in mind that you should not sort "generated common words" only "generated words". I'm saying this since you're list contains 379 "Unknown" words but all of generated bad words are 250 words so my guess is you are also working on generated common words too.
I've gone through the rest of them, sorted them and left comments on a group of words that are often used in vandalism but I guess would have more common legitimate uses than most of the words on the list. @Josve05a , feel free to through my edits and see if there's anything you disagree with – some of the ones left weren't exactly obvious.