Once we have a set of word tokens identified in the wikitext, we want to know if any of them are informal words. This will allow us to build the credibility signals related to informal words.
Implementation is mostly regex pattern matching.
Implementation details:
[1] This utility will live under structured-data/packages.
[2] is_informal_word(word_token) -> True/ False
We return true if the word token matches any informal word pattern in any language.
For this, you will need to copy/paste the list of informal patterns for each language. Refer to informal patterns for english in the link. Similarly, browse through the directory for other languages.
