Rather than looking for big patterns, we also need to identify categories that can't be readily detected other than by manual inspection (e.g., typos and gibberish) to gauge their extent.
This also gives us a sample of typos sent through the API to see how many would get suggestions if suggestions were enabled.