We need a way to manage the NSFW content in the commons query completion candidates. Manual review and blacklisting simply wont scale, both the variety of terms as well as the maintenance burden of maintaining this list are too much for us to handle.
There are a variety of NSFW image classifiers openly available. Determine a reasonable one to use, and wire up a system to maintain a set of title and classifications in hive. This will be run daily, we should evaluate if there is benefit to keeping a running dataset of classified titles, rather than re-classifying everything daily (i suspect we shouldn't re-classify, but experimentation needed).