After talks in Wikimania, it seems ORES can handle Detox. I think this app and revscoring should be peers thus we need better and well-documented dependency injection for ORES. Detox on other hand needs to be cleaned up of all unnecessary items and dependencies (I would suggest to have a "clean" repo and a "research" repo). In order to add it to our prod cluster, probably we need a security review for that too. I couldn't find the source code, it would be great to give some useful links here.
Description
Related Objects
Event Timeline
Our current thinking is to make Detox into its own service that exposes a scoring API. This way ORES can just submit revision ids or diffs to the API and get back scores instead of running the models itself. How does that sound?
That seems like it would be overly complicated and would result in a bunch of duplicated infrastructure. Why not just let ORES host the prediction model itself? If you give us a list of features, it seems that we might be able to generate them relatively easily in realtime with ORES. Once we can generate the feature set, hosting a model is quite simple.
This conversation happened. https://etherpad.wikimedia.org/p/detox_model
And we made a next step task: T139978
I noticed in the etherpad there was some concern w.r.t. ops, legal, security. Could you spell this out a bit? Thanks!