This will be subtasked, but essentially we're implementing the design that was discussed here: https://docs.google.com/document/d/10cTkWcxOE89kx_HejlAbRyiRjlhXL13Cii0hfPOki4c/edit#heading=h.6fww1k1wmrtd
Description
Event Timeline
I see that this epic has cloud-services-team as a dependency on https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2017-18_Q1#Analytics_Engineering. Is there a specific ask of Cloud Services here or just a general need for some support as components are deployed?
Just support, we plan to deploy a storage (druid? clickhouse? other?) to labs with the data from data lake for public access, I think labs team support would mean doing code reviews for puppet and general consulting oh the appropriate way to do things on labs.
@Nuria thanks for the clarification. I think we can handle helping with that. If you don't have hardware budget for deploying the new datastore we'll probably have to scrounge around and see if there is capacity in any of our existing data storage systems. That would be my only concern at this point.
We do have budgetfor 3-druid like hosts. Besides refreshes that is our top priority ask for next year
One version of the history schema simplified and loaded to test how Druid can work as a direct back-end for AQS: