Page MenuHomePhabricator

Replicate flowdb from X1 to analytics-store
Closed, ResolvedPublic

Description

The flow team is about to start tying into the analytics infrastructure. As a part of this we need to have the flowdb from X1 replicated to stat1003.

Details

Reference
bz73047
Related Gerrit Patches:

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:51 AM
bzimport set Reference to bz73047.

Clarification: the replication can be to analytics-store, which is a db that stat1003 has access to. Stat1003 is just a box where people crunch numbers, not a db.

Clarification: currently stat1003 can access x1-analytics-slave, but it would be nice to have replication of that db to analytics-store so that flow data can be joined with event logging data. Erik / Flow folks, please correct me if this is a bad assumption. From the couple of example queries you guys told me, this sounded necessary.

Krenair set Security to None.
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptAug 17 2015, 3:49 PM

This is not a trivial task.

I would like to put some order on all the production and analytics db boxes so that we can provide you a quality service. There is some incoming hardware in production plus some logical reorganization pending on flow servers, so I would like to delay this request for a bit while we refactor the internal architecture.

Are you ok to wait while we make a plan ourselves? That hopefully would mean faster resources for you guys.

@jcrespo, that's ok with me, and this task has waited a while already, if there's a good way forward, it seems worthwhile. Let us know if you'd like to plan the reorganization together.

jcrespo added a comment.EditedAug 17 2015, 5:48 PM

@Milimetric It would be great to meet at some point with several of you. I have also some ideas to improve the service, but I need some feedback if they are worth investing some time.

@jcrespo, feel free to schedule a meeting. I'm in EST but I'm flexible, and my calendar is up to date.

Krinkle moved this task from Triage to Backlog on the DBA board.Sep 23 2015, 4:27 AM
Krinkle moved this task from Backlog to Triage on the DBA board.Sep 23 2015, 7:02 AM
Neil_P._Quinn_WMF renamed this task from Replicate flowdb from X1 to stat1003 to Replicate flowdb from X1 to analytics-store.Oct 13 2015, 6:26 PM
jcrespo claimed this task.Oct 29 2015, 1:52 PM
jcrespo moved this task from Triage to In progress on the DBA board.
jcrespo added a subscriber: Springle.

I am trying this now with the resources we have- I cannot guarantee how well it work, but I cannot continue blocking this for so long :-/

Change 249743 had a related patch set uploaded (by Jcrespo):
Addind needed files to setup research access to a flow replica

https://gerrit.wikimedia.org/r/249743

Change 249743 merged by Jcrespo:
Addind needed files to setup research access to a flow replica

https://gerrit.wikimedia.org/r/249743

I've added flowdb to analytics-store. I cannot guarantee how well it will work, x1 traffic is very "particular".

Let's assume it is in "beta", and you can give me feedback about how it works (I will also monitor its lag and performance).

I have provided access to the user research, please tell me if that is enough. Test is and I will close the ticket afterwards.

@matthiasmullie might be interested in this, not sure.

Change 250392 had a related patch set uploaded (by Jcrespo):
Monitor x1 replication on dbstore hosts

https://gerrit.wikimedia.org/r/250392

Change 250392 merged by Jcrespo:
Monitor x1 replication on dbstore hosts

https://gerrit.wikimedia.org/r/250392

jcrespo closed this task as Resolved.Nov 2 2015, 11:49 AM