Page MenuHomePhabricator

Move piwik's database to db1108 (eventlogging slave)
Closed, DeclinedPublic

Description

We are currently running a mysql instance on bohrium, alongside with Piwik's PHP code. A better strategy would be to surely decouple the two, for example moving the database to db1108 (Eventlogging slave), that is currently under utilized.

This would mean:

  1. Stop Piwik, take a snapshot of the database, copy it to db1108
  2. Add credentials for Piwik to db1108
  3. Move the bacula backup to db1108
  4. Start Piwik again
  5. Mysql cleanup

Event Timeline

elukey triaged this task as Medium priority.Jun 26 2018, 2:14 PM
elukey created this task.

The main question remains security, since as far as I can see Piwik does not support TLS encryption/authentication to the database. The connection will be within eqiad, more or less similar to what is currently happening with Eventlogging, but nonetheless private data would be sent in cleartext across the DC.

@Muehlenhoff: in light of what we discussed in Prague, do you have a suggestion about what's best to do?

The current state is that DBAs don't support Piwik, and as a consequence, we will not support db1108 and db1107 at all if they are mixed toghether. You may think we already don't support eventlogging database, but you maybe unaware of all the many things that we do that are transparent to you, that now you will have to be in charge of.

The current state is that DBAs don't support Piwik, and as a consequence, we will not support db1108 and db1107 at all if they are mixed toghether. You may think we already don't support eventlogging database, but you maybe unaware of all the many things that we do that are transparent to you, that now you will have to be in charge of.

Makes sense, is there anything that we can do to make Piwik in a good state so that it can be hosted on db1108?
I am fully aware of what you guys are doing, and I don't underestimate it, this is why I opened a task to ask a wider an broader opinion on the subject :)
It is very likely that the new foundation's website will need Piwik's support from Analytics, and hosting the mysql instance on bohrium is not the best option at the moment.

Makes sense, is there anything that we can do to make Piwik in a good state so that it can be hosted on db1108?

No, the issue is we don't support Piwik, the only way to support it is if we (you?) got a 3rd DBA person budgeted to specifically handle such infrastructure, or I got word from my manager that this is more important than my current project (having working backups of our data).

The rest is me overreaching, so feel free to ignore me- but if I had to support it, I wouldn't anyway add it to db1108 (and adding it to a replica is by itself a really bad idea)- you want to setup high availability at mysql layer and setup a couple of proxies in-between. Adding it to an existing service, in my opinion, and I could be wrong, will only make you having 2 problems instead of one (instability on _both_ piwiki and eventlogging). This is consistent with lessons we learned in other parts of the infrastructure- and informed other change you may have heard from other unrelated services: we are moving to multi-instance and architecture changes that are non-monolithical and with separation of concerns precisely from that. Let me insist on that this is my own opinion, and I don't have any problem with you, on your own, experimenting with your own way of doing things (e.g. adding it to a supported service) as long as we are no longer involved with it or have to handle its alerts, crashes and basically creates any other work for us.

It is very likely that the new foundation's website will need Piwik's support from Analytics, and hosting the mysql instance on bohrium is not the best option at the moment.

I don't mean this badly or in a mean way, but I think hosting it on db1108 would be the last to worst place to do it, only before wikireplica service/toolsdb. Purchase 2 machines "analyticsdb[12]XX[12]" and 2 proxies per datacenter (eqiad and codfw) and host it there, would be my recommendation (we still wouldn't support it, so it is just an advice of what I _would_ do if I were you). My managers and colleagues have advised many times against sharing and reusing hw just because they look available, and when I dismissed that advice, they were always right- so I don't think I would be alone on this line of thinking.

Makes sense, is there anything that we can do to make Piwik in a good state so that it can be hosted on db1108?

No, the issue is we don't support Piwik, the only way to support it is if we (you?) got a 3rd DBA person budgeted to specifically handle such infrastructure, or I got word from my manager that this is more important than my current project (having working backups of our data).

We (as Analytics) have been supporting Piwik as best effort project and Manuel helped in the past when there were issues with db outages (that ended up in restoring a backup). I learned a ton of things and now I should be able to handle the load without much supervision, plus Piwik is not a service that needs a ton of maintenance.

The rest is me overreaching, so feel free to ignore me- but if I had to support it, I wouldn't anyway add it to db1108 (and adding it to a replica is by itself a really bad idea)- you want to setup high availability at mysql layer and setup a couple of proxies in-between. Adding it to an existing service, in my opinion, and I could be wrong, will only make you having 2 problems instead of one (instability on _both_ piwiki and eventlogging). This is consistent with lessons we learned in other parts of the infrastructure- and informed other change you may have heard from other unrelated services: we are moving to multi-instance and architecture changes that are non-monolithical and with separation of concerns precisely from that. Let me insist on that this is my own opinion, and I don't have any problem with you, on your own, experimenting with your own way of doing things (e.g. adding it to a supported service) as long as we are no longer involved with it or have to handle its alerts, crashes and basically creates any other work for us.

It is very likely that the new foundation's website will need Piwik's support from Analytics, and hosting the mysql instance on bohrium is not the best option at the moment.

I don't mean this badly or in a mean way, but I think hosting it on db1108 would be the last to worst place to do it, only before wikireplica service/toolsdb. Purchase 2 machines "analyticsdb[12]XX[12]" and 2 proxies per datacenter (eqiad and codfw) and host it there, would be my recommendation (we still wouldn't support it, so it is just an advice of what I _would_ do if I were you). My managers and colleagues have advised many times against sharing and reusing hw just because they look available, and when I dismissed that advice, they were always right- so I don't think I would be alone on this line of thinking.

I understand the scalability concerns but for a little service that has to handle a few super small websites it seems a bit of an overkill. We (as Analytics) need to support this small use cases that are not in the scope of Eventlogging (so ingestion of events, processing, quick dashboarding etc.. for low volume of requests) and Piwik is the current best compromise that we have found (happy to discuss about alternatives).

db1108 is highly underutilized and having it down for any reason would only mean that the Eventlogging replica is not functioning well, not the entire Eventlogging storage (we also don't even failover anymore db1107 to 08 in case of failures). I agree that a proxy with two databases in master standby would be ideal, but Piwik doesn't seem a use case for them.

Anyhow, I understand that this task is probably going to end up nowhere, so I'll back off and keep going with the mysql instance on bohrium.

Maybe we should try to reach a consensus here.
I do believe the Analytics Team is aware of all the stuff we work behind the scenes on the already analytics hosts and the help we provide.

Also, I can see the point of trying to get Piwik out of bohrium and maybe using the underutilized Eventlogging servers.

On the other hand, it is well-known that DBAs do not support Piwik, and Jaime's point about having Piwik sharing resources with Eventlogging could make that service less stable is a good point. I am not aware of the last issues with Piwik apart from some months ago when we had to restore it from backups, because of a storage crash.

Maybe trying to move EventLogging to multi-instance could be a way of advancing on this topic, so we could have both things isolated (isolated on the same server, of course).
This would require lots of puppet work from the Analytics side in order to adapt mariadb::core_multiinstance to all the eventlogging specific things though. Which is something for Analytics to decide if they want to spend time on.

I would definitely like to work on making the multi instance puppet code available for Analytics, so if this solution would be a compromise I can schedule time for this during this quarter :)

Declining this for the moment, it doesn't seem a good way to go.