Page MenuHomePhabricator

Move tendril to a VM
Closed, DeclinedPublic

Description

Tendril is currently on neon (no introduction needed) and since the process is just a web front end to a db backend, it should be fine (and maybe better off) in its own VM.

It could also share krypton which hosts others PHP applications.

Event Timeline

JohnLewis raised the priority of this task from to Needs Triage.
JohnLewis updated the task description. (Show Details)
JohnLewis added projects: Operations, DBA.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 29 2015, 12:11 PM

I am not ok with this task.

jcrespo added a comment.EditedOct 29 2015, 1:32 PM

This task is not blocking anything, so it is not quickly actuable.

By the time it is needed, tendril may have been converted into something else (grafana + scripts).

It handles *very private data*, it should be part of the production network. I think there is not infrastructure there yet handling VMs. Because of firewall and network restrictions, it could not even possible now. If there is something that will be implemented on production, it will be containers, not VMs.

I do not see a reason to change if it is not broken- there should be a real reason to create overhead better than "it should be fine". This is not a trivial change. It should be justified why it is needed, I shouldn't need to justify why it is not needed. There are no resources to do this now, and only a few people can do it. When it is needed, we can discuss the best way to proceed. This is taking very few resources from neon, I do not see why it should be actuable now.

Justifying this also has already taken me time to fix another issue more important like production errors.

It handles *very private data*, it should be part of the production network. I think there is not infrastructure there yet handling VMs. Because of firewall and network restrictions, it could not even possible now. If there is something that will be implemented on production, it will be containers, not VMs.

Production already has virtualisation with VMs, it's Ganeti. To put it into perspective it runs several production sites and 'key' infrastructure as well like the mail servers (mx*). They're treated like production hardware in theory so firewall is not an issue at all.

I do not see a reason to change if it is not broken- there should be a real reason to create overhead better than "it should be fine". This is not a trivial change. It should be justified why it is needed, I shouldn't need to justify why it is not needed. There are no resources to do this now, and only a few people can do it. When it is needed, we can discuss the best way to proceed. This is taking very few resources from neon, I do not see why it should be actuable now.
Justifying this also has already taken me time to fix another issue more important like production errors.

That does actually describe a lot of operations bugs. Things can't be rejected because 'they're not worth it now' or 'we gain nothing now' take bugs like moving other services to VMs. Zirconium as a key example was the first hardware misc server where all of its services were converted to VMs. Nothing was gained, it was only because an opsen agreed to give some time to work with a volunteer and puppetise things and so on.

In the long run the arguments of 'not worth it' or 'nothing is broken' are not exactly strong.

Also the understanding you gave on irc is this is just a static code front end, so I don't understand why this is a complex task? All the data is on a db machine, so adjusting firewalls and grants is all I see necessary - unless there is more not in puppet.

jcrespo added a comment.EditedOct 29 2015, 2:39 PM

Things can't be rejected

I have not rejected it, I said I do not agree with it and that I am not going to work on it. Good luck convincing someone working on it, and even if you do, convincing me that this will not create downtime so I will have to block it.

chasemp triaged this task as Lowest priority.Oct 29 2015, 3:36 PM
chasemp set Security to None.
Dzahn added a comment.Oct 29 2015, 5:05 PM

Hey, nobody ever said "labs". This has always just been about a move within the production network from one host to another.

He was just trying to find something to move off of neon because neon is kind of overloaded.

We also moved other misc. services to (production) VMs, so i don't see this suggestion as unreasonable.

I have not rejected it, I said I do not agree with it and that I am not going to work on it. Good luck convincing someone working on it, and even if you do, convincing me that this will not create downtime so I will have to block it.

ori closed this task as Declined.Oct 29 2015, 5:18 PM
ori claimed this task.
ori added a subscriber: ori.

We have one DBA. However much Neon is overloaded and is a SPOF, it is less overloaded and less of a SPOF than Jaime is. The question we should be asking is how to move tasks off of Jaime rather than neon. We should help him stay sane by keeping the set of open #database tasks restricted to things that are urgent and actionable. Let's not re-open this (at least) until we have a second DBA.