Package the discovery teams dashboards
Closed, ResolvedPublic

Description

Put together something so that the dashboards are both easy for someone to start hacking on from nothing, as well as making the instance we use to show the world easily re-creatable.

This was originally to be a puppet role, but on further guidance we are setting up a vagrant role and a simple shell script that provisions as necessary. This will probably end up as a repository that can be cloned either to the "prod" (labs instance) server or a development box. On the "prod" box the provision script is run directly. On a dev box vagrant up would boot up a machine similar enough to the one in labs for development.

See: http://searchdata.wmflabs.org
See: http://www.rstudio.com/products/shiny/download-server/

Ironholds assigned this task to EBernhardson.
Ironholds added a subscriber: Ironholds.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 18 2015, 7:41 PM
Deskana set Security to None.

the puppet that runs in labs is the regular production puppet, probably not a good place for this. If we really want to puppetize this I think i'll build it out to run as part of labs-vagrant.

also the existing packaging for shiny and shiny server looks to leave a lot to be desired...

after looking over all this stuff, I'm not sure there is enough benefit to puppetizing this at the moment to go through all the legwork. R and shiny have what most would not even consider the bare minimum of packaging.

Since the original problem was just that shiny doesn't start on boot, we should just fix that. I'm not sure what server this is on to fix that though, i have access to the analytics group in wikitech but searchdata.wmflabs.org is not a proxy created within the analytics group.

Change 221827 had a related patch set uploaded (by EBernhardson):
Create puppet role for discovery dashboards

https://gerrit.wikimedia.org/r/221827

@Joe @Ottomata Can one you review the above patch when you have a chance? Thank you!

ksmith added a subscriber: ksmith.Jul 21 2015, 5:59 PM

@Joe @Ottomata: Ping. Will one of you have a chance to get to this soon, and if not, is there anyone who could? The patch has been "maturing" for a few weeks now. Thanks!

EBernhardson changed the title from "Create puppet manifest for Shiny" to "Package the discovery teams dashboards".Aug 12 2015, 4:54 AM
EBernhardson edited the task description. (Show Details)

@Joe @Ottomata Re-ping. What can we do to get this moving again?

EBernhardson added a comment.EditedAug 12 2015, 8:42 PM

Per advice on the git reviews against puppet i've started up a repository at wikimedia/discovery/dashboard to act as a container which will sub-module the repo's that hold actual dashboarding code. This will integrate with Vagrant directly instead of through mediawiki-vagrant. Additionally it will provision via shell script rather than puppet.

I'm trying to make that shell script provision properly on a labs instance and in vagrant. I havn't put the new incarnation up for review yet because it doesn't start the daemon under some circumstances i havn't figured it out.

Change 231458 had a related patch set uploaded (by EBernhardson):
Provision discovery team dashboards

https://gerrit.wikimedia.org/r/231458

Change 221827 abandoned by EBernhardson:
Create puppet role for discovery dashboards

Reason:
abandoned in favor of I0003eb5a

https://gerrit.wikimedia.org/r/221827

Change 231458 had a related patch set uploaded (by JanZerebecki):
Provision discovery team dashboards

https://gerrit.wikimedia.org/r/231458

JanZerebecki added a comment.EditedAug 18 2015, 12:22 PM

(as a completely disinterested 3rd party with no weight in this discussion but just sharing my experiences)

I think using MWV for things that aren't MW is... unideal for a lot of reasons. Wikimetrics suffered from this heavily. The biggest reason is that any break in the core MW puppet stuff (due to composer or any other reason) causes your unrelated-to-MW thing to have problems as well. Puppet is also overkill for a lot of these things. And new services should probably run in Jessie in labs, and MWV pins to Trusty.

I've switched to setting up per-project Vagrant setups with Bash (https://github.com/wiki-ai/ores for ORES and https://github.com/wikimedia/analytics-quarry-web for quarry, one in progress somewhere for wikimetrics) and has been fairly painless.

(This is not a -1 nor a 'OMG MWV sucks!' but a 'this might be more complications than you need, be careful)

Does operations do security support and automated upgrades for VMs built using a bash script?

I think bash is always worse than puppet.

Please add a module to operations/puppet.git so that by simply selecting it in labs you can build a new VM with it. If you want to use it also from Vagrant you can make your Vagrant git repository share submodules with operations/puppet.git .

(as a completely disinterested 3rd party with no weight in this discussion but just sharing my experiences)

I think using MWV for things that aren't MW is... unideal for a lot of reasons. Wikimetrics suffered from this heavily. The biggest reason is that any break in the core MW puppet stuff (due to composer or any other reason) causes your unrelated-to-MW thing to have problems as well. Puppet is also overkill for a lot of these things. And new services should probably run in Jessie in labs, and MWV pins to Trusty.

I've switched to setting up per-project Vagrant setups with Bash (https://github.com/wiki-ai/ores for ORES and https://github.com/wikimedia/analytics-quarry-web for quarry, one in progress somewhere for wikimetrics) and has been fairly painless.

(This is not a -1 nor a 'OMG MWV sucks!' but a 'this might be more complications than you need, be careful)

Does operations do security support and automated upgrades for VMs built using a bash script?

No. What security problems could your labs instance have?

I think bash is always worse than puppet.

Agreed!

Please add a module to operations/puppet.git so that by simply selecting it in labs you can build a new VM with it. If you want to use it also from Vagrant you can make your Vagrant git repository share submodules with operations/puppet.git .

As explained further above, Operations is not willing to provide support for this element. So, no, it's not going to make it into puppet. The abandoned patch was a patch to try and get it into puppet. You can go look at it to see the response it got.

As explained further above, Operations is not willing to provide support for this element. So, no, it's not going to make it into puppet. The abandoned patch was a patch to try and get it into puppet. You can go look at it to see the response it got.

No, it was written in puppet, but in the wrong repository. It was proposed in mediawiki/vagrant not in operations/puppet.

Dzahn added a subscriber: Dzahn.Aug 18 2015, 2:46 PM

What Jan said, it was in the wrong repo.

Dzahn added a comment.Aug 18 2015, 2:48 PM

I think bash is always worse than puppet.

Agreed!

Agreed again

Missed adding @yuvipanda to the discussion as I quoted him above.

EBernhardson added a comment.EditedAug 18 2015, 4:12 PM

The problem with operations/puppet is it only solves half the problem. There are two issues to solve for here:

  1. Install and hack on dashboards on a developers machine
  2. Deploy dashboards to a public server

Putting the code into operations/puppet does nothing for 1 and basically means doing this all twice. Additionally i think 1 is a much more common use case than 2.

You don't need to do anything for 1, the Shiny R packages contain the ability to local-launch an instance.

That doesn't do everything though, there are .deb's to install for dependencies, there are R packages to compile. The process of setting up a dev environment goes from vagrant up to run these 10 shell commands and hope it works out

I think it's totally OK to use a simple bash provisioning script instead of puppet for things like this. And as @Ironholds said, ops isn't supporting these anyway, so the people supporting it (aka @EBernhardson) should use what they find most convenient... This can change if / when they request Ops to support it.

Even for simple bash scripts I don't think it is OK to use that instead of puppet. Anyway, the bash script in question is not simple, unlike the scripts you linked as an example.

What ops supports for instances in labs using puppet is that the instance gets regular puppet runs and thus is maintained with updates. right?

I was asked to provision another instance with the same tools but different metrics, see T108404. I was not aware that @EBernhardson is willing to manually maintain these and manually update the instances. Are you?

Putting the code into operations/puppet does nothing for 1 and basically means doing this all twice. Additionally i think 1 is a much more common use case than 2.

No, you only need to do it once, even when solving both use cases. If you want to use it also from Vagrant you can make your Vagrant git repository share submodules with operations/puppet.git .

Ironholds added a comment.EditedAug 18 2015, 7:14 PM

Even for simple bash scripts I don't think it is OK to use that instead of puppet. Anyway, the bash script in question is not simple, unlike the scripts you linked as an example.

What ops supports for instances in labs using puppet is that the instance gets regular puppet runs and thus is maintained with updates. right?

I was asked to provision another instance with the same tools but different metrics, see T108404. I was not aware that @EBernhardson is willing to manually maintain these and manually update the instances. Are you?

Within Discovery, Mikhail and myself handle the ongoing maintenance of the dashboards; this includes the updates. This is not a task the Cirrus engineers work on directly.

I'm honestly not sure whether "I was not aware that @EBernhardson is willing to manually maintain these and manually update the instances. Are you?" is sarcastic or not - if it is not, no, it is not the responsibility of Discovery engineers to maintain software for other projects without agreement in advance, and that's not what the training and knowledge transfer we've agreed to constitutes. If it *is* sarcastic - sarcasm is almost never the right answer and certainly isn't here. We're all trying to make sure dashboard instances are fully provisioned and as easy to set up as possible. Language and attitude aimed at other users serves only to chill conversation, and reluctant engineers are likely to be less effective allies than enthusiastic ones.

Dzahn removed a subscriber: Dzahn.Aug 19 2015, 5:00 AM

That was not sarcasm. That was the task others asked me to do, seemingly based on a missunderstanding. I'll change the tasks accordingly.

EBernhardson added a comment.EditedAug 19 2015, 3:44 PM

In terms of instance maintenance, i was planning on having to maintain whichever scripts/puppitization goes through and updating it as necessary so it keeps our instance up to date. I'm not especially planning to maintain other peoples instances, more the mw-v approach where people maintain their own instances via use of the shared code. anyone can fix whatever isn't working right in the shared code and benefit us all.

Tbh i'm not sure how to tie operations/puppet into a Vagrant instance, but i'll find some time and see how that ends up.

The question "Are you?" from my post above was directed at @yuvipanda. As he advocated the solution where I don't know how the updates and other maintenance are supposed to happen automatically. (We talked about a way to do that in the context of the PaaS plans but that is not implemented yet as far as I know.)

I'm not especially planning to maintain other peoples instances

Thank you, that removes any doubt. That is why I was arguing for operations/puppet. As instances created with puppet receive some maintenance from operations.

Tbh i'm not sure how to tie operations/puppet into a Vagrant instance, but i'll find some time and see how that ends up.

For using puppet with vagrant see https://docs.vagrantup.com/v2/provisioning/puppet_apply.html , which is also the thing that Mediawiki Vagrant uses. I assumed one would use only submodules (i.e. a puppet module as its own git repository) from operations/puppet in Vagrant, but I have no experience with that so it could be that using the full operations/puppet is easier. That probably depends on how many dependencies from the module for this task to other modules there are and if those are already broken out as submodules.

When using the full operations/puppet in vagrant one can not add the vm to its sites.pp so one needs another way to specify what roles/modules to apply in the vm. One way would be to use a manifests_path that is outside operations/puppet to have a different sites.pp and point module_path to the one inside operations/puppet.

[...] As instances created with puppet receive some maintenance from operations.

Not true at all, especially for labs! We require that puppet runs successfully all the time, which is the case wether you use puppet or not. Outside of that, having your own puppet code doesn't guarantee any maintenance from operations in any form or way.

For using puppet with vagrant see https://docs.vagrantup.com/v2/provisioning/puppet_apply.html , which is also the thing that Mediawiki Vagrant uses. I assumed one would use only submodules (i.e. a puppet module as its own git repository) from operations/puppet in Vagrant, but I have no experience with that so it could be that using the full operations/puppet is easier. That probably depends on how many dependencies from the module for this task to other modules there are and if those are already broken out as submodules.

When using the full operations/puppet in vagrant one can not add the vm to its sites.pp so one needs another way to specify what roles/modules to apply in the vm. One way would be to use a manifests_path that is outside operations/puppet to have a different sites.pp and point module_path to the one inside operations/puppet.

If you're going to attempt to run our operations/puppet.git in vagrant do be prepared to block out a week or so from your schedule... :) vagrant_lxc / labs-vagrant (which goes the other way) can work fairly easily in labs, however.

Even for simple bash scripts I don't think it is OK to use that instead of puppet. Anyway, the bash script in question is not simple, unlike the scripts you linked as an example.

What ops supports for instances in labs using puppet is that the instance gets regular puppet runs and thus is maintained with updates. right?

Regular puppet runs for general labs wide updates happen wether your code lives in puppet or not.

I was asked to provision another instance with the same tools but different metrics, see T108404. I was not aware that @EBernhardson is willing to manually maintain these and manually update the instances. Are you?

My understanding is that Erik will maintain the instances he is responsible for, and if others want their own instances they will be responsible for their own instances. Similar to the labs-vagrant model.

No, you only need to do it once, even when solving both use cases. If you want to use it also from Vagrant you can make your Vagrant git repository share submodules with operations/puppet.git .

Good luck with convincing opsen to add more git submodules to operations/puppet :)

In general, I think operations/puppet for this is mowing a lawn with a tank, requiring certified tank operators (ops) to +2 every change. I far prefer the Discovery team building their own lawnmower, so at least someone (they!) can maintain it....

There is a module in operations/puppet called 'puppetception' that might be a fix to these, but unfortunately I do not have the time to work on that atm.

mpopov added a subscriber: mpopov.Aug 20 2015, 6:00 PM

Regular puppet runs for general labs wide updates happen wether your code lives in puppet or not.

How does that work for instances which use e.g. puppet master self?

How does that work for instances which use e.g. puppet master self?

I'll prefix this by saying that you should only use those for testing patches. However, yes, they too have a cronjob running puppet every 20mins, and a different cronjob that tries to rebase the local git repo to production every hour (maybe?)

Not sure how that's relevant here?

How does that work for instances which use e.g. puppet master self?

I'll prefix this by saying that you should only use those for testing patches. However, yes, they too have a cronjob running puppet every 20mins, and a different cronjob that tries to rebase the local git repo to production every hour (maybe?)

Which makes it likely to break, which is why I agree that it should only be used for testing patches. So an instance that needs to be reliable and maintained for providing a service used by others needs to use a normal puppet master provided by operations using operations/puppet.

Not sure how that's relevant here?

The relevance is that you are suggesting not to do what we usually do and what AFAIK is the current consensus in operations (add a module to operations puppet and apply that to an instance). So I want to know if your new way of doing things fulfills the requirements:

  • automatically keeping up to date with the settings required to run in labs (which AFAIK is only done by using operations/puppet)
  • automatically updating (within a distro release) to e.g. get security updates
  • automatic setup of new instances from roles (or similar things) set for that instance
  • automatic configuration changes to follow changes done to the roles (or similar things) set for that instance
  • consensus from others in operations

(I suspect all this would be provided by T106475 once it is evaluated, integrated and read to use, but that is not an option for now. When that becomes an option, migrating from normal labs usage of operations/puppet to using a puppet module to build an image is probably fairly low cost.)

I currently don't understand how the solution you are suggesting we use now fulfills any of these, could you explain?

I feel like there are a lot of things going on in this ticket that are above my head in terms of expectations, so I'd like to just note a few facts:

  1. puppet runs on all instances, all the time. This is what the labs team cares about for security updates, system changes, etc. As long as you don't explicitly disable this, you can do whatever you want with your instance and the labs team is happy. So security updates and other infrastructure level changes (new DNS! New LDAP! Whatever) are taken care of by this.
  2. just because something is in the operations/puppet.git repo doesn't mean it gets 'ops support'. 'Ops suppprt' requires either someone with +2 who cares enough *and* has the time, or someone officially tasked with caring about it by management.

As far as I can see, the search and discovery team isn't getting any ops support for this dashboard (see #2 above) and they are OK with that. If the wikidata team wants ops support for its dashboard, I suggestion either finding someone with +2 who cares enough, or taking it up to management to provide ops support for it.

https://gerrit.wikimedia.org/r/#/c/230928/ looks like a fairly nice way to replicate the vagrant setup across multiple hosts.

Joe added a comment.Aug 24 2015, 8:40 AM

So, I get to this very late, guilty as charged.

I am not sure I understand why we would object against using bash to provide software installation and all in labs.

Is this a production service that our main sites would connect to directly? Are the VMs hosting this going to have any special access to production/ host security sensitive data like passwords?

If the answer is yes, this is not ok, but it is also not ok to keep this in labs

If the answer is no, I don't see any problem, as long as we keep puppet enabled to provide the base system up to date.

If the wikidata team wants ops support for its dashboard, I suggestion either finding someone with +2 who cares enough, or taking it up to management to provide ops support for it.

I didn't say it needs full support from the operations team. I listed a few requirements that the current "add a module to operations/puppet and use it via wikitech" fulfills that I want, because I don't want to do more manual work than necessary. Do you want to tell me if your suggestion fulfills these requirements?

I am not sure I understand why we would object against using bash to provide software installation and all in labs.

Because someone else might end up needing to use / maintain / support it? That someone might be me.

Is this a production service that our main sites would connect to directly? Are the VMs hosting this going to have any special access to production/ host security sensitive data like passwords?

If the answer is yes, this is not ok, but it is also not ok to keep this in labs

@yuvipanda Why do you _require_ something that is not ok in production instead of accepting something that is also ok in production? Note that one of the patches is written in puppet instead of bash, which is what you seem to argue against.

If the wikidata team wants ops support for its dashboard, I suggestion either finding someone with +2 who cares enough, or taking it up to management to provide ops support for it.

I didn't say it needs full support from the operations team. I listed a few requirements that the current "add a module to operations/puppet and use it via wikitech" fulfills that I want, because I don't want to do more manual work than necessary. Do you want to tell me if your suggestion fulfills these requirements?

Those are your requirements, and not the discovery team's nor mine :) I don't know why I'm supposed to fulfill them nor the discovery team :)

@yuvipanda Why do you _require_ something that is not ok in production instead of accepting something that is also ok in production? Note that one of the patches is written in puppet instead of bash, which is what you seem to argue against.

So first I'm not requiring anything - I'm a completely uninterested observer who is just trying to make clear that this is not going to get any ops support just because it is in ops/puppet.git. And this is never going to reach production - if we're going to apply 'production' standards to this then the fact that this runs on R itself will be a much bigger issue than wether the small amount of code needed to provision this is in bash or puppet.

So can I state that my position on this is 'I think Ops really does not care if this is in operations/puppet.git or not, since we are not expected to support it, and hence it should be done in whatever way the people building it feel comfortable' and then disengage from this conversation?

Consider me added to the "I really don't care" category.

Discovery's interest in this is twofold; first, to provision Shiny in
a way we can reuse if lightning strikes the datacentre. Second, to
provision Shiny in a way that allows other teams and orgs to use it on
Labs infrastructure.

I've heard a lot of arguing over the "best" way of doing this. I've
not heard anyone argue that the existing way of doing things does not
solve for both problems. From a DIscovery point of view this endless
back and forth is blocking cards that are, for all intents and
purposes, complete, and blocking a training session for a partner
organisation to boot.

I would very much appreciate it if people would just provision the
dashboards they want using the tools they are given and then, once
those dashboards exist and everyone is happy with them, /then/ we come
back round and harden them up. Because we have spent several weeks
going back and forth producing nothing of consumer value.

Change 231458 merged by OliverKeyes:
Provision discovery team dashboards

https://gerrit.wikimedia.org/r/231458

Deskana closed this task as "Resolved".Aug 25 2015, 8:11 PM

Done!