Page MenuHomePhabricator

Access to dumps servers
Closed, ResolvedPublic

Description

I'm working on T199252, which concerns generation and distribution of sitemaps. It appears at this point that the server that hosts dumps (either dumpsdata1001.eqiad.wmnet or labstore1006.wikimedia.org, I believe, based on puppet roles) is probably the best place for these files to end up. However, I don't currently have access to them.

Could someone please set me up with appropriate groups to access these machines?

@ArielGlenn is probably the person to approve this, and/or to tell me if I'm misunderstanding the puppet configs and these are actually the wrong machines.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

It's the labstore boxes you want, either 1006 or 1007 depending, and maybe you just want to make the file available and ask someone to drop it into the right location? And that would likely be someone on the WMCS team.

fgiunchedi triaged this task as Medium priority.Aug 7 2018, 9:51 AM

FWIW, I would prefer general access instead of having to ask someone to move files for me. There are a number of open Phab tickets that request sitemap generation for different wikis, including on an ongoing basis, and I expect that this work is going to result in a general solution. I don't know if we will actually end up automating these runs going forward, but having the ability to do so if we find that sitemaps improve our search engine indexing would be extremely helpful.

I don't know if we will actually end up automating these runs going forward, but having the ability to do so if we find that sitemaps improve our search engine indexing would be extremely helpful

I expect that this work is going to result in a general solution

Allow me to suggest strongly the use of puppet for that- not because access cannot be provided, but because deploying the steps to puppet will leave a trail of what is done, so it can later be improved/automated and properly maintained. If access is just granted, custom solutions tend to be forgotten and become unmaintained/abandoned. For example, running scripts from mwmaint is far from ideal- most ones that run for over a week should be puppetized for accountability. For example, in your case, if your plan is to setup a cron and a script to run, puppet is the right way to deploy it (even if it is a quick and dirty one for a specific solution).

Please ignore me if I am misunderstanding what you want to do.

@jcrespo Right, obviously this would end up in puppet if it were something that we were going to do as more than a one-off. But even when putting something in to puppet, not being able to see whether it's working right is awkward at best, and actively counterproductive at worst. All I'm asking for is the access that I need to be able to do that.

bd808 added subscribers: Bstorm, bd808.

@Bstorm can you work with @Imarlier to get him access to labstore1006/7 (this may require a new user security role for these hosts) and show him where to put things so that the dumps.wikimedia.org vhost can serve them up?

@Imarlier do you need sudo or just login access to the server? Also, should everyone in the perf-team group be included in that?

Change 451394 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] dumps: give access to perf-team

https://gerrit.wikimedia.org/r/451394

@Bstorm I'd say all of perf-team should be added.

As for sudo, without knowing more about how the box is set up it's hard for me to say for sure, but I'm guessing that it would be helpful. In particular, being able to CRUD files within the dumps vhost (or at least a subpath thereof) would be necessary.

In that case I'll merge the patch we figured was probably the right answer here. Since the web function can move between the two servers for failover, we are giving access to both labstore1006 and 7.

Change 451394 merged by Bstorm:
[operations/puppet@production] dumps: give access to perf-team

https://gerrit.wikimedia.org/r/451394

Ok, you are good-to-go on access. The dumps are served out of /srv/dumps/xmldatadumps/public to the web.

This is also an NFS server, since they both have that role to different clients. Be gentle. Our NFS servers are an excellent source of icinga pages, in general. 😅

Perfect -- I'm in! And I shall be exceedingly kind to these boxes :-)

Thanks much.

Imarlier assigned this task to Bstorm.

So.. just to understand for myself.. the sitemaps.wikimedia.org site is hosted on misc_static_sites and not dumps servers or these labs servers now besides this?

@Dzahn sitemaps.wikimedia.org is hosted on misc_static_sites.

Dumps servers would not work because the production front-end Varnish servers need to be able to directly route to the host where sitemaps live, and that couldn't happen if they were on dumps. I presume the same would be true with any other lab servers due to network security policies, but that's not an area I know a ton about.

any other lab servers due to network security policies

Ok thanks, honestly i don't understand how it is/was related to labs at all.

I do understand that Varnish needs to be in front of it and i saw the list of requirements on T202910#4538651.

Those are all fulfilled by using the existing releases-servers and they are already made for uploads of files.

Let me try to resolve all of that as part of T202910.

So technically this access here is not needed and maybe should be reverted for good measure if it's not used?

I think that's a good idea, personally. I'll roll it on back.
https://gerrit.wikimedia.org/r/c/operations/puppet/+/455902

Unless we have any objections soonish :)