copy partial dumps from dataset host to labs
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	ArielGlenn
	Aug 5 2015, 6:33 PM

Description

Now that we run staged dumps, i.e. stubs for all wikis, then tables for all wikis etc., it takes a while for a full dump to complete. Folks will want the dump files available sooner rather than later so we should copy them sooner. This may need a rethinking of space available in labs, so we keep the last known full good dump, possibly a more recent partial dump, plus current files being copied over, per wiki.

Related Objects
Search...

Status	Assigned	Task
Resolved	ArielGlenn	T107750 Make dumps run via cron on each snapshot host
Resolved	ArielGlenn	T107757 staged dumps implementation
Resolved	ArielGlenn	T108077 copy partial dumps from dataset host to labs

Event Timeline

ArielGlenn created this task.Aug 5 2015, 6:33 PM

ArielGlenn claimed this task.

ArielGlenn raised the priority of this task from to Medium.

ArielGlenn updated the task description. (Show Details)

ArielGlenn added a project: acl*sre-team.

ArielGlenn added subscribers: ArielGlenn, coren.

Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptAug 5 2015, 6:33 PM

Coren, I've added you on this so we can chat about space available in labs for the dumps copy.

Do you already have a ballpark of how much space you'd need?

About 3x times one full run to be on the safe side. One run these days takes (guesstimate) 2.5T so we're looking at 8T to be safe. I forget what the last round of negotiations landed us with; what have we got allocated now?

That's... not an issue. :-) Since we moved to labstore1003, there is some 40T available for dumps (with the caveat that this lives on media that is not otherwise backed up or very redundant under the presumtion that it holds only copies of data).

DCDuring subscribed.Aug 5 2015, 10:34 PM

wpmirrordev subscribed.Aug 6 2015, 9:16 PM

changes to list-last-n-good-dumps coming up, yet to be tested. see https://gerrit.wikimedia.org/r/234973

tested and merged. https://gerrit.wikimedia.org/r/#/c/234982/ is the change to generate the list of last three good dumps and use that for rsync, also merged. we should see new behavior tomorrow as the new dump run starts.

this is working now; closing.

copy partial dumps from dataset host to labsClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

copy partial dumps from dataset host to labs
Closed, ResolvedPublic
Actions

Related Objects
Search...