Page MenuHomePhabricator

Rack and Setup (3) Logstash Servers
Closed, ResolvedPublic

Description

Needs everything
Rack in b4, c7,d3
cable, Label and asset tag
switch cfg
dns/dhcp
racktables

  • rob edit addition --

So going foward, we should have split tickets for onsite and software work, but this is done, so this differs from the normal template:

  • - mgmt dns entries created/updated (both asset tag & hostname) - via this task by chris
  • - system bios and mgmt setup and tested - via this task by chris
  • - network switch setup (port description & vlan) - via this task by chris
  • - production dns entries created/updated (just hostname, no asset tag entry) - via this task by chris
  • - install-server module updated (dhcp and netboot/partitioning)
  • - install OS (note jessie or trusty) [done via this task when network sub-task(s) complete]
  • - accept/sign puppet/salt keys [done via this task post os-installation]
  • - service implementation [done via this task post puppet/salt acceptance]

Related Objects

Event Timeline

Cmjohnson claimed this task.
Cmjohnson raised the priority of this task from to Needs Triage.
Cmjohnson updated the task description. (Show Details)
Cmjohnson added projects: ops-eqiad, acl*sre-team.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 21 2015, 3:25 PM

@bd808 are you still taking the post-ops setup? I can certainly help. I know your pretty crazy busy at this point.

Change 205859 had a related patch set uploaded (by Cmjohnson):
Adding dns entries for logstash1004-6 (https://phabricator.wikimedia.org/T96692)

https://gerrit.wikimedia.org/r/205859

Change 205859 merged by Cmjohnson:
Adding dns entries for logstash1004-6 (https://phabricator.wikimedia.org/T96692)

https://gerrit.wikimedia.org/r/205859

IP's have been setup

logstash1004 1H IN A 10.64.0.162
logstash1005 1H IN A 10.64.16.185
logstash1006 1H IN A 10.64.48.109

logstash1004 1H IN A 10.65.4.11
logstash1005 1H IN A 10.65.4.12
logstash1006 1H IN A 10.65.4.13

Switch Configuration Completed
ge-4/0/3 up up logstash1004 in private1-a-eqiad
ge-4/0/1 up up logstash1005 in private1-b-eqiad
ge-3/0/25 up up logstash1006 in private1-d-eqiad

@bd808 are you still taking the post-ops setup? I can certainly help. I know your pretty crazy busy at this point.

I was going to start on it this afternoon. I made T96814 yesterday to track the puppet changes that I think will be needed first. I will certainly poke you if I get stuck/sidetracked.

Change 205901 had a related patch set uploaded (by Cmjohnson):
Adding dhcpd entries for new logstash1004-6 (T96692)

https://gerrit.wikimedia.org/r/205901

Change 205901 merged by Cmjohnson:
Adding dhcpd entries for new logstash1004-6 (T96692)

https://gerrit.wikimedia.org/r/205901

The setup is complete and can be installed at any time. They are set to install jessie. Please let me know if you want to change to precise.

Cmjohnson reassigned this task from Cmjohnson to bd808.Apr 23 2015, 6:33 PM

Assigning this t o@ bd808. Removing the on-site project flag as the on-site portion of this ticket has been completed.

Cmjohnson set Security to None.

I have four Puppet patches for this:

Once these are live on the prod cluster I can move the current Elasticsearch indices to over to the new hosts. When that is done there is one more Puppet patch:

That final patch will convert the current Elasticsearch nodes on logstash100[1-3] to client only mode and reduce their resource consumption.

RobH claimed this task.Apr 23 2015, 7:17 PM
RobH added a comment.Apr 23 2015, 9:36 PM

argh, this task is a mess... in the future lets use the template laid out on https://wikitech.wikimedia.org/wiki/Phabricator#Hardware.2FServer_Setup_.2F_Deployment_Stage_Workflow to keep a more clear picture on what has and has not been done.

RobH updated the task description. (Show Details)Apr 23 2015, 9:41 PM

These precursor patches are merged in ops/puppet.

RobH added a comment.Apr 28 2015, 4:54 PM

partman layout guidelines from RT ticket 9199:

~~~~
Yes. software RAID1 for the OS partition(s) and software RAID0 for a
single data partition for Elasticsearch spread across as many spindles
as we can get should be great.

Bryan

~~~~

RobH added a comment.Apr 28 2015, 10:54 PM

I've created logstash.cfg for logstash partitioning. This sets up the following:

/ : ext3, RAID11, 250GB (just sda and sdb)
/var/lib/elasticsearch : xfs, RAID0, rest of the space and use all four disks

There is presently an issue with Jessie installs (likely due to the release), so I've attached the blocking task. Once it is resolved, logstash1004-1006 can be installed. (logstash1004 has trusty, since I was using it for partman recipe testing, it needs reinstall to jessie)

Once these are all installed and in service, we should reinstall the logstash1001-1003 to match the partitioning scheme.

Once these are all installed and in service, we should reinstall the logstash1001-1003 to match the partitioning scheme.

Actually once the new logstash100[4-6] are fully in service we can pull the large disks from logstash100[1-3] entirely. Their role in the new cluster will be only hosting Logstash and Kibana with all Elasticsearch being on the 3 new nodes.

RobH updated the task description. (Show Details)Apr 29 2015, 3:59 PM
RobH updated the task description. (Show Details)Apr 29 2015, 4:36 PM
RobH reassigned this task from RobH to bd808.Apr 29 2015, 4:42 PM
RobH added a subscriber: RobH.

These are now all installed, with puppet having run and salt-keys accepted.

I'm assigning this task to @bd808 for final service implementation.

RobH added a comment.Apr 29 2015, 4:43 PM

I've also created a blocked task T97545 for the reinstallation/update of the logstash1001-1003 hosts.

bd808 moved this task from To Do to In Dev/Progress on the User-bd808 board.
bd808 closed this task as Resolved.May 6 2015, 3:17 AM
bd808 moved this task from In Dev/Progress to Done on the User-bd808 board.
bd808 moved this task from Done to Archive on the User-bd808 board.May 12 2015, 6:23 AM
Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptAug 19 2019, 2:28 PM