eqiad: (3) servers for logstash service
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Gage
	Dec 19 2014, 1:01 AM

Description

The Logstash cluster currently consists of three 'misc'-class nodes, to which have been added 2x3tb disks in raid0. These machines were allocated while the service was in development, and disks were added as the service was adopted and volume increased. However we are quickly outgrowing the current config as more event sources are added. RAM appears to be the current bottleneck.

Now that Logstash has proved its usefulness, let's purchase equipment explicitly spec'd for the task with sufficient capacity for future growth.

Factors:

Elasticsearch replicates data between nodes, therefore redundant storage is not required for nodes. For comparison, elastic10xx nodes also use raid0
Logstash nodes have 16GB RAM and have had trouble with Elasticsearch OOM failures. For comparison, elastic10xx nodes have 96GB RAM.
Currently the daily Logstash indices are ~30GB/day, stored for 30 days = 900GB storage required per node
It is unclear how much our storage requirements will increase based on desired additional logging, or even whether all potential log sources have been enumerated.

For scalability and clarity, we may wish to divide the service among nodes by task:

- Redis nodes for an input message queue
- Logstash nodes for ingest
- Elasticsearch master nodes for cluster management which store no data
- Elasticsearch client nodes for data storage
- Kibana web frontend service nodes
Some roles might be combined onto a common set of hosts, such as Redis + Logstash

Current event sources:

Apache2
HHVM
Mediawiki
scap
Job queue runner
Hadoop
OCG
Parsoid

Future sources

Syslog from all nodes (includes Puppet)
Zookeeper (T84908)
Kafka (T84907)
Monolog (T76759)
Icinga
Any local logs which may be tailed
?

I request help from others to identify all additional event sources we wish to store, and to determine appropriate host stats and number of hosts.

Related Objects
Search...

Status	Assigned	Task
Resolved	bd808	T69817 Monitor for anomalies/spikes in read failures of memcached
Resolved	bd808	T100735 Have Logstash report per-channel log message rate to Graphite
Resolved	bd808	T99735 Upgrade Logstash to 1.5.3
Resolved	bd808	T97545 reinstall logstash1001-1003
Resolved	Anomie	T1272 Deploy ApiFeatureUsage extension on WMF wikis
Resolved	bd808	T87521 Convert JobRunner.php to PSR-3 logging and add levels
Resolved	bd808	T88732 Decouple logging infrastructure failures from MediaWiki logging
Resolved	bd808	T96692 Rack and Setup (3) Logstash Servers
Resolved	RobH	T84958 eqiad: (3) servers for logstash service
Declined	bd808	T87078 Upgrade RAM for logstash100[123] to 64G
Resolved	RobH	T89402 purchase 3 additional logstash nodes
Declined	RobH	T87460 Allocate temporary Elasticsearch nodes from spares pool for Logstash
Resolved	faidon	T97481 jessie installs fail - mirror issue due to jessie release?
Resolved	bd808	T97645 Elasticsearch not starting on Jessie hosts

Event Timeline

• Gage created this task.Dec 19 2014, 1:01 AM

• Gage raised the priority of this task from to Needs Triage.

• Gage updated the task description. (Show Details)

• Gage added projects: Wikimedia-Logstash, acl*sre-team, ops-core.

• Gage changed Security from none to None.

• Gage added subscribers: • Gage, bd808, • Manybubbles, mark.

• Gage updated the task description. (Show Details)Dec 19 2014, 1:04 AM

• Gage mentioned this in T84895: Improve logstash .

• MZMcBride subscribed.Dec 19 2014, 1:10 AM

several system log candidates are listed on T63779
T6: Get !log entries from #-operations into logstash would be low volume but useful.
T64505: Add gerrit event stream events to logstash

From past experience, once you start aggregating logs in a useful way you will continually find more event sources to add. I think we should plan on a system that can be grown horizontally as we find more things we want to do with it. Luckily, this should be pretty easy. Any number of Logstash agents can be used to feed data to the backing Elasticsearch cluster and Elasticsearch itself is designed to scale by adding nodes and increasing shard count on indices.

Separating Elasticsearch from Logstash + redis + apache would be a good start. The existing misc boxes should be able to handle ingesting a lot of log traffic if relieved from their duty of indexing the resulting documents and delivering search results.

jeremyb subscribed.Dec 20 2014, 3:41 AM

• chasemp triaged this task as Medium priority.Jan 6 2015, 11:06 PM

• demon mentioned this in T87031: Allocate a few servers to logstash.Jan 16 2015, 4:50 PM

Here's my proposal:

Add 3 hosts matching the machine spec used for the MediaWiki Elasticsearch cluster
Move the Logstash Elasticsearch backend to those hosts
Keep the current logstash100[123] boxes to run the rest of the stack (Apache + Redis + Logstash)

Scale the system by:

Adding Elasticsearch nodes if and when Elasticsearch becoes a bottleneck on the 3 node cluster (shard daily indices across more than 1 shard, distribute across more hosts)
Separating Redis and Logstash to separate machines if there is cpu or ram contention on those host between the two processes
Add more hosts running logstash if and when needed due to cpu consumed by logstash processing

Unanswered questions:

Do we need Logstash processing cluster in codfw or can we ship logs direct to eqiad?
Do we need Logstash Elasticsearch cluster in codfw as either replica of combined data from both DCs or as a codfw only log storage system?
Do we need a separate "ops only" Elasticsearch cluster for log storage or can we redact or partition the data in the same cluster.

In T84958#988453, @bd808 wrote:

Here's my proposal:

Add 3 hosts matching the machine spec used for the MediaWiki Elasticsearch cluster

Does it make sense to use spinning disks rather than SSDs? High indexing rate loves SSDs but keeping lots of data around likes the bigger rotating disks. What about RAM?

In T84958#988487, @Manybubbles wrote:

In T84958#988453, @bd808 wrote:

Here's my proposal:

Add 3 hosts matching the machine spec used for the MediaWiki Elasticsearch cluster

Does it make sense to use spinning disks rather than SSDs? High indexing rate loves SSDs but keeping lots of data around likes the bigger rotating disks. What about RAM?

At this moment, the base size of the data set is 990G for the last 30 days. With the growth that has been seen as a result of the MediaWiki conversion to Monolog+Redis by mid-month in February that base storage may grow to something like 1.5T. I would like to see each index have 2 replicas (3 total copies) available across the backing Elasticsearch cluster so with 3 nodes we would want 1.5T plus a healthy growth factor available for index storage.

I personally think we will be fine with spinning disks for IOPS. If at some point in the growth of the logstash cluster we really need to be able to ingest messages at a rate where disk IOPS become a problem we can start sharding the daily index and if we get to a point where that doesn't help we can add a set of nodes with SSD to hold the current day's index and then use index routing in Elastcisearch itself to move the index from nodes using SSD to nodes using spinning disks each day.

Ram I think is a more the merrier thing. If we are using a 30G fixed heap then we should have 45-60G available for the JVM plus file system cache for the index files plus more for OS overhead.

• Gage mentioned this in T87460: Allocate temporary Elasticsearch nodes from spares pool for Logstash.Jan 23 2015, 9:30 PM

bd808 mentioned this in T87078: Upgrade RAM for logstash100[123] to 64G.Jan 27 2015, 12:14 AM

bd808 closed subtask T87078: Upgrade RAM for logstash100[123] to 64G as Declined.

bd808 added a subtask: T87460: Allocate temporary Elasticsearch nodes from spares pool for Logstash.

mark added a project: hardware-requests.Feb 6 2015, 2:52 PM

greg subscribed.Feb 6 2015, 7:27 PM

• RobLa-WMF subscribed.Feb 6 2015, 7:28 PM

RobH closed subtask T87460: Allocate temporary Elasticsearch nodes from spares pool for Logstash as Declined.Feb 12 2015, 9:50 PM

RobH changed the status of subtask T89402: purchase 3 additional logstash nodes from Open to Stalled.Feb 12 2015, 9:55 PM

There is currently ongoing discussion on https://rt.wikimedia.org/Ticket/Display.html?id=9199 in regards to the disks being SATA or SAS.

RobH renamed this task from Production hardware for Logstash service to eqiad: (3) servers for logstash service.Feb 26 2015, 11:39 PM

RobH changed the task status from Open to Stalled.

RobH claimed this task.

I didn't follow up on the discussion until today, which included the potential upgrade to SAS disks to match current standards.

As such, the new quote request is in on the associated RT ticket, I expect we'll have movement on this task within the next 24 hours.

Edit addition: There was movement, still is. Quote options are being discussed on RT ticket.

RobH moved this task from Backlog to In Review on the hardware-requests board.Apr 10 2015, 3:50 PM

RobH moved this task from In Review to Allocation/Ordering/Implementation on the hardware-requests board.Apr 13 2015, 8:19 PM

The order for these systems has been placed on https://rt.wikimedia.org/Ticket/Display.html?id=9199. The ETA as of now is 2015-04-24 (Friday) for delivery. We'll need a day or two to receive them in and get them racked and ready for installation & then installed. I wouldn't expect to see these ready for service implementation sooner than the 30th. (It could be faster of course, but I don't like to promise too short a timeline.)

bd808 moved this task from Backlog to In Dev/Progress on the Wikimedia-Logstash board.Apr 22 2015, 3:34 AM

Resolving this request, as the servers were ordered and are now on-site being setup.

Setup of systems can be followed on T96692.

RobH added a parent task: T96692: Rack and Setup (3) Logstash Servers.Apr 22 2015, 7:36 PM

bd808 added a parent task: T88732: Decouple logging infrastructure failures from MediaWiki logging.Apr 23 2015, 4:14 PM

RobH mentioned this in T98620: Degraded RAID-1 arrays on new logstash hosts: [UU__].May 13 2015, 4:49 PM

bd808 moved this task from In Dev/Progress to Archive on the Wikimedia-Logstash board.Jun 5 2015, 6:55 PM

• Phabricator_maintenance added a project: MediaWiki-Debug-Logger.Jul 29 2016, 8:19 PM

fgiunchedi added a project: observability.Aug 19 2019, 2:29 PM