prometheus-wmf-elasticsearch-exporter interferes with prometheus-wmf-elasticsearch-exporter-9* unit on elastic nodes
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Mathew.onipe
	Apr 15 2019, 4:02 PM

Description

We should only have two unit of prometheus-wmf-elasticsearch-exporter-9* on elastic nodes. prometheus-wmf-elasticsearch-exporter starts and uses the same ports as prometheus-wmf-elasticsearch-exporter-9200 thereby preventing it from starting as they use the same port. This is probably some puppet corrections. We should align these units to make sure the correct ones are started

Event Timeline

• Mathew.onipe created this task.Apr 15 2019, 4:02 PM

Restricted Application edited projects, added Discovery-Search; removed Discovery-Search (Current work). · View Herald TranscriptApr 15 2019, 4:02 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

• Mathew.onipe triaged this task as High priority.Apr 16 2019, 8:41 AM

After investigating, I noticed prometheus-wmf-elasticsearch-exporter was created prior to multi-instance setup. This unit is no longer needed. It is present on some nodes and absent on others. It is probably absent on nodes that were setup after multi-instance. Here are some ways to get rid of it completely:

Use puppet and ensure=>absent on this resource (prometheus-wmf-elasticsearch-exporter)
Use cumin to delete these unit across the nodes where it is present

@Gehel what do you think?

• Mathew.onipe edited projects, added Discovery-Search (Current work); removed Discovery-Search.Apr 16 2019, 10:59 AM

redundant units have been cleaned via cumin:

sudo cumin 'elastic[2025-2026,2028,2031,2034,2047,2052].codfw.wmnet' 'rm /etc/systemd/system/multi-user.target.wants/prometheus-elasticsearch-exporter-9600.service ; systemctl daemon-reload'

(and similar commands for other nodes)

It looks like we only have the units we need left:

gehel@cumin2001:~$ sudo cumin 'A:elastic' 'systemctl list-units -a | grep prometheus-elastic'
65 hosts will be targeted:
elastic[2025-2054].codfw.wmnet,elastic[1017-1020,1022-1052].eqiad.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====                                                                                                                                                                                             
(1) elastic2028.codfw.wmnet                                                                                                                                                                                        
----- OUTPUT of 'systemctl list-u...ometheus-elastic' -----                                                                                                                                                        
  prometheus-elasticsearch-exporter-9200.service                                            loaded    active     running         Prometheus exporter for Elasticsearch                                             
  prometheus-elasticsearch-exporter-9400.service                                            loaded    active     running         Prometheus exporter for Elasticsearch                                             
===== NODE GROUP =====                                                                                                                                                                                             
(32) elastic[2027,2029-2030,2032-2033,2035-2036,2039-2040,2043-2044,2048-2049,2053-2054].codfw.wmnet,elastic[1024-1027,1035,1039,1042-1052].eqiad.wmnet                                                            
----- OUTPUT of 'systemctl list-u...ometheus-elastic' -----                                                                                                                                                        
  prometheus-elasticsearch-exporter-9200.service                                            loaded    active   running   Prometheus exporter for Elasticsearch                                                     
  prometheus-elasticsearch-exporter-9600.service                                            loaded    active   running   Prometheus exporter for Elasticsearch                                                     
===== NODE GROUP =====                                                                                                                                                                                             
(32) elastic[2025-2026,2031,2034,2037-2038,2041-2042,2045-2047,2050-2052].codfw.wmnet,elastic[1017-1020,1022-1023,1028-1034,1036-1038,1040-1041].eqiad.wmnet                                                       
----- OUTPUT of 'systemctl list-u...ometheus-elastic' -----                                                                                                                                                        
  prometheus-elasticsearch-exporter-9200.service                                            loaded    active   running   Prometheus exporter for Elasticsearch                                                     
  prometheus-elasticsearch-exporter-9400.service                                            loaded    active   running   Prometheus exporter for Elasticsearch                                                     
================                                                                                                                                                                                                   
PASS:  |████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100% (65/65) [00:01<00:00, 60.52hosts/s]     
FAIL:  |                                                                                                                                                                     |   0% (0/65) [00:01<?, ?hosts/s]     
100.0% (65/65) success ratio (>= 100.0% threshold) for command: 'systemctl list-u...ometheus-elastic'.
100.0% (65/65) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.

Gehel moved this task from Incoming to Needs Reporting on the Discovery-Search (Current work) board.Apr 16 2019, 1:00 PM

• Mathew.onipe closed this task as Resolved.Apr 16 2019, 1:20 PM

prometheus-wmf-elasticsearch-exporter interferes with prometheus-wmf-elasticsearch-exporter-9* unit on elastic nodesClosed, ResolvedPublicActions

Description

Event Timeline

prometheus-wmf-elasticsearch-exporter interferes with prometheus-wmf-elasticsearch-exporter-9* unit on elastic nodes
Closed, ResolvedPublic
Actions