Page MenuHomePhabricator

ElasticSearch unassigned shard check apifeatureusage-2020.06.30@codfw and enwiki_general_1587198756@eqiad
Closed, ResolvedPublic

Description

Caught by icinga:

chi@eqiad:

enwiki_general_1587198756              0  r UNASSIGNED

(explain: P12807)

chi@codfw:

apifeatureusage-2020.07.18             0  p UNASSIGNED                                   
apifeatureusage-2020.07.04             0  p UNASSIGNED                                   
apifeatureusage-2020.07.19             0  p UNASSIGNED                                   
apifeatureusage-2020.06.30             0  p UNASSIGNED                                   
apifeatureusage-2020.07.12             0  p UNASSIGNED                                   
apifeatureusage-2020.07.01             0  p UNASSIGNED                                   
apifeatureusage-2020.08.03             0  p UNASSIGNED                                   
apifeatureusage-2020.07.26             0  p UNASSIGNED

(explain: P12808)

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2020-09-28T07:24:32Z] <dcausse> T263970: forcing allocation of enwiki_general_1587198756 (chi@eqiad)

dcausse renamed this task from ElasticSearch unassigned shard check apifeatureusage-2020.06.30@codfw and enwiki_general_1587198756@codfw to ElasticSearch unassigned shard check apifeatureusage-2020.06.30@codfw and enwiki_general_1587198756@eqiad.Sep 28 2020, 7:24 AM

Mentioned in SAL (#wikimedia-operations) [2020-09-28T08:56:10Z] <dcausse> T263970: recovering lost apifeature indices (copying eqiad indices -> codfw)

Change 630546 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] logstash: keep at least 2 copies of each shard for apifeatureusage

https://gerrit.wikimedia.org/r/630546

Change 630546 merged by Gehel:
[operations/puppet@production] logstash: keep at least 2 copies of each shard for apifeatureusage

https://gerrit.wikimedia.org/r/630546

Curator config has been changed for apifeatureusage. We still want to reset the replica count for existing indices once the data transfer is completed.

Mentioned in SAL (#wikimedia-operations) [2020-10-27T06:42:20Z] <ryankemper> T263970 Set number of replicas to 2 (from previous value of 1) for all codfw indices matching apifeatureusage*, new shards have been assigned without issue

New state of cluster after setting replica count to 2 for apifeatureusage*:

ryankemper@elastic2038:~$ curl -X GET -s -k 'https://localhost:9243/_cat/indices/apifeatureusage*?v'
health status index                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   apifeatureusage-2020.10.12 dx5-2vxHTsu6h5zNl2Md5w   1   2    8388841            0      1.6gb        560.4mb
green  open   apifeatureusage-2020.10.23 I0CjCP6IRTmpbQ4RxxfDOQ   1   2    8554945            0      1.6gb        579.8mb
green  open   apifeatureusage-2020.09.22 us7FEwNcQeK4WZHxP-0NSQ   1   2    8640015            0      1.7gb        583.5mb
green  open   apifeatureusage-2020.08.18 FxQVYqDyS86g6f-OvOleXg   1   2    8640360            0      1.7gb        595.2mb
green  open   apifeatureusage-2020.09.13 K0UISZk5TnqpGvHbyCgAvg   1   2    8640000            0      1.7gb        581.5mb
green  open   apifeatureusage-2020.10.22 OsoXKJ1ARkOqxV1pUBohAw   1   2    8195940            0      1.6gb        559.2mb
green  open   apifeatureusage-2020.08.14 cCmaVNoARqO-wCqe6z_L-Q   1   2    8640293            0      1.7gb          600mb
green  open   apifeatureusage-2020.08.08 XAXB43q8TXytSvFIoG7_4Q   1   2    8640033            0      1.8gb          620mb
green  open   apifeatureusage-2020.09.01 0Xg-0F98RLi_nrQLV__JuA   1   2    8640000            0      1.6gb        557.6mb
green  open   apifeatureusage-2020.08.09 jTdOqiXwR3e7tBcZWFMi7g   1   2    8639937            0      1.8gb        621.2mb
green  open   apifeatureusage-2020.10.17 qUxgiHPsTamdPb8A8iLXsQ   1   2    8638708            0      1.7gb        588.4mb
green  open   apifeatureusage-2020.10.03 GlWklRv2TXaXak3_BwFlVg   1   2    8639827            0      1.6gb        573.4mb
green  open   apifeatureusage-2020.07.31 NAmvRVtKRWSj5soUfHTo2Q   1   2    8640000            0      1.7gb        602.9mb
green  open   apifeatureusage-2020.09.10 FLg5rk_rRxyMGYtTLUcJHA   1   2    8655128            0      1.6gb        574.2mb
green  open   apifeatureusage-2020.09.11 _-HiFnFGS6-s0AAJCMMWjg   1   2    8640000            0      1.6gb        574.4mb
green  open   apifeatureusage-2020.09.26 OmJcbnHnTG2f1x8EfYGNeQ   1   2    8640000            0      1.6gb        566.9mb
green  open   apifeatureusage-2020.08.31 NHCsGOo4QeiCw7vVBqqhcg   1   2    8684974            0      1.6gb          567mb
green  open   apifeatureusage-2020.09.25 qWlgWULyRc67GYUaF_Y5cg   1   2    8630086            0      1.6gb        566.4mb
green  open   apifeatureusage-2020.10.09 2aEAMz_tQ3WnA71XK-xsKQ   1   2    8639759            0      1.6gb        573.4mb
green  open   apifeatureusage-2020.09.03 vdE6rWB3Ti-lpHjQYIEopw   1   2    8640000            0      1.6gb        556.7mb
green  open   apifeatureusage-2020.08.16 M9kjLBeuRu2Mcc14i8X6rg   1   2    8640050            0      1.7gb        595.4mb
green  open   apifeatureusage-2020.08.26 lr0pvP3ySDuidfqNAL2LqA   1   2    8640000            0      1.6gb        577.7mb
green  open   apifeatureusage-2020.08.28 SBPSL_8QQz-xNegZXRBtsQ   1   2    8640000            0      1.6gb        548.7mb
green  open   apifeatureusage-2020.08.30 _pOE68_kTAelSp6lFjpE1w   1   2    8877794            0      1.7gb          582mb
green  open   apifeatureusage-2020.08.10 d5RGG0uFQGGAAbqYTlYxew   1   2    8640324            0      1.7gb        611.7mb
green  open   apifeatureusage-2020.09.19 AzGwCXWfRQKI4SsY7ICNyg   1   2    9672954            0      1.8gb        630.3mb
green  open   apifeatureusage-2020.08.20 J74FCi4EQzKhSDfhM33iYQ   1   2    8713625            0      1.7gb          582mb
green  open   apifeatureusage-2020.10.24 ODg8zAtzSQ6bX3GXSfDI0A   1   2    8639684            0      1.7gb        582.9mb
green  open   apifeatureusage-2020.10.06 SyH3XXmESImttNIIX4Y8UA   1   2    8639468            0      1.6gb        576.2mb
green  open   apifeatureusage-2020.09.04 kqvaz3gUTbeEJAuqMKKhCA   1   2    8640000            0      1.6gb        568.7mb
green  open   apifeatureusage-2020.10.15 uLF7hTLnQ2a40HoPLGt5JA   1   2    8640316            0      1.7gb        580.6mb
green  open   apifeatureusage-2020.09.18 50IdJC2tTrWOz7XlLJX6SQ   1   2    8605547            0      1.6gb        570.4mb
green  open   apifeatureusage-2020.07.30 vL77nD26SQi_-EkFC4kQlw   1   2    8640000            0      1.7gb        607.6mb
green  open   apifeatureusage-2020.08.21 BDOa1ewvRkiCVttq5vkWlA   1   2    8640305            0      1.6gb          566mb
green  open   apifeatureusage-2020.10.25 wbgH6qBmS3WKfaz0SBjjGA   1   2    8640238            0      1.7gb        594.6mb
green  open   apifeatureusage-2020.08.02 _LUT3E57R1SfftQdRlBGLg   1   2    8640000            0      1.7gb        601.9mb
green  open   apifeatureusage-2020.09.07 8EDfqPl7QTmJrPfDwOBIYg   1   2    8640000            0      1.6gb        578.3mb
green  open   apifeatureusage-2020.10.05 TtKxoTIjTV6AHi97NY0aBA   1   2    8640099            0      1.7gb        583.1mb
green  open   apifeatureusage-2020.08.12 prK4up3zQv2-6Ssjc3Pz6A   1   2    8575288            0      1.7gb        594.3mb
green  open   apifeatureusage-2020.08.24 fcqPOd9-SQG7sBedxPaF5Q   1   2    8639922            0      1.6gb        556.4mb
green  open   apifeatureusage-2020.10.13 DyJbprfuSFuAApDFTx_xmA   1   2    8591711            0      1.6gb        575.5mb
green  open   apifeatureusage-2020.08.15 7MAUaT4JQxKFAkt4BGRF4g   1   2    8639957            0      1.7gb        598.8mb
green  open   apifeatureusage-2020.09.23 -VswYN5bSi-_WETQIGopVg   1   2    8640000            0      1.6gb        563.7mb
green  open   apifeatureusage-2020.09.21 p1tRED2hSaSNBTQ_sQYyRA   1   2    8640000            0      1.7gb          582mb
green  open   apifeatureusage-2020.10.11 sDGST5QvTZqGAcn81f5FFw   1   2    8640172            0      1.6gb        580.1mb
green  open   apifeatureusage-2020.10.19 UBwEvAzJTtWe_9XuerRebg   1   2    8641105            0      1.7gb          595mb
green  open   apifeatureusage-2020.08.06 O3yfMRCUQveYGlb0NF_aRw   1   2   45432527            0      8.6gb          2.8gb
green  open   apifeatureusage-2020.10.21 2csjV_SkR82VVqrI0EdZVg   1   2    8641130            0      1.7gb        591.3mb
green  open   apifeatureusage-2020.09.29 xx5hVgzZRV-YFh9x_10m1Q   1   2    8633022            0      1.6gb        570.1mb
green  open   apifeatureusage-2020.10.10 hSwKew3ST5qXVTDSqDsp8g   1   2    8640305            0      1.6gb        573.5mb
green  open   apifeatureusage-2020.10.02 7kWsaOE5TKmQNUF1-dwdXg   1   2    8640369            0      1.6gb        573.9mb
green  open   apifeatureusage-2020.08.22 JleLmcEKQpaWL1lYTTjToA   1   2    8639717            0      1.6gb        557.2mb
green  open   apifeatureusage-2020.09.27 hZ4SdzRkTEqveMbk6kbPAA   1   2    8640000            0      1.6gb        576.2mb
green  open   apifeatureusage-2020.09.16 wGLUxv6LRzywBKG2OkiuWQ   1   2    8640156            0      1.6gb        569.3mb
green  open   apifeatureusage-2020.08.04 5ssr5azNQruOxjLa67vYjw   1   2    8640000            0      1.7gb        604.6mb
green  open   apifeatureusage-2020.08.17 A1KTxP9fR1GoIsLatM9usQ   1   2    8640238            0      1.7gb          595mb
green  open   apifeatureusage-2020.10.18 Unrtat8kQWi8QT5CbMQSxA   1   2    8639740            0      1.7gb          602mb
green  open   apifeatureusage-2020.09.14 cR0itqEAQYSPxq9W8FSo4Q   1   2    8640000            0      1.6gb          575mb
green  open   apifeatureusage-2020.10.16 D59DjKGLRZWhXFLwqEYl_Q   1   2    8640377            0      1.6gb        573.9mb
green  open   apifeatureusage-2020.09.09 -lWWvU8sSsatYZMtXruY8w   1   2    8640000            0      1.7gb        585.9mb
green  open   apifeatureusage-2020.08.27 M5GBrj8USB-O7U04z4OeEg   1   2    8640000            0      1.6gb        562.5mb
green  open   apifeatureusage-2020.09.08 9iUcQ09CTKitClI0qa2LXg   1   2   12478909            0      2.4gb        819.9mb
green  open   apifeatureusage-2020.09.30 92Oz3Z8uTKaEUtNSrcY4JA   1   2    8639471            0      1.6gb        577.1mb
green  open   apifeatureusage-2020.08.01 X51ZN9CKRECGYxFzECxJyw   1   2    8640000            0      1.7gb        596.7mb
green  open   apifeatureusage-2020.10.26 bGKAp2JZSMS2-KIYOPyg1g   1   2    8615027            0      1.7gb        600.4mb
green  open   apifeatureusage-2020.09.28 VNArkhOCTm2Hamjw810d8A   1   2    8640000            0      1.6gb          571mb
green  open   apifeatureusage-2020.09.06 jrLMI_RIQO-XfkWrYpKawQ   1   2    8640000            0      1.6gb        572.1mb
green  open   apifeatureusage-2020.08.19 W1m2OuoKQbCqG18mEieJLQ   1   2    9997798            0      1.9gb        674.4mb
green  open   apifeatureusage-2020.10.20 m1JfDR5AR42vmf7K3Y7D7w   1   2    8638945            0      1.7gb          593mb
green  open   apifeatureusage-2020.08.29 wDu0Equ6SvG32fQKChYteg   1   2    8640000            0      1.6gb        556.3mb
green  open   apifeatureusage-2020.08.13 8rTmiCebSK-ij3tYYPjvuA   1   2    8639827            0      1.7gb        600.3mb
green  open   apifeatureusage-2020.09.12 966YT90zR0uZLwsHSqG2aw   1   2    8640000            0      1.7gb          584mb
green  open   apifeatureusage-2020.09.20 em7_Cl99S1-dV54E2b1tag   1   2    8640000            0      1.6gb        576.3mb
green  open   apifeatureusage-2020.08.03 ZoFNMXLtSQedZ_YdFrwujQ   1   2    8640000            0      1.8gb        626.1mb
green  open   apifeatureusage-2020.08.23 8q6koCeZTkaPR1njipUyQw   1   2    8640026            0      1.6gb        556.9mb
green  open   apifeatureusage-2020.10.07 ZWY9iMkISIWV08v1F4ssgA   1   2    8640095            0      1.7gb        580.2mb
green  open   apifeatureusage-2020.09.15 47KY-olHSxSMNvpw1qOmWg   1   2    8650428            0      1.6gb        576.2mb
green  open   apifeatureusage-2020.08.07 yVNc_pSSR8C549kIUWm58g   1   2    8640143            0      1.7gb        614.1mb
green  open   apifeatureusage-2020.09.17 YWV4udXSSyaTFBB8yAfLKw   1   2    8640289            0      1.6gb        566.3mb
green  open   apifeatureusage-2020.10.08 q4sMDis2RfqDY26A1-D8Gw   1   2    8640176            0      1.6gb        579.2mb
green  open   apifeatureusage-2020.10.27 TWKKWcANTjuuLpglWYIN0Q   1   2    2419520            0    589.3mb        200.1mb
green  open   apifeatureusage-2020.07.29 Z_R29EomSm2quYoTxWA2eg   1   2    8640000            0      1.7gb        605.3mb
green  open   apifeatureusage-2020.10.04 GTyex99xSReqrQQqEYEhZA   1   2    8325192            0      1.6gb          557mb
green  open   apifeatureusage-2020.08.11 tZ2CPMu5Roq0cA5H31FEig   1   2    8639659            0      1.7gb        596.6mb
green  open   apifeatureusage-2020.08.25 We9lg23pTTK4FIucLG0-nA   1   2    8783046            0      1.6gb        579.4mb
green  open   apifeatureusage-2020.08.05 QC-XbN-BSji4MU2qkg2x4g   1   2   46811896            0      8.8gb          2.9gb
green  open   apifeatureusage-2020.09.05 B-voT0RiSqSYNOmeBSVWsw   1   2    8640000            0      1.6gb        561.2mb
green  open   apifeatureusage-2020.09.02 6zKkQCAXRo6Vn1XT_RogwQ   1   2    8640000            0      1.6gb        556.9mb
green  open   apifeatureusage-2020.10.14 mx-jLtUkR2auQ9LAxBLKew   1   2    8639576            0      1.6gb        576.2mb
green  open   apifeatureusage-2020.10.01 hgWaSwcWQvGKx-H4Z2I30g   1   2    8640455            0      1.7gb        581.2mb
green  open   apifeatureusage-2020.09.24 ACFaGAYmTxOIQ41-TRVuNg   1   2    8498867            0      1.6gb        565.3mb

Is this the reason for the Icinga alerts on logstash2*? I just acked them because they had been unhandled CRIT for quite a while and I'd like to remove the noise so we see signal from other unhandled CRITs.

@Dzahn I'm trying to find the history of those alerts you mentioned on Icinga, but not having success - probably my (lack of) familiarity with the UI is coming into play here.

But I looked at #wikimedia-operations irc logs and it looks like the alerts associated with apifeatureusage and enwiki_general resolved a month ago:

134040:[2020-09-27 09:17:57] <icinga-wm> PROBLEM - ElasticSearch unassigned shard check - 9243 on search.svc.eqiad.wmnet is CRITICAL: CRITICAL - enwiki_general_1587198756[0](2020-09-24T14:34:37.564Z) https://wikitech.wikimedia.org/wiki/Search%23Administration
134211:[2020-09-28 00:24:39] <stashbot> T263970: ElasticSearch unassigned shard check apifeatureusage-2020.06.30@codfw and enwiki_general_1587198756@codfw - https://phabricator.wikimedia.org/T263970
134218:[2020-09-28 00:39:57] <icinga-wm> RECOVERY - ElasticSearch unassigned shard check - 9243 on search.svc.eqiad.wmnet is OK: OK - All good https://wikitech.wikimedia.org/wiki/Search%23Administration
134339:[2020-09-28 01:56:16] <stashbot> T263970: ElasticSearch unassigned shard check apifeatureusage-2020.06.30@codfw and enwiki_general_1587198756@eqiad - https://phabricator.wikimedia.org/T263970
135634:[2020-09-28 21:08:10] <icinga-wm> RECOVERY - ElasticSearch unassigned shard check - 9243 on search.svc.codfw.wmnet is OK: OK - All good https://wikitech.wikimedia.org/wiki/Search%23Administration

So I think those logstash2* alerts you're referencing were a separate issue.

Glancing at curl -X GET -s -k 'https://localhost:9243/_cat/indices/apifeatureusage*?v' on both eqiad and codfw elasticsearch hosts confirms that the indices are the same between both cirrus clusters; this means that the data recovered properly.

With the new replica counts set, this ticket should be done.