Page MenuHomePhabricator

Session storage Cassandra cluster configuration
Closed, ResolvedPublic

Description

Create a new puppet role the new storage service:

Issues:

  • The use of a multi-instance Cassandra configuration, despite only having a single instance configured.
  • The use of a JBOD configuration despite having since learned that we cannot recover from a single disk failure. It's likely this would require repartitioning.
  • That admin::groups would match those used for RESTBase (this is almost certainly wrong).

TBA:

  • LVS configuration
  • Secrets, keys, certs, etc
  • Templating variables for deployment

Event Timeline

jijiki triaged this task as Medium priority.Feb 12 2019, 10:20 AM
jijiki created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 12 2019, 10:20 AM

Change 487885 had a related patch set uploaded (by Effie Mouzeli; owner: Eevans):
[operations/puppet@production] WIP: initial (strawman) configuration for session storage

https://gerrit.wikimedia.org/r/487885

Eevans moved this task from Backlog to In-Progress on the User-Eevans board.Feb 13 2019, 3:26 PM
jijiki moved this task from Backlog/Radar to Pio Kato on the User-jijiki board.Feb 21 2019, 9:00 AM

Change 492196 had a related patch set uploaded (by Eevans; owner: Eevans):
[labs/private@master] sessions: add (dummy) key material for session storage cluster

https://gerrit.wikimedia.org/r/492196

Change 492196 merged by Giuseppe Lavagetto:
[labs/private@master] sessions: add (dummy) key material for session storage cluster

https://gerrit.wikimedia.org/r/492196

Change 492338 had a related patch set uploaded (by Eevans; owner: Eevans):
[labs/private@master] sessions: updated key material

https://gerrit.wikimedia.org/r/492338

Change 492338 merged by Elukey:
[labs/private@master] sessions: updated key material

https://gerrit.wikimedia.org/r/492338

Change 487885 merged by Alexandros Kosiaris:
[operations/puppet@production] Initial configuration for session storage service

https://gerrit.wikimedia.org/r/487885

Change 496445 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] Icinga: add host groups for session storage service

https://gerrit.wikimedia.org/r/496445

Change 496445 merged by Alexandros Kosiaris:
[operations/puppet@production] Icinga: add host groups for session storage service

https://gerrit.wikimedia.org/r/496445

Change 496462 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/dns@master] Add sessionstore[12]00[123]-a.site.wmnet RRs

https://gerrit.wikimedia.org/r/496462

Change 496472 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] sessionstore: Switch to using the -a addresses

https://gerrit.wikimedia.org/r/496472

Change 496462 merged by Alexandros Kosiaris:
[operations/dns@master] Add sessionstore[12]00[123]-a.site.wmnet RRs

https://gerrit.wikimedia.org/r/496462

Change 496472 merged by Alexandros Kosiaris:
[operations/puppet@production] sessionstore: Switch to using the -a addresses

https://gerrit.wikimedia.org/r/496472

sessionstore hosts setup

akosiaris@sessionstore1001:~$ nodetool -p 7189 status
Datacenter: codfw
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.192.48.132  102.81 KiB  256          28.1%             8d062d5f-b31f-4f58-bde2-ec0e33a489fd  d
UN  10.192.32.101  164.41 KiB  256          37.2%             31a83e9e-63a6-4f51-99fb-0040b74cb51d  c
UN  10.192.16.95   76.96 KiB  256          34.6%             b461130b-258f-4a93-81e5-63bd31d12406  b
Datacenter: eqiad
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.64.0.144    159.49 KiB  256          32.8%             59918268-6b49-45bb-a29d-9ede6385bdce  a
UN  10.64.48.178   102.8 KiB  256          33.0%             bcdedfef-87e6-4f54-bc33-6e149da90115  d
UN  10.64.32.85    107.83 KiB  256          34.4%             5af82e95-c935-4492-9d8c-04542c0c6ede  c

Mentioned in SAL (#wikimedia-operations) [2019-03-18T08:52:06Z] <moritzm> restarting ferm on sessionstore, was stuck in resolving one of the -a records, which were only merged in a subsequent step (T215883)

Eevans renamed this task from Create puppet role for session storage service to Session storage Cassandra cluster configuration.Mar 20 2019, 7:26 PM
Dzahn added a subscriber: Dzahn.

subtask resolved. sessionstore1001 now has actual super_password from private repo etc.

Eevans closed this task as Resolved.Apr 25 2019, 3:48 PM
Eevans claimed this task.

Considering this has since been scoped as being just the Cassandra cluster, I believe we can call it complete.