Page MenuHomePhabricator

cloudcephosd10[48-52] service implementation
Closed, ResolvedPublic

Description

  • cloudcephosd1048
  • cloudcephosd1049 -- ready to be pooled
  • cloudcephosd1050 -- ready to be pooled
  • cloudcephosd1051 -- ready to be pooled
  • cloudcephosd1052

Event Timeline

Mentioned in SAL (#wikimedia-cloud-feed) [2025-06-03T17:37:01Z] <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-06-03T17:42:08Z] <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-06-03T17:50:23Z] <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-06-03T17:55:42Z] <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-06-03T18:29:01Z] <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-06-03T18:34:14Z] <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T395910)

Andrew changed the task status from Open to Stalled.Jun 6 2025, 4:50 PM
Andrew triaged this task as Medium priority.

Change #1167708 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Cloudcephosd1048: Configure ceph with a single nic

https://gerrit.wikimedia.org/r/1167708

We may need to hold off on this for now.

The requirement for jumbo frames poses a difficulty for the plan as the parent interface is configured for regular 1500 byte frames, which I don't think Linux will allow.

I will put a note about this on T399180 also.

Regarding the jumbo-frame complication on the plan to move to one link we are arranging to connect a second 25G on each of these new hosts for the storage vlan. See below tasks:

T394333: Q4:rack/setup/install cloudcephosd10[48-51]
{T399869}

Andrew changed the task status from Stalled to In Progress.Aug 20 2025, 9:22 PM

Change #1180693 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloudceph: add new OSDs: cloudcephosd1042-1051

https://gerrit.wikimedia.org/r/1180693

Change #1180693 merged by Andrew Bogott:

[operations/puppet@production] cloudceph: add new OSDs: cloudcephosd1042-1051

https://gerrit.wikimedia.org/r/1180693

Mentioned in SAL (#wikimedia-cloud-feed) [2025-08-22T14:15:16Z] <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-08-22T14:20:17Z] <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-08-22T14:24:03Z] <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T395910)

Change #1167708 abandoned by Andrew Bogott:

[operations/puppet@production] Cloudcephosd1048: Configure ceph with a single nic

Reason:

This turns out to be a whole thing, on account of needing jumbo frames on one nic but not the other

https://gerrit.wikimedia.org/r/1167708

Andrew renamed this task from cloudcephosd10[48-51] service implementation to cloudcephosd10[48-52] service implementation.Aug 26 2025, 3:36 PM
Andrew updated the task description. (Show Details)

Change #1182172 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Make cloudcephosd1052 an osd node

https://gerrit.wikimedia.org/r/1182172

Change #1182172 merged by Andrew Bogott:

[operations/puppet@production] Make cloudcephosd1052 an osd node

https://gerrit.wikimedia.org/r/1182172

Mentioned in SAL (#wikimedia-cloud-feed) [2025-09-04T16:32:03Z] <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T395910)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-09-04T16:36:16Z] <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T395910)

1050 and 1051 won't be pooled immediately, they're being reserved for T405478