Page MenuHomePhabricator

Papaul (Papaul)
User

Projects (7)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Dec 18 2014, 3:39 PM (415 w, 4 d)
Availability
Available
LDAP User
Papaul
MediaWiki User
Unknown

Recent Activity

Thu, Dec 1

Papaul closed T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet as Resolved.

@Andrew all yours

Thu, Dec 1, 8:34 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Thu, Dec 1, 8:33 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Thu, Dec 1, 7:27 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Thu, Dec 1, 6:30 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Thu, Dec 1, 5:37 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Thu, Dec 1, 5:16 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul placed T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5] up for grabs.
Thu, Dec 1, 4:32 PM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Thu, Dec 1, 4:30 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul closed T323718: decommission graphite2003.codfw.wmnet as Resolved.

Complete

Thu, Dec 1, 3:30 PM · SRE, ops-codfw, User-fgiunchedi, decommission-hardware
Papaul closed T322256: Q3:rack/setup/install db1206 as Resolved.

@Marostegui this is complete

Thu, Dec 1, 12:24 AM · SRE, DBA, ops-eqiad, DC-Ops
Papaul updated the task description for T322256: Q3:rack/setup/install db1206.
Thu, Dec 1, 12:23 AM · SRE, DBA, ops-eqiad, DC-Ops

Wed, Nov 30

Papaul added a comment to T322256: Q3:rack/setup/install db1206.

@Jclark-ctr thanks

Wed, Nov 30, 11:27 PM · SRE, DBA, ops-eqiad, DC-Ops
Papaul updated the task description for T322256: Q3:rack/setup/install db1206.
Wed, Nov 30, 11:26 PM · SRE, DBA, ops-eqiad, DC-Ops
Papaul updated the task description for T323718: decommission graphite2003.codfw.wmnet.
Wed, Nov 30, 7:09 PM · SRE, ops-codfw, User-fgiunchedi, decommission-hardware
Papaul reassigned T323222: Degraded RAID on ganeti2013 from Papaul to MoritzMuehlenhoff.

@MoritzMuehlenhoff Disk replaced

Wed, Nov 30, 6:52 PM · SRE, ops-codfw
Papaul closed T323960: ManagementSSHDown as Resolved.

This is fix.

Wed, Nov 30, 6:42 PM · ops-codfw
Papaul moved T323718: decommission graphite2003.codfw.wmnet from Backlog to Non-Urgent on the ops-codfw board.
Wed, Nov 30, 6:42 PM · SRE, ops-codfw, User-fgiunchedi, decommission-hardware
Papaul closed T322988: db2173 HW errors, a subtask of T322987: db2173 crashed and didn't alert, as Resolved.
Wed, Nov 30, 6:42 PM · Patch-For-Review, observability, DBA
Papaul closed T322988: db2173 HW errors, a subtask of T321130: Add column cuc_private to cu_changes on wmf wikis, as Resolved.
Wed, Nov 30, 6:42 PM · DBA, Schema-change-in-production
Papaul closed T322988: db2173 HW errors as Resolved.

@Marostegui main board replaced. The server is back up running. Sorry it took this long to get this fix.

Wed, Nov 30, 6:42 PM · DBA, SRE, ops-codfw
Papaul closed T323925: codfw: ManagementSSHDown for ores2009 and thumbor2004 as Resolved.

ores2009 mgmt is back up

Wed, Nov 30, 6:41 PM · SRE, serviceops, ops-codfw
Papaul added a comment to T323925: codfw: ManagementSSHDown for ores2009 and thumbor2004.

thunbor2004 had a broken IDRAC card. I replaced it.

Wed, Nov 30, 5:29 PM · SRE, serviceops, ops-codfw

Tue, Nov 29

Papaul claimed T322256: Q3:rack/setup/install db1206.
Tue, Nov 29, 7:43 PM · SRE, DBA, ops-eqiad, DC-Ops
Papaul added a comment to T322256: Q3:rack/setup/install db1206.

@Jclark-ctr netbox is showing that the server is racked in B8 or on the task it says that the server is in rack B1 (db1206 B1 U36 Port 26 ) can you please double check.

Tue, Nov 29, 7:41 PM · SRE, DBA, ops-eqiad, DC-Ops
Papaul closed T313978: Q1:rack/setup/install db1204, db1205 as Resolved.

This is complete

Tue, Nov 29, 5:59 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul updated the task description for T313978: Q1:rack/setup/install db1204, db1205.
Tue, Nov 29, 5:58 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul added a comment to T322256: Q3:rack/setup/install db1206.

I will take a look once i have the OS going on db120[4-5]

Tue, Nov 29, 5:02 PM · SRE, DBA, ops-eqiad, DC-Ops
Papaul added a comment to T313978: Q1:rack/setup/install db1204, db1205.

Waiting on John to connected those servers into 1G port since there are connected to 10G port so i can redo the switch configuration and start the OS install

Tue, Nov 29, 4:24 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul updated the task description for T313978: Q1:rack/setup/install db1204, db1205.
Tue, Nov 29, 4:23 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul updated the task description for T313978: Q1:rack/setup/install db1204, db1205.
Tue, Nov 29, 3:58 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul updated the task description for T313978: Q1:rack/setup/install db1204, db1205.
Tue, Nov 29, 3:05 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul added a comment to T313978: Q1:rack/setup/install db1204, db1205.

ACK

Tue, Nov 29, 1:24 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul added a comment to T323222: Degraded RAID on ganeti2013.

@MoritzMuehlenhoff thanks for the update

Tue, Nov 29, 1:23 PM · SRE, ops-codfw
Papaul added a comment to T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.

@Jclark-ctr can you please double check and confirm that all those servers are not R640 like it says in Netbox but there are R440? Thanks

Tue, Nov 29, 1:39 AM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul added a comment to T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.

@Andrew can yo please special the partman recipe to use for those servers in the task description?
Thank you

Tue, Nov 29, 1:25 AM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Tue, Nov 29, 1:23 AM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul added a comment to T313978: Q1:rack/setup/install db1204, db1205.

@Jclark-ctr can you please confirm that those servers are connected to a 10G interface.
@Marostegui @jcrespo I am trying to setup those servers and i don't know if the servers should use IPV6 address or not it is not mentioned in the Description

Tue, Nov 29, 12:53 AM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul closed T319433: Q2:rack/setup/install arclamp1001.eqiad.wmnet as Resolved.

This is done

Tue, Nov 29, 12:34 AM · serviceops, SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T319433: Q2:rack/setup/install arclamp1001.eqiad.wmnet.
Tue, Nov 29, 12:33 AM · serviceops, SRE, ops-eqiad, DC-Ops

Mon, Nov 28

Papaul added a comment to T323512: db2174 lost power.

I tested the HW on the server all looking good. The only error i had was error-code 2000-0251 which is not a big issue see link below for more information on error-code. I think the task can be closed. Thanks.
https://www.dell.com/support/kbdoc/en-us/000139065/resolving-error-code-2000-0251-when-launching-the-epsa-diagnostics-on-dell-pc

Mon, Nov 28, 4:06 PM · SRE, DBA, ops-codfw
Papaul added a comment to T323222: Degraded RAID on ganeti2013.

@MoritzMuehlenhoff unfortunately this server is out of warranty.

Mon, Nov 28, 3:53 PM · SRE, ops-codfw
Papaul triaged T323222: Degraded RAID on ganeti2013 as Medium priority.
Mon, Nov 28, 3:51 PM · SRE, ops-codfw
Papaul triaged T323925: codfw: ManagementSSHDown for ores2009 and thumbor2004 as High priority.
Mon, Nov 28, 3:51 PM · SRE, serviceops, ops-codfw
Papaul created T323925: codfw: ManagementSSHDown for ores2009 and thumbor2004.
Mon, Nov 28, 3:51 PM · SRE, serviceops, ops-codfw

Wed, Nov 23

Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Wed, Nov 23, 9:22 PM · SRE, ops-eqiad, DC-Ops
Papaul added a comment to T313830: Q1:rack/setup/install contint1002.

@Dzahn yes the server has a Public IP address

Wed, Nov 23, 8:50 PM · serviceops-collab, ops-eqiad, SRE, DC-Ops
Papaul closed T313830: Q1:rack/setup/install contint1002 as Resolved.

@LSobanski this is done

Wed, Nov 23, 7:42 PM · serviceops-collab, ops-eqiad, SRE, DC-Ops
Papaul closed T313830: Q1:rack/setup/install contint1002, a subtask of T294276: contint hardware refresh, as Resolved.
Wed, Nov 23, 7:41 PM · Continuous-Integration-Infrastructure, serviceops-collab, Release-Engineering-Team (Seen)
Papaul updated the task description for T313830: Q1:rack/setup/install contint1002.
Wed, Nov 23, 7:41 PM · serviceops-collab, ops-eqiad, SRE, DC-Ops
Papaul updated the task description for T319433: Q2:rack/setup/install arclamp1001.eqiad.wmnet.
Wed, Nov 23, 7:07 PM · serviceops, SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Wed, Nov 23, 7:06 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313830: Q1:rack/setup/install contint1002.
Wed, Nov 23, 7:04 PM · serviceops-collab, ops-eqiad, SRE, DC-Ops
Papaul updated the task description for T319433: Q2:rack/setup/install arclamp1001.eqiad.wmnet.
Wed, Nov 23, 7:03 PM · serviceops, SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T313983: Q1:rack/setup/install cloudvirt10[54-61].eqiad.wmnet.
Wed, Nov 23, 6:01 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Papaul updated the task description for T313830: Q1:rack/setup/install contint1002.
Wed, Nov 23, 4:58 PM · serviceops-collab, ops-eqiad, SRE, DC-Ops
Papaul updated the task description for T313830: Q1:rack/setup/install contint1002.
Wed, Nov 23, 4:35 PM · serviceops-collab, ops-eqiad, SRE, DC-Ops
Papaul closed T321122: Q2:rack/setup/install dbprov1004 as Resolved.

@jcrespo this is done

Wed, Nov 23, 2:27 AM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul updated the task description for T321122: Q2:rack/setup/install dbprov1004.
Wed, Nov 23, 2:26 AM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops

Tue, Nov 22

Papaul closed T317892: Q2:rack/setup/install puppetdb1003 as Resolved.

@MoritzMuehlenhoff this complete

Tue, Nov 22, 11:12 PM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T317892: Q2:rack/setup/install puppetdb1003.
Tue, Nov 22, 11:11 PM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T321122: Q2:rack/setup/install dbprov1004.
Tue, Nov 22, 10:37 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul updated the task description for T317892: Q2:rack/setup/install puppetdb1003.
Tue, Nov 22, 10:36 PM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T321122: Q2:rack/setup/install dbprov1004.
Tue, Nov 22, 10:24 PM · SRE, Data-Persistence-Backup, ops-eqiad, DC-Ops
Papaul closed T322419: Troubleshoot why latest idrac version is not working on Dell servers as Resolved.

This is now fixed by @jbond and @Volans

Tue, Nov 22, 10:19 PM · SRE, ops-codfw
Papaul updated the task description for T317892: Q2:rack/setup/install puppetdb1003.
Tue, Nov 22, 10:17 PM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T317892: Q2:rack/setup/install puppetdb1003.
Tue, Nov 22, 8:58 PM · SRE, ops-eqiad, DC-Ops

Mon, Nov 21

Papaul added a comment to T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].

@Ottomata this SSD looks like the first disk /dev/sda below is what I have

Virtual Disk 238: RAID1, 446.625GB, Ready
  Virtual Disk 239: RAID10, 21.829TB, Ready
Mon, Nov 21, 9:09 PM · SRE, ops-eqiad, DC-Ops
Papaul added a comment to T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].

@BTullis thanks for the update looks like we have an issue with the partman recipe can you please take a look and let me know thanks

────────────────────┤ [!] Partition disks ├───────────
  │ │
  │ │                        383.6 GB is too small
  │ │ You asked for 383.6 GB to be used for guided partitionin
  │ │ selected partitioning recipe requires at least 6.0 TB.
  │ │
  └─│     <Go Back>                                        <Co
    │
Mon, Nov 21, 5:43 PM · SRE, ops-eqiad, DC-Ops
Papaul added a comment to T323512: db2174 lost power.

i checked all looks good on the server. @Ladsgroup can you confirm that all is good us on your end in this server ?
Thanks

Mon, Nov 21, 5:06 PM · SRE, DBA, ops-codfw
Papaul renamed T323512: db2174 lost power from db1174 lost power to db2174 lost power.
Mon, Nov 21, 5:04 PM · SRE, DBA, ops-codfw
Papaul moved T323512: db2174 lost power from Backlog to Hardware Failure / Troubleshoot on the ops-codfw board.
Mon, Nov 21, 5:04 PM · SRE, DBA, ops-codfw

Sat, Nov 19

Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Sat, Nov 19, 1:13 AM · SRE, ops-eqiad, DC-Ops

Fri, Nov 18

Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Fri, Nov 18, 9:42 PM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Fri, Nov 18, 9:41 PM · SRE, ops-eqiad, DC-Ops
Papaul closed T313960: Q1:rack/setup/install kafka-logging100[45] as Resolved.

@herron this is complete

Fri, Nov 18, 4:10 PM · SRE Observability, observability, SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T313960: Q1:rack/setup/install kafka-logging100[45].
Fri, Nov 18, 4:09 PM · SRE Observability, observability, SRE, ops-eqiad, DC-Ops
Papaul added a comment to T313960: Q1:rack/setup/install kafka-logging100[45].

@jbon I think the issue was with what @Volans mentioned above. Didn't have the issue with another node that I worked with yesterday (kafka-jumbo1010) Thanks to both of you

Fri, Nov 18, 3:19 PM · SRE Observability, observability, SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Fri, Nov 18, 4:58 AM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Fri, Nov 18, 4:56 AM · SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].
Fri, Nov 18, 4:48 AM · SRE, ops-eqiad, DC-Ops
Papaul added a comment to T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].

@Ottomata @BTullis what HW RAID are we using for those servers ?
Thanks

Fri, Nov 18, 4:03 AM · SRE, ops-eqiad, DC-Ops
Papaul added a comment to T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].

@Jclark-ctr I have no node in netbox with the name kafka-jumbo1013 but i do have a node wmf10606 whit purchase date 2022-06-07 that is set to offline in netbox . can you please track where is kafka-1013 and update netbox for me?

Fri, Nov 18, 3:52 AM · SRE, ops-eqiad, DC-Ops
Papaul updated subscribers of T313960: Q1:rack/setup/install kafka-logging100[45].

@Volans i tried ro urn the reimage cookbook on kafka-logging1005 i am getting the error below

raceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/remote.py", line 335, in query
    hosts = query.Query(self._config).execute(query_string)
  File "/usr/lib/python3/dist-packages/cumin/query.py", line 65, in execute
    raise InvalidQueryError(
cumin.backends.InvalidQueryError: Unable to parse the query 'D{kafka-logging1005:.eqiad.wmnet}' neither with the default backend 'puppetdb' nor with the global grammar:
puppetdb: Expected end of text, found '{'  (at char 1), (line:1, col:2)
global: Expected end of text, found ':'  (at char 17), (line:1, col:18)
Fri, Nov 18, 3:10 AM · SRE Observability, observability, SRE, ops-eqiad, DC-Ops
Papaul updated the task description for T313960: Q1:rack/setup/install kafka-logging100[45].
Fri, Nov 18, 2:57 AM · SRE Observability, observability, SRE, ops-eqiad, DC-Ops
Papaul updated subscribers of T313960: Q1:rack/setup/install kafka-logging100[45].

@jbond if you have time tomorrow i did get the error below on kafka-logging1004. I checked the upgrade completed with no issue but the cookbook failed with the error below. thanks

Fri, Nov 18, 1:06 AM · SRE Observability, observability, SRE, ops-eqiad, DC-Ops
Papaul claimed T306939: Q4:(Need By: TBD) rack/setup/install kafka-jumbo101[0-5].

@BTullis thank you I will take over this tasks

Fri, Nov 18, 12:45 AM · SRE, ops-eqiad, DC-Ops

Thu, Nov 17

Papaul added a comment to T322988: db2173 HW errors.

I have a case open with Dell

Service Request: 1115331653
Thu, Nov 17, 10:53 PM · DBA, SRE, ops-codfw
Papaul closed T321339: Port with no description on access switch as Resolved.

This was already fixed.

Thu, Nov 17, 9:39 PM · ops-codfw
Papaul updated the task description for T322578: codfw:test new Supermicro server.
Thu, Nov 17, 2:03 AM · Patch-For-Review, SRE, ops-codfw
Papaul moved T323222: Degraded RAID on ganeti2013 from Backlog to Hardware Failure / Troubleshoot on the ops-codfw board.
Thu, Nov 17, 1:58 AM · SRE, ops-codfw
Papaul moved T323220: Broken disk on ganeti2013 from Backlog to Hardware Failure / Troubleshoot on the ops-codfw board.
Thu, Nov 17, 1:57 AM · SRE, ops-codfw

Wed, Nov 16

Papaul updated the task description for T322578: codfw:test new Supermicro server.
Wed, Nov 16, 1:11 AM · Patch-For-Review, SRE, ops-codfw
Papaul updated the task description for T322578: codfw:test new Supermicro server.
Wed, Nov 16, 1:10 AM · Patch-For-Review, SRE, ops-codfw
Papaul closed T321128: Q1:rack/setup/install dbprov2004 as Resolved.

The R650 is working fine no issue to report on my end. The only problem and I think we know already about it is that the server has 1 power supply on the left and the other one on the right not like all the other servers having their power supplies both on the right

Wed, Nov 16, 1:06 AM · Data-Persistence-Backup, SRE, ops-codfw, DC-Ops
Papaul added a comment to T322419: Troubleshoot why latest idrac version is not working on Dell servers.

I had a chat with @jbond in IRC he is looking into this.

Wed, Nov 16, 12:05 AM · SRE, ops-codfw
Papaul closed T322238: ulsfo: cp4052 repro whole provisioning process as Resolved.

It turn out that the issue that was making the R450 to fail during provisioning was
1 - The BIOS was set to UEFI
2 - The Serial communication settings were different then the other old servers (R440, R430)

Wed, Nov 16, 12:03 AM · SRE, ops-ulsfo

Tue, Nov 15

Papaul added a comment to T321128: Q1:rack/setup/install dbprov2004.

The serial communication issue we had was fixed by @Volans patch

Tue, Nov 15, 9:06 PM · Data-Persistence-Backup, SRE, ops-codfw, DC-Ops
Papaul updated the task description for T321128: Q1:rack/setup/install dbprov2004.
Tue, Nov 15, 9:05 PM · Data-Persistence-Backup, SRE, ops-codfw, DC-Ops
Papaul added a comment to T321128: Q1:rack/setup/install dbprov2004.

@jcrespo thank you for the update

Tue, Nov 15, 5:34 PM · Data-Persistence-Backup, SRE, ops-codfw, DC-Ops
Papaul added a comment to T321128: Q1:rack/setup/install dbprov2004.

@Volans All looks good on the R650 the only issue is that the provision cookbook didn't setup the serial communication like what happen with the R450. Do you want for us to keep the server again for this week to give you time to look into it?

Tue, Nov 15, 4:17 PM · Data-Persistence-Backup, SRE, ops-codfw, DC-Ops