Page MenuHomePhabricator

Volans (Riccardo Coccioli)
SRE

Projects (14)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Feb 10 2016, 11:25 AM (326 w, 6 d)
Availability
Available
IRC Nick
volans
LDAP User
Volans
MediaWiki User
RCoccioli (WMF) [ Global Accounts ]

Recent Activity

Yesterday

Volans added a comment to T155761: DNS repo: add Jenkins job to ensure there are no duplicates.

I've a local patch that I'm testing to perform the validation of the whole dataset (manual + netbox). The preliminary results are below. I will have a look at the reported errors (that seems legit at first sight) and also at the warnings that might not be reported correctly anymore (some seems a bit too many).

Tue, May 17, 9:54 PM · Traffic, Infrastructure-Foundations, DNS, SRE-tools
Volans added a comment to T307538: Write a GitLab "Migrating a Project" runbook / manual based on Blubber migration.

SRE foundations pointed me at @Volans who wrote and knows everything about runbooks implementations and could surely recommend an existing tool or guide us toward using cookbook if it is at all possible :]

Tue, May 17, 2:34 PM · Release-Engineering-Team (GitLab-a-thon 🦊), User-brennen, User-dduvall, GitLab (Project Migration)
Volans closed T304497: Implement persistence of spicerack and cumin logs to survive host reimage/refresh/failure as Resolved.

I've merged the change and @jcrespo has run it manually. The backup is working fine. Resolving.

Tue, May 17, 1:10 PM · Data-Persistence-Backup, bacula, Infrastructure-Foundations, Data-Persistence (Consultation)

Thu, May 12

Volans added a comment to P27820 Var inheritance with inlcuded file - Jinja2.

Could you try this?

Thu, May 12, 4:14 PM
Volans added a comment to T307260: sre.hosts.reimage: wait reboot time timeout on aqs nodes .

We looked at the logs with John and Papaul during our last meeting and agreed that it took a long time for mdadm+mkfs to create the software raid partition and format it. Hence decided to just increase the current timeout in spicerack, I'll make the patch.

Thu, May 12, 10:30 AM · Infrastructure-Foundations, SRE-tools, DC-Ops

Wed, May 11

Volans added a comment to T296452: Upgrade Netbox to 3.2.

Then manually ran the "import from puppetDB Netbox script for bast4003 (not sure if that should have been automated or not).

Wed, May 11, 2:46 PM · Patch-For-Review, Infrastructure-Foundations, netbox
Volans added a comment to T296452: Upgrade Netbox to 3.2.

/srv/netbox-exports/dns.git doesn't exist as expected, and the DNS generation went fine.

Wed, May 11, 2:44 PM · Patch-For-Review, Infrastructure-Foundations, netbox

Mon, May 2

Volans added a comment to T307349: Accidental removal of some files under /srv/deployment on deploy1002.

Mentioned in SAL (#wikimedia-operations) [2022-05-02T12:48:01Z] <volans> swapped /srv/deployment directory on deploy1002 with the one from the latest backup - T307349

Mon, May 2, 12:58 PM · Parsoid, Deployments, Release-Engineering-Team (Doing), bacula, SRE
Volans updated the task description for T155761: DNS repo: add Jenkins job to ensure there are no duplicates.
Mon, May 2, 10:00 AM · Traffic, Infrastructure-Foundations, DNS, SRE-tools
Volans edited projects for T155761: DNS repo: add Jenkins job to ensure there are no duplicates, added: SRE-tools, DNS; removed SRE.

Ack, let me repurpose this one.

Mon, May 2, 9:58 AM · Traffic, Infrastructure-Foundations, DNS, SRE-tools

Sat, Apr 30

Volans added a project to T307260: sre.hosts.reimage: wait reboot time timeout on aqs nodes : SRE-tools.
Sat, Apr 30, 9:50 AM · Infrastructure-Foundations, SRE-tools, DC-Ops
Volans triaged T307260: sre.hosts.reimage: wait reboot time timeout on aqs nodes as Medium priority.

@Papaul is it normal that it's so slow to just create an empty partition?
We can surely increase the number or add some tweak in spicerack to make it a bit more dynamic. I'll take a look at it next week.

Sat, Apr 30, 9:50 AM · Infrastructure-Foundations, SRE-tools, DC-Ops
Volans added a comment to T155761: DNS repo: add Jenkins job to ensure there are no duplicates.

@fgiunchedi yes and no, duplicates within the operations/dns repository are currently catched, but duplication within the automatically-generated data or between the manual and the generated data are not.
What we could do is to refactor a bit zone validator to inject into the zonefiles the netbox generated data for each INCLUDE before parsing the file. That should allow to catch all issues, but would also mean that some wrong data in Netbox might make CI fail on a totally valid dns patch.
What are your thoughts?

Sat, Apr 30, 9:43 AM · Traffic, Infrastructure-Foundations, DNS, SRE-tools

Tue, Apr 26

Volans edited P23803 SRE Observability contact hosts.
Tue, Apr 26, 11:02 AM
Volans added a comment to T306809: sre.dns.netbox cookbook dosn't support period terminated domains .

Sure, but they could cause various unwanted issues in different contexes, like not matching the fingerprint in the known hosts file for SSH connections:

cumin1001 $ SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh root@sretest1001.eqiad.wmnet.
The authenticity of host 'sretest1001.eqiad.wmnet. (2620:0:861:107:10:64:48:138)' can't be established.
[...SNIP...]

That's why I would rather prefer to keep the data consistent in Netbox without the ending period, so that the behaviour is consistent everywhere. For the DNS -specific bits the period is automatically added always.
Thoughts?

Tue, Apr 26, 9:33 AM · SRE, Traffic, DNS, netbox, SRE-tools, Infrastructure-Foundations
Volans added a comment to T306809: sre.dns.netbox cookbook dosn't support period terminated domains .

The DNS Name field in Netbox is an FQDN, the same Netbox UI help message for the field is: Hostname or FQDN (not case-sensitive)
For this reason I think that this task should be closed as invalid, and instead we should move towards having consistent data in Netbox.
The DNS Name field is used in multiple places in different automation, from Homer to various Netbox scripts and is always considered to be an FQDN.

Tue, Apr 26, 8:49 AM · SRE, Traffic, DNS, netbox, SRE-tools, Infrastructure-Foundations

Fri, Apr 22

Volans renamed T306654: Request sudo access for Jclark-ctr from WIP: request sudo access for Jclark-ctr to Request sudo access for Jclark-ctr.
Fri, Apr 22, 9:18 AM · Infrastructure-Foundations (FY2021/2022-Q4), SRE, SRE-Access-Requests
Volans moved T306654: Request sudo access for Jclark-ctr from Untriaged to In Discussion on the SRE-Access-Requests board.
Fri, Apr 22, 8:55 AM · Infrastructure-Foundations (FY2021/2022-Q4), SRE, SRE-Access-Requests
Volans updated the task description for T306654: Request sudo access for Jclark-ctr.
Fri, Apr 22, 8:35 AM · Infrastructure-Foundations (FY2021/2022-Q4), SRE, SRE-Access-Requests
Volans added a comment to T306654: Request sudo access for Jclark-ctr.

I've updated the task description according to T306654#7873125.
As for the puppet-merge on the puppetmasters, does the datacenter-ops have +2 on the operations/puppet repository on Gerrit?

Fri, Apr 22, 8:34 AM · Infrastructure-Foundations (FY2021/2022-Q4), SRE, SRE-Access-Requests
Volans updated the task description for T306654: Request sudo access for Jclark-ctr.
Fri, Apr 22, 8:33 AM · Infrastructure-Foundations (FY2021/2022-Q4), SRE, SRE-Access-Requests
Volans added a comment to T306654: Request sudo access for Jclark-ctr.

on apt.wikimedia.org

  • sudo puppet agent
Fri, Apr 22, 8:28 AM · Infrastructure-Foundations (FY2021/2022-Q4), SRE, SRE-Access-Requests
Volans merged T297133: Manage DHCP of Ganeti VMs from Netbox into T306661: Update makevm to include completion of the installation with the puppet runs.
Fri, Apr 22, 8:27 AM · Infrastructure-Foundations (FY2021/2022-Q4)
Volans merged task T297133: Manage DHCP of Ganeti VMs from Netbox into T306661: Update makevm to include completion of the installation with the puppet runs.
Fri, Apr 22, 8:27 AM · SRE-tools, Infrastructure-Foundations

Thu, Apr 21

Volans added a comment to T306552: Spicerack: add network devices support.

Thanks for opening the task to discuss details. As the first feedback I've a primary question that is how you envision this new third way to configured the network devices to re-conciliate with the existing two?
Basically, if we do a change via this method, would then homer be out of sync? Or anything we plan to do with this method will be automatically included in homer runs and so would be a noop for homer based on the updated Netbox configuration and the new state of the network device?

Thu, Apr 21, 1:26 PM · SRE, Infrastructure-Foundations, netops, Spicerack, SRE-tools
Volans triaged T305676: Validate all yaml files in puppet.git as Medium priority.
Thu, Apr 21, 9:28 AM · Puppet, SRE, Infrastructure-Foundations
Volans triaged T305979: allow certain users to disable puppet on mwdebug hosts as Medium priority.
Thu, Apr 21, 9:26 AM · Infrastructure-Foundations, serviceops, SRE, SRE-Access-Requests
Volans triaged T306580: haproxy tls terminator autobanning as Medium priority.
Thu, Apr 21, 9:25 AM · SRE, Traffic
Volans closed T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) as Resolved.

@KSiebert thanks, it's all done. There was some confusion based on which email should the account be associated with.
@TheresNoTime I'm resolving this, feel free to reopen it in case you encounter any issue.

Thu, Apr 21, 8:49 AM · SRE, LDAP-Access-Requests
Volans added a comment to T306117: Grant Access to nda for jmads.

Sorry, I did overlooked the request, as your account is with an @wikimedia.org email account I've granted you the wmf group in LDAP and revoked the nda one as they can't cohexist.
But don't worry, if/when the contract will end you will be able to request back the nda one if needed.

Thu, Apr 21, 7:04 AM · SRE, LDAP-Access-Requests

Wed, Apr 20

Volans added a comment to T306057: Request to grant cparle and mfossati login to an-airflow1003.eqiad.wmne.

I've +1ed the patch, @Ottomata feel free to merge whenever works for you.

Wed, Apr 20, 4:04 PM · SRE, SRE-Access-Requests, Generated Data Platform, Data-Engineering
Volans triaged T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) as Medium priority.
Wed, Apr 20, 3:48 PM · SRE, LDAP-Access-Requests
Volans moved T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) from Awaiting User Input to Code Review Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 3:46 PM · SRE, LDAP-Access-Requests
Volans closed T306437: Grant Access to ldap/wmf for AAssaf as Resolved.

Patch merged, resolving.

Wed, Apr 20, 3:46 PM · SRE, LDAP-Access-Requests
Volans closed T306225: Grant Access to ldap/wmf for "Mary Yang" as Resolved.

Patch merged, resolving.

Wed, Apr 20, 3:45 PM · SRE Observability (FY2021/2022-Q4), SRE, LDAP-Access-Requests
Volans added a comment to T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime).

Granted ldap/wmf to uid= samtar, revoked pre-existing ldap/nda one as they can't coexists on the same account.
Don't worry if/when the contract will be over you can re-get the nda one.

Wed, Apr 20, 3:35 PM · SRE, LDAP-Access-Requests
Volans added a comment to T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime).

And yes I did (starling-ctr@wikimedia) :-)

Wed, Apr 20, 3:19 PM · SRE, LDAP-Access-Requests
Volans moved T306225: Grant Access to ldap/wmf for "Mary Yang" from Manager Approval Pending to Code Review Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 3:09 PM · SRE Observability (FY2021/2022-Q4), SRE, LDAP-Access-Requests
Volans moved T306437: Grant Access to ldap/wmf for AAssaf from Manager Approval Pending to Code Review Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 3:09 PM · SRE, LDAP-Access-Requests
Volans moved T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) from Code Review Pending to Awaiting User Input on the LDAP-Access-Requests board.
Wed, Apr 20, 3:09 PM · SRE, LDAP-Access-Requests
Volans moved T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) from Manager Approval Pending to Code Review Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 3:09 PM · SRE, LDAP-Access-Requests
Volans added a comment to T306225: Grant Access to ldap/wmf for "Mary Yang".

As clarified in the related task above, granted ldap/wmf to uid=maryyang.

Wed, Apr 20, 3:00 PM · SRE Observability (FY2021/2022-Q4), SRE, LDAP-Access-Requests
Volans added a comment to T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime).

Hey @Volans, I'm already in the ldap/nda group from previous volunteer work :-) I believe the only reason for this request was for the gerrit +2 ACLs (which the nda group doesn't provide)?

Wed, Apr 20, 2:58 PM · SRE, LDAP-Access-Requests
Volans added a comment to T306437: Grant Access to ldap/wmf for AAssaf.

LDAP wmf group granted for aassaf.

Wed, Apr 20, 2:57 PM · SRE, LDAP-Access-Requests
Volans added a comment to T306437: Grant Access to ldap/wmf for AAssaf.

Do they have an @wikimedia.org email account? As per https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#WMF_group we usually grant the ldap/wmf group only to staff and contractors with an @wikimedia.org email account.

They do.

Wed, Apr 20, 2:55 PM · SRE, LDAP-Access-Requests
Volans added a comment to T306437: Grant Access to ldap/wmf for AAssaf.

@Jdforrester-WMF I can check what's the difference in Gerrit, it depends on the repositories I guess.
Do they have an @wikimedia.org email account? As per https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#WMF_group we usually grant the ldap/wmf group only to staff and contractors with an @wikimedia.org email account.

Wed, Apr 20, 2:46 PM · SRE, LDAP-Access-Requests
Volans moved T303857: Need a service account on deploy servers for automated train pre-sync operations from Untriaged to In Discussion on the SRE-Access-Requests board.
Wed, Apr 20, 2:38 PM · Release-Engineering-Team (Radar), SRE-Access-Requests, serviceops, SRE, Infrastructure-Foundations
Volans closed T306117: Grant Access to nda for jmads as Resolved.

Granted ldap/nda group, confirmation of NDA on file is in T249873#7865953. Resolving.

Wed, Apr 20, 1:59 PM · SRE, LDAP-Access-Requests
Volans moved T306225: Grant Access to ldap/wmf for "Mary Yang" from Backlog to Manager Approval Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 1:54 PM · SRE Observability (FY2021/2022-Q4), SRE, LDAP-Access-Requests
Volans moved T306437: Grant Access to ldap/wmf for AAssaf from Backlog to Manager Approval Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 1:54 PM · SRE, LDAP-Access-Requests
Volans moved T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) from Awaiting User Input to Manager Approval Pending on the LDAP-Access-Requests board.
Wed, Apr 20, 1:54 PM · SRE, LDAP-Access-Requests
Volans moved T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime) from Backlog to Awaiting User Input on the LDAP-Access-Requests board.
Wed, Apr 20, 1:54 PM · SRE, LDAP-Access-Requests
Volans added a comment to T306518: Grant Access to ldap/wmf for Samtar (TheresNoTime).

For contractors we usually grant the ldap/nda group instead, at the practical level they are almost equivalent, so that should work too.
@TheresNoTime Would be ok for you to convert this request into requesting the ldap/nda group?

Wed, Apr 20, 1:53 PM · SRE, LDAP-Access-Requests
Volans updated subscribers of T306472: Troubleshooting Mail Delivery Issues from Coupa.
Wed, Apr 20, 1:34 PM · Infrastructure-Foundations, Mail
Volans moved T305978: Grant Access to ldap/wmf for Nathillard from Code Review Pending to Awaiting User Input on the LDAP-Access-Requests board.
Wed, Apr 20, 10:55 AM · Infrastructure-Foundations, SRE, LDAP-Access-Requests
Volans moved T305979: allow certain users to disable puppet on mwdebug hosts from In Discussion to SRE Meeting Review on the SRE-Access-Requests board.
Wed, Apr 20, 10:54 AM · Infrastructure-Foundations, serviceops, SRE, SRE-Access-Requests
Volans moved T305979: allow certain users to disable puppet on mwdebug hosts from Untriaged to In Discussion on the SRE-Access-Requests board.
Wed, Apr 20, 10:54 AM · Infrastructure-Foundations, serviceops, SRE, SRE-Access-Requests
Volans moved T249873: Requesting access to analytics-privatedata-users for Jim Maddock from Untriaged to Awaiting User Input on the SRE-Access-Requests board.
Wed, Apr 20, 10:28 AM · SRE, SRE-Access-Requests
Volans added a comment to T250560: jmads requesting Kerberos password.

@jmads your kerberos account should still be valid, as far as I can tell. Please verify it and feel free to close this task if all is working as expected.

Wed, Apr 20, 10:28 AM · Analytics
Volans added a comment to T249873: Requesting access to analytics-privatedata-users for Jim Maddock.

@jmads the access patch has been merged, it will be deployed across the fleet within the next 30 minutes.
Feel free to close this task once verified that all is working as expected.

Wed, Apr 20, 10:27 AM · SRE, SRE-Access-Requests
Volans updated the task description for T249873: Requesting access to analytics-privatedata-users for Jim Maddock.
Wed, Apr 20, 9:13 AM · SRE, SRE-Access-Requests
Volans updated the task description for T249873: Requesting access to analytics-privatedata-users for Jim Maddock.
Wed, Apr 20, 8:53 AM · SRE, SRE-Access-Requests
Volans added a comment to T306225: Grant Access to ldap/wmf for "Mary Yang".

Pending clarification from @dr0ptp4kt on the similar request T306437#7864599

Wed, Apr 20, 8:45 AM · SRE Observability (FY2021/2022-Q4), SRE, LDAP-Access-Requests
Volans added a comment to T306490: Cumin should group similar SSH errors.

Would it be possible to group the similar SSH errors where the only difference is the target hostname?

Wed, Apr 20, 8:23 AM · SRE-tools, Infrastructure-Foundations

Tue, Apr 19

Volans added a comment to T306437: Grant Access to ldap/wmf for AAssaf.

@dr0ptp4kt could you please clarify if this access request (and the other related to the same project) is instead for the NDA group more than the WMF one? The NDA seems more approriate for non-staff and is the same used for example for research contractors.
As for accessing tools usually WMF and NDA are equivalent, so that shouldn't affect usability.

Tue, Apr 19, 4:02 PM · SRE, LDAP-Access-Requests
Volans added a comment to T305948: Requesting access to analytics-privatedata-users for Essex Igyan eigyan.

@eigyan the access request has been merged, it will be deployed within the next 30 minutes.
Please resolve this task once confirmed that it's all working as expected.

Tue, Apr 19, 3:58 PM · SRE, SRE-Access-Requests
Volans moved T305948: Requesting access to analytics-privatedata-users for Essex Igyan eigyan from Ready To Go to Awaiting User Input on the SRE-Access-Requests board.
Tue, Apr 19, 3:58 PM · SRE, SRE-Access-Requests
Volans added a comment to T306117: Grant Access to nda for jmads.

Pending the related T249873 at this point, to do all together.

Tue, Apr 19, 2:36 PM · SRE, LDAP-Access-Requests
Volans updated subscribers of T249873: Requesting access to analytics-privatedata-users for Jim Maddock.

Adding @BGerdemann for approval (contract side), please also provide a contract end date.
Adding @odimitrijevic for approval (analytics side).
Adding @KFrancis for confirming that there is still a valid NDA on file.

Tue, Apr 19, 1:38 PM · SRE, SRE-Access-Requests
Volans updated the task description for T249873: Requesting access to analytics-privatedata-users for Jim Maddock.
Tue, Apr 19, 1:15 PM · SRE, SRE-Access-Requests
Volans closed T306274: Add EChetty to #wmf-nda as Resolved.

@EChetty I've added you to the WMF-NDA project as staff member, no need for manager approval in this case. Resolving.
Feel free to re-open if you encounter any issue related to this access request.

Tue, Apr 19, 10:22 AM · WMF-NDA-Requests
Volans added a member for WMF-NDA: EChetty.
Tue, Apr 19, 10:21 AM
Volans triaged T306429: check_user: manager information not present anymore as Medium priority.
Tue, Apr 19, 10:17 AM · User-jbond, Infrastructure-Foundations
Volans moved T305634: Requesting access to analytics-privatedata-users for drochford (superset access with no server access) from Ready To Go to Awaiting User Input on the SRE-Access-Requests board.
Tue, Apr 19, 8:50 AM · SRE, SRE-Access-Requests
Volans updated the task description for T305634: Requesting access to analytics-privatedata-users for drochford (superset access with no server access).
Tue, Apr 19, 8:50 AM · SRE, SRE-Access-Requests

Apr 14 2022

Volans added a comment to T211750: Introduce Python code formatters usage.

Thanks for the detailed write up of all the issues. It would be great at some point to carve these into Phabricator tasks so we could try to start whittling them down. For example does git grep -E '^#!.*python2?$' give use all the python2 code that we need to check for black compatibility?

Apr 14 2022, 10:40 PM · Infrastructure-Foundations, User-Kormat, tox-wikimedia, Patch-For-Review, SRE, SRE-tools

Apr 13 2022

Volans added a comment to T211750: Introduce Python code formatters usage.

Our we ready to consider running black on our puppet repo?

Apr 13 2022, 9:47 PM · Infrastructure-Foundations, User-Kormat, tox-wikimedia, Patch-For-Review, SRE, SRE-tools
Volans committed rOSHOa45ae7a9b41f: setup.py: add missing types for requests (authored by Volans).
setup.py: add missing types for requests
Apr 13 2022, 4:48 PM
Volans committed rOSHOae7335c2f5af: capirca: catch also requests exceptions (authored by Volans).
capirca: catch also requests exceptions
Apr 13 2022, 4:48 PM
Volans added a comment to T45956: Rename $wmf* to $wmg* in wmf-config.

Perfect, thanks for clarifying.

Apr 13 2022, 12:52 PM · Technical-Debt, Wikimedia-Site-requests
Volans updated subscribers of T305979: allow certain users to disable puppet on mwdebug hosts.

let them run "puppet disable/enable" either directly or with a wrapper around it. (the one used by cumin?).

Apr 13 2022, 11:02 AM · Infrastructure-Foundations, serviceops, SRE, SRE-Access-Requests
Volans added a comment to T45956: Rename $wmf* to $wmg* in wmf-config.

Will the work on this task also change the key wmfMasterDatacenter in siteinfo's ['query']['general']['wmf-config']?
If so please ping me when that will happen as I have to adjust spicerack accordingly.

Apr 13 2022, 9:45 AM · Technical-Debt, Wikimedia-Site-requests

Apr 12 2022

Volans edited P24020 Gerrit I6d8efdca3fa2f0ef206763fcd8efd4af14b00af4 usage example.
Apr 12 2022, 10:01 PM

Apr 11 2022

Volans renamed T305840: Cannot verify NTP status asw1-b12-drmrs from Cannot verify NTP satus asw1-b12-drmrs to Cannot verify NTP status asw1-b12-drmrs.
Apr 11 2022, 12:57 PM · SRE, Infrastructure-Foundations, netops

Apr 8 2022

Volans updated subscribers of T305676: Validate all yaml files in puppet.git.
Apr 8 2022, 9:38 AM · Puppet, SRE, Infrastructure-Foundations

Apr 7 2022

Volans edited P24020 Gerrit I6d8efdca3fa2f0ef206763fcd8efd4af14b00af4 usage example.
Apr 7 2022, 2:05 PM
Volans edited P24020 Gerrit I6d8efdca3fa2f0ef206763fcd8efd4af14b00af4 usage example.
Apr 7 2022, 1:44 PM
Volans edited P24020 Gerrit I6d8efdca3fa2f0ef206763fcd8efd4af14b00af4 usage example.
Apr 7 2022, 1:33 PM
Volans added a comment to T305589: Upgrading Wikidough and durum VMs to bullseye.

AIUI the decom cookbook doesn't support VMs yet (?)

Apr 7 2022, 12:13 PM · Traffic, SRE

Apr 6 2022

Volans added a comment to T300977: Maybe restrict domains accessible by webproxy.

If I may add my use case too, I would like to be able to restrict the access to the webproxies from the cumin hosts (cluster::management puppet role) and potentially other sensitive hosts. Ideally to an allow-list of URLs or something similar.

Apr 6 2022, 7:06 PM · Patch-For-Review, Research, Product-Analytics, SRE, netops, Infrastructure-Foundations, Data-Engineering
Volans created P24161 Example native "grouping".
Apr 6 2022, 2:08 PM

Apr 5 2022

Volans closed T304434: reimage cookbook failure due to ipmi settings as Resolved.

With the above patch merged the problem should not happen anymore, if it does please re-open the task, I'm boldly resolving it for now.

Apr 5 2022, 1:50 PM · Infrastructure-Foundations, cloud-services-team (Kanban)

Mar 31 2022

Volans created P24020 Gerrit I6d8efdca3fa2f0ef206763fcd8efd4af14b00af4 usage example.
Mar 31 2022, 5:42 PM
Volans renamed T212866: Create Spicerack cookbook to drain/reboot/uncordon a Kubernetes worker from Create Spicerack cook book to drain/reboot/uncordon a Kubernetes worker to Create Spicerack cookbook to drain/reboot/uncordon a Kubernetes worker.
Mar 31 2022, 2:45 PM · Prod-Kubernetes, Infrastructure-Foundations, Kubernetes, SRE, SRE-tools

Mar 30 2022

Volans edited P23803 SRE Observability contact hosts.
Mar 30 2022, 1:20 PM
Volans created P23803 SRE Observability contact hosts.
Mar 30 2022, 12:22 PM

Mar 29 2022

Volans added a comment to T304851: codfw: setup a HP server to test PXE/DHCP.

All done from my side, thanks a lot!

Mar 29 2022, 4:43 PM · Infrastructure-Foundations
Volans added a comment to T304321: Most Icinga http checks ignore the URL parameter.

As John is out I took a stab at the implementation in https://gerrit.wikimedia.org/r/c/operations/puppet/+/773272, and decided to convert it to python, as I think gives us more flexibility.

Mar 29 2022, 1:34 PM · SRE Observability (FY2021/2022-Q4), Patch-For-Review, SRE, Sustainability (Incident Followup), observability
Volans added a comment to T304851: codfw: setup a HP server to test PXE/DHCP.

I'd also like to use this host for a couple of Force PXE tests for T304434 if possible

Mar 29 2022, 11:16 AM · Infrastructure-Foundations

Mar 28 2022

Volans added a comment to T300246: Add alert for varnishkafka low/zero messages per second to alertmanager.

I asked a question in #wikimedia-traffic on IRC about this. It seems that the ideal way to do it would be to get the pooled/depooled status of hosts into prometheus, so that we could integrate this status into the alert. Several people thought that this would be useful.

Tagging @fgiunchedi who might be able to advise further. In the meantime I'd be quite keen to get the alert deployed as-is, without the exclusion based on pooled/depooled status.

Mar 28 2022, 1:47 PM · Patch-For-Review, Data-Engineering, Data-Engineering-Kanban