Page MenuHomePhabricator

crusnov (Cas Rusnov)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Oct 15 2018, 5:56 PM (52 w, 3 d)
Availability
Available
LDAP User
CRusnov
MediaWiki User
Unknown

Recent Activity

Yesterday

crusnov triaged T235736: cp3032 and cp3040 occasional failed fetches as Normal priority.
Thu, Oct 17, 6:46 PM · Operations, Traffic
crusnov triaged T235743: Prepare and check storage layer for mnwwiki as Normal priority.
Thu, Oct 17, 6:46 PM · cloud-services-team (Kanban), Data-Services, DBA, Operations
crusnov triaged T235755: Increased latency in POST requests as Normal priority.
Thu, Oct 17, 6:45 PM · Performance-Team, serviceops, Operations
crusnov triaged T235676: dwisehaupt needs access to iginca for frack hosts as Normal priority.
Thu, Oct 17, 2:46 PM · fundraising-tech-ops, Icinga, Operations
crusnov triaged T235677: Automatic pickup of Gerrit clone master doesn't happen (due to git-lfs not installed on production misc) as Normal priority.
Thu, Oct 17, 2:45 PM · Gerrit, Release-Engineering-Team, Operations, Wikimedia Design Style Guide
crusnov assigned T235688: SSH access for Lex Nasser, analytics intern to Dzahn.
Thu, Oct 17, 2:45 PM · Analytics, SRE-Access-Requests, Operations

Wed, Oct 16

crusnov triaged T235716: update librenms report as Normal priority.
Wed, Oct 16, 10:14 PM · Operations, netbox
crusnov added a comment to T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.

Thanks for bugging about this, I have silenced the particular alerts for the time being so as to reduce spam. I should have time to debug it more this week and we'll try to get it to shut up.

Wed, Oct 16, 3:11 PM · SRE-tools
crusnov added a comment to T235458: Make pipermail show RTL emails better by emitting dir=auto.

Okay cool, like I said I'm not completely versed :)

Wed, Oct 16, 1:34 AM · Patch-For-Review, I18n, Operations, Wikimedia-Mailing-lists, RTL

Tue, Oct 15

crusnov moved T235550: Rename multimedia-team to structured-data-team from Backlog to List maintenance on the Wikimedia-Mailing-lists board.
Tue, Oct 15, 11:57 PM · Operations, Wikimedia-Mailing-lists
crusnov added a comment to T235458: Make pipermail show RTL emails better by emitting dir=auto.

After discussing this a bit it looks like it's not currently possible with pipermail without uptsream changes (willing to be wrong here but afaict).

Tue, Oct 15, 11:56 PM · Patch-For-Review, I18n, Operations, Wikimedia-Mailing-lists, RTL
crusnov moved T235458: Make pipermail show RTL emails better by emitting dir=auto from Backlog to Mailman v3 on the Wikimedia-Mailing-lists board.
Tue, Oct 15, 11:55 PM · Patch-For-Review, I18n, Operations, Wikimedia-Mailing-lists, RTL
crusnov claimed T235550: Rename multimedia-team to structured-data-team.

Question, do you need an alias for the old list name?

Tue, Oct 15, 8:57 PM · Operations, Wikimedia-Mailing-lists
crusnov added a comment to T235458: Make pipermail show RTL emails better by emitting dir=auto.

I did a little digging but I don't immediately see where this is configured. Anyone more experienced with Mailman should look at this.

Tue, Oct 15, 8:43 PM · Patch-For-Review, I18n, Operations, Wikimedia-Mailing-lists, RTL
crusnov triaged T235458: Make pipermail show RTL emails better by emitting dir=auto as Normal priority.
Tue, Oct 15, 8:43 PM · Patch-For-Review, I18n, Operations, Wikimedia-Mailing-lists, RTL
crusnov triaged T235488: Jobrunners: allow to check that they are in sync with the etcd data as Normal priority.
Tue, Oct 15, 6:16 PM · Operations, serviceops
crusnov triaged T234999: Create wikimedia sustainability mailing list as Normal priority.
Tue, Oct 15, 5:58 PM · Operations, Wikimedia-Mailing-lists
crusnov closed T235526: Update prod SSH key for nathante as Resolved.

Done and done.

Tue, Oct 15, 5:55 PM · Operations, SRE-Access-Requests
crusnov claimed T235526: Update prod SSH key for nathante .
Tue, Oct 15, 5:45 PM · Operations, SRE-Access-Requests

Tue, Oct 8

crusnov created T234997: Make Netbox Active/Active.
Tue, Oct 8, 8:11 PM · Patch-For-Review, Traffic, Operations

Mon, Oct 7

crusnov added a comment to T233183: Automate generation of Management DNS records from Netbox.

After discussing this a bit and thinking about it quite a lot, I'm highly in favor of a machine git repo for the generated side. This has a nice side-benefit of us being able to easy expose it to the network via https on the netbox servers (and, thus, both publically and to the dns servers).

Mon, Oct 7, 8:12 PM · Traffic, Operations, Patch-For-Review, User-crusnov, Goal, SRE-tools

Thu, Oct 3

crusnov added a comment to T233183: Automate generation of Management DNS records from Netbox.

Thanks for the extensive feedback & validation suggestions! I'll see what i can come up with.

Thu, Oct 3, 3:15 PM · Traffic, Operations, Patch-For-Review, User-crusnov, Goal, SRE-tools

Wed, Oct 2

crusnov claimed T229397: Puppet: get row/rack info from Netbox.
Wed, Oct 2, 5:32 PM · Patch-For-Review, Puppet, Operations
crusnov added a comment to T234452: Puppet breakage in automation-framework VMs.

THanks for the heads up, we'll loop around to fix these up.

Wed, Oct 2, 4:16 PM · Operations
crusnov added a comment to T230449: Automate selection of IP address for interface.

This script has been released and appears to work correctly!

Has this been tested in the cloud Netbox test instance first?
Where can I see some example of generated IPs?

Wed, Oct 2, 3:49 PM · User-crusnov, Goal, SRE-tools
crusnov updated the task description for T228387: Bare metal cloud: management interfaces.
Wed, Oct 2, 4:04 AM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov closed T230449: Automate selection of IP address for interface, a subtask of T228387: Bare metal cloud: management interfaces, as Resolved.
Wed, Oct 2, 4:04 AM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov closed T230449: Automate selection of IP address for interface as Resolved.

This script has been released and appears to work correctly!

Wed, Oct 2, 4:04 AM · User-crusnov, Goal, SRE-tools

Tue, Sep 24

crusnov added a subtask for T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails: T233728: Netbox: netbox_dump_run service failed.
Tue, Sep 24, 11:49 PM · SRE-tools
crusnov added a parent task for T233728: Netbox: netbox_dump_run service failed: T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.
Tue, Sep 24, 11:49 PM · netbox
crusnov added a comment to T233728: Netbox: netbox_dump_run service failed.

This is the "switching to http" problem discussed in T232767 I believe. I haven't taken the time to more fully debug it but I suspect it's something in Netbox's configuration or a bug in django rest framework possibly.

Tue, Sep 24, 3:07 PM · netbox
crusnov added a comment to T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.

In debugging alerts on Netbox, I noticed that, unrelated to the CSV dumper, the ganeti sync sometimes returs a 500 error. This is caused by this error:

Tue, Sep 24, 3:42 AM · SRE-tools

Mon, Sep 23

crusnov added a comment to T218956: Should we deploy sshguard on external IP addresses?.

i think the general consensus i've heard is that external load balancers don't or can't have firewall rules, but perhaps we should consider it on a case by case basis for other external services.

Mon, Sep 23, 11:28 PM · User-crusnov

Sep 18 2019

crusnov added a comment to T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.

I spent some time debugging the problem with csv dumps from netbox2001. The basic gist is that when dumping a larger table, its pagination routine fails because the second page it tries to retrieve from the API is returning an http instead of an https url as the "next page" URL (and when netbox2001 tries to access an http url, it times out eventually because :80 is blocked). I suspect a bug in Netbox itself. I traced the execution in pdb, and it showed that this value is coming from the remote end.

Sep 18 2019, 4:31 AM · SRE-tools
crusnov added a comment to T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.

I spent some time debugging the problem with csv dumps from netbox2001. The basic gist is that when dumping a larger table, its pagination routine fails because the second page it tries to retrieve from the API is returning an http instead of an https url as the "next page" URL (and when netbox2001 tries to access an http url, it times out eventually because :80 is blocked). I suspect a bug in Netbox itself. I traced the execution in pdb, and it showed that this value is coming from the remote end.

Sep 18 2019, 4:30 AM · SRE-tools
crusnov added a comment to T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.

After increasing the CPU count to 4 on both fornt-ends the number of 500 errors that occur are much lower.

Sep 18 2019, 4:28 AM · SRE-tools
crusnov renamed T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails from Netbox API Occasionally 500s to Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.
Sep 18 2019, 4:28 AM · SRE-tools
crusnov closed T224517: netbox / netmon1002: netbox report related service units failed as Resolved.

netmon1002 is cleaned up now and should not be alerting on these basis anymore.

Sep 18 2019, 4:26 AM · observability, netbox, Operations
crusnov created T233183: Automate generation of Management DNS records from Netbox.
Sep 18 2019, 3:49 AM · Traffic, Operations, Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov closed T218709: Add Spicerack module for Ganeti as Resolved.
Sep 18 2019, 3:47 AM · SRE-tools, User-crusnov

Sep 17 2019

crusnov added a comment to T223291: Netbox: move it to dedicated Ganeti VMs.

This has been completed modulo some growing pains.

Sep 17 2019, 5:42 PM · netbox
crusnov closed T223291: Netbox: move it to dedicated Ganeti VMs as Resolved.
Sep 17 2019, 5:42 PM · netbox

Sep 12 2019

crusnov created T232767: Netbox API Occasionally 500s and Netbox2001 dumpcsv fails.
Sep 12 2019, 6:16 PM · SRE-tools

Sep 11 2019

crusnov closed T231502: More special cases for Netbox LibreNMS report as Resolved.
Sep 11 2019, 3:23 AM · netbox
crusnov added a comment to T230449: Automate selection of IP address for interface.

Shifting gears on this project to use the custom scripts interface added in 2.6.3. I shall make an 'add management interface' script that automatically assigns an IP address.

Sep 11 2019, 3:23 AM · User-crusnov, Goal, SRE-tools
crusnov added a comment to T223291: Netbox: move it to dedicated Ganeti VMs.

FWIW netbox.wikimedia.org points at netbox1001.wikimedia.og now. I am working on fixing some minor remaining issues with reports and making backups be correct (database is currently backed-up correctly, but netbox proper needs dumps backed up).

Sep 11 2019, 3:21 AM · netbox

Sep 9 2019

crusnov added a comment to T223291: Netbox: move it to dedicated Ganeti VMs.

Also adding @akosiaris for input as he initially wrote these classes.

The initial intention was for $includes to be an array indeed. So I favor b) as well.

@crusnov : Can you please fix this today, either by (partly) reverting https://gerrit.wikimedia.org/r/514395 or by adapting the type hints to use an array? This has prevented puppet runs on puppetdb2001 for ~ three days now and is blocking the setup of the new Buster-based puppetdb instances.

Ah my mistake, apologies. As we discussed on IRC, i shall untypehint the offending part and open a ticket to later address it.

Sep 9 2019, 4:35 PM · netbox
crusnov created T232358: postgres::slave module type for includes parameter in inconsistent..
Sep 9 2019, 4:19 PM · Puppet, Operations, netbox
crusnov added a comment to T223291: Netbox: move it to dedicated Ganeti VMs.

Also adding @akosiaris for input as he initially wrote these classes.

The initial intention was for $includes to be an array indeed. So I favor b) as well.

@crusnov : Can you please fix this today, either by (partly) reverting https://gerrit.wikimedia.org/r/514395 or by adapting the type hints to use an array? This has prevented puppet runs on puppetdb2001 for ~ three days now and is blocking the setup of the new Buster-based puppetdb instances.

Sep 9 2019, 4:14 PM · netbox

Sep 4 2019

crusnov committed rOBPY79ab2f724da4: Fix patches (authored by crusnov).
Fix patches
Sep 4 2019, 6:22 PM
crusnov committed rOBPY98cd21f0b7fd: fix setup (authored by crusnov).
fix setup
Sep 4 2019, 6:22 PM
crusnov committed rOBPY72680b58ffa2: add final newline (authored by crusnov).
add final newline
Sep 4 2019, 6:22 PM
crusnov committed rOBPY63e81a3c13c5: Update changelog (authored by crusnov).
Update changelog
Sep 4 2019, 6:22 PM
crusnov committed rOBPYb86c1a8d0d0a: Patch requirements in debian branch (authored by crusnov).
Patch requirements in debian branch
Sep 4 2019, 6:22 PM

Sep 3 2019

crusnov added a comment to T231068: Spicerack: improve support for Ganeti VMs.

Okay I've implemented changes to the netbox and ganeti modules as linked above which should allow all of the operations requested. I have not implemented writing status to Ganeti VMs since this information is updated automatically but it should be relatively straight forward to implement if desired.

Sep 3 2019, 4:47 AM · SRE-tools
crusnov added a comment to T223291: Netbox: move it to dedicated Ganeti VMs.

@Volans Ah hah thanks for this. I was given to believe the 'default' would include the ferm config and did'nt even think of looking.

Sep 3 2019, 2:38 AM · netbox

Aug 30 2019

crusnov updated the task description for T224946: Netbox Alert Cleanups.
Aug 30 2019, 3:15 PM · observability, User-crusnov, netbox, SRE-tools
crusnov updated the task description for T224946: Netbox Alert Cleanups.
Aug 30 2019, 3:10 PM · observability, User-crusnov, netbox, SRE-tools

Aug 29 2019

crusnov added a comment to T231068: Spicerack: improve support for Ganeti VMs.

Finding which cluster, or if the instance by fqdn is a Ganeti instance, could be done as easily as trrying to look it up in every configured cluster, and checking if there's information. We could provide utility functions to perform those actions trivially.

Aug 29 2019, 4:07 AM · SRE-tools
crusnov created T231512: Netbox: Add CSV dump rotation.
Aug 29 2019, 3:58 AM · SRE-tools

Aug 27 2019

crusnov added a comment to T230449: Automate selection of IP address for interface.

Work is progressing, I've taken the step of setting up vagrant so i can comfortably hack on Netbox without breaking anybody.

Aug 27 2019, 4:18 PM · User-crusnov, Goal, SRE-tools
crusnov added a comment to T209182: Setup Swift Storage for Netbox image (was: netbox won't allow me to upload photos of the rack).

I have confirmed content-type is set correctly, however Swift sets a content-disposition to attachment which causes browser to download. Incoming patch to sttrip this at the apache level.

Aug 27 2019, 4:46 AM · Patch-For-Review, netbox, Operations

Aug 22 2019

crusnov closed T230964: Netbox LibreNMS report fails as Resolved.

Fix deployed with https://gerrit.wikimedia.org/r/531763

Aug 22 2019, 11:09 PM · netbox, Operations
crusnov claimed T230964: Netbox LibreNMS report fails.
Aug 22 2019, 3:30 PM · netbox, Operations

Aug 19 2019

crusnov created T230725: Make contact group for Netbox report alerts.
Aug 19 2019, 3:01 PM · observability, Operations
crusnov closed T221507: Netbox report to validate network equipment data as Resolved.
Aug 19 2019, 2:57 PM · netbox, User-crusnov, SRE-tools, Operations, netops

Aug 14 2019

crusnov updated the task description for T217072: Spicerack module for Netbox.
Aug 14 2019, 4:47 PM · netbox, Patch-For-Review, User-crusnov, SRE-tools
crusnov added a comment to T217072: Spicerack module for Netbox.

At least the first one i

Aug 14 2019, 4:47 PM · netbox, Patch-For-Review, User-crusnov, SRE-tools

Aug 13 2019

crusnov added a comment to T230449: Automate selection of IP address for interface.

There is an undocumented API which creates a new IP address for a given prefix:

Aug 13 2019, 11:18 PM · User-crusnov, Goal, SRE-tools
crusnov committed rLPRIb3c1c8624751: netbox: Add fake secrets for reorg (authored by crusnov).
netbox: Add fake secrets for reorg
Aug 13 2019, 9:59 PM
crusnov added a comment to T230449: Automate selection of IP address for interface.

I have spent time looking at adding the API required for this functionality. I believe I have figured out how to do it and will produce a patch shortly.

Aug 13 2019, 9:15 PM · User-crusnov, Goal, SRE-tools
crusnov created T230449: Automate selection of IP address for interface.
Aug 13 2019, 9:15 PM · User-crusnov, Goal, SRE-tools
crusnov updated the task description for T228387: Bare metal cloud: management interfaces.
Aug 13 2019, 9:11 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov closed T223292: Netbox: generate CSV backups as Resolved.

this has been fully deployed now and tested. It is automated.

Aug 13 2019, 6:43 PM · netbox
crusnov closed T228670: Import management interfaces into Netbox from DNS, a subtask of T228387: Bare metal cloud: management interfaces, as Resolved.
Aug 13 2019, 6:43 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov closed T228670: Import management interfaces into Netbox from DNS as Resolved.
Aug 13 2019, 6:43 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov added a comment to T228670: Import management interfaces into Netbox from DNS.

FWIW there was no mgmt DNS information for some hosts:

Aug 13 2019, 6:43 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov updated the task description for T228670: Import management interfaces into Netbox from DNS.
Aug 13 2019, 6:37 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov updated the task description for T228670: Import management interfaces into Netbox from DNS.
Aug 13 2019, 6:37 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov added a comment to T228670: Import management interfaces into Netbox from DNS.

Script has completed running. Several edge cases worked out with Arzhel (frack, etc). MGMT interfaces should be largely correct now.

Aug 13 2019, 6:36 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov moved T228670: Import management interfaces into Netbox from DNS from In Progress to Pending on the User-crusnov board.
Aug 13 2019, 5:22 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov committed rOSNB78bed5d03506: switch swagger to nonpublic mode (authored by crusnov).
switch swagger to nonpublic mode
Aug 13 2019, 3:40 PM
crusnov committed rLPRIb1029da4af40: make netbox tokens available to hosts that need em (authored by crusnov).
make netbox tokens available to hosts that need em
Aug 13 2019, 3:17 PM

Aug 6 2019

crusnov closed T209182: Setup Swift Storage for Netbox image (was: netbox won't allow me to upload photos of the rack) as Resolved.

Okay after some finagling, uploading (and downloading) images should work. A particularity of swift storage is that they download instead of viewing, but they work!

Aug 6 2019, 6:09 PM · Patch-For-Review, netbox, Operations
crusnov committed rLPRI6dcd0ece5ca3: netbox: add dummy swift url key (authored by crusnov).
netbox: add dummy swift url key
Aug 6 2019, 3:33 AM

Aug 1 2019

crusnov moved T222629: Netbox: Set up deploy groups for scap to ensure primary is deployed before secondary from Backlog to Complete on the User-crusnov board.
Aug 1 2019, 10:25 PM · User-crusnov, netbox, SRE-tools

Jul 31 2019

crusnov added a comment to T226331: Upgrade Netbox to 2.6.1.

What was the issue?

  • There were some root-owned files in the tree because of testing that happened
  • There were missing deps for Swift in the build package
Jul 31 2019, 3:21 PM · Patch-For-Review, netbox
crusnov added a comment to T226331: Upgrade Netbox to 2.6.1.

What was the issue?

Jul 31 2019, 3:20 PM · Patch-For-Review, netbox

Jul 30 2019

crusnov added a comment to T209182: Setup Swift Storage for Netbox image (was: netbox won't allow me to upload photos of the rack).

Netbox has been deployed with the change that should enable this. We're testing.

Jul 30 2019, 10:52 PM · Patch-For-Review, netbox, Operations
crusnov closed T226331: Upgrade Netbox to 2.6.1 as Resolved.
Jul 30 2019, 10:52 PM · Patch-For-Review, netbox
crusnov added a comment to T226331: Upgrade Netbox to 2.6.1.

Obviously some finagling happened, but in the end the upgrade is good.

Jul 30 2019, 10:51 PM · Patch-For-Review, netbox
crusnov moved T228670: Import management interfaces into Netbox from DNS from Backlog to In Progress on the SRE-tools board.
Jul 30 2019, 10:51 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov moved T222837: Discussion about synchronizing Ganeti VM network interfaces to Netbox from In Progress to Pending release/deployment on the SRE-tools board.
Jul 30 2019, 10:50 PM · SRE-tools
crusnov moved T221507: Netbox report to validate network equipment data from In Progress to Pending release/deployment on the SRE-tools board.
Jul 30 2019, 10:50 PM · netbox, User-crusnov, SRE-tools, Operations, netops
crusnov moved T228670: Import management interfaces into Netbox from DNS from Backlog to In Progress on the User-crusnov board.
Jul 30 2019, 10:50 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools
crusnov moved T218956: Should we deploy sshguard on external IP addresses? from Backlog to Complete on the User-crusnov board.
Jul 30 2019, 10:50 PM · User-crusnov
crusnov moved T224946: Netbox Alert Cleanups from In Progress to Pending on the User-crusnov board.
Jul 30 2019, 10:50 PM · observability, User-crusnov, netbox, SRE-tools
crusnov moved T221507: Netbox report to validate network equipment data from In Progress to Complete on the User-crusnov board.
Jul 30 2019, 10:50 PM · netbox, User-crusnov, SRE-tools, Operations, netops
crusnov committed rOSNB4b8dc43ebe70: Fix imports for settings mod. (authored by crusnov).
Fix imports for settings mod.
Jul 30 2019, 10:42 PM

Jul 29 2019

crusnov committed rLPRI9d0685255bf8: netbox: Add dummy redis passwords (authored by crusnov).
netbox: Add dummy redis passwords
Jul 29 2019, 10:50 PM