BBlack (Brandon Black)
WMF Operations Engineer

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Nov 4 2014, 4:29 PM (124 w, 1 d)
Availability
Available
IRC Nick
bblack
LDAP User
BBlack
MediaWiki User
BBlack (WMF)

Recent Activity

Yesterday

BBlack created T161148: AuthDNS CM/CI refactor.
Wed, Mar 22, 7:27 PM · DNS, Operations, Traffic
BBlack renamed T161145: Fix the general problem of randomly-bad puppet agent cron timings within redundant clusters from "Fix the general problem of randomly-bad puppet agent cron timings within redundancy clusters" to "Fix the general problem of randomly-bad puppet agent cron timings within redundant clusters".
Wed, Mar 22, 7:22 PM · Operations
BBlack created T161145: Fix the general problem of randomly-bad puppet agent cron timings within redundant clusters.
Wed, Mar 22, 7:00 PM · Operations
BBlack added a comment to T137962: [Spec] Tracking and blocking specific IP/user-agent combinations .

Re: 304-vs-200, I was able to get some 304s, but only when I dropped the INM and relied on IMS. It seems like the ETags might be inconsistent between serial requests to ores for the same resource? (the LM timestamps are too a bit).

Wed, Mar 22, 1:40 AM · Patch-For-Review, ORES, Revision-Scoring-As-A-Service-Backlog

Tue, Mar 21

BBlack merged T82996: ulsfo: add a DNS recursor into T96852: Define 3-host infra cluster for traffic pops.
Tue, Mar 21, 10:58 PM · Operations, Traffic
BBlack merged task T82996: ulsfo: add a DNS recursor into T96852: Define 3-host infra cluster for traffic pops.
Tue, Mar 21, 10:58 PM · Operations
BBlack renamed T96852: Define 3-host infra cluster for traffic pops from "Deploy infra ganeti cluster @ ulsfo" to "Define 3-host infra cluster for traffic pops".
Tue, Mar 21, 10:57 PM · Operations, Traffic
BBlack added a comment to T147569: Evaluate/Deploy TCP BBR when available (kernel 4.9+).

This is it. We're currently still testing/deploying the kernel that allows it to be enabled. After that we can do some testing/evaluation on BBR itself and report here. Our current thinking is we expect we'll enable it as the default congestion control for our public edge nodes, but probably not elsewhere in our network (app/db/etc in core DCs), as it's unclear whether it might lose to cubic in some corner cases on fast/local networks.

Tue, Mar 21, 4:46 PM · Performance-Team, Operations, Traffic

Fri, Mar 17

BBlack edited the description of T156256: Select or Acquire Address Space for Asia Cache DC.
Fri, Mar 17, 4:39 PM · Operations, Traffic
BBlack edited the description of T156256: Select or Acquire Address Space for Asia Cache DC.
Fri, Mar 17, 4:36 PM · Operations, Traffic

Thu, Mar 16

BBlack closed T107430: Decom bits.wikimedia.org hostname as "Resolved".

This was resolved on the server side back in early Dec when the MW config patch finally landed. There might be trailing traffic to this ticket in the form of reports/fixups of long-broken things, but there's really nothing left to actually do to finish decom.

Thu, Mar 16, 12:41 PM · Patch-For-Review, MW-1.28-release (WMF-deploy-2016-08-09_(1.28.0-wmf.14)), Operations, Traffic

Mon, Mar 13

BBlack added a comment to T160109: What happened 2017-03-09 04:00 - 06:00 UTC.

Presently, we don't have any sort of "live" feed on security incidents other than what you've already mentioned. For some classes of incident, such a thing would be a liability.

Mon, Mar 13, 1:39 PM · Operations, Traffic, Performance-Team
BBlack added a comment to T160109: What happened 2017-03-09 04:00 - 06:00 UTC.

Yes, the anomalies you're describing here were the result of a DDoS attack against us, and our mitigations to reduce user impact. The incident doc is private...

Mon, Mar 13, 1:33 PM · Operations, Traffic, Performance-Team

Wed, Mar 8

BBlack added a comment to T101525: Set up LVS for current AuthDNS.

Some further thoughts that haven't been captured here:

Wed, Mar 8, 12:19 PM · netops, Operations, Traffic

Tue, Mar 7

BBlack created T159870: baham (ns1) CPU-related issues.
Tue, Mar 7, 7:40 PM · Traffic, Operations, ops-codfw

Mon, Mar 6

BBlack added a comment to T152622: Wikipedia.cz and other domains owned by WMCZ have invalid certificate.

Probably this should be merged into T133548, unless it's altered to be about implementing some other solution outside of that scope.

Mon, Mar 6, 7:18 PM · Operations, Traffic, Domains, User-Urbanecm

Fri, Feb 24

BBlack added a comment to T156029: Select location for Asia Cache DC.

The goal is to have Wikipedia's servers run on renewable energy. It's as simple as that.

Fri, Feb 24, 4:13 PM · Operations, Traffic
BBlack added a comment to T156029: Select location for Asia Cache DC.

I think the wikitech discussion seems like it was, in fact, a good discussion of the issue. So if your goal is discussion, I don't see the issue here. The metawiki page does contain a lot of information that comes from the Operations team.

Fri, Feb 24, 3:27 PM · Operations, Traffic
BBlack added a comment to T156029: Select location for Asia Cache DC.

I really don't mean to be overly facile here, but if you're interested in having a discussion, we can have that at any usual public discussion venue. The wikitech mailing list might be a good start. Our IRC channels (e.g. Freenode #wikimedia-operations) work as well for informal, async discussion. Setting up a BOF or other sort of session on the topic at Wikimania might make sense as well. It would be helpful to understand better what sort of discussion you'd like to have. Is this a policy discussion about how the organization or Operations (or Finance) specifically weights green issues in various decisions in general? Or do you want to bring some information to the table that we might not be aware of about the evolving metrics of vendor green-ness and how to evaluate them? or.. ?

Fri, Feb 24, 2:24 PM · Operations, Traffic
BBlack added a comment to T156029: Select location for Asia Cache DC.

I think you can discuss that anywhere you like (within reason!).

Fri, Feb 24, 1:33 PM · Operations, Traffic
BBlack added a comment to T156029: Select location for Asia Cache DC.

This probably isn't the ideal location, but I can speak to the issue here since it's obvious that some will come looking here for that answer. The TL;DR is that environmental considerations were not a major factor in this decision. The basics of this decision go something like this:

Fri, Feb 24, 1:06 PM · Operations, Traffic

Wed, Feb 22

BBlack closed T156029: Select location for Asia Cache DC as "Resolved".

Singapore approved by Legal and selected, which is pretty much our ideal candidate on a range of issues.

Wed, Feb 22, 5:04 PM · Operations, Traffic
BBlack closed T156029: Select location for Asia Cache DC, a subtask of T156028: Name Asia Cache DC site, as "Resolved".
Wed, Feb 22, 5:04 PM · Operations, Traffic

Feb 13 2017

BBlack added a comment to T137962: [Spec] Tracking and blocking specific IP/user-agent combinations .
  1. One of the best defenses you can have is to be sure that unauthenticated URLs are reasonably-cacheable (e.g. images and the root login page itself, etc). If the service emits good cacheability headers, the front edge can absorb a lot of load on cache hits. At a first cursory glance, even https://ores.wikimedia.org/ doesn't look like it's cacheable.

Given that an ORES response returns data for an arbitrary collection of revisions, that would probably be a bad idea.

Feb 13 2017, 11:11 PM · Patch-For-Review, ORES, Revision-Scoring-As-A-Service-Backlog

Feb 10 2017

BBlack added a comment to T82849: lvs servers report 'Memory allocation problem' on bootup.

Yeah ipvsadm says "memory allocation problem" if you give it any kind of not-useful arguments (like delete a non-existent service, etc)

Feb 10 2017, 3:46 PM · Traffic, Pybal, Operations

Feb 6 2017

BBlack added a comment to T156919: Port RCStream clients to EventStreams.

I know a chunk of our rcstream traffic that I've observed in the past came from Google. I'm not sure who/what in Google drives that...

Feb 6 2017, 5:52 PM · Analytics-Kanban, Wikimedia-Stream

Feb 3 2017

BBlack moved T156320: $wgServer with initial https:// does not force HTTPS from Triage to BadHerald on the Traffic board.
Feb 3 2017, 1:37 PM · Operations, HTTPS, Traffic, Security
BBlack moved T138027: Add global last-access cookie for top domain (*.wikipedia.org) from Triage to Caching on the Traffic board.
Feb 3 2017, 1:19 PM · Analytics-Kanban, Patch-For-Review, Operations, Traffic
BBlack added a project to T138027: Add global last-access cookie for top domain (*.wikipedia.org): Traffic.
Feb 3 2017, 1:17 PM · Analytics-Kanban, Patch-For-Review, Operations, Traffic
BBlack added a comment to T154801: Investigate varnishd child crashes when multiple nodes get depooled/pooled concurrently.

Recording this while I remember it:

Feb 3 2017, 1:13 PM · Wikimedia-Incident, Traffic, Operations

Jan 30 2017

BBlack added a comment to T156100: DNS: dynamically generate entries for service discovery.

We should probably divorce the RO/RW distinction from the core design here. Not all services will have an RW/RO distinction (I would expect most not to), and those will be things we try to eliminate with better (active/active) design over time. if a specific services needs a split into "active/passive RW + active/active RO", we can solve that by calling it two separate services at this level: foo-rw and foo-ro, with different active/passive rules and distinct failover.

Jan 30 2017, 5:23 PM · Patch-For-Review, Wikimedia-Multiple-active-datacenters, Services (watching), Performance-Team, discovery-system, User-Joe, User-mobrovac, MediaWiki-Configuration, Operations, Wikimedia-Developer-Summit (2017)

Jan 27 2017

RobH awarded T133717: Letsencrypt all the prod things we can - planning a Love token.
Jan 27 2017, 5:55 PM · Operations, Traffic
BBlack edited the description of T156256: Select or Acquire Address Space for Asia Cache DC.
Jan 27 2017, 3:37 PM · Operations, Traffic
BBlack updated subscribers of T156256: Select or Acquire Address Space for Asia Cache DC.
Jan 27 2017, 3:34 PM · Operations, Traffic

Jan 26 2017

BBlack added a comment to T143925: Productionize and deploy Public EventStreams.

cache_misc for this are all implemented and live now. The config declaration is now:

Jan 26 2017, 4:03 PM · Operations, Traffic, Patch-For-Review, Analytics-Kanban, EventBus, Wikimedia-Stream
BBlack added a comment to T155524: convert stream.wikimedia.org from GS to LE certificate.

stream.wikimedia.org is part of cache_misc now, so if we have an expiring certificate here, I don't think we need to replace it.

Jan 26 2017, 3:59 PM · Patch-For-Review, Operations, Traffic
BBlack moved T156030: Select site vendor for Asia Cache Datacenter from Triage to Asia Cache DC on the Traffic board.
Jan 26 2017, 3:57 PM · Traffic, Operations
BBlack added a comment to T137161: Fix nits in Fundraising HTTPS/HSTS configs in wikimedia.org domain.

What about benefactorevents / eventdonations?

Jan 26 2017, 3:10 PM · Traffic, fundraising-tech-ops, Operations
BBlack added a comment to T128559: store.wikimedia.org HTTPS issues.

Any updates here? What we're asking for here is a modern HTTPS-only configuration. I'd think an e-commerce vendor would be all about that...

Jan 26 2017, 3:09 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack edited the description of T105905: Switch blog to HTTPS-only.
Jan 26 2017, 3:03 PM · Operations, Traffic, HTTPS, Wikimedia-Blog
BBlack closed T105905: Switch blog to HTTPS-only as "Resolved".

Confirmed correct current operation:

  1. All HTTP access seems to redirect to HTTPS
  2. All HTTPS requests send response header: strict-transport-security: max-age=31536000; includeSubDomains; preload
Jan 26 2017, 3:03 PM · Operations, Traffic, HTTPS, Wikimedia-Blog
BBlack closed T105905: Switch blog to HTTPS-only, a subtask of T104728: make blog links from wmfwiki front page use HTTPS links, as "Resolved".
Jan 26 2017, 3:03 PM · Operations, Traffic, HTTPS, Wikimedia-Blog
BBlack closed T105905: Switch blog to HTTPS-only, a subtask of T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination, as "Resolved".
Jan 26 2017, 3:03 PM · Patch-For-Review, HTTPS, Operations, Traffic
Nemo_bis awarded T156026: Enable Service in Asia Cache DC a Mountain of Wealth token.
Jan 26 2017, 3:02 PM · Operations, Traffic
BBlack added a project to T156030: Select site vendor for Asia Cache Datacenter: Traffic.
Jan 26 2017, 3:01 PM · Traffic, Operations
BBlack moved T156256: Select or Acquire Address Space for Asia Cache DC from Triage to Asia Cache DC on the Traffic board.
Jan 26 2017, 3:00 PM · Operations, Traffic

Jan 25 2017

BBlack added a parent task for T156256: Select or Acquire Address Space for Asia Cache DC: T156027: Configuration for Asia Cache DC hosts.
Jan 25 2017, 2:58 PM · Operations, Traffic
BBlack added a subtask for T156027: Configuration for Asia Cache DC hosts: T156256: Select or Acquire Address Space for Asia Cache DC.
Jan 25 2017, 2:58 PM · Traffic, Operations
BBlack added a parent task for T156256: Select or Acquire Address Space for Asia Cache DC: T156031: Turn up network links for Asia Cache DC.
Jan 25 2017, 2:58 PM · Operations, Traffic
BBlack added a subtask for T156031: Turn up network links for Asia Cache DC: T156256: Select or Acquire Address Space for Asia Cache DC.
Jan 25 2017, 2:58 PM · Traffic, Operations
BBlack created T156256: Select or Acquire Address Space for Asia Cache DC.
Jan 25 2017, 2:57 PM · Operations, Traffic

Jan 23 2017

BBlack moved T156029: Select location for Asia Cache DC from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Operations, Traffic
BBlack moved T156032: Hardware installation for Asia Cache DC from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Operations, Traffic
BBlack moved T156027: Configuration for Asia Cache DC hosts from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Traffic, Operations
BBlack moved T156028: Name Asia Cache DC site from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Operations, Traffic
BBlack moved T156033: Hardware purchasing for Asia Cache DC from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Operations, Traffic
BBlack moved T156031: Turn up network links for Asia Cache DC from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Traffic, Operations
BBlack moved T156026: Enable Service in Asia Cache DC from Triage to Asia Cache DC on the Traffic board.
Jan 23 2017, 5:38 PM · Operations, Traffic
BBlack added a subtask for T156033: Hardware purchasing for Asia Cache DC: T156030: Select site vendor for Asia Cache Datacenter.
Jan 23 2017, 5:36 PM · Operations, Traffic
BBlack added a parent task for T156030: Select site vendor for Asia Cache Datacenter: T156033: Hardware purchasing for Asia Cache DC.
Jan 23 2017, 5:36 PM · Traffic, Operations
BBlack added a subtask for T156026: Enable Service in Asia Cache DC: T156031: Turn up network links for Asia Cache DC.
Jan 23 2017, 5:35 PM · Operations, Traffic
BBlack added a parent task for T156031: Turn up network links for Asia Cache DC: T156026: Enable Service in Asia Cache DC.
Jan 23 2017, 5:35 PM · Traffic, Operations
BBlack added a subtask for T156032: Hardware installation for Asia Cache DC: T156033: Hardware purchasing for Asia Cache DC.
Jan 23 2017, 5:33 PM · Operations, Traffic
BBlack added a parent task for T156033: Hardware purchasing for Asia Cache DC: T156032: Hardware installation for Asia Cache DC.
Jan 23 2017, 5:33 PM · Operations, Traffic
BBlack added a parent task for T156032: Hardware installation for Asia Cache DC: T156026: Enable Service in Asia Cache DC.
Jan 23 2017, 5:33 PM · Operations, Traffic
BBlack added a subtask for T156026: Enable Service in Asia Cache DC: T156032: Hardware installation for Asia Cache DC.
Jan 23 2017, 5:33 PM · Operations, Traffic
BBlack added a parent task for T156030: Select site vendor for Asia Cache Datacenter: T156031: Turn up network links for Asia Cache DC.
Jan 23 2017, 5:31 PM · Traffic, Operations
BBlack added subtasks for T156031: Turn up network links for Asia Cache DC: T156030: Select site vendor for Asia Cache Datacenter, T156029: Select location for Asia Cache DC.
Jan 23 2017, 5:31 PM · Traffic, Operations
BBlack added a parent task for T156029: Select location for Asia Cache DC: T156031: Turn up network links for Asia Cache DC.
Jan 23 2017, 5:31 PM · Operations, Traffic
BBlack added a subtask for T156030: Select site vendor for Asia Cache Datacenter: T156029: Select location for Asia Cache DC.
Jan 23 2017, 5:30 PM · Traffic, Operations
BBlack added a parent task for T156029: Select location for Asia Cache DC: T156030: Select site vendor for Asia Cache Datacenter.
Jan 23 2017, 5:30 PM · Operations, Traffic
BBlack added a subtask for T156028: Name Asia Cache DC site: T156030: Select site vendor for Asia Cache Datacenter.
Jan 23 2017, 5:28 PM · Operations, Traffic
BBlack added a parent task for T156030: Select site vendor for Asia Cache Datacenter: T156028: Name Asia Cache DC site.
Jan 23 2017, 5:28 PM · Traffic, Operations
BBlack added a subtask for T156028: Name Asia Cache DC site: T156029: Select location for Asia Cache DC.
Jan 23 2017, 5:27 PM · Operations, Traffic
BBlack added a parent task for T156029: Select location for Asia Cache DC: T156028: Name Asia Cache DC site.
Jan 23 2017, 5:27 PM · Operations, Traffic
BBlack added a subtask for T156027: Configuration for Asia Cache DC hosts: T156028: Name Asia Cache DC site.
Jan 23 2017, 5:27 PM · Traffic, Operations
BBlack added a parent task for T156028: Name Asia Cache DC site: T156027: Configuration for Asia Cache DC hosts.
Jan 23 2017, 5:27 PM · Operations, Traffic
BBlack added a subtask for T156026: Enable Service in Asia Cache DC: T156027: Configuration for Asia Cache DC hosts.
Jan 23 2017, 5:27 PM · Operations, Traffic
BBlack added a parent task for T156027: Configuration for Asia Cache DC hosts: T156026: Enable Service in Asia Cache DC.
Jan 23 2017, 5:27 PM · Traffic, Operations
BBlack created T156033: Hardware purchasing for Asia Cache DC.
Jan 23 2017, 5:26 PM · Operations, Traffic
BBlack created T156032: Hardware installation for Asia Cache DC.
Jan 23 2017, 5:26 PM · Operations, Traffic
BBlack created T156031: Turn up network links for Asia Cache DC.
Jan 23 2017, 5:26 PM · Traffic, Operations
BBlack created T156030: Select site vendor for Asia Cache Datacenter.
Jan 23 2017, 5:26 PM · Traffic, Operations
BBlack created T156029: Select location for Asia Cache DC.
Jan 23 2017, 5:26 PM · Operations, Traffic
BBlack created T156028: Name Asia Cache DC site.
Jan 23 2017, 5:26 PM · Operations, Traffic
BBlack created T156027: Configuration for Asia Cache DC hosts.
Jan 23 2017, 5:26 PM · Traffic, Operations
BBlack created T156026: Enable Service in Asia Cache DC.
Jan 23 2017, 5:26 PM · Operations, Traffic

Jan 11 2017

BBlack created P4735 dns + etcd + confd + .....
Jan 11 2017, 6:35 PM

Jan 5 2017

BBlack added a comment to T148780: mobile-safari has very few internally-referred pageviews.

It's not really my feature, I just happened to write the very short config patch to turn it on, because nobody else had at the time. For the history on this, see also:
T149858 , espectially from T87276#2055761 onwards, where some issues with Safari's support of both spellings was raised. Also the original discussion in https://meta.wikimedia.org/wiki/Research_talk:Wikimedia_referrer_policy , and the original code change for it here: https://gerrit.wikimedia.org/r/#/c/186104/2 . I'm not opposed to any particular path, but I'd be careful that someone fully research all the implications of any change to the previous misspelled variant, and possibly look at using the header instead of the meta tag.

Jan 5 2017, 7:45 PM · Analytics, Operations, Traffic, Reading-Web-Backlog
BBlack added a comment to T148131: Deploy redundant unified certs.

These are now deployed (digicert in esams, globalsign elsewhere). Pending closing this until we document switching off either of the certs...

Jan 5 2017, 5:53 PM · Wikimedia-Incident, Operations, Traffic
BBlack closed T150561: Extra RTT on TLS handshakes as "Resolved".
Jan 5 2017, 5:52 PM · Traffic, Operations

Dec 20 2016

BBlack added a comment to T138027: Add global last-access cookie for top domain (*.wikipedia.org).

Yes, we can do Q3. The actual work on our end is fairly minimal, just need to pencil it in and remember to get it done!

Dec 20 2016, 8:35 PM · Analytics-Kanban, Patch-For-Review, Operations, Traffic

Dec 19 2016

BBlack added a comment to T148131: Deploy redundant unified certs.

In case such an incident happens before the changes in January and I'm not around, the procedure to switch GlobalSign to Digicert globally would be:

Dec 19 2016, 7:42 PM · Wikimedia-Incident, Operations, Traffic
BBlack added a comment to T148131: Deploy redundant unified certs.

Status update - Digicert unified certs (RSA+ECDSA) are now deployed and stapled alongside the GlobalSign ones on all cache terminators. They're not being used for user traffic, but they're ready as a warm standby in the case that we need to deal with another issue like the past GlobalSign OCSP/revocation incident.

Dec 19 2016, 7:25 PM · Wikimedia-Incident, Operations, Traffic
BBlack moved T143925: Productionize and deploy Public EventStreams from Triage to Caching on the Traffic board.
Dec 19 2016, 4:47 PM · Operations, Traffic, Patch-For-Review, Analytics-Kanban, EventBus, Wikimedia-Stream
BBlack added a project to T143925: Productionize and deploy Public EventStreams: Traffic.
Dec 19 2016, 4:47 PM · Operations, Traffic, Patch-For-Review, Analytics-Kanban, EventBus, Wikimedia-Stream
BBlack edited the description of T104681: HTTPS Plans (tracking / high-level info).
Dec 19 2016, 4:33 PM · Tracking, Operations, Traffic, HTTPS
BBlack added a comment to T128559: store.wikimedia.org HTTPS issues.

@Ppena (or anyone) - who's responsible in the WMF for store.wikimedia.org? This is a pretty basic request and it's been outstanding for months. It's one of the few non-confirming exceptions to our HTTPS policy remaining!

Dec 19 2016, 4:29 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack added a comment to T105905: Switch blog to HTTPS-only.

@EdErhart-WMF - Any update on setting the appropriate Strict-Transport-Security header on this service?

Dec 19 2016, 4:27 PM · Traffic, Operations, HTTPS, Wikimedia-Blog
BBlack closed T140128: HTTPS-only for stream.wikimedia.org as "Declined".

We're going to leave this as-is and assume eventstream replacement (which will be HTTPS-only from the get-go) will handle this for us.

Dec 19 2016, 4:18 PM · Patch-For-Review, HTTPS, Operations, Wikimedia-Stream, Performance-Team, Traffic