BBlack (Brandon Black)
WMF Operations Engineer

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Nov 4 2014, 4:29 PM (146 w, 17 h)
Availability
Available
IRC Nick
bblack
LDAP User
BBlack
MediaWiki User
BBlack (WMF)

Recent Activity

Mon, Aug 21

BBlack added a comment to T173422: Investigate the increase in the number of requests to Swift after the Page Previews deploy.

No qualms on the cache end of things!

Mon, Aug 21, 5:40 PM · Readers-Web-Backlog (Tracking), Traffic, Page-Previews, Operations
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Heh yeah I guess you're right. Still, I added it to the current page, and we seemed to have picked up some new translations over the weekend. I can pull the link back out of there on the next update if that makes more sense.

Mon, Aug 21, 4:40 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack moved T172198: setup/install cp402[5-8].ulsfo.wmnet from Triage to Caching on the Traffic board.
Mon, Aug 21, 3:59 PM · Patch-For-Review, Traffic, Operations
BBlack added a comment to T172418: Get translations for "IE8 on XP won't work".

I see we have a few new translations up today, I'll incorporate them shortly! :)

Mon, Aug 21, 3:52 PM · User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T173422: Investigate the increase in the number of requests to Swift after the Page Previews deploy.

You can see a view of cache_upload's over all 2xx (and everything else) here: https://grafana.wikimedia.org/dashboard/db/varnish-aggregate-client-status-codes?panelId=2&fullscreen&orgId=1&var-site=All&var-cache_type=upload&var-status_type=2&from=now-30d&to=now .

Mon, Aug 21, 2:49 PM · Readers-Web-Backlog (Tracking), Traffic, Page-Previews, Operations

Sat, Aug 19

BBlack added a comment to T147199: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support).

If a corporation is insane enough to still run XP and force their users to run IE, we can only hope that yet another site they can't use will be the final straw forcing them to do something.

Yes, Our mission is indeed to try to force corporate IT upgrades, and not to make knowledge freely available.

Nevertheless, my point stands. The claim I quoted is clearly false.

Sat, Aug 19, 5:09 PM · User-notice, Patch-For-Review, Operations, Traffic
BBlack added a comment to T147199: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support).

Both IE7 and IE8 for XP are what's being cut off in this transition, with IE8 being the newest IE that's even available for XP, AFAIK. However, we're not doing this with the express intent of deprecating older browser tech; it's just the natural fallout of raising our minimum security level for network connections. There's no further firm plans yet on Operations' end of things regarding deprecating specific browser versions, but we will most likely run through similar cipher/protocol deprecations in the future which may take out older browsers along the way like this one did.

Sat, Aug 19, 4:56 PM · User-notice, Patch-For-Review, Operations, Traffic

Thu, Aug 17

BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

After a couple of other minor nits, going to push the above as it stands. We can iterate further as necessary, at least it's an improvement on the original!

Thu, Aug 17, 9:32 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

patch above is the same changes as a real changeset (it's just hard to review them that way, simpler manually on https://pinkunicorn.wikimedia.org/test-sec-warning ).

Thu, Aug 17, 9:00 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Thanks! Updated for all the above as best I can (I'm not 100% sure on the language-name text prefix for Arabic and Chinese, but took a good stab from http://mediaglyphs.org/mg/?p=langnames ), I guess someone that knows better can recommend a further fixup?

Thu, Aug 17, 8:52 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Update: noticed I had en-US firefox links in all of the translations. Updated them all now.

Thu, Aug 17, 8:35 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Testing updated HTML with some translations and a translate link (and other minor cleanups) at https://pinkunicorn.wikimedia.org/test-sec-warning . Will push something like this to the real one at https://en.wikipedia.org/test-sec-warning before upping percentage. Thoughts? Further tweaks? Mistakes? :)

Thu, Aug 17, 8:24 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Hopefully in the former case, they'll complain to their IT department and they'll fix it, and hopefully in the latter they'll blindly trust our Firefox links and find their way out of this mess from there :)

Thu, Aug 17, 8:13 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

It's a very valid question :)

Thu, Aug 17, 6:44 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Ok thanks!

Thu, Aug 17, 6:25 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Update: Today is the start date for going to 5%. Before we pull that trigger sometime later (perhaps much later) today, I'm working on a few other things:

Thu, Aug 17, 4:22 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack moved T172418: Get translations for "IE8 on XP won't work" from Triage to TLS on the Traffic board.
Thu, Aug 17, 2:35 PM · User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack closed T173506: cp3036 crashed as Resolved.
14:32 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp3036.*
Thu, Aug 17, 2:33 PM · ops-esams, Operations, Traffic
BBlack created T173506: cp3036 crashed.
Thu, Aug 17, 2:30 PM · ops-esams, Operations, Traffic

Mon, Aug 14

BBlack added a comment to T128374: Sort out analytics service dependency issues for cp* cache hosts.

I think there's still some work here to do, if nothing else to audit the situation as it stands. There's basically two things to sort out for all of the varnish-logging bits and pieces:

  1. Have we killed the hard dependency on Varnish being online? (Can we start the logger first and have it connect/reconnect as Varnish goes up and down?)
  2. Have we re-ordered the systemd level dependencies to ensure we're not losing log events? (Can we make Varnish services dependent on the loggers being ready to receive events?)
Mon, Aug 14, 6:17 PM · User-Elukey, Varnish, Traffic, Analytics, Operations
BBlack changed the status of T170518: Non zero rated LVS IPs from Open to Stalled.

Re-evaluating alternatives here, hold on actual implementation for now.

Mon, Aug 14, 6:13 PM · Patch-For-Review, Operations, Traffic

Tue, Aug 8

RandomDSdevel awarded T147199: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support) a Doubloon token.
Tue, Aug 8, 12:46 AM · User-notice, Patch-For-Review, Operations, Traffic

Mon, Aug 7

BBlack added a comment to T169175: What is a reasonable per-IP ratelimit for maps.

It's just per-IP. So yes that sounds fine: if you're peaking at 80/s total, then lets put an upper sanity bound at 100/s misses for now. Any preferences for a deploy time to be sure whoever needs to be around to check for any fallout is around?

Mon, Aug 7, 4:33 PM · Discovery-Analysis, Operations, Traffic, Maps-Sprint, Maps, Discovery

Thu, Aug 3

BBlack added a comment to T147199: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support).

Even while FF 52 is still supported by Mozilla, it's unlikely that Mozilla's security efforts can actually prevent all the possible exploits that breach the underlying WinXP.

Thu, Aug 3, 11:20 PM · User-notice, Patch-For-Review, Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Works for me. If you can paste back the text form of whatever you want here, I can get the page updated.

Thu, Aug 3, 5:29 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

The current message text (which needs massaging and updating anyways) is visible explicitly at: https://en.wikipedia.org/test-sec-warning

Thu, Aug 3, 5:20 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

And for those wanting to follow the changes in 3DES percentage of requests as we go: https://grafana.wikimedia.org/dashboard/db/tls-ciphers?panelId=11&fullscreen&orgId=1

Thu, Aug 3, 2:09 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T147199: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support).

Cross-ticket updates: There's a separate sub-ticket for the Communications side of this change at T163251, and a timeline has been laid out there in T163251#3478043 . The TL;DR of the timeline is we'll ramp from 5% to ~29% blocked over the period of Aug 17 -> Oct 12, then 100% blocked on Oct 17, then protocol-disabled on Nov 17.

Thu, Aug 3, 2:07 PM · User-notice, Patch-For-Review, Operations, Traffic
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

How do we do on the translation side for this, by the way?

Thu, Aug 3, 1:47 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic

Tue, Aug 1

BBlack added a comment to T164327: replace ulsfo aging servers.

Excellent news! I'll try to squeeze in replacing one of the clusters ASAP, which will decom another 6x of the old cp to let us move further.

Tue, Aug 1, 5:05 PM · Patch-For-Review, Traffic, Operations, ops-ulsfo

Mon, Jul 31

BBlack added a comment to T154227: URLs with title query string parameter and additional query string parameters do not redirect to mobile site.

It seems reasonable to relax the regex in question a bit (to allow additional parameters).

Mon, Jul 31, 4:57 PM · Unplanned-Sprint-Work, Readers-Web-Kanban-Board, Patch-For-Review, Traffic, Operations, Readers-Web-Backlog (Tracking), Puppet, Need-volunteer, Mobile
BBlack added a comment to T172124: PyBal Feature: progressive depooling strategy for monitored failures.

It's also an interesting thought to consider progressively scaling the weight. For example, you could make the strategy configurable such that the first failure sets weight=configured_weight*0.5, the next weight=0, and the next deletes. However, the way that weighting is handled in sh for the public services is not ideal (excess churn due to lack of true chashing), so it's probably best to avoid staging through smaller shifts of weight until some future time when we've got a proper chashing ipvs scheduler.

Mon, Jul 31, 3:30 PM · Pybal, Traffic, Operations
BBlack added subtasks for T172124: PyBal Feature: progressive depooling strategy for monitored failures: T86650: Add support for setting weight=0 when depooling, T171850: Backport ipvsadm.
Mon, Jul 31, 3:27 PM · Pybal, Traffic, Operations
BBlack added a parent task for T86650: Add support for setting weight=0 when depooling: T172124: PyBal Feature: progressive depooling strategy for monitored failures.
Mon, Jul 31, 3:27 PM · Operations, Traffic, Patch-For-Review, Pybal
BBlack added a parent task for T171850: Backport ipvsadm: T172124: PyBal Feature: progressive depooling strategy for monitored failures.
Mon, Jul 31, 3:27 PM · Pybal, Traffic, Operations
BBlack created T172124: PyBal Feature: progressive depooling strategy for monitored failures.
Mon, Jul 31, 3:27 PM · Pybal, Traffic, Operations
BBlack added a subtask for T172103: IPVS issues with UDP services, pybal depooling strategy: T86650: Add support for setting weight=0 when depooling.
Mon, Jul 31, 3:18 PM · Pybal, Traffic, Operations
BBlack added a parent task for T86650: Add support for setting weight=0 when depooling: T172103: IPVS issues with UDP services, pybal depooling strategy.
Mon, Jul 31, 3:17 PM · Operations, Traffic, Patch-For-Review, Pybal
BBlack added a comment to T172116: Improve OCSP fetching and monitoring strategies.

Hmm I wrote that backwards above. The OCSP file-freshness checks look at age-of-mtime, not the timestamp within. In any case, we can still move them to crit=~3d and warn=~2d.

Mon, Jul 31, 2:16 PM · Patch-For-Review, Operations, Traffic
BBlack created T172116: Improve OCSP fetching and monitoring strategies.
Mon, Jul 31, 2:08 PM · Patch-For-Review, Operations, Traffic
BBlack closed T172101: OCSP update failed for /etc/update-ocsp.d/globalsign-2016-ecdsa-unified.conf as Resolved.

Ran it again and it's ok now.

Mon, Jul 31, 1:56 PM · Operations, Traffic
BBlack added a subtask for T172103: IPVS issues with UDP services, pybal depooling strategy: T171850: Backport ipvsadm.
Mon, Jul 31, 1:51 PM · Pybal, Traffic, Operations
BBlack added a parent task for T171850: Backport ipvsadm: T172103: IPVS issues with UDP services, pybal depooling strategy.
Mon, Jul 31, 1:51 PM · Pybal, Traffic, Operations
BBlack added a comment to T172103: IPVS issues with UDP services, pybal depooling strategy.

+1. There are a number of tricky things here to get to these simple goals, though, and since the sysctls affect all services, we have to have the TCP cases in mind as well:

Mon, Jul 31, 1:49 PM · Pybal, Traffic, Operations
BBlack added a comment to T134893: Unhandled pybal error causing services to be depooled in etcd but not in lvs.

The added PyBal IPVS diff check is flapping a bit with UNKNOWN for some hosts (lvs100[3,6,9], lvs200[3,6]) with message:

HTTPConnectionPool(host='localhost', port=9090): Read timed out. (read timeout=1.0)
$ grep -c "PyBal IPVS diff check" icinga.log
34

When specifying the timeout in Requests you can use a tuple to put different values for connect and read timeouts. My guess is that sometimes on those hosts PyBal is not able to reply within the 1s timeout and we might need a larger one.

Mon, Jul 31, 1:38 PM · Patch-For-Review, Operations-Software-Development, Pybal, Operations, Traffic

Thu, Jul 27

BBlack added a project to T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster: Traffic.
Thu, Jul 27, 2:26 PM · Traffic, wikiba.se, Operations, Wikidata-Sprint-2016-11-08, Wikidata
BBlack added a comment to T171850: Backport ipvsadm.

Yeah, that's not a bad idea. Perhaps we should morph this into a stretch-for-LVS ticket, and start with the always-almost-ready-to-use lvs1007-12? :)

Thu, Jul 27, 2:25 PM · Pybal, Traffic, Operations
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

@Johan Yeah I've been OoO and catching up slowly too. We also have Wikimania coming up on the horizon of course. Want to shoot for a start date the Thursday after Wikimania, Aug 17th, 3 weeks out from today?

Thu, Jul 27, 1:53 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic

Wed, Jul 26

BBlack added a comment to T170740: PuppetDB misbehaving on 2017-07-15.

So, things fell over again with a ton of puppetfail spam. As a stopgap, I've done the following:

Wed, Jul 26, 5:53 PM · Patch-For-Review, Puppet, Operations
BBlack added a comment to T104442: Investigate better DNS cache/lookup solutions.

So to recap a small part of IRC discussion today in the wake of issues with rebooting hydrogen, I think our short-term improvement plan looks like this:

Wed, Jul 26, 5:16 PM · Patch-For-Review, Traffic, Operations

Tue, Jul 25

BBlack added a comment to T171028: Degraded RAID on cp1008.

It was decommed a long time ago, and then I revived it as a quasi-production testing machine for "temporary" use for a little while, and probably poorly documented that, and now "temporary" has stretch on a really really long time. cp1008 is the correct machine.

Tue, Jul 25, 7:12 PM · Traffic, Operations
BBlack added a comment to T154026: On mobile, http://wikipedia.org/wiki/Foo redirects to https://www.m.wikipedia.org/wiki/Foo which does not exist.

As @Aklapper said, there's a broad range of issues embedded in this. In the example given in the title, this is what actually happens and which actors are responsible for the redirects:

Tue, Jul 25, 6:32 PM · Readers-Web-Backlog (Tracking), Operations, Puppet, Wikimedia-Apache-configuration, Mobile
BBlack added a comment to T171498: Implement machine-local forwarding DNS caches.
  • I'm worried a little bit that this will hide issues like the ones you mentioned under the carpet. The cases where services are latency/failure-sensitive especially are issues we should be fixing. I'm worried that with a local recursor we'll just make them manifest even less often and in even more corner-cases :/
Tue, Jul 25, 1:50 PM · Traffic, Operations

Mon, Jul 24

BBlack moved T171498: Implement machine-local forwarding DNS caches from Triage to DNS Infra on the Traffic board.
Mon, Jul 24, 6:16 PM · Traffic, Operations
BBlack created T171498: Implement machine-local forwarding DNS caches.
Mon, Jul 24, 5:50 PM · Traffic, Operations
BBlack added subtasks for T104442: Investigate better DNS cache/lookup solutions: T98006: Anycast (Auth)DNS, T164327: replace ulsfo aging servers.
Mon, Jul 24, 5:34 PM · Patch-For-Review, Traffic, Operations
BBlack added a parent task for T98006: Anycast (Auth)DNS: T104442: Investigate better DNS cache/lookup solutions.
Mon, Jul 24, 5:34 PM · Patch-For-Review, netops, Operations, Traffic
BBlack added a parent task for T164327: replace ulsfo aging servers: T104442: Investigate better DNS cache/lookup solutions.
Mon, Jul 24, 5:34 PM · Patch-For-Review, Traffic, Operations, ops-ulsfo
BBlack added a comment to T104442: Investigate better DNS cache/lookup solutions.

Add T171318 to the list too. There's doubtless a long tail of issues we'll never fully realize that would be helped by work here. Part of the reason this ticket's still idling so long is that it doesn't offer any simple path forward, just problems and problematic solutions. So let's step through things here:

Mon, Jul 24, 5:33 PM · Patch-For-Review, Traffic, Operations

Jul 21 2017

elukey awarded T164768: Explicitly limit varnishd transient storage a Love token.
Jul 21 2017, 1:40 PM · Patch-For-Review, Traffic, Operations

Jul 14 2017

BBlack added a comment to T128559: store.wikimedia.org HTTPS issues.

Digging a little deeper, Shopify open-sources a lot of their infrastructure code. It seems likely that they already support the appropriate attributes at least in the lower levels of their stack (who knows in the user interface), as the specific options exist in their modified clone of Rails: https://github.com/Shopify/rails-mirror/blob/master/actionpack/lib/action_dispatch/middleware/ssl.rb#L20

Jul 14 2017, 8:19 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack merged T152622: Wikipedia.cz and other domains owned by WMCZ have invalid certificate into T133548: Create a secure redirect service for large count of non-canonical / junk domains.
Jul 14 2017, 8:11 PM · Patch-For-Review, HTTPS, Traffic, Operations
BBlack merged task T152622: Wikipedia.cz and other domains owned by WMCZ have invalid certificate into T133548: Create a secure redirect service for large count of non-canonical / junk domains.
Jul 14 2017, 8:11 PM · Operations, Traffic, Domains, User-Urbanecm
BBlack added a comment to T152622: Wikipedia.cz and other domains owned by WMCZ have invalid certificate.

To be clear then: this ticket is about our (WMF's) hosting of wikipedia.cz not having a valid SSL cert, and maybe touches on broader issues of ownership and delegation rules.

Jul 14 2017, 8:11 PM · Operations, Traffic, Domains, User-Urbanecm
BBlack updated the task description for T128559: store.wikimedia.org HTTPS issues.
Jul 14 2017, 8:05 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack added a comment to T128559: store.wikimedia.org HTTPS issues.

It seems like Shopify has been making some improvements on this front since we last checked.

Jul 14 2017, 8:02 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

@Johan - Do you have any kind of estimate on Community's inputs here and time needed before a start date? Can we set a tentative one and begin editing the various copy?

Jul 14 2017, 1:06 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic

Jul 13 2017

BBlack added a comment to T169683: Thumbor should return informative and nice-looking errors.

We could perhaps go after this in the most-general sense. In our common VCL (across all clusters), if we get an error from a backend ([45]xx) which has no body content, we can always turn that into a synth() response and probably add missing reason text in most cases as well (varnish does that by default when you re-set the status code). This way any app that sends errors with empty bodies gets converted to the standardized error templates that Varnish already has.

Jul 13 2017, 11:18 PM · Patch-For-Review, Performance-Team, Thumbor
BBlack added a comment to T170628: HTTP 429 on thumbnail images for specific SVG file on Commons.

So, error code 429 is Too Many Requests, generally used by ratelimiters. In this case, it seems that thumbor (our internal service that renders thumbnails of images) issues a 429 because the SVG is failing to render (it might be invalid SVG in some sense, or at least making our SVG parsing tools explode). Making this more user-friendly is discussed in T169683

Jul 13 2017, 10:08 PM · Performance-Team (Radar), Operations, Thumbor, Commons, media-storage
BBlack created T170598: Extending our HSTS value beyond ~1y.
Jul 13 2017, 4:35 PM · Operations, Traffic
BBlack moved T170567: Support TLSv1.3 from Triage to TLS on the Traffic board.
Jul 13 2017, 1:59 PM · Patch-For-Review, Operations, Traffic
BBlack created T170567: Support TLSv1.3.
Jul 13 2017, 1:59 PM · Patch-For-Review, Operations, Traffic
BBlack added a comment to T170546: Optimize Wikipedia PNG Logo.

I think the main point here is we'd rather have a reproducible method for optimizing these images which works on our Linux and open-source based infrastructure. Having a third party optimize one of our many PNGs once manually is interesting, but this doesn't scale to the many other PNGs which may be spread around many other repos and sources, and more importantly the work will be lost the next time someone uploads new PNG content updates (e.g. visual re-designs or tweaks for new display types).

Jul 13 2017, 1:12 PM · Performance-Team (Radar), Wikimedia-Site-requests

Jul 11 2017

BBlack updated the task description for T128559: store.wikimedia.org HTTPS issues.
Jul 11 2017, 5:29 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack updated the task description for T104681: HTTPS Plans (tracking / high-level info).
Jul 11 2017, 5:19 PM · Tracking, Operations, Traffic, HTTPS
BBlack updated the task description for T104681: HTTPS Plans (tracking / high-level info).
Jul 11 2017, 5:17 PM · Tracking, Operations, Traffic, HTTPS
BBlack updated the task description for T104681: HTTPS Plans (tracking / high-level info).
Jul 11 2017, 5:16 PM · Tracking, Operations, Traffic, HTTPS
BBlack added a comment to T104681: HTTPS Plans (tracking / high-level info).

So with these changes and cleanups in the past few weeks, we're basically down to two outstanding issues here from the original context:

Jul 11 2017, 5:14 PM · Tracking, Operations, Traffic, HTTPS
BBlack closed T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination as Resolved.

Resolving this and moving the last remaining ticket up the tree as a direct child of the tracker. There's no point having a sub-category for one thing.

Jul 11 2017, 5:12 PM · Patch-For-Review, HTTPS, Traffic, Operations
BBlack closed T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination, a subtask of T40516: Enable HSTS on Wikimedia sites, as Resolved.
Jul 11 2017, 5:12 PM · Operations, Traffic, HTTPS
BBlack closed T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination, a subtask of T104681: HTTPS Plans (tracking / high-level info), as Resolved.
Jul 11 2017, 5:12 PM · Tracking, Operations, Traffic, HTTPS
BBlack edited parent tasks for T128559: store.wikimedia.org HTTPS issues, added: T104681: HTTPS Plans (tracking / high-level info); removed: T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination.
Jul 11 2017, 5:11 PM · Operations, Traffic, Wikimedia-Shop, HTTPS
BBlack added a subtask for T104681: HTTPS Plans (tracking / high-level info): T128559: store.wikimedia.org HTTPS issues.
Jul 11 2017, 5:11 PM · Tracking, Operations, Traffic, HTTPS
BBlack removed a subtask for T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination: T128559: store.wikimedia.org HTTPS issues.
Jul 11 2017, 5:11 PM · Patch-For-Review, HTTPS, Traffic, Operations
BBlack closed T137161: Fix nits in HTTPS/HSTS configs in externally-hosted fundraising domains as Resolved.
Jul 11 2017, 5:10 PM · Traffic, Operations
BBlack closed T137161: Fix nits in HTTPS/HSTS configs in externally-hosted fundraising domains, a subtask of T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination, as Resolved.
Jul 11 2017, 5:10 PM · Patch-For-Review, HTTPS, Traffic, Operations
BBlack removed a subtask for T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination: Unknown Object (Task).
Jul 11 2017, 4:26 PM · Patch-For-Review, HTTPS, Traffic, Operations
BBlack added a comment to T170193: revoke eventdonations.wikimedia.org SSL cert if there is one....

Ah I missed the part above where it stated that it expired in a week or two. In that case, there's little point for this particular certificate.

Jul 11 2017, 4:16 PM · Patch-For-Review, Domains, Traffic, Operations, fundraising-tech-ops
BBlack added a comment to T170193: revoke eventdonations.wikimedia.org SSL cert if there is one....

I think in this case we should revoke unless the expiry is already very close (it might be!). This is private key that is out of our control, and I honestly don't even understand all the machinations of the change of vendors involved here. It was one thing to trust them to represent one of our hostnames in a TLS public key when they were an active vendor with a contractual relationship, it's another to trust that that key is still secure in the aftermath of that relationship ending.

Jul 11 2017, 4:15 PM · Patch-For-Review, Domains, Traffic, Operations, fundraising-tech-ops
BBlack added a comment to T124954: Decrease max object TTL in varnishes.

[..] We don't believe it should be possible at this time for an object to exist in the caching layers for more than 4 days, assuming there are no application-layer HTTP bugs in play (e.g. the application incorrectly giving a 304 Not Modified response to a conditional request from the cache, for content which has in fact been modified).

Is there an upper limit to how long or how often the same cache object can be "304-whitewashed"?

Jul 11 2017, 3:06 PM · Traffic, Operations
BBlack removed a parent task for T156919: Port RCStream clients to EventStreams: T168919: stream.wikimedia.org: remove legacy rcstream/socket.io HTTPS redirect hole punches.
Jul 11 2017, 1:00 AM · Analytics-Kanban, Wikimedia-Stream
BBlack removed a subtask for T168919: stream.wikimedia.org: remove legacy rcstream/socket.io HTTPS redirect hole punches: T156919: Port RCStream clients to EventStreams.
Jul 11 2017, 1:00 AM · Operations, Traffic
BBlack closed T168919: stream.wikimedia.org: remove legacy rcstream/socket.io HTTPS redirect hole punches as Resolved.

This hole was removed today in https://gerrit.wikimedia.org/r/#/c/364252 , so this is resolved assuming we don't revert (unlikely!). \o/

Jul 11 2017, 12:58 AM · Operations, Traffic
BBlack closed T168919: stream.wikimedia.org: remove legacy rcstream/socket.io HTTPS redirect hole punches, a subtask of T104681: HTTPS Plans (tracking / high-level info), as Resolved.
Jul 11 2017, 12:58 AM · Tracking, Operations, Traffic, HTTPS

Jul 10 2017

BBlack added a comment to T137161: Fix nits in HTTPS/HSTS configs in externally-hosted fundraising domains.

benefactors - It wasn't originally part of the original task here, we've just been questioning whether it's also being removed at the same time, since it seems related.

Jul 10 2017, 6:28 PM · Traffic, Operations
BBlack added a comment to T169175: What is a reasonable per-IP ratelimit for maps.

+1 on using a similar rate to the APIs on text. I wonder what the peak (ab?)users' rates on upload.wikimedia.org look like as well, and whether one shared ratelimit for both might make sense.

Jul 10 2017, 6:27 PM · Discovery-Analysis, Operations, Traffic, Maps-Sprint, Maps, Discovery
BBlack added a comment to T163251: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members.

Sorry, we've been wanting to make forward progress on this for several months now, but it keeps falling to the bottom of our priority queue. I'll pick up here from the related email thread as well and try to cover the basics:

Jul 10 2017, 5:36 PM · Patch-For-Review, User-Johan, Community-Liaisons (Jul-Sep 2017), Operations, Traffic
BBlack added a comment to T167299: Upgrade BIOS/RBSU/etc on lvs1007.

Update?

Jul 10 2017, 4:27 PM · ops-eqiad, Traffic, netops, Operations
BBlack reopened T137161: Fix nits in HTTPS/HSTS configs in externally-hosted fundraising domains as "Open".

All of these hostnames are still in DNS AFAICS:

Jul 10 2017, 3:57 PM · Traffic, Operations
BBlack reopened T137161: Fix nits in HTTPS/HSTS configs in externally-hosted fundraising domains, a subtask of T132521: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination, as Open.
Jul 10 2017, 3:57 PM · Patch-For-Review, HTTPS, Operations, Traffic