Page MenuHomePhabricator

HTTPS-only for stream.wikimedia.org
Closed, DeclinedPublic

Description

stream.wikimedia.org was moved behind cache_misc in T134871. It was noted during this transition that most of the current clients are using unencrypted HTTP, and that our default/sample websocket client implementations tend to break on a 301 redirect rather than follow it, so this transition may be painful.

We should start with ensuring our own documentation uses https:// and/or wss:// URLs for the stream service as appropriate, and make some announcements to the community about the problem and a plea to update their URLs for secure access, and set a future date on which we'll make this service redirect to HTTPS with 301 like all of our other public, cache-terminated hostnames in wikimedia.org.

Details

Related Gerrit Patches:
operations/puppet : productionrcstream: log X-Forwarded-Proto

Event Timeline

BBlack created this task.Jul 12 2016, 4:48 PM
Restricted Application added a project: Operations. · View Herald TranscriptJul 12 2016, 4:48 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

Change 299892 had a related patch set uploaded (by Ori.livneh):
rcstream: log X-Forwarded-Proto

https://gerrit.wikimedia.org/r/299892

Change 299892 merged by Ori.livneh:
rcstream: log X-Forwarded-Proto

https://gerrit.wikimedia.org/r/299892

BBlack added a comment.Sep 6 2016, 1:28 PM

Have we sent any announcement about this to the community? We might have already, just not tracked in here.

BBlack added a comment.Sep 6 2016, 3:15 PM

Looking at the past couple days of access logs from ori's nginx patch above, it looks like the current split is still 88% insecure, 12% secure :/

I did some digging and found /shared/pywikipedia/core/pywikibot/comms/rcstream.py in tools suggests stream.wikimedia.org on port 80

That's at least one bot in tools fixed. Can you filter those access logs down to labs entries only (208.80.155.128 - 208.80.155.255), and write the result to my home directory on some production server?

Dzahn added a subscriber: Dzahn.Sep 14 2016, 6:39 PM

current access log is only about 9% https and a chunk of that is all Catchpoint monitoring.

About 32% are from python-requests UA, of which under 1% use https :/

Dzahn added a comment.Sep 14 2016, 6:42 PM

Can you filter those access logs down to labs entries only (208.80.155.128 - 208.80.155.255), and write the result to my home directory on some production server?

I only see the 10.64.x.x IPs of cp servers in the access log. it has the http_x_forwarded_proto but not the remote IP.

BBlack moved this task from Triage to TLS on the Traffic board.Sep 30 2016, 1:40 PM
BBlack added a comment.Oct 3 2016, 4:05 PM

Can you filter those access logs down to labs entries only (208.80.155.128 - 208.80.155.255), and write the result to my home directory on some production server?

I only see the 10.64.x.x IPs of cp servers in the access log. it has the http_x_forwarded_proto but not the remote IP.

We could perhaps enable apache logging of the X-Client-IP header to see through the caches for this.

We could perhaps enable apache logging of the X-Client-IP header to see through the caches for this.

Done in https://gerrit.wikimedia.org/r/#/c/318296/ (forgot to put Bug: link in there)

At a glance, it seems like the bulk of the query traffic comes from GCE and AWS, and the bulk of it's still not HTTPS.

BBlack closed this task as Declined.Dec 19 2016, 4:18 PM

We're going to leave this as-is and assume eventstream replacement (which will be HTTPS-only from the get-go) will handle this for us.