Page MenuHomePhabricator

Deprecate HTTPS udp2log stream?
Closed, ResolvedPublic

Description

Our nginx TLS terminators are currently patched with a custom-written patch that adds udp2log logging support.

This is unmaintained, buggy (e.g. with regards to sequence number generation) and needs porting to each newer nginx version (and requires us to ship custom packages).

I remember hacking on it and fixing some bugs my second week at the foundation and people telling me to let it go as it was going away "soon". Almost three years ago have passed and we still have it, for reasons that still aren't clear to many people including myself.

We're moving to a much newer nginx version (1.6.x looks like) very soon, so it'd be nice to either deprecate the stream entirely, or falling to do that, properly architecture it and assign it a maintainer. This conversation is a blocker for a couple of quarterly goals for SRE so we should figure this out very soon.

Event Timeline

faidon claimed this task.
faidon raised the priority of this task from to High.
faidon updated the task description. (Show Details)
faidon added subscribers: Aklapper, faidon, mark, BBlack.

on the re-architecturing, I think (newer?) nginx versions can write logs to a pipe so that might be a quick win without patching nginx (?)

@Tnegrin confirmed that Analytics is not using this data (anymore), and it can be turned off.

(Let's turn off this data ASAP, so we still have a bit of time if someone suddenly starts screaming.)

Otto, Oliver, Erik -- This is a heads-up that these log messages are going away. My understanding was that we were using them to count HTTPS requests but we use a different mechanism on Hadoop.

-Toby

Seems sensible.

(Out of interest, as part of the ticket-that-must-not-be-named, do we want to also stop generating the sampled logs altogether? Are we using them for anything now?)

The sampled logs are used for about 15 monthly and quarterly reports, for which replacement is still in the 'someday, somehow by someone' phase.

http://stats.wikimedia.org/wikimedia/squids/SquidReportsCountriesLanguagesVisitsEdits.htm
http://stats.wikimedia.org/cgi-bin/search_portal.pl?search=breakdown+of+traffic

If you're talking about 1:1000 sampled text logs, these are immensely useful for day to day operations. But let's keep this on-topic, we can discuss this further in a separate task if you want.

Yep; Erik's answer is enough. My comment was merely an aside.

FYI, the pagecounts-raw files found at dumps.wikimedia.org/other/pagecounts-raw/ use the nginx logs. We and are now recreating this data using Hive via varnishkafka. Christian and I have a role out plan to get the newly generated files to dumps.wikimedia.org. This will require a small announcement to the analytics public list, as well as some hacking on some scripts that are currently used to copy these files from webstatscollector to dataset1001.

Are they? So are these just counting the X% of requests that come via HTTPS, where X is < 5 probably (and also a biased sample, as this is predominantly editors)?

So, in udp2log, the https and the duplicated proxied http request both exist. That means that a any given https request will have 2 entries in the logs. The webstatscollector code chooses to use the nginx log line, and throws away the varnish log line. By removing nginx from the webrequest udp2log stream, webstatscollector will no longer count any https requests.

Since I'm going on vacation next week, Christian and I are making deprecating webstatscollector priority number one this week.

gerritbot added a subscriber: gerritbot.

Change 187168 had a related patch set uploaded (by Ottomata):
Copy pagecounts-raw dataset from stat1002 hdfs-archive (data generated by Hive) to dumps.wikimedia.org

https://gerrit.wikimedia.org/r/187168

Patch-For-Review

Change 187168 merged by Ottomata:
Copy pagecounts-raw dataset from stat1002 hdfs-archive (data generated by Hive) to dumps.wikimedia.org

https://gerrit.wikimedia.org/r/187168

Update: the Hive generated pagecounts-raw data is now being copied every hour from HDFS to dumps.wikimedia.org.

The data is still being backfilled in hadoop. Once all backfill jobs are done, I will decomission webstatscollector, and we should be able to turn off nginx udp2logs. I'll check back on the jobs tomorrow and update.

The data is still being backfilled in hadoop.

Done.

Looking GOOD! Brandon, you may turn of nginx udp2log! :)

nginx udp2log is off and nginx configs have been reloaded: https://gerrit.wikimedia.org/r/#/c/186257/