@fgiunchedi Mentioned in #wikimedia-perf that he thought he remembered there being a reason why submitting metrics via graphite wouldn't work.
Mon, Apr 16
Brandon - No further action for Performance on this. I'm assigning to you to close out or for further investigation, if any is needed.
Fri, Apr 13
isCompliant check has been moved to the backend, and is working as expected. Log messages look like this:
Wed, Apr 11
All affected data has been reprocessed, and the graphs look like they should.
Change in observed performance due to depooling of Singapore:
Mon, Apr 9
After re-processing, medians are lining up as before
Fix is deployed, and I'm re-running the data that's in kafka. It starts at Monday, April 2, 2018 8:53:00 AM, meaning that we can reprocess going back just over 1 week. Unfortunately we'll have a window where the data is skewed due to oversampling from Asia prior to that, but it's not worth the time/energy to fix, IMO.
Sat, Apr 7
Woke up in the middle of the night knowing what was going on. The code was checking for the presence of "is_oversample" and discarding if true, but the property is actually named "isOversample". Was getting thrown off because we do that a little differently in navtiming.py.
Fri, Apr 6
I've confirmed that the new code is giving the correct results, and the old code is not.
Thu, Apr 5
Server side is deployed, extension change is merged but figuring that it'll just be on the train next week.
Mon, Apr 2
Coal has now been running for 4 days, and appears to be performing as expected.
Thu, Mar 29
Interestingly, checking out records coming in via kafkacat, I'm seeing some that I wouldn't expect to be issues:
Wed, Mar 28
Mon, Mar 26
Wed, Mar 21
Rolled out around 5:30PM UTC today. It's the middle of the night in Asia so current traffic is low, but it's clearly working:
Tue, Mar 20
Mar 20 2018
Mar 15 2018
Mar 14 2018
If it helps, this repo just needs dpkg-buildpackage run at the root in order to generate the deb: https://github.com/marlier/python-typing
Mar 13 2018
@Ottomata - should be today, in testing over the weekend I found an issue, think it's fixed but I need to verify. Since the new code knows how to catch up, I think you can go ahead and turn off the crossloader, and when the new code starts running we'll start processing from where we left off.
Live as of 13:15UTC.
Mar 12 2018
http://python-etcd.readthedocs.io/en/latest/ seems like a reasonable python-etcd client, which would in turn allow us to switch based on the master datacenter parameter.
Mar 9 2018
Mar 5 2018
Works like a charm!
Mar 2 2018
Hrm, probably need (limited) sudo access. Specifically, being able to read/tail the logs that are written by systemd when it launches a new service, and potentially being able to start/stop the service "coal".
@Dzahn Yes pease!
It would be helpful if the entire team had this access. It's only necessary for me for the moment.
Mar 1 2018
Feb 26 2018
Perf team talked about this a bit today, and we think it's fine to go ahead.