This is now live on beta cluster and it will be available on staging as soon as the staging-cache-* machines come online.
You can check it out on the wmflabs.org [[ https://graphite.wmflabs.org/dashboard/#availability | graphite dashboard. ]]
Individual Graphs:
* [[ https://graphite.wmflabs.org//render?width=600&from=-8hours&until=now&height=400&target=cactiStyle%28alias%28averageSeries%28deployment-prep.*.availability.availability%29%2C%22deployment-cache-*%22%29%29&title=Cluster%20Average&graphOnly=false&vtitle=%25&yMin=96&yMax=100&_uniq=0.25077629713462357 | cluster average ]]
* [[ https://graphite.wmflabs.org/render?width=600&from=-8days&until=now&height=400&yMin=&target=cactiStyle(aliasByNode(deployment-prep.*.availability.availability%2C1))&yMax=100&title=Deployment-prep%20availability&_uniq=0.5533200549967812 | individual varnish nodes ]]
* [[ https://graphite.wmflabs.org/render/?width=1052&height=313&_salt=1429917276.356&from=-2days&logBase=&vtitle=errors%20per%20minute&lineMode=connected&connectedLimit=&target=stacked(sumSeries(nonNegativeDerivative(keepLastValue(deployment-prep.*.availability.5xx))))&hideLegend=true | Errors per minute ]]
So far we are averaging about 99.5% availability and this is counting 404 as non-availability so that is fairly good IMO.