Page MenuHomePhabricator

Create a "state of the cloud" monthly report
Open, Needs TriagePublic

Description

Idea from @chasemp in a labs-admin email thread:

I really don't want to call these target metrics or anything but "total instances in Cloud VPS", "Total tools in Tools", "% on Grid", "% on k8s", "new ldap user count", "new Tools user count", "nova-fullstack failures (? or uptime?)"....if this was run on a cron and sent via email to the labs-admin list before the meeting that would be ideal I guess.
We have a similar weekly or months stats generator for Phab (user count, issue count, etc) and over time it creates a view of normal. If we suddenly have 100 new ldap users where we usually have 10, or if someone notices new Tools hasn't gone up in a few weeks, etc. These are hard to quantify baselines where we will be happy to we put them in front of our faces over time I believe.

Event Timeline

bd808 created this task.Jul 25 2017, 4:59 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 25 2017, 4:59 PM
bd808 added a comment.Jul 25 2017, 5:02 PM

The trick to automating this is figuring out where we can gather the interesting metrics from and possibly where we could store them for month over month comparisons.

Counting the number of active Kubernetes namespaces requires a privileged account and I think can only be done from tools-k8s-master at the moment. We could probably setup something that would figure this out daily and post it into graphite/prometheus where it would be more accessible.

bd808 added a comment.Jul 25 2017, 5:04 PM

The metric(s) that come out of T167556: Define a metric to track OpenStack system availability would probably be reasonable to include on this report.

bd808 added a comment.Jul 25 2017, 5:13 PM

LDAP account creations since date:

$ ldapsearch -xLLL -P 3 -E pr=40000/noprompt -o ldif-wrap=no -b"ou=people,dc=wikimedia,dc=org" '(&(objectClass=posixaccount)(createTimestamp>=20170701000000Z))' dn | grep dn: | wc -l
150

Tool account creations since date:

$ ldapsearch -xLLL -P 3 -E pr=40000/noprompt -o ldif-wrap=no -b"ou=people,ou=servicegroups,dc=wikimedia,dc=org" '(&(objectClass=posixaccount)(createTimestamp>=20170701000000Z))' dn | grep dn: | wc -l
10

Have often found number of unique log ins to a systems per month to be a useful stat: (Drop the wc to see the distribution of logins per user.)

$ last |cut -f1 -d" " | head -n -2 | sort | uniq -c | sort -nr | wc -l
194

It can also help to track the directions folks take after account activation. Combining with tools account creates:

$ ldapsearch -xLLL -P 3 -E pr=40000/noprompt -o ldif-wrap=no -b"ou=people,dc=wikimedia,dc=org" '(&(objectClass=posixaccount)(createTimestamp>=20170701000000Z))' uid | awk -F: '/^uid/ {print $2}' | xargs last| cut -f1 -d" " | head -n -2 | sort | uniq -c | sort -nr

Qgil added a subscriber: Qgil.Nov 2 2017, 9:38 PM
Harej added a subscriber: Harej.Jan 24 2018, 11:31 PM