Context:
Keren discovered a lot of interesting insights can be gleaned from the logs in particular regarding the age of the instance and overall activity.
Assumption:
my assumption is that every single data point and edit comes with a log entry and thus storing the complete raw data wouldn't be feasible. the following solution is based on that assumption. please correct me if I'm misunderstanding this.
Goal:
gather interesting insights from the logs of instances
Acceptance criteria
- generate an age of instance metric based on the first log date
- capture the first X log entries and analyze for:
a) ratio of new content to edits
b) ratio of human edits to bot edits
- capture the most recent X log entries and analyze for:
a) ratio of new content to edits
b) ratio of human edits to bot edits
c) most recent bot activity by date
d) most recent human activity by date
- note any other interesting meta data available in the logs in comments on this ticket for consideration as potential inclusions
X = whatever a reasonable number is re storing this data. thinking something like 25 to 100 as a gut check. ideally enough to have some statistical relevance.