Once the PageIssues schema is live, we would like to ingest some of its data into Druid, so that it can be viewed in Superset.
Per a review of the draft schema guidelines and subsequent discussion with the AE team on IRC this should be possible, with the following fields as dimensions:
- isAnon
- action
- issuesVersion
- issuesSeverity This field is an array. As discussed on IRC, it should be flattened into a string, see T201873
- sectionNumbers This field is also an array and should be treated in the same way.
- editCountBucket
- namespaceId
The sole measure would be the number of actions (events), aggregated by (say) hour.
The following fields should be left out:
- pageTitle
- pageToken
- sessionToken