The Growth team is currently working on a mentor dashboard. We should decide how should the mentor dashboard be instrumented (if anyhow at all). While thinking about this issue, I realized that it would be already possible to see some information from the default telemetry we use:
- https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest has information about which special page (if any at all) the request was about, thus we'd be able to see how many times the page was opened
- event infrastructure already logs API calls to the data lake, and thus we'd be able to see how people use the filtering API (theoretically at the param level)
The only issue is that those datasets are really big (especially the webrequest one), and I wasn't able to find a way how to use something smaller than webrequest table (pageviews are refined to their own table, but this doesn't happen with special page views AFAICS). That means it might be more efficient for cluster time to just create another schema, and log data from the dashboard ourself.
On the other hand, if the only thing we're interested in is how often it gets visited, it might be more efficient to generate "special page pageviews" dataset using the already existing data, and use it for other "not so important" special pages we have (community configuration goes to my mind), as creating a schema with a single event ("impression") sounds like not a ideal solution to me.
- Which data will we want from the mentor dashboard?
- How feasible it would be to create a table similar to pageviews, but for special pages?