As part of analyzing the impact of archive.today on readers of Wikipedia, following the enwiki blacklisting of that source, and other community efforts to respond, we would like to better understand the rate of outbound traffic through citation links.
To do this, our plan is, for some period of time, to collect data on what citation links are clicked by logged-out readers, across enwiki and other wikis. We'll add to this ticket with specifics about the time period and specific wikis.
Our current plan is to do this through client-side instrumentation (JS) rather than a server-side redirector, since it doesn't need to be perfect or change anything about what happens.
Prior research
https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Citation_Usage
T171231: [Objective 11.1.2] Research on citation/external link usage
Proposed implementation
Use Prometheus instead of Event Logging, so that events are not associated with an individual user. We do not need or want to connect clicks with any individual session; we are interested to see data in aggregate.
- Use StatsFactory to emit events to Prometheus, bucketed by domain