Problem
As part of T251602, the geolocation and AS number need to be collected.
We (Anti-Harassment) originally thought we would use external web services to fetch the information about an IP address T248525. While this mechanism is straightforward to implement, it prevents the product from being able to query logged actions based on this data. It also prevents aggregation of any kind.
However, as @Reedy explained in T248525#6101785, Wikimedia Foundation is already paying for and using MaxMind's proprietary dataset.
Proposed Solution
Instead of looking up information on IP addresses on-demand, it would be preferable in many ways to look up that information when a logged action (edit, etc.) is made and save the data into the database. This would be similar to the way CheckUser records User Agents.
This could either be done by passing the data from another service (like whatever proxy / cache accepts incoming requests) by adding the data as a request header or could be done from within MediaWiki itself.
Ideally what would be saved is a value that can be localized. Which would either be the ASN & GeoNames ID and/or the converted Wikidata ID.
Data Retention
Data retention will be the same as IP addresses are now:
Not logged-in | Indefinently |
Logged-in | 90 days in CheckUser |
Questions