The Anti-Harassment team needs to access the MaxMind database to provide extended information about IPs as part of our work on IP masking. We have licensing permission from MaxMind to use the data in this way.
The MaxMind DB file is available on all Wikimedia application servers. We'll add a config variable to IP Info extension that will be a filesystem path to the MaxMind DB file. We'll use their geoip2/geoip2 library in order to read/query the DB file. What will the the file path in production?
Alternatively, we could create a microservice that IP Info could access over the local network.
The outstanding question is how do we write code that integrates with that data either in local environments or other testing areas. Should we mock the data structure with some fake data? Is the GeoIP Lite database in the same structure? Given its open license, we could more easily copy that data around. What other options are there for testing setups?
Given that this is a nicely indexed flat file database, I'm assuming performance impact for something low use is negligible. (At least to begin with, the feature will only be available to checkusers. However, it may be more widely available in the future.) Is that assumption correct?