Page MenuHomePhabricator

Fetch information about an IP from the MaxMind GeoIP2 Enterprise database
Closed, ResolvedPublic

Description

The MaxMind GeoIP2 Enterprise database provides

…geolocation data such as country, region, state, city, ZIP/postal code, and additional intelligence such as confidence factors, ISP, domain, and connection type.

At the time of writing (16th August, 2021), IP Info reads from the separate City, ASN, ISP, and Connection-Type databases. The extension needs to be updated to read from the Enterprise database when it is available.

Notes

  • The Enterprise database is not available to everyone so the code that reads it (i.e. MediaWiki\IPInfo\GeoIP2InfoRetriever) should not be removed

Event Timeline

ARamirez_WMF renamed this task from Fetch information about an IP from the MaxMind GeoIP2 Enterprise database to Investigate: Fetch information about an IP from the MaxMind GeoIP2 Enterprise database [8H].Sep 1 2021, 4:49 PM

I looked into this a little this morning (UTC+1) as we'd discussed a couple of design patterns that we could use to model this.

Firstly, it turns out that the class that we use to read the MaxMind databases, GeoIp2\Database\Reader, Just Works™ with the Enterprise database but there's a slightly different API that returns something that is GeoIp2\Model\City-like but has traits that contain the ASN, ISP, and connection type information that we display, e.g.

foo.php
require __DIR__ . '/vendor/autoload.php';

use GeoIp2\Database\Reader;

$reader = new Reader( __DIR__ . '/GeoIP2-Enterprise.mmdb' );

$record = $reader->enterprise( 'xxx.xxx.xxx.xxx' );

$coordinates = [ $record->location->longitude, $record->location->latitude ];
$asn = $record->traits->autonomousSystemNumber;
$organization = $record->traits->autonomousSystemOrganization;

$locations = [];
$locations []= [ $record->city->geonameId, $record->city->name ];
$locations += array_map(
   fn ( $subdivision ) => [ $subdivision->geonameId, $subdivision->name ],
   $record->subdivisions
);

$isp = $record->traits->isp;
$connectionType = $record->traits->connectionType;

$isAnonymous = $record->traits->isAnonymous;
$isAnonymousVpn = $record->traits->isAnonymousVpn;
$isPublicProxy = $record->traits->isPublicProxy;
$isResidentialProxy = $record->traits->isResidentialProxy;
$isLegitimateProxy = $record->traits->isLegitimateProxy;
$isTorExitNode = $record->traits->isTorExitNode;

AIUI clients should be unaware of which database that we're reading from. Given the above and this constraint, my suggestion would be to extend MediaWiki\IPInfo\InfoRetriever\GeoIp2InfoRetriever to specialise it:

IPInfo/src/InfoRetriever/GeoIp2EnterpriseInfoRetriever.php
namespace MediaWiki\IPInfo\InfoRetriever;

use GeoIp2\Database\Reader;
use MediaWiki\IPInfo\Info\Info;

class GeoIp2EnterpriseInfoRetriever extends GeoIp2InfoRetriever {
    public function retrieveFromIP( string $ip ): Info {
        $reader = new Reader( $this->options->get( 'IPInfoGeoIP2EnterprisePath' ) );

        return new Info(
            /* ... */
        );
    }
}

and to encapsulate the logic of picking which retriever class to instantiate in the 'IPInfoGeoIp2InfoRetriever' service factory function in IPInfo/src/ServiceWiring.php:

IPInfo/src/ServiceWiring.php
return [

/* ... */

    'IPInfoGeoIp2InfoRetriever' => function ( MediaWikiServices $services ): GeoIp2InfoRetriever {
        $config = $services->getMainConfig();

        // The GeoIP2 Enterprise database takes precedence
        if ( $config->has( 'IPInfoGeoIP2EnterprisePath' ) {
             /* ... */
        } else {
             /* ... */
        }
    },

/* ... */

];

Change 722826 had a related patch set uploaded (by STran; author: STran):

[mediawiki/extensions/IPInfo@master] [WIP] Implement enterprise db info retriver

https://gerrit.wikimedia.org/r/722826

@phuedx did most of the work already but I've uploaded a proof of concept patch to go along with it. Some notes:

  • IPInfoGeoIP2EnterprisePath refers to the folder and the reader itself is responsible for listing out the file name*
  • I noticed that the enterprise db didn't have proxy information. For that we'll need to use the Anonymous IP database which we have access to. This is reflected in the patch.
  • As part of this task we should also 1. update the documentation and 2. make sure these files exist on the relevant servers

*I think it's outside the scope of this ticket, but we might want to refactor the non-enterprise reader to make IPInfoGeoIP2Path refer to only the folder as well instead of the folder + prefix
**We could also maybe make database names a config option instead of being hard-coded into the reader

  • As part of this task we should also 1. update the documentation and 2. make sure these files exist on the relevant servers

#2 is tracked in T288844: Update MaxMind GeoIP2 license key and product IDs for application servers.

*I think it's outside the scope of this ticket, but we might want to refactor the non-enterprise reader to make IPInfoGeoIP2Path refer to only the folder as well instead of the folder + prefix

This is tracked in T289361: Rename $wgIPInfoGeoIP2Path or hard code the GeoLite2/GeoIP2- prefix [S]

**We could also maybe make database names a config option instead of being hard-coded into the reader

Perhaps this could be discussed as part of T289361: Rename $wgIPInfoGeoIP2Path or hard code the GeoLite2/GeoIP2- prefix [S]?

STran renamed this task from Investigate: Fetch information about an IP from the MaxMind GeoIP2 Enterprise database [8H] to Fetch information about an IP from the MaxMind GeoIP2 Enterprise database.Sep 28 2021, 8:37 AM

Removing the investigate tag since there's a patch associated w/this ticket.

Change 722826 merged by jenkins-bot:

[mediawiki/extensions/IPInfo@master] Implement enterprise db info retriver

https://gerrit.wikimedia.org/r/722826

Verify that IP info fetched data from the downloaded enterprise database on my local instance.
Below is the screen shot example:

Screen Shot 2021-10-06 at 3.53.38 PM.png (1×2 px, 438 KB)

Screen Shot 2021-10-06 at 4.33.40 PM.png (1×2 px, 712 KB)

Change 729999 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/IPInfo@master] services: Fix IPInfoGeoIP2InfoRetriever

https://gerrit.wikimedia.org/r/729999

Change 729999 merged by jenkins-bot:

[mediawiki/extensions/IPInfo@master] services: Fix IPInfoGeoIP2InfoRetriever

https://gerrit.wikimedia.org/r/729999