Page MenuHomePhabricator

Update puppet configuration to use GeoLite2 (free) instead of GeoIP2-Enterprise data
Closed, ResolvedPublic

Description

Context

Ideal state: Extension:IPInfo shows baseline data from GeoLite2 and is supplemented by data from Spur (via iPoid-Service).

Background

In T361884: Remove $wgIPInfoGeoIP2EnterprisePath from production config, we removed the $wgIPInfoGeoIP2EnterprisePath from the operations/mediawiki-config repo, because we no longer want to use MaxMind Enterprise GeoIP2 data. The patch rOMWCbacf14164ad3: IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath also set $wgIPInfoGeoLite2Prefix = '/usr/share/GeoIP/GeoLite2-';.

However, looking at this directory on a deployment server, we see:

[kharlan@deploy1002 ~]$ ls -al /usr/share/GeoIP/ | grep GeoLite2-
-rw-r--r--   1 root root  31893928 Nov 30  2020 GeoLite2-City.mmdb

In addition to missing the -ASN and -Country files, the -City file was last downloaded in 2020 (on the deployment server, not sure about app servers)

The step we missed was updating calls to operations/puppet's geoip module to specify that it should download GeoLite2 data and not GeoIP2-Enterprise or GeoIP2 data.

As a result, after deploying rOMWCbacf14164ad3: IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath, we started to receive reports from users (filed in an unrelated task, to further complicate matters) about not seeing data: T363118#9796877. That makes perfect sense as Country and ASN files are missing, and the City file was last updated four years ago. On May 30 we did a temporary revert of rOMWCbacf14164ad3: IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath via rOMWC429ebcfea028: Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath".

Proposal

  • Update the operations/puppet repo to download GeoLite2 data instead of GeoIP2-Enterprise data for IPInfo

Consequences

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change #1037528 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/puppet@production] geoip: Use GeoLite2 instead of GeoIP2 Enterprise data

https://gerrit.wikimedia.org/r/1037528

Change #1037531 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/puppet@production] geoip: Download GeoLite2 ASN file

https://gerrit.wikimedia.org/r/1037531

Is there a licensing reason we're trying to show GeoLite2 data rather than GeoIP2 data?

The GeoIP2 databases are up to date on our servers, so in theory could we update the path to point there, and make the accompanying changes to IPInfo (see below)? From our deploy server /usr/share/GeoIP/:

-rw-r--r--   1 root root 109001736 May 29 03:53 GeoIP2-City.mmdb
-rw-r--r--   1 root root   9018527 May 29 03:53 GeoIP2-Connection-Type.mmdb
-rw-r--r--   1 root root   6418250 May 29 03:53 GeoIP2-Country.mmdb
-rw-r--r--   1 root root  14723765 May 29 03:53 GeoIP2-ISP.mmdb

It looks as thought we originally intended to show GeoIP2 data on production. IPInfo originally we had a GeoIP2InfoRetriever class, but after adding the EnterpriseInfoRetriever, we changed it into the GeoLite2InfoRetriever in this patch: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/IPInfo/+/752257 and reduced its scope only to handle data available from GeoLite2.

kostajh renamed this task from Update puppet configuration to use GeoLite2 (free) instead of GeoIP2 (Enterprise) data to Update puppet configuration to use GeoLite2 (free) instead of GeoIP2-Enterprise data.May 31 2024, 12:57 PM
kostajh updated the task description. (Show Details)
kostajh added a subscriber: Dzahn.

Is there a licensing reason we're trying to show GeoLite2 data rather than GeoIP2 data?

The GeoIP2 databases are up to date on our servers, so in theory could we update the path to point there, and make the accompanying changes to IPInfo (see below)? From our deploy server /usr/share/GeoIP/:

-rw-r--r--   1 root root 109001736 May 29 03:53 GeoIP2-City.mmdb
-rw-r--r--   1 root root   9018527 May 29 03:53 GeoIP2-Connection-Type.mmdb
-rw-r--r--   1 root root   6418250 May 29 03:53 GeoIP2-Country.mmdb
-rw-r--r--   1 root root  14723765 May 29 03:53 GeoIP2-ISP.mmdb

It looks as thought we originally intended to show GeoIP2 data on production. IPInfo originally we had a GeoIP2InfoRetriever class, but after adding the EnterpriseInfoRetriever, we changed it into the GeoLite2InfoRetriever in this patch: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/IPInfo/+/752257 and reduced its scope only to handle data available from GeoLite2.

I'm following up to get clarity on licensing issues in an internal chat.

I'm following up to get clarity on licensing issues in an internal chat.

For some reason I am getting a "There’s been a glitch…We’re not quite sure what went wrong. " kind of message from Slack when following this link.

There is a lot of history here. I might be able to help if you have questions how it's setup.

Also see T302864 , T303464 , T228533 and T288844

and

https://gerrit.wikimedia.org/r/q/topic:%22geoip%22

We have 2 different classes that download different sets of databases. It can be configured which ones MediaWiki appservers get.. and:

"modules/geoip/manifests/data/maxmind/ipinfo.pp:# The difference to geoip::data::maxmind is a different license"

^^

I'm following up to get clarity on licensing issues in an internal chat.

For some reason I am getting a "There’s been a glitch…We’re not quite sure what went wrong. " kind of message from Slack when following this link.

It's a private channel accessible to the TSP team.

There is a lot of history here. I might be able to help if you have questions how it's setup.

Also see T302864 , T303464 , T228533 and T288844

and

https://gerrit.wikimedia.org/r/q/topic:%22geoip%22

Thanks, yes, I did a lot of reading last week :)

We have 2 different classes that download different sets of databases. It can be configured which ones MediaWiki appservers get.. and:

"modules/geoip/manifests/data/maxmind/ipinfo.pp:# The difference to geoip::data::maxmind is a different license"

^^

The patch should be ready for review. We are not continuing the Enterprise license, so we need to update ipinfo.pp to download GeoLite2 data.

Is there a licensing reason we're trying to show GeoLite2 data rather than GeoIP2 data?

GeoIP2's license says:

  • You may install and use multiple copies of the GeoIP Databases on multiple computers as long as the databases are accessible only by you and your employees.
  • GeoIP Data may not be stored in a way that is accessible or searchable by anyone other than you and your employees.
  • You may not share the GeoIP Data or GeoIP Databases with third parties. Examples of sharing the data include (1) displaying geolocation pairing information ("this IP address, XXX.XX.XX.XX, originates from New York City, NY, USA"); or (2) displaying the geolocation data in aggregate ("1000 IP addresses originated in New York City").

So it doesn't seem like that would work for the IPInfo use case.

Change #1038419 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/IPInfo@master] ServiceWiring: Prefer GeoLite2 configuration

https://gerrit.wikimedia.org/r/1038419

Change #1038419 abandoned by Kosta Harlan:

[mediawiki/extensions/IPInfo@master] ServiceWiring: Prefer GeoLite2 configuration

Reason:

Going with I0489439c6eceb96ef00e0f0ef16ceb168a0067c9 instead

https://gerrit.wikimedia.org/r/1038419

For compatibility reasons, as well as to avoid any kind of disruption, we decided to include all data files, until the final code has been deployed, so in turn we will do some puppet code cleanup afterwards

Change #1037528 merged by Effie Mouzeli:

[operations/puppet@production] [geoip::data::maxmind::ipinfo]: Use GeoLite2 instead of Enterprise data

https://gerrit.wikimedia.org/r/1037528

I think this is done now. I added a comment to T357753: Build next iteration of IPoid using OpenSearch/ElasticSearch as backend, where in the future we could consider loading the GeoLite2 data into the OpenSearch index that will host Spur data.

@jijiki @CDanis as a follow-up, could someone from SRE please confirm that the GeoLite2 files at /usr/share/GeoIPInfo/GeoLite2- have been successfully updated today?

@jijiki @CDanis as a follow-up, could someone from SRE please confirm that the GeoLite2 files at /usr/share/GeoIPInfo/GeoLite2- have been successfully updated today?

Yeah, I've verified that all the GeoLite2-* files on all hosts match what's on puppetserver. P64647
(The two mw140x hosts listed there with "Name or service not known" are being reimaged with a different hostname right now in T351074)

I am not sure if I should do any cleanup here without running it by and hence just a comment and no action:

We have geoip_update_ipinfo.service failing on puppetmaster1001 (started by the timer of the same name):

Jul 10 04:30:01 puppetmaster1001 systemd[1]: Starting download geoip databases for the IPInfo extension from MaxMind...
Jul 10 04:30:02 puppetmaster1001 geoip_update_ipinfo[22062]: Received an unexpected HTTP status code of 403 from https://updates.maxmi
Jul 10 04:30:02 puppetmaster1001 geoip_update_ipinfo[22062]: Invalid product ID or subscription expired for GeoIP2-Enterprise
Jul 10 04:30:04 puppetmaster1001 systemd[1]: geoip_update_ipinfo.service: Main process exited, code=exited, status=1/FAILURE
Jul 10 04:30:04 puppetmaster1001 systemd[1]: geoip_update_ipinfo.service: Failed with result 'exit-code'.
Jul 10 04:30:04 puppetmaster1001 systemd[1]: Failed to start download geoip databases for the IPInfo extension from MaxMind.
sukhe@puppetmaster1001:~$ systemctl list-units --failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION                                                   
● geoip_update_ipinfo.service loaded failed failed download geoip databases for the IPInfo extension from MaxMind

I am not sure if I should do any cleanup here without running it by and hence just a comment and no action:

We have geoip_update_ipinfo.service failing on puppetmaster1001 (started by the timer of the same name):

Jul 10 04:30:01 puppetmaster1001 systemd[1]: Starting download geoip databases for the IPInfo extension from MaxMind...
Jul 10 04:30:02 puppetmaster1001 geoip_update_ipinfo[22062]: Received an unexpected HTTP status code of 403 from https://updates.maxmi
Jul 10 04:30:02 puppetmaster1001 geoip_update_ipinfo[22062]: Invalid product ID or subscription expired for GeoIP2-Enterprise
Jul 10 04:30:04 puppetmaster1001 systemd[1]: geoip_update_ipinfo.service: Main process exited, code=exited, status=1/FAILURE
Jul 10 04:30:04 puppetmaster1001 systemd[1]: geoip_update_ipinfo.service: Failed with result 'exit-code'.
Jul 10 04:30:04 puppetmaster1001 systemd[1]: Failed to start download geoip databases for the IPInfo extension from MaxMind.
sukhe@puppetmaster1001:~$ systemctl list-units --failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION                                                   
● geoip_update_ipinfo.service loaded failed failed download geoip databases for the IPInfo extension from MaxMind

That is indeed expected, because the subscription has expired. We should remove GeoIP2-Enterprise from the list of product IDs in volatile.pp and geoip.pp. @ssingh is this something you can do?

Change #1053390 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] puppetmaster/puppetserver: remove MaxMind db product GeoIP2-Enterprise

https://gerrit.wikimedia.org/r/1053390

Mentioned in SAL (#wikimedia-operations) [2024-07-10T22:53:31Z] <mutante> puppetmaster1001 - remove Enterprise product ID from MaxMind downloads. sudo systemctl start geoip_update_ipinfo - T366272

Change #1053390 merged by Dzahn:

[operations/puppet@production] puppetmaster/puppetserver: remove MaxMind db product GeoIP2-Enterprise

https://gerrit.wikimedia.org/r/1053390

[cumin1002:~] $ sudo cumin 'C:role::puppetmaster::frontend' 'systemctl list-units --state=failed'
..
[cumin1002:~] $ sudo cumin 'C:role::puppetmaster::backend' 'systemctl list-units --state=failed'
..
[cumin1002:~] $ sudo cumin 'C:role::puppetserver' 'systemctl list-units --state=failed'
..
(6) puppetserver[2001-2003].codfw.wmnet,puppetserver[1001-1003].eqiad.wmnet                                                                                                  
(2) puppetmaster2001.codfw.wmnet,puppetmaster1001.eqiad.wmnet                                                                                                                
(2) puppetmaster2002.codfw.wmnet,puppetmaster1003.eqiad.wmnet                                                                                                                
..
0 loaded units listed.

Started the failed unit on puppetmaster1001 and puppetserver1001.

No more failed units. fixed.