Page MenuHomePhabricator

Registering on wikibase.cloud exposes internal IP rather than the registering user
Closed, ResolvedPublic3 Estimated Story Points

Description

From @Addshore

I just noticed once thing before fully signing off on cloud and figured I should write it down so that it can be looked at.

I just registered an account on a new site and in the email confirmation that I got it said Someone, probably you, from IP address 10.108.1.4,

This looks like an internal IP address to me, so something is miss configured.

I believe this is what you'll need to look at
https://github.com/wbstack/mediawiki/blob/main/dist-persist/wbstack/src/Settings/ProductionCache.php#L3-L7
It probably needs to be configurable

This probably also needs checking on .dev as the range could be different.
Fixing this for example will make it harder for spam bots to spam etc, as the IP address will actually be known by the tooling fighting spam!

AC

  • Registration email does not expose internal IP but the registering user IP

Event Timeline

Assuming the mentioned setting is the right one ($wgCdnServersNoPurge), the current configuration states the subnet '10.8.0.0/14'. I created new accounts on test wikis on staging and production, these were the IPs that were in the mail (both not included by that subnet mask):

# configured
10.8.0.0/14

# from staging
10.112.4.4

# from prod
10.108.3.6

@Tarrow found a way to look up the used IP ranges for each cluster via the following command:
kubectl get ds kube-proxy -n kube-system -o=jsonpath="{.spec.template.spec.containers[0].command}" | grep -Po '\-\-cluster\-cidr=[^ ]*' | cut -d'=' -f2 | tr -d '"]'

results:
staging - 10.112.0.0/14
production - 10.108.0.0/14

mediawiki PR which introduces pulling the setting from an env var:
https://github.com/wbstack/mediawiki/pull/262

chart PR using new mediawiki image (needs mw image name after the previous is merged and built):
https://github.com/wbstack/charts/pull/99

using the new chart on staging:
https://github.com/wmde/wbaas-deploy/pull/418

using the new chart on production:
https://github.com/wmde/wbaas-deploy/pull/419

deployed https://github.com/wmde/wbaas-deploy/pull/418 to staging but the email is still exposing the internal mediawiki ip

We just reverted this deployment on production it seems to have caused issues with the queryservice-updater.

https://github.com/wmde/wbaas-deploy/commit/edc2b110017c328381186e9542e318a4caa0f90a

This fixed the problem, but we now need to re-think this ticket.

This is the kind of error we are experiencing with mw chart 0.10.6 (seen in logs of a mediawiki-137-fp-app-web after creating a new item on staging):

[error] [exception] [e3f5889f0814aba8363ea838] /wiki/Special:EntityData/Q5.ttl?flavor=dump&nocache=1656591583302   PHP Fatal Error from line 1325 of /var/www/html/w/includes/WebRequest.php: Uncaught MWException: Invalid IP given in XFF ', 10.112.1.12'. in /var/www/html/w/includes/WebRequest.php:1325
Stack trace:
#0 /var/www/html/w/includes/db/MWLBFactory.php(391): WebRequest->getIP()
#1 /var/www/html/w/includes/ServiceWiring.php(513): MWLBFactory::applyGlobalState(Object(Wikimedia\Rdbms\LBFactorySimple), Object(GlobalVarConfig), Object(BufferingStatsdDataFactory))
#2 /var/www/html/w/vendor/wikimedia/services/src/ServiceContainer.php(447): Wikimedia\Services\ServiceContainer::{closure}(Object(MediaWiki\MediaWikiServices))
#3 /var/www/html/w/vendor/wikimedia/services/src/ServiceContainer.php(416): Wikimedia\Services\ServiceContainer->createService('DBLoadBalancerF...')
#4 /var/www/html/w/includes/MediaWikiServices.php(279): Wikimedia\Services\ServiceContainer->getService('DBLoadBalancerF...')
#5 /var/www/html/w/includes/MediaWikiServices.php(881): MediaWiki\MediaWikiServices->getService('DBLoadBalancerF...')
#6 /var/www/html/w/includes/exception/MWExceptionHandler.php(136): MediaWiki\MediaWikiServ
10.112.1.142 - - [30/Jun/2022:12:19:43 +0000] "GET /wiki/Special:EntityData/Q5.ttl?flavor=dump&nocache=1656591583302 HTTP/1.1" 500 284 "-" "Wikidata Query Service Updater Bot"
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}

To me it looks like the Query Service updates can't happen because of the forwarded IP header.

Invalid IP given in XFF ', 10.112.1.12' almost looks like some kind of typo or something (IP string starting with , ), but peaking at the source it suggests that it's only for the output if I understood it correctly, so thats probably not it. Also, right there in the source there is a hint that this may be misconfigured. Just now I see in the stacktrace another IP address 10.112.1.142 and now I wonder if the issue is that these two don't match, or rather, only one of them is in the XFF header?

I think I have found the root cause for this problem, which would indeed suggest a misconfiguration. If my understanding is correct, currently, if a query-service-updater pod makes a request to a mediawiki pod, the IP of the QS updater pod is wrongly treated as a proxy , and therefore the "real" IP is empty (because it's no proxy, it's already the original request). We probably don't see this behaviour locally as the pods are running on the same node.

I wonder if we ideally just want the IP/IP range of the nginx ingress in the $wgCdnServersNoPurge setting? Not sure how to do that without assigning it one statically but I could imagine this being the solution.

@Deniz_WMDE and I talked a bit about this earlier and decided that what may well be needed is due to the creation of a rather mangled looking X-Forwarded-For (XFF) header. We suspect that as per the error log above it does really look like , 10.112.1.12.

This is probably due to https://github.com/wmde/wbaas-deploy/blob/main/k8s/helmfile/env/local/platform-nginx.nginx.conf#L33 (proxy_set_header X-Forwarded-For "$http_x_forwarded_for, $realip_remote_addr";)

When a request is made "internally" straight from a pod in the cluster to the platform nginx (but skipping the ingress nginx) this $http_x_forwarded_for is empty.

We suspected that the right solution might be to more carefully build the XFF header. However to do this properly it would be nice to have a local reproduction which so far seems to be elusive.

I found the the following PR have me a local reproduction of this error: https://github.com/wmde/wbaas-deploy/pull/453

[error] [exception] [8edec348b992f34fcd68026d] /wiki/Special:EntityData/Q1.ttl?flavor=dump&nocache=1657043054938   PHP Fatal Error from line 1325 of /var/www/html/w/includes/WebRequest.php: Uncaught MWException
: Invalid IP given in XFF ', 172.17.0.1'. in /var/www/html/w/includes/WebRequest.php:1325                                                                                                                         
Stack trace:                                                                                                                                                                                                      
#0 /var/www/html/w/includes/db/MWLBFactory.php(391): WebRequest->getIP()                                                                                                                                          
#1 /var/www/html/w/includes/ServiceWiring.php(513): MWLBFactory::applyGlobalState(Object(Wikimedia\Rdbms\LBFactorySimple), Object(GlobalVarConfig), Object(BufferingStatsdDataFactory))                           
#2 /var/www/html/w/vendor/wikimedia/services/src/ServiceContainer.php(447): Wikimedia\Services\ServiceContainer::{closure}(Object(MediaWiki\MediaWikiServices))                                                   
#3 /var/www/html/w/vendor/wikimedia/services/src/ServiceContainer.php(416): Wikimedia\Services\ServiceContainer->createService('DBLoadBalancerF...')                                                              
#4 /var/www/html/w/includes/MediaWikiServices.php(279): Wikimedia\Services\ServiceContainer->getService('DBLoadBalancerF...')                                                                                     
#5 /var/www/html/w/includes/MediaWikiServices.php(881): MediaWiki\MediaWikiServices->getService('DBLoadBalancerF...')                                                                                             
#6 /var/www/html/w/includes/exception/MWExceptionHandler.php(136): MediaWiki\MediaWikiServi                                                                                                                       
#0 [internal function]: MWExceptionHandler::handleFatalError()                                                                                                                                                    
#1 {main}

I created a PR for local & staging to try this other nginx configuration variable out https://github.com/wmde/wbaas-deploy/pull/454

So to review how this is working we can do the following:

  • Check that the IP address that we see in password reset (and registration) emails is actually the users IP address
  • Check that we are able to make and edit (e.g. make a new item) and then see that reflected in the QueryService
  • Check that there are no errors in the queryservices logs
  • Check that there are no errors in the mediawiki logs (like: https://phabricator.wikimedia.org/T309687#8052888)

Note that the most recent PR doesn't use the new mediawiki image, this must also be deployed or we expect that this first bullet point won't be true (since the original won't be back in place after it was reverted (https://github.com/wmde/wbaas-deploy/commit/edc2b110017c328381186e9542e318a4caa0f90a))

check the first 3 points on the list in the comment above and everything looks good. It works with no errors.
bumping mediawiki on production and adding the fix for production: https://github.com/wmde/wbaas-deploy/pull/457/files

Tarrow added a subscriber: Evelien_WMDE.

@Evelien_WMDE since Leszek is now on holiday I think you could verify this. The procedure would be to use the password reset process on one of your wikibase.cloud wikis, confirm that the IP address shown in the email is actually yours (e.g. by looking on whatismyip.com or similar).

Tested it and it works beautifully, ticket can be moved to Done

@Evelien_WMDE: The project tag got archived and this open task has no other active project tags. Could you please either add an active project tag so this task can be found, or update the task status? Thanks a lot!