The wikimedia_nets Varnish ACL includes the 10.68.0.0/16 range used by the old nova-network OpenStack region, but it does not include the new 172.16.0.0/21 range that is used by the "eqiad1-r" OpenStack region that now hosts the majority of Cloud VPS traffic. This is also starting to effect Toolforge tools as they migrate from the old Ubuntu Trusty grid to the new Debian Stretch grid.
Kiwix runs Wikimedia wikis scrapers on 5 boxes of the Wikimedia cloud VPS:
We have been doing so since years but we have recently migrated our (mostly manual) system to a fully automated one called the Zimfarm. The Zimfarm is based on Docker and basically schedules and runs Docker containers to scrape the wikis. These containers run an instance of mwoffliner, a Mediawiki dedicated scraper which deals with the Mediawiki API to retrieve the necessary information and content.
Unfortunately, and for an unknown reason, it seems that we face now, a lot of HTTP 429 responses from the Wikimedia API. This basically stops us to scrape the wikis properly. I don't know why we have this problem now: on our side the software is the same and I assume nothing has changed in the configuration on Wikimedia side. Maybe this is the migration to the new VM OS and/or the new VPS datacenter we had to do in December?
Does someone has an explanation? a solution?
Our ticket: https://github.com/openzim/mwoffliner/issues/496