Executive summary
- There is an HTTP caching proxy that sits in front of docker-registry.wikimedia.org.
- The MW image build process may update tags.
- HTTP clients (such as docker) which query the list of tags and/or the manifest for a tag may receive out-of-date information due to the cache.
This document is only concerned with mediawiki core/extension/etc commits merged into “train branches” (e.g. wmf/1.37.0-wmf.4), and commits to operations/mediawiki-config.
Workflow:
A change to mediawiki is merged into a train branch (e.g. wmf/1.37.0-wmf.4):
- The merge triggers a single-version image build.
- The single-version image will be tagged with the train branch (e.g. wmf-1.37.0-wmf.4) and pushed to the registry, probably updating an existing tag.
- The multiversion MW image build process is triggered.
Multiversion MW image build process:
- Read wikiversions.json from operations/mediawiki-config
- Copy in the contents of each unique single-version image mentioned in wikiversions.json.
- Push the constructed multiversion image to the registry with tag production, probably updating an existing tag.
Problem:
Let’s say a multiversion image build containing wmf.3 and wmf.4 has already run in the past (e.g. a few hours ago) and now a change has been backported to wmf.4. The new wmf.4 image is built and pushed to the registry. Now the multiversion build process tries to copy the files from the wmf.4 tagged image. Docker makes an HTTP request to the registry to see if a new image needs to be downloaded. The HTTP proxy sees a URL that it has seen before and returns the cached information which points to the old wmf.4 single-version image. The multiversion build process proceeds using the out-of-date image. Bad.
Possible solutions:
- Don’t cache certain accesses to the registry. It looks like https://gerrit.wikimedia.org/r/c/operations/puppet/+/691108 covers this option.
- Provide way to invalidate the cache for tags/manifests URLs
- Make all registry accesses through docker-registry.discovery.wmnet (suggested by @Joe) instead of docker-registry.wikimedia.org.
- Credentials are required for read access to docker-registry.discovery.wmnet
- The problem of out-of-date information for offsite read-only access still remains.
- Change the image build process so that tags are never reused. This means that the single-image build process will need to communicate new tags to the multi-version build process (since we can’t rely on asking the registry for the list of tags since we might get out-of-date cached info). This should be achievable if the building process is allowed to push and +2 a commit to operations/mediawiki-config.