Page MenuHomePhabricator

Big holes in the MediaWiki release archive
Open, Needs TriagePublic

Description

I've just been browsing https://releases.wikimedia.org/mediawiki/ and it appears that there are big holes in the history of MediaWiki.

For example, the 1.19 folder includes 1.19.9 to 1.19.24 but is missing 1.19.0 to 1.19.8, and possibly one or more beta releases. Similarly the 1.20 folder only contains release 1.20.8. (Note that I have only checked these two folders, so I don't know how much further this problem extends.)

Expected behaviour: The release archive contains an archive of all releases.

I am not sure if this is corruption (e.g. during server moves) or whether it was by deliberate user action, but these deleted versions should be restored and should remain publicly available.

See also, T190363, which relates to properly tagging the releases in Git.

Event Timeline

I am not sure if this is corruption (e.g. during server moves) or whether it was by deliberate user action

Definitely not the latter.

but these deleted versions should be restored and should remain publicly available.

Indeed, we should find them.

I uploaded the following release tarballs from my personal archives, which mostly derive from a copy I made of the SourceForge files section in 2009. I retroactively designated the dated snapshots of 2003 as "1.0" for clarity when navigating the top-level directory. There was no other 1.0 and they immediately preceded 1.1 in the release notes. For files which already existed on releases.wikimedia.org, I confirmed that the MD5 hash was the same before removing them from the following list.

  • 1.0/mediawiki-20030829.tar.gz
  • 1.0/mediawiki-20031107.tar.gz
  • 1.0/mediawiki-20031117.tar.gz
  • 1.0/mediawiki-20031118.tar.gz
  • 1.1/mediawiki-1.1.0.tar.gz
  • 1.2/mediawiki-1.2.0rc1b.tar.gz
  • 1.2/mediawiki-1.2.0rc1.tar.gz
  • 1.2/mediawiki-1.2.0rc2.tar.gz
  • 1.2/mediawiki-1.2.0rc3.tar.gz
  • 1.2/mediawiki-1.2.0rc4.tar.gz
  • 1.2/mediawiki-1.2.0.tar.gz
  • 1.2/mediawiki-1.2.1.tar.gz
  • 1.2/mediawiki-1.2.2.tar.gz
  • 1.2/mediawiki-1.2.3.tar.gz
  • 1.2/mediawiki-1.2.4.tar.gz
  • 1.2/mediawiki-1.2.5.tar.gz
  • 1.2/mediawiki-1.2.6.tar.gz
  • 1.3/mediawiki-1.3.0beta2.tar.gz
  • 1.3/mediawiki-1.3.0beta3.tar.gz
  • 1.3/mediawiki-1.3.0beta4.tar.gz
  • 1.3/mediawiki-1.3.0beta5.tar.gz
  • 1.3/mediawiki-1.3.0beta6.tar.gz
  • 1.3/mediawiki-1.3.0.tar.gz
  • 1.3/mediawiki-1.3.10.tar.gz
  • 1.3/mediawiki-1.3.14.tar.gz
  • 1.3/mediawiki-1.3.16.tar.gz
  • 1.3/mediawiki-1.3.17.tar.gz
  • 1.3/mediawiki-1.3.1.tar.gz
  • 1.3/mediawiki-1.3.2.tar.gz
  • 1.3/mediawiki-1.3.3.tar.gz
  • 1.3/mediawiki-1.3.4.tar.gz
  • 1.3/mediawiki-1.3.5.tar.gz
  • 1.3/mediawiki-1.3.6.tar.gz
  • 1.3/mediawiki-1.3.7.tar.gz
  • 1.3/mediawiki-1.3.8.tar.gz
  • 1.3/mediawiki-1.3.9.tar.gz
  • 1.4/mediawiki-1.4.0.tar.gz
  • 1.4/mediawiki-1.4.10.tar.gz
  • 1.4/mediawiki-1.4.11.tar.gz
  • 1.4/mediawiki-1.4.1.tar.gz
  • 1.4/mediawiki-1.4.2.tar.gz
  • 1.4/mediawiki-1.4.3.tar.gz
  • 1.4/mediawiki-1.4.6.tar.gz
  • 1.4/mediawiki-1.4.8.tar.gz
  • 1.4/mediawiki-1.4beta1.tar.gz
  • 1.4/mediawiki-1.4beta2.tar.gz
  • 1.4/mediawiki-1.4beta3.tar.gz
  • 1.4/mediawiki-1.4beta4.tar.gz
  • 1.4/mediawiki-1.4beta5.tar.gz
  • 1.4/mediawiki-1.4beta6.tar.gz
  • 1.4/mediawiki-1.4rc1.tar.gz
  • 1.5/mediawiki-1.5.0.tar.gz
  • 1.5/mediawiki-1.5.1.tar.gz
  • 1.5/mediawiki-1.5.3.tar.gz
  • 1.5/mediawiki-1.5.4.tar.gz
  • 1.5/mediawiki-1.5.7.tar.gz
  • 1.5/mediawiki-1.5alpha1.tar.gz
  • 1.5/mediawiki-1.5alpha2.tar.gz
  • 1.5/mediawiki-1.5beta1.tar.gz
  • 1.5/mediawiki-1.5beta2.tar.gz
  • 1.5/mediawiki-1.5beta3.tar.gz
  • 1.5/mediawiki-1.5beta4.tar.gz
  • 1.5/mediawiki-1.5rc1.tar.gz
  • 1.5/mediawiki-1.5rc2.tar.gz
  • 1.5/mediawiki-1.5rc3.tar.gz
  • 1.5/mediawiki-1.5rc4.tar.gz
  • 1.6/mediawiki-1.6.0.tar.gz
  • 1.6/mediawiki-1.6.1.tar.gz
  • 1.6/mediawiki-1.6.2.tar.gz
  • 1.6/mediawiki-1.6.4.tar.gz
  • 1.6/mediawiki-1.6.5.tar.gz
  • 1.6/mediawiki-1.6.6.tar.gz
  • 1.6/mediawiki-1.6.7.tar.gz
  • 1.6/mediawiki-1.6.9.tar.gz
  • 1.7/mediawiki-1.7.0.tar.gz
  • 1.7/mediawiki-1.7.2.tar.gz
  • 1.8/mediawiki-1.8.0.tar.gz
  • 1.8/mediawiki-1.8.1.tar.gz
  • 1.8/mediawiki-1.8.3.tar.gz
  • 1.9/mediawiki-1.9.0rc1.tar.gz

Mentioned in SAL (#wikimedia-operations) [2024-03-27T23:21:54Z] <TimStarling> on releases1003: uploaded 80 missing old MediaWiki releases T190369

Legoktm uploaded all MediaWiki tarballs from releases.wikimedia.org to the Internet Archive in 2018. I should be able to recover the remaining missing tarballs from there.

OK, well if they were already missing in 2014, I'm not going to find them in a 2018 archive.

It's unlikely the Internet Archive or any crawler would have files that were generated in September 2013 and reported missing in December 2013. Maybe community members would have them, but the right time to ask was December 2013.

The missing releases were generated between 2012-04-26 and 2013-09-03, all in the Git era which started on 2012-03-27. The missing releases are 1.19.0rc1 - 1.19.8 and 1.20.0 - 1.20.7. There were no release candidates or betas for 1.20. The releases were made by @Reedy, @csteipp and @MarkAHershberger.

We could regenerate them from git, although that's not as easy as you might think. My main concern is https://gerrit.wikimedia.org/r/c/mediawiki/tools/release/+/11720 by @Reedy which implies that local patches were applied after checking out the git tags.

Lots of tweaks were made to the release script in this period, and it won't be easy to figure out which releases used which version since it's likely that the release managers made local changes in order to get a release done, with the commits coming later.

Also, extensions were bundled the tarballs at this time, and the release script gets them from master as of the release date. Some work would be required to figure out what commit that would have been.

If nobody has these files, then I think we should close this as declined.

@csteipp has left the foundation and his account here is disabled. I guess it is therefore not going to receive any notification.

releases.wikimedia.org got created in 2014 for *new web server for mw/mobile tarballs* (3e058c6d25af1fa08078bc08eef71ddc482b3f). I do remember setting up a Jenkins job that published an apk.

Releases tarball were stored on download.wikimedia.org (diff between 2007 and 2011). That was a cname to kaulen (d952375c83592c15971e7a1ca7d7f8224dbb13e6) which primarily hosted Bugzilla. Which led me to its creation with T80222. Previously they were on dataset2 but there is no entry on that ticket tracking the file transfer :(

And from SAL:

2013-11-05
17:50 <mutante> deleting empty /srv/org/mediawiki on kaulen


I have found 1.19.0 on https://archive.synology.com/download/Package/MediaWiki , the package embeds a package.tgz but that has extra files added to it, so the tarball is not the one we released at the time :/

Debian had a MediaWiki package since 2006 (changelog) and there are all the stable releases between 1.19.0 and 1.19.20. However they were in the experimental distribution and did not make it to a stable release (with the exception of 1.19.20) and thus I guess the tarball were not elected to be kept in the archive http://archive.debian.org/debian/pool/main/m/mediawiki/ .

Debian package often have a copy of the upstream tarball in their SCM (using pristine-tar), but that only started with 1.27.0.

And eventually I have found the Fedora project has copy of MediaWiki 1.19 tarballs with gpg keys and md5sum https://src.fedoraproject.org/repo/pkgs/mediawiki119/ . There are ones for 1.20.2 to 1.20.5 in https://src.fedoraproject.org/repo/pkgs/mediawiki/

There are some more at http://scripts.virtualmin.com/

I have pushed some 1.19.8 material to https://people.wikimedia.org/~hashar/T190369/ I have found them from a mediawiki-1.19.8-1.scs.el7.centos.src.rpm package. I haven't verified their validity.

I found 1.20.6 on a random website. I think the only one we're missing is 1.20.7.

Mentioned in SAL (#wikimedia-operations) [2024-03-30T05:19:14Z] <TimStarling> on releases1003 uploaded mediawiki 1.19.0 - 1.19.8, 1.20.0 - 1.20.6 T190369

Thanks both for working on this!!

Legoktm uploaded all MediaWiki tarballs from releases.wikimedia.org to the Internet Archive in 2018. I should be able to recover the remaining missing tarballs from there.

OK, well if they were already missing in 2014, I'm not going to find them in a 2018 archive.

Yeah, I made the archive primarily in response to the missing tarballs to help with recovery if things ever went missing again.

Debian had a MediaWiki package since 2006 (changelog) and there are all the stable releases between 1.19.0 and 1.19.20. However they were in the experimental distribution and did not make it to a stable release (with the exception of 1.19.20) and thus I guess the tarball were not elected to be kept in the archive http://archive.debian.org/debian/pool/main/m/mediawiki/ .

Debian package often have a copy of the upstream tarball in their SCM (using pristine-tar), but that only started with 1.27.0.

You can find event the experimental versions at https://snapshot.debian.org/package/mediawiki/ but note that the ones with +dfsg have been repacked and some files removed/added.

I found 1.20.6 on a random website. I think the only one we're missing is 1.20.7.

I did a Google search for the filename and found https://bugs.mageia.org/show_bug.cgi?id=11157, which links to https://advisories.mageia.org/MGASA-2013-0276.html. The referenced SRPM is still available at https://distrib-coffee.ipsl.jussieu.fr/pub/linux/Mageia-archive/distrib/3/SRPMS/core/updates/mediawiki-1.20.7-1.mga3.src.rpm.

Though I was unable to reproduce the exact tarball for reasons such as the filenames not having been sorted, the contents do appear to match the code from our Git repos. See P60395.

I don't know where you would find the core and i18n only tarballs, the patch files, and the GPG signatures though.