Analysis:
The UploadFromUrl class prepends the body content of the redirect response together with the actual file content, which corrupts the file and causes an the MIME type to be detected as text/html (arguably correctly, since a client may indeed interpret the contents as HTML).
In order to fix this, the code responsible for saving the data chunks into a temporary file needs to be aware of the redirect, and reset the file contents before receiving the actual file contents.
Original report:
*.archives.gov is already in the whitelist. However, every file I attempt to upload from this domain gives the following error:
Cannot upload this file because Internet Explorer would detect it as "text/html", which is a disallowed and potentially dangerous file type.
There does not appear to be anything wrong with the files. I am guessing this is caused by the fact that archives.gov media is actually all stored in s3, and this is just an alias domain.
For example, the URL https://catalog.archives.gov/catalogmedia/lz/riverside/rg-075/561617/561617-3-07-01.jpg actually resolves to https://s3.amazonaws.com/NARAprodstorage/lz/riverside/rg-075/561617/561617-3-07-01.jpg
How can we correctly allow these files? Is this a bug in the domain whitelist, or can we resolve it by simply adding the s3 bucket path in some way? It seems like this would be s3.amazonaws.com/NARAprodstorage/, but none of the current whitelist examples are showing a URL path beyond the (sub)domain, so I'm not sure if that works?