We are seeing a re-occurrence of T96360, this time for PDF files instead of DjVu.
Most of the queries are requesting the same files, that have ~13MB of img_metadata. Some of the requests comes from search engines bots, but not all of them.
This morning (UTC) for example on db1081 there where ~12k queries that requested those fields and saturated the 1Gb link.
db1081 traffic graph:
Query sample:
SELECT /* LocalFile::loadExtraFromDB xxx.xxx.xxx.xxx */ img_metadata FROM `image` WHERE img_name = 'Catalog_of_Copyright_Entries_1977_Books_and_Pamphlets_Jan-June.pdf' AND img_timestamp = '20160426090826' LIMIT 1
The content of img_metadata is a PHP serialized array.