When PdfHandler extracts meta data from a PDF with page size A4 and page rotation of 90 degrees (landscape) the "Page rot" meta information is not saved to the database
Occured in PdfHandler in branch REL1_31
Example output of /usr/bin/pdfinfo -enc UTF-8 -l 9999999 images/b/b5/Overview.pdf
Title: Overview.vsd Subject: Keywords: Author: auser Creator: PDFCreator 2.4.1.13 Producer: PDFCreator 2.4.1.13 CreationDate: Thu Jan 2 12:08:27 2019 UTC ModDate: Thu Jan 4 12:08:27 2019 UTC Tagged: no UserProperties: no Suspects: no Form: none Syntax Warning: Invalid least number of objects reading page offset hints table JavaScript: no Pages: 1 Encrypted: no Page 1 size: 595 x 842 pts (A4) Page 1 rot: 90 File size: 123684 bytes Optimized: yes PDF version: 1.4
Stored information in images.image_meta (unserialized)
array (
'Title' => 'Overview.vsd',
'Subject' => '',
'Keywords' => '',
'Author' => 'auser',
'Creator' => 'PDFCreator 2.4.1.13',
'Producer' => 'PDFCreator 2.4.1.13',
'CreationDate' => 'Thu Jan 2 12:08:27 2019',
'ModDate' => 'Thu Jan 4 12:08:27 2019',
'Tagged' => 'no',
'Pages' => '1',
'Encrypted' => 'no',
'pages' =>
array (
1 =>
array (
'Page size' => '595 x 842 pts (A4)',
),
),
'File size' => '123684 bytes',
'Optimized' => 'yes',
'PDF version' => '1.4',
'mergedMetadata' =>
array (
'DateTime' => '2019:01:02 12:08:27',
'DateTimeDigitized' => '2019:01:04 12:08:27',
'Software' => 'PDFCreator 2.4.1.13',
'ObjectName' =>
array (
'x-default' => 'Overview.vsd',
'_type' => 'lang',
),
'Artist' =>
array (
0 => 'auser',
'_type' => 'ol',
),
'ImageDescription' => '',
'pdf-Producer' => 'PDFCreator 2.4.1.13',
'pdf-Encrypted' => 'no',
'pdf-PageSize' =>
array (
0 => '595 x 842 pts (A4)',
),
'pdf-Version' => '1.4',
),
'text' =>
array (
0 => 'Some text
',
1 => '',
),
)As you can see [ 'pages'][1]['Page rot'] is missing.
This might be due to the "postprocessing" of the extracted meta data: https://github.com/wikimedia/mediawiki-extensions-PdfHandler/blob/51185ca9cb1f3a76bea2054e48c5790802633779/includes/PdfImage.php#L291-L309