Page MenuHomePhabricator

Invalid file error for Add image - OFFICE is not valid mime type
Closed, DeclinedPublicPRODUCTION ERROR

Description

Error
normalized_message
* Invalid file Talk_by_Leonard_J._Brass_(News_Review)_(1948_January_7)_(IA_talkbyleonardjbr0000bras).pdf in article Richard_Archbold. Filtered because OFFICE is not valid mime type ( BITMAP, DRAWING )
* Invalid file (Journal)_March_6-November_22,_1933_(IA
Impact

Details

Request URL
https://pl.wikipedia.org/w/index.php?geclickid=*&genewcomertasktoken=*&gesuggestededit=*&getasktype=*&section=*&title=*&veaction=*

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

https://commons.wikimedia.org/wiki/File:Talk_by_Leonard_J._Brass_(News_Review)_(1948_January_7)_(IA_talkbyleonardjbr0000bras).pdf

On commonswiki_p (quarry.wmcloud.org)

select * from image where img_name ='Talk_by_Leonard_J._Brass_(News_Review)_(1948_January_7)_(IA_talkbyleonardjbr0000bras).pdf';

gives

img_media_type: OFFICE
img_major_mime: application
img_minor_mime: pdf

As far as I could see when img_media_type: OFFICE is present mime value types are valid.

select count(*) from image where img_media_type='OFFICE'; shows

commonsplwikienwikizhwiki
number of records - 3957418number of records - 0number of records - 249 (application/pdf)number of records 2 (application/vnd.ms-excel)

@Tgr, @kostajh - the error rate is extremely low (4 in 30 days), so the impact is really minor. But could it be an indication of a bigger issue?

Etonkovidova renamed this task from [QA task] plwiki Invalid file error for Add image to Invalid file error for Add image - OFFICE is not valid mime type .Mar 24 2023, 9:05 PM
Etonkovidova updated the task description. (Show Details)
img_media_type ENUM(
  'UNKNOWN', 'BITMAP', 'DRAWING', 'AUDIO',
  'VIDEO', 'MULTIMEDIA', 'OFFICE',
  'TEXT', 'EXECUTABLE', 'ARCHIVE',
  '3D'
) DEFAULT NULL,

All pdfs (application/pdf) are stored with media type OFFICE (includes/libs/mime/MimeMap.php)

I'm not sure what is producing that message, but it's probably some tool (inside visualeditor?) for adding an image. It does make sense for a tool designed to embed images that you can only embed images. However, we do use [[File: some pdf]] to insert thumbnails from that, though. I think there's some expectation of that image that is being broken by seeing a pdf where it thought there would only be images.

kostajh added subscribers: Cparle, KStoller-WMF.
img_media_type ENUM(
  'UNKNOWN', 'BITMAP', 'DRAWING', 'AUDIO',
  'VIDEO', 'MULTIMEDIA', 'OFFICE',
  'TEXT', 'EXECUTABLE', 'ARCHIVE',
  '3D'
) DEFAULT NULL,

All pdfs (application/pdf) are stored with media type OFFICE (includes/libs/mime/MimeMap.php)

I'm not sure what is producing that message, but it's probably some tool (inside visualeditor?) for adding an image. It does make sense for a tool designed to embed images that you can only embed images. However, we do use [[File: some pdf]] to insert thumbnails from that, though. I think there's some expectation of that image that is being broken by seeing a pdf where it thought there would only be images.

It's from the AddImage plugin for VisualEditor (Image-Suggestions project).

@Cparle is it possible to filter out this media type from the image suggestions pipeline?

@Cparle is it possible to filter out this media type from the image suggestions pipeline?

Should be no problem I think ... still I'm curious how this is coming from VisualEditor? Do you guys use that for adding the suggested images?

@Cparle is it possible to filter out this media type from the image suggestions pipeline?

Should be no problem I think ... still I'm curious how this is coming from VisualEditor? Do you guys use that for adding the suggested images?

Yes, we use a plugin (AddImage). The tasks are shown on Special:Homepage via hassuggestion:image keyword. Then the user clicks on the image and the AddImage plugin to VisualEditor uses the image suggestion API metadata to show a yes/no/unsure interface to the user:

image.png (1×1 px, 805 KB)

We should probably search for hassuggestion:image filetype:BITMAP|DRAWING. Video etc. suggestions are conceptually valid, our tool just doesn't handle them (and it's probably not worth the effort to try).

We should probably search for hassuggestion:image filetype:BITMAP|DRAWING. Video etc. suggestions are conceptually valid, our tool just doesn't handle them (and it's probably not worth the effort to try).

That wouldn't work, filetype refers to the page (which doesn't have a file type since we are talking about articles, but other kinds of pages could), not the image suggestion. Sorry, got myself confused.

Moving to Ready for Development, but low priority given the low frequency of this error.
Growth engineers: please move this out of the current sprint if you think this error isn't worth the time investment needed to solve it.

I don't think we can do anything about this on our side.

There are no errors (for the last couple of months) of this type, i.e. "Invalid file [...] Filtered because OFFICE is not valid mime type ( BITMAP, DRAWING )". Given that the error was rare when the task was file and since it's not present now, I'm closing the task as Declined.