Page MenuHomePhabricator

[S] Suggestions should only be images
Closed, ResolvedPublic

Description

In order to prevent downstream bugs such as T332757: Invalid file error for Add image - OFFICE is not valid mime type , we should restrict the type of suggested files to images only.
The ideal solution would be to look up relevant MIME types, although we currently don't have such information available in our data pipeline outputs.

An alternative are file extensions: instead of filtering SVG files to avoid logos/icons as in T331456: [S] Filter out all .svg files from section-level image suggestions, we should keep image files.
Pasting below the set of all file extensions found in a previous Section-Level-Image-Suggestions run, 2023-01-30 snapshot:
SVG, JPEG, JPg, Svg, Jpg, PNG, Jpeg, Gif, JpG, pdf, djvu, JPeG, GIF, xcf, png, Tif, gif, jPeG, jpeg, JPG, Png, svg, tif, webp, BMP, stl, bmp, tiff, jPg, WebP, jpg, TIF, webm, JpEg, jPG, TIFF

Tasks

Details

TitleReferenceAuthorSource BranchDest Branch
Use suffix allowlist instead of denylistrepos/structured-data/image-suggestions!29mlitnT334296main
Customize query in GitLab

Event Timeline

MarkTraceur renamed this task from Suggestions should only be images to [S] Suggestions should only be images.May 3 2023, 4:29 PM
MarkTraceur subscribed.

n.b. in estimation meeting we surmised it's probably only a matter of changing [[ https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/blob/T330773/image_suggestions/unillustratable.py#L243

this ]] line of code to use the new types and inverting the logic. But there was a note of caution that it may be necessary to double-check the file extension list against what's supported/present.
mfossati updated the task description. (Show Details)