Page MenuHomePhabricator

Scaled image is larger than original for palletted PNG (in terms of size)
Closed, DeclinedPublic

Description

Author: plugwash

Description:
the point of doing scaling on the server side is to reduce load times. However
when the image is only being scaled by a small ammount and the original is line
art the image sometimes gets substantially bigger. In the specific case shown in
the url the reduced image inlined in the page is nearly 4 times the size of the
image.

it would seem sensible to check if the version scaled on the server side really
was smaller than the original before deciding which version to send.


Version: 1.5.x
Severity: minor
URL: http://en.wikipedia.org/wiki/Image:Harta_Ocean_Indian_Quake.png

Details

Reference
bz1218
TitleReferenceAuthorSource BranchDest Branch
perform-release: Add GitLab supportrepos/releng/scap!16dancydancy-review-scap-release-9.99.9master2
perform-release: Add GitLab supportrepos/releng/scap!12dancyreview-add-gitlab-support-to-release-scriptmaster
Replace some references to gerrit with gitlabrepos/releng/scap!10dancyreview-gerrit-to-gitlabmaster
bin/install_local_version.sh: Update verify_source_dir for GitLabrepos/releng/scap!9dancyreview-T321847-1master
Customize query in GitLab

Revisions and Commits

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 8:06 PM
bzimport set Reference to bz1218.
bzimport added a subscriber: Unknown Object (MLST).

grendelkhan wrote:

There's also the issue of quality, however. Web browsers use some truly
grotesque scaling algorithms in the interest of speed; it is likely that
doing client-size image scaling will result in very, very ugly image
display. See
http://en.wikipedia.org/wiki/User:Grendelkhan/Scratch#Thumbs_stretch_small_
images for an example. The unscaled version is being sent, but it's being
scaled on the web browser end. On IE6 SP1, at least, it looks ucky.

This may or may not outweigh the size costs of computing and sending a
nicely antialiased version, but it's something to consider.

jeluf wrote:

Fixed in CVS HEAD.
If the thumbnail image in an image page is bigger (in Bytes) than the original
image, the original one is shown. It will not be scaled down using <img
width=... height=...> but keep its original size.

jeluf wrote:

fixed in 1.4, deployed

plugwash wrote:

there has been quite a discussion regarding this issue over at
http://commons.wikimedia.org/wiki/Commons:Village_pump#JPG_and_grayscale_PNG_images_rescaled.2C_but_not_bitonal_PNG

the REAL problem seems to be that imagemagick converts EVERYTHING to truecolor
during any operation. This seems to be the main cause of bloat during the
conversion process causing the scaled images to be far bigger than they should
be and more likely to exceed the size of the original image in the first place.

andrew.archibald wrote:

I suggest two things:

  • Always resize images (This would resolve bug 1352 as well).
  • Use ImageMagick's "identify" to count the colors in an image (possibly only if

the input image was indexed or grayscale); if it's 256 or less, convert the PNG
to an indexed PNG (losslessly).

The first cures the user hassle; the second will probably avoid most of the
cases where the image actually gets bigger.

plugwash wrote:

  • Use ImageMagick's "identify" to count the colors in an image (possibly only if

the input image was indexed or grayscale); if it's 256 or less, convert the PNG
to an indexed PNG (losslessly)

i think this is unlikely to work.

scaling by non integer scale factors will produce an insane flood of colors
which is pretty much only limited by the bitdepth you allow the output to be.

andrew.archibald wrote:

I should first point out that the limitations discussed here apply also to the
usual thumbnailing process - a greyscale image yields a "truecolor" thumbnail.

Scaling to any ratio does indeed produce a flood of colors. This cannot and
should not be changed (the colors are there to improve image quality). So most
scaled pictures will stay truecolor. What are the problem cases then?

  • Start with a black-and-white (bitonal) image. Scale it; currently it turns

into truecolor. This method would yield a (paletted) grayscale image. (I would
guess that this is by far the most common case of image growth.)

  • Start with a bitonal image in two other colors. Currently turns into

truecolor; this suggestion would make it paletted.

  • Start with an image with a handful of colors (a color line drawing, say, that

has been inexplicably rendered with no antialiasing). Scaling needs a flood of
colors to render an antialiased version (any time two colors border on one
another, many colors between the two are needed to antialias it). This would
stay truecolor.

  • Start with a grayscale image. Scaling currently converts this into a

truecolor image; this proposal would convert it into a (paletted) grayscale image.

What other alternatives do we have?

  • Scale only to integer ratios (or fractions with small numerator and

denominator). This somewhat reduces the number of distinct colors that are
introduced (if the simplest anti-aliasing scheme is used). For reasonable-sized
images this probably yields scaled bitonal images with not more than nine
colors; it also helps with images with not too many colors, especially if they
don't meet very much. This would most naturally be combined with the previous
suggestion, so that one need not assume ahead of time anything about the
resulting colors. It might also improve image clarity (at the cost of size).
It is of no use for thumbnails in pages.

  • Attempt to lossily reduce the number of colors in some images. Which images?

Bitonal only? Some bitonal images (scans, for example) use stippling to produce
shades of gray, which has the correct effect when scaled using the current
algorithm. How many colors should be used? The answer will surely depend on
the image and the scaling ratio. How much quality are we willing to lose, and
how can it be measured? Histograms of the image pixels, perhaps, along with a
human visual model. How much CPU time will it use on the servers? How much
developer time will it take? How many images need this anyway?

Reclosing this, since problem as described was fixed as requested.

Open a separate enhancement request for more intelligent thumbnail formatting if desired.

jeluf wrote:

In 1.5alpha2, this behaviour can be observed again. => REOPEN.

http://upload.wikimedia.org/wikipedia/en/c/cd/OregonCity.png is 75 KByte, its
image page thumbnail in 1.5alpha2 is 305 KByte.

plugwash wrote:

indeed because the cure that was made (which was not the one i as the reporter
proposed) was worse than the original issue and so was backed out for 1.5.

the original souloution i proposed was scaling down using the browser for such
cases but that has problems too (mostly because browsers do a really shitty job
of scaling).

the real issue is that scaling in mediawiki sucks for pretty much everything
except photographic jpegs. see
http://bugzilla.wikimedia.org/show_bug.cgi?id=1757 for the real improvements
that are needed.

register wrote:

I found this bug while preparing to file one, so just adding a comment.
http://upload.wikimedia.org/wikipedia/en/thumb/2/27/Openmotif_screenshot.png/250px-Openmotif_screenshot.png
vs
http://upload.wikimedia.org/wikipedia/en/thumb/2/27/Openmotif_screenshot.png/712px-Openmotif_screenshot.png
vs
http://upload.wikimedia.org/wikipedia/en/2/27/Openmotif_screenshot.png

  1. The scaling algorithm wasn't smart enough to note it was "scaling" it by only

a few pixels

  1. In scaling the PNG it presumably caused some imperceptible slurring - as a

result the original size of 12kb

was inflated almost 8 fold (!).  Scaling would probably be better off just

using JPEG.

  1. The 250 pixel version on the page should probably have been a jpeg - possibly

with Mediawiki checking to see if the scaled image was larger than the original,
and if so, discarding the scaled version and just using the original with image
width/height set in CSS or IMG attributes. (comment #2 - sorry for repeating,
just was applicable in this case as well)

  1. Additionally, if Mediawiki was clever, it'd note when simpled line drawings

or user interfaces or even text of math equations was being uploaded as RGBA
when the number of colours in the image was less than 255. In such a case,
Mediawiki would be better off resaving as an 8bit PNG with savings of 50% or
more in image size. Also possibly reevaluating the need for a scaled version as
mentioned in comment #2. In this case, doing that to the original image reduced
the size to 6852 bytes with no data loss - making the "scaled" size of 96437 a
ridiculous 14 fold increase.

register wrote:

And apologies, should have read all the comments more carefully, the other
issues with this image were also being discussed in comment #5, comment #6 and
others.

Well. Adding my vote for smarter scaling and not saving as PNG when scaling...

  • Bug 5211 has been marked as a duplicate of this bug. ***

Still an issue in 1.13alpha, this time it's animated gifs. http://upload.wikimedia.org/wikipedia/commons/thumb/6/62/Cicada_molting_animated-2.gif/250px-Cicada_molting_animated-2.gif is about 10% larger than [[Image:Cicada molting animated-2.gif]].

ayg wrote:

*** This bug has been marked as a duplicate of bug 234 ***

ayg wrote:

Actually, reopening, since bug 234 deals with two issues, and it seems to primarily deal with the transparency issue. Let's use this one for the size-increase issue.

ayg wrote:

*** Bug 14974 has been marked as a duplicate of this bug. ***

ayg wrote:

The comment from the duplicate may be useful.

(from bug 14974 comment #0)

The thumbnails are created by GD using the function imagecreatetruecolor(...)
to create a blank new image in which the thumbnail is created. So the
destination image is a true color image which is therefor larger than the
source. When imagecreate(...) is used for the non true color images these
thumbnails are much smaller.
The following code in media/Bitmap.php at about line 170 does this.

if (imageistruecolor( $src_image ) ) {
        $dst_image = imagecreatetruecolor(

$physicalWidth, $physicalHeight );

}
else {
        $dst_image = imagecreate( $physicalWidth,

$physicalHeight );

}

ImageMagick using the -depth 8 option also creates a full color image from the
source.

As has been pointed out above, more colors may be needed in the scaled version due to interpolation, so there's not necessarily an easy fix.

rene wrote:

(In reply to comment #18)

ImageMagick using the -depth 8 option also creates a full color image from the
source.

The -depth ImageMagick command line option may get an enhancement which could solve the problem, if I understand the following link correctly:
http://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=11013
I have not seen it yet in the latest version though.

plugwash wrote:

As has been pointed out above, more colors may be needed in the scaled version
due to interpolation, so there's not necessarily an easy fix.

Agreed, IMO the only proper fix to this and the other thumbnailer issues is to
give the editor more control over the thumbnailer output.

IMO only a human can judge how much color depth is really needed to give a
decent looking thumbnail of a particular image and whether it is sensible to
use lossy or lossless compression for that thumbnail.

The point of server-side scaling is to make images take up less pixels while still looking nice, not to take up less bytes. Inevitably, downscaling will sometimes cause images to get bigger on disk, unless you accept a loss in image quality. This is because downscaling works by moving information from fine spatial detail into fine colour detail, via antialiasing. Due to the nature of the PNG format, that colour detail is not as easily compressable as the spatial detail which gave rise to it, thus the file size is bigger.

There are two special cases where a saving might be made:

The colours in the final image will be linear combinations of the colours in the source image. This allows for space savings in certain cases. If the colours in the source image are colinear when plotted in 3-d colour space, then the colours in the final image will be along the same line. For instance, if the source image is greyscale, then the destination image will be greyscale.

If there are very few colours in the source image, then the antialiased final image will have a limited number of colours also, due to the limited number of ways in which the colours can combine. Say if there were 3 source colours. If you downscale by a factor of 2, then you might use a 2x2 grid in the source image to calculate each pixel in the destination image. So there are 3^4 = 81 possible 2x2 source image grids, and so 81 possible colours in the destination image. This allows you to map a palette source image to a palette destination image. However this potential saving diminishes rapidly with increasing number of source colours, and with increasing scaling factor.

I would argue that the initial report is naive, and that the bug should be closed as WONTFIX along the lines of comment 8: i.e. detection of the two special cases above can be filed separately as feature requests.

epriestley changed the task status from Declined to Resolved by committing Unknown Object (Diffusion Commit).Mar 4 2015, 8:21 AM
epriestley added a commit: Unknown Object (Diffusion Commit).
Aklapper changed the task status from Resolved to Declined.Mar 4 2015, 11:41 AM
Aklapper claimed this task.