Many
imaging and image processing methods are evaluated by how well
the images they output resemble some given image. Examples include:
image data compression, dithering algorithms, flat-panel display and
printer design. In all of these cases, the human visual system is the
judge of image fidelity. Most of these methods use the mean squared
error (MSE) or root mean squared error (RMSE) between the two images
as a measure of visual distortion. These measures are popular largely
because of their analytical tractability. It has long been accepted
that MSE (or RMSE) is inaccurate in predicting perceived distortion.
This is illustrated in the following paradoxical example.
The top two images on the right were created by adding different types
of distortions to the original image; the original image is shown
below them. The root mean squared error (RMSE) between each of the
distorted images and the original were computed. The root mean
squared error is the square root of the average squared difference
between every pixel in the distorted image and its counterpart in the
original image.
The RMSE between the first distorted image and the original is 8.5
while the RMSE between the second distorted image and the original is
9.0. Although the RMSE of the first image is less than that of the
second, the distortion introduced in the first image is more
visible than the distortion added to the second. Thus, the root mean
squared error is a poor indicator of perceptual image fidelity.