The main task of any image compression technique is to store a small amount of data from which an original image can be reconstructed. Many such image compression standards currently exist to perform such an operation -- the most commonly used methods being the popular JPEG and JPEG 2000 standards. Both of these specify a codec that defines how an image is compressed into a stream of bytes and decompressed back into an image.
As popular as such standards are, Professor Joachim Weickert -- a member of the faculty of mathematics and computer science at Saarland University in Saarbrucken, Germany -- is not content with their performance. For many years, his team has been working to develop alternative approaches that perform better at higher compression rates.
At this year’s First European Machine Vision Forum -- a two day series of seminars held between 8 to 9 September in Heidelberg, Germany -- Professor Weickert presented the results of his research to an audience of over one hundred engineers and scientists from many of the leading universities and engineering companies in Europe.
Professor Weickert began by reminding the audience that the JPEG compression algorithm operates by decomposing images into 8 x 8 pixel blocks. Within each of these blocks, a frequency analysis is performed using the Discrete Cosine Transform (DCT) that quantizes higher frequencies in the image with coarser precision than those of lower frequencies.
At compression rates of 10:1, the performance of the algorithm is adequate enough to enable an image to be compressed and then decompressed at reasonable quality. However, at higher compression rates such as 80:1, the quality deteriorates significantly.
To create a more effective way to compress images, Professor Weickert and his team have been studying the potential of using in-painting techniques. Also referred to as image interpolation, in-painting uses sophisticated diffusion algorithms for image compression. In this technique, only a few of the original pixels in a particular image are stored and the image is regenerated by the use of a mathematical operator which then paints in between each of those pixels with a diffusion process to regenerate the image.
“As simple as that might sound, there are some important considerations that need to be addressed to develop an image compression methodology based on the idea. The first of these is how to determine which pixels should be chosen to perform the best reconstruction of the image. The second is to how to ascertain the best diffusion process that should actually be used for the data in-painting. Lastly, a method of encoding needs to be chosen so that the selected pixels can be stored efficiently. To complicated matters further, all of the three issues are highly interrelated,” said Professor Weickert.
To highlight how optimum data might be chosen, Professor Weickert showed how a Laplacian mathematical operator could be used to highlight regions of rapid intensity change in a grey-scale image of Felix Klein, after which a dithering process could be applied to select a percentage of pixels proportional to their magnitude. The data contained within that mask can then be used to reconstruct the image using in-painting using a homogeneous diffusion technique.
A Laplacian mathematical operator is used to highlight regions of rapid intensity change in a grey-scale image of Felix Klein, after which a dithering process is applied to select a percentage of pixels proportional to their magnitude. The data contained within the mask can then be used to reconstruct the image using in-painting using a homogeneous diffusion technique.
The relatively simple technique enabled Professor Weickert to demonstrate how even an Intel Centrino Duo laptop could be used to capture an image, perform the encoding and then the reconstruction of the image in real-time.
Professor Weickert then discussed an even more promising alternative using discrete data optimization. In this approach, pixels that are deemed to be less important for image reconstruction are gradually removed from the image in a process known as probabilistic sparsification. Furthermore, the results can be improved further by tonal optimization in which the grey scale values of the pixels to be stored in the mask are either increased or decreased. By selecting the data in a more rigorous fashion, the Mean-Squared Error -- a common means of measuring the quality of an image -- was reduced by a factor of 10-20 over the previous approach.
Optimizing the process
Having described how to optimize the data in an image, Professor Weickert addressed the issue of optimizing the diffusion process itself. Having previously shown that in-painting could be achieved effectively using a homogeneous diffusion operator, he noted that the technique still required a considerable amount of data to represent the edges of features in an image. He then showed how it could be replaced by an edge-enhancing anisotroptic diffusion (EED) operator. Originally designed for de-noising images, the EED provides stronger in painting along the edges in an image than across the edges. This EED operator, he added, was the most successful in-painting process that his researchers had found so far.
To highlight the advantages of the EED in-painting operator, Professor Weickert showed a simple test image containing three small dark circles with sections highlighted in white. By in-painting with a homogeneous diffusion process, the results of recovering the image from the original were less satisfactory than when the EED operator was applied. Indeed, the EED approach yielded an image that was almost perfect.
By in-painting with a homogeneous diffusion process, the results of recovering the image from the original are less satisfactory than when an edge-enhancing anisotroptic (EED) operator was applied. Indeed, the EED approach yields an image that is almost perfect.
Professor Weickert then turned to the issue of optimizing the encoding of the data, noting that storing a freely optimized data mask can be expensive, and that a more effective approach might be to use the EED technique and then to find a slightly suboptimal mask that is cheaper to encode. Noting that the choice of mask points influences the quality of the in-painting results, Professor Weickert went on to discuss one effective mask in which the choice of mask points was restricted to an adaptive grid that allowed mask point locations to be encoded in a binary tree.
To demonstrate the effectiveness of the idea, Professor Weickert compared the results of the optimized EED encoding technique with those using the JPEG 2000 standard – which superseded the original discrete cosine transform-based JPEG standard with a wavelet-based encoding scheme. Here, he demonstrated the differences between the two by showing the recovered image of a woman’s face at high compression rates of around 160: 1, 200:1 and 260:1. Using the JPEG 2000 standard, it was hardly possible to recognize the face of the woman at high compression rates. However, the EED process coupled with the adaptive grid masking process still enabled the face to be easily recognized.
Using the JPEG 2000 standard, it is hardly possible to recognize the face of the woman in the image at high compression rates. However, the edge-enhancing anisotroptic EED operator coupled with the adaptive grid masking process still enabled the face to be easily recognized.
Lastly, Professor Weickert described some extensions and applications of the new image compression techniques he had been working on. In one such extension, he described how color images could be compressed. To do so he highlighted how the RGB representation of an image of peppers could be transformed into Y, Cb and Cr space, and how the Y channel -- which represented the brightness in an image -- could be stored at a higher precision than the other two channels and later used to guide the in-painting of the two Cb and Cr Chroma channels. The results clearly demonstrated how the approach outperformed the JPEG 2000 standard.
He noted too that the new approach was not restricted to encoding 2D images. Diffusion-based encoding processes can easily be used to encode 3D images as well. Furthermore, by employing progressive mode coding, a coarse representation of the image data can be transmitted incrementally over a network. As more data is then transmitted, the quality of the image received is then subsequently increased.
Professor Weickert believes that after the optimized EED encoding technique has been perfected, then it may be submitted to a standards group such as the Joint Photographic Experts Group who may consider establishing it as a standard. “In the meantime, many systems builders looking to effectively transfer images at high compression rates may be able to take advantage of the technique which offers a new and novel way to compress data that can be tailored towards specific needs and applications,” he said.
By Dave Wilson, Senior Editor, Novus Light Technologies Today