Single Image Super-Resolution – A Quantitative Comparison

DOI : 10.17577/IJERTV4IS050888

Download Full-Text PDF Cite this Publication

Text Only Version

Single Image Super-Resolution – A Quantitative Comparison

Lisha P P

Department of Computer Engineering, Model Engineering College, Thrikakkara, Kochi, Kerala.

Jayasree V K

Department of Electronics Engineering, Model Engineering College, Thrikakkara, Kochi, Kerala

Abstract Super-resolution (SR) techniques generates high resolution (HR) image from low resolution (LR) images. Since HR image contains more information than LR image, it is severely demanding for all applications of image analysis. High resolution images improves the pictorial information for both human and automatic machine perception. This paper presents a comparison on well-known techniques of super resolution.

Performance of the algorithms are evaluated by means of objective image quality criteria like Peak Signal to Ratio (PSNR), Structural Similarity Index (SSIM) and Maximum Difference (MD). From the analysis we have found that learning based algorithm using sparse dictionary performs better.

Keywords High resolution; Interpolation; Low resolution; Sparse Dictionary; Super resolution.

  1. INTRODUCTION

    Resolution enhancement is one of the most desirable terms in image processing. It gives more information to human interaction. Super- resolution technique is used for acquiring a high resolution image from observed low resolution images. Image resolution is defined as smallest measurable detail in visual representation. The resolution of a digital image can be expressed in many different ways like spatial, temporal, spectral or radiometric resolution. In this paper special emphasis is given to spatial resolution.

    Spatial or pixel resolution is defined as spacing of pixels in an image and it is measured in terms of number of column pixels with number of row pixels. Since 1970's CCD and CMOS image sensors are being widely used in digital cameras. Number of sensor elements in camera decides the resolution of the camera. A camera with with less number of sensor elements generates LR images. One solution to increase spatial resolution is to decrease the pixel size of the sensors, but it produces shot noise because the availability of light decreases. Another method to increase the resolution is to increase the sensor chip size, but it leads to increase in capacitance which is not desirable. So a cost effective method for increasing spatial resolution is required to overcome the limitations of sensors and lens manufacturing technology. As cost incurred for software implementation is low compared to that of hardware techniques, super resolution techniques are becoming more and more popular these days.

    HR images are always desirable in applications such as satellite imaging, sports photographs, medical imaging, microscopy, computer vision, remote sensing, surveillance systems, target detection and recognition. It is also applicable in high resolution videos, compression, astronomy, etc. The need of zooming of images to analyze visual information also increases the demand for super-resolution[1].

  2. CLASSIFICATION

    Super-resolution techniques are categorized based on domain employed, number of images involved and actual reconstruction method. In terms of the domain used, super- resolution techniques are sub categorized as spatial domain and frequency domain. Initial attempts of super resolution were in the frequency domain typically recovering high frequency components by taking advantage of shifting and aliasing properties of the Fourier transform. The major advantage of frequency domain approach is that it is simple with low computational overhead but less flexible.

    According to the number of low resolution images involved, super-resolution techniques can be performed in two ways, single image (frame) super-resolution and multi image (frame) super-resolution resolution. Single image super-resolution technique generates HR image from single low resolution image.

    In multi image super-resolution, high resolution image is generated from multiple low resolution images. The basic approach in multi frame SR technique is to combine the non- redundant information contained in multiple low resolution images to generate a high resolution image. However this method is unsatisfactory because mostly it takes more computation time than single image super-resolution technique and it degrades when magnification factor is large or number input images available are less.

    In terms of different methods applied for image reconstruction, SR techniques can be classified as interpolation based, reconstruction based and learning (example) based. Interpolation based methods such as nearest neighbor interpolation, bilinear interpolation, bicubic interpolation, and lanczos interpolation are simple but their visual effect is unsatisfactory and the resulting image is blurred[1,2,3].

    The advantage of learning based approach is that it requires very few LR images when compared to other techniques. It is faster, more versatile and provides high magnification factor. In this paper, we performed a comparison on different methods of single image super- resolution technique and thus a quantitative analysis of the performance of various interpolation methods like nearest neighbor interpolation, bilinear interpolation, bicubic interpolation and a learning based super resolution algorithm using sparse dictionary[4].

    The rest of the paper is organized as follows. Section III presents related work. Section IV describes super-resolution using sparse representation. Section V and VI describes about results and conclusion respectively.

  3. RELATED WORK

    Tsai and Haung[5] have contributed in research on multi image SR technique using frequency domain. They introduced frequency domain approach for HR image reconstruction using aliasing in the LR images with the assumption that LR images are noise free and with no blur. It mainly focus on three concepts of Fourier transform:

    1. Shifting property

    2. The continuous Fourier transform Discrete Fourier transform relationship

    3. The HR image is assumed to be band-limited

    The advantage of Tsai-Haung approach is theoretical simplicity and low computational complexity. It also reduces hardware complexity by enabling parallel implementation. Later Tsai-Haung approach was modified by incorporating the concepts of additive noise and blurring effects.

    In frequency domain based SR algorithms, wavelet transform is an alternative to the Fourier transform. The problem with wavelet based method is that it is in efficient in the implementation of convolution filters that are degraded. Therefore, later a combination of these two transforms were implemented.

    In spatial domain, multi image SR sub category, algorithms are mostly concentrated on the aliasing artifacts that is present in observed LR images. The representative methods in this subcategory include iterative back projection (IBP), Projection on to convex sets (POCS), Maximum Likelihood(ML) and so on.

    1) Iterative back projection (IBP) based methods

    In this method, initially a guess for the HR targeted image is needed and then it is refined. Such a guess can be obtained by registering the LR images over an HR grid and then averaging them. This initial guess can be refined by using the simulated imaging model with a set of available LR observations. Then the error between the simulated LR images and the observed ones is obtained and back-projected to the coordinates of the HR image to improve the initial guess. In this method the back-projected error is the mean of the errors that each LR image causes[,7].

    Later this method is improved by replacing the mean of the errors by the median to get a faster algorithm. The main drawback of these above mentioned method is that the response of the iteration need not always converge to one of the possible solutions.

    2. Projection on to convex sets(POCS)

    In the POCS method each LR image imposes apriori knowledge on the final solution and this apriori knowledge is a closed set. We can use different apriori knowledge with the POCS method. Results of these methods need not be accurate, if some of the LR images suffer from partial occlusion. But this can be improved by using a validity map to disable those projections which involve inaccurate information[7,8].

    3) Maximum Likelihood(ML)

    The ML solution of an SR problem, is sensitive to small disturbances, such as noise or errors in the estimation of the imaging parameters and there might not be a unique solution. To deal with these problems, there is some additional information needed to constrain the solution. Such information can be apriori knowledge about the desired image. Then apriori term can prefer a specific solution over others when the solution is not unique[7,8].

    4 ) Maximum aposteriori(MAP)

    MAP, proposed by schultz and stevenson, is a typical probabilistic method. In this method the image super- resolution reconstruction is a problem concerning statistical estimation[7,8]. The accuracy of multiple image based SR algorithms are highly depend on the estimation accuracy of the motions between the LR observations, which gets more unstable in real world applications where different objects in the same scene can have different and complex motions. In situations like these, single image based SR algorithms are better. These algorithms are either reconstruction based (similar to multiple image based algorithms) or learning based.

    Learning based single image SR algorithms also known as hallucination algorithms. These algorithms are based on statistical and machine learning approaches. These algorithms contain a training step in which the relationship be tween some HR examples (from a specific class like face images, fingerprints, etc.) and their LR counter parts are learned. This learnet knowledge is then incorporated into the apriori term of the reconstruction[9,10].

    Models that are commonly exploited in single image SR methods include image smoothness, geometric regularity of image structures, gradient profile priors, self-similarity of image patches within and across different scales in the same image and sparsity. Sparsity suggests that a high-frequency signal can be accurately recovered from its corresponding low-frequency representation.

    Priyadarshini D et al. [11] described different methods of single image super-resolution and multi image super- resolution. They discussed effectiveness of IBP method in computer aided tomography. But results of these methods can be improved, if noise detection module is more accurate.

    A technical survey conducted by Sung Cheol Park et al.

    [12] explains the SR technology and provides an outline of main SR approaches and related issues. The article begins by illustrating the need of super-resolution in this era. Then it discuss the methods to improve resolution and the research in SR algorithms. An observation model to relate input LR image and output HR image is formulated. The authors also emphasize the role of SR algorithms in compression system.

    Jian Zhanga et al. [13] described about image super- resolution via dual-dictionary learning and sparse representation. This method suggest that high frequency(HF) to be estimated is considered as a combination of two components: main high-frequency (MHF) and residual high- frequency (RHF), and proposed a image super-resolution method using dual-dictionary learning and sparse representation, which consists of the main dictionary learning and the residual dictionary learning, to recover MHF and RHF respectively.

    Zhiliang Zhu et al.[14] described fast single image super- resolution using self-example learning and sparse representation. This algorithm uses single image super- resolution based on self-example learning and sparse representation. They used K-singular value decomposition (SVD) algorithm and straightforward orthogonal matching pursuit algorithm.

    Kevin et al.[15] addressed neighborhood issue. They discussed about how to find the co-relation between low resolution patches and their corresponding HR image patches, but the results of this method varies based on accuracy of both feature extraction process and reconstruction function.

  4. SUPER-RESOLUTION USING SPARSE REPRESENTATION [16,17,18, 19, 20, 21, 22]

    In sparse coding representation of an image, the term basis is the set of images that capture some features, characteristics or properties of the original image. Linear combination of these basis are used to represent an image. Ie

    n

    X= s I b i

    i=1

    where x is the represented image, bi is i th basis and si are their corresponding coefficients in the linear combination. Sparse coding can represent images using only few active coefficients. This makes the sparse representations easy to interpret and manipulate, and facilitates efficient content- based image indexing and retrieval.

    Representing a signal involves the choice of a dictionary, which is the set of elementary signals or atoms used to decompose the signal. When the dictionary forms a basis, every signal is uniquely represented as the linear combination of the dictionary atoms and such dictionaries are over- complete dictionaries. These dictionaries have more atoms

    than the dimensions of the signal, which promises to represent a wider range of signal phenomena. In sparse coding it allows the basis to be over-complete and the coefficients are sparse. Learning a dictionary directly from data often leads to a better adaptation of the dictionary and has been successful in the applications where pre-defined dictionaries either available or applicable.

    The fundamental assumption of sparse representation method is that LR image may contain a large number of similar patches(sub image)with same information at both the same scale and across different scales. These similar patches with the same scale are regarded as patches from different LR images, whereas those with different scales are considered as HR image [4]. Research on image statistics suggests that image patches can be well represented as a sparse linear combination of elements from an appropriately chosen over- complete dictionary and HR output can be generated from the coefficients of sparse representation of LR image.

    By jointly training two dictionaries for the low- and high- resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch.

    Let D Rn k be an over-complete dictionary of K atoms (K >> n). Signal x Rn can be represented as sparse linear combination with respect to D. That is, the signal x can be written as x = D0 where where 0 RK is a vector with very few (« n) nonzero entries. In practice, we can write it as

    .

    y = Lx = LD0 (1)

    where L Rk×n is a projection matrix. In super-resolution context, x is a high-resolution image(patch), while y is its low-resolution counter part(or features )extracted from it. If the dictionary D is over-complete, then equation

    x = D (2)

    can have sparsest solution to 0 and it is unique. Any sufficiently sparse linear representation of a high-resolution image patch x in terms of the D can be recovered most perfectly from the low- resolution image patch.

    But the real challenge of learning-based SR methods lies on the selection of proper training data and proper learning model for SR from an unseen target image.

  5. RESULT AND DISCUSSION

    We implemented interpolation algorithms and a learning based algorithm using sparse dictionary in MATLAB. A comparative study based on PSNR, SSIM and MD are performed. These parameters are computed for various interpolation algorithms like nearest neighbor interpolation, bilinear interpolation, bicubic interpolation and a learning based algorithm using sparse dictionary. Experiment is conducted on 512 512 standard test images.

    Results are given in figure 1 and the quantitative measures are tabulated in Table I, Table II, and Table III. The result shows that PSNR, SSIM and MD values of learning based algorithm using sparse dictionary is superior to interpolation based methods.

    TABLE I. PERFORMANCE COMPARISON OF THE DIFFERENT APPROACHES BASED ON PSNR.

    Nearest Neighbor

    Bilinear

    Bicubic

    Learning- based

    Image1

    45.65

    45.12

    44.79

    49.34

    Image2

    36.56

    35.89

    35.98

    54.5

    Image3

    45.9

    45.62

    45.2

    48.65

    Image4

    38.59

    37.98

    38.35

    40.53

    TABLE II. PERFORMANCE COMPARISON OF DIFFERENT APPROACHES BASED ON SSIM

    Nearest Neighbor

    Bilinear

    Bicubic

    Learning- based

    Image1

    0.97

    0.97

    0.97

    0.99

    Image2

    0.95

    0.94

    0.96

    0.99

    Image3

    0.97

    0.97

    0.98

    0.99

    Image4

    0.93

    0.9

    0.94

    0.99

    TABLE III. PERFORMANCE COMPARISON OF DIFFERENT APPROACHES BASED ON MD

    Nearest Neighbor

    Bilinear

    Bicubic

    Learning- based

    Image

    43

    42

    66

    22.15

    Image2

    48

    69

    96

    2.9

    Image3

    60

    68

    104

    19.14

    Image4

    52

    60

    77

    29.6

    I II III IV V

    Fig. 1. Resolution enhancement demonstrated on various test images.

    (i)Original image (ii) Nearest Neighbor (iii)Bilinear (iv)Bicubic (v)Learning based method

  6. CONCLUSION

In this paper, we have studied several methods of single image super resolution. Image quality assessment for the interpolation based algorithms and a learning based algorithm using sparse dictionary is performed and thus computed the distortion between two images on the basis of their pixel-wise differences. They include Peak Signal to Noise Ratio (PSNR), Maximum Difference (MD), and image quality assessment based on structural similarity in terms of structural similarity index(SSIM). From the result we have found that learning based SR algorithm using sparse dictionary produce better result than other methods.

REFERENCES

  1. Jiji.CV Single Image Super resolution, Thesis, IIT Bombay , 2007.

  2. Kamal Nasrollahi, Thomas B. Moeslund, Super-resolution: A comprehensive survey, Visual Analysis of People Laboratory, Aalborg University, Denmark. May 2014.

  3. Sdhimil Shijo and V K Gvindan, An Improved Approach toSuper-resolution, International Journal of Engineering Research & Technology, ISSN:2278-0181 ,Vol. 3 Issue 12, December-2014.

  4. Yang, J., Wright, J., Huang, T., Ma, Y, Image super- resolution via sparse representation , IEEE ICIP, 2010.

  5. R.Y. Tsai and T.S. Huang, Multiple frame image restoration and registration, in Advances in Computer Vision and Image Processing. Greenwich, CT: JAI Press Inc., 1984, pp. 317-339.

  6. S. Kim, N. Bose, and H.M. Valenzuela, Recursive re- construction of High resolution Image from Noisy under- sampled Multiframes, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 38, no. 6, pp. 1013- 1027, 1990.

  7. Patel Shreyas A,Novel Iterative Back Projection Approach , Journal of Computer Engineering , Volume 11, Issue 1 (May. – Jun. 2013), PP 65-69.

  8. Patel Shreyaskumar A, A Survey On Single Image Superresolution, International Journal of Computer Science and Information Technology & Security (IJCSITS), Vol. 2, No.6, December 2012.

  9. Tomer Peleg, Michael Elad, A Statistical Prediction Model Based on Sparse Representations for Single Image Super- Resolution, IEEE Transactions On Image Processing, vol. 23, no. 6, june 2014 .

  10. Min-Chun Yang and Yu-Chiang Frank Wang, A Self- Learning Approach to Single Image Super-Resolution, IEEE Traransactions On Multimedia, VOL. 15, NO. 3, APRIL 2013.

  11. Priyadarshini D Shinde , S.L. Nalbalwar, Image Super- resolution, International Journal of Science, Engineering and Technology Research Volume 2, Issue 7, July 2013.

  12. Sung Cheol Park, Min Kyu Park, and Moon Gi Kang, Super- Resolution Image Reconstruction: A Technical Overview , IEEE Signal Processing Magazine , 2003.

  13. Jian Zhanga, Chen Zhaob, Ruiqin Xiongb, Siwei Mab, and Debin Zhaoa image super-resolution via Dual-dictionary Larning and Sparse representation .

  14. Zhiliang Zhu , Fangda Guo ,Hai Yu and Chen Chen Fast Single Image Super-Resolution via Self-Example Learning and Sparse Representation .

  15. Kevin (Xu) Su, Qi Tian, Qing Xue, Nicu Sebe, and Jingsheng Ma Neighborhood Issue in Single-frame Image super- resolution , 2005 IEEE.

  16. Bc. Tomá Luke Super-Resolution Methods for Digital Image and Video Processing, Diploma thesis, January 2013.

  17. Shu Zhang Example-based Super-resolution ,Thesis, The Australian National University Engineering and Computer Science College, May 2013 Canberra, Australia.

  18. Sudarshan ,R Venkatesh Babu Super Resolution via Sparse Representation in l1 Framework, ICVGIP 12, December 16- 19, 2012, Mumbai, India.

  19. Jie Xu , Cheng Deng , Xianglong Liu , and Jie Li, Image Super-resolution Based on Sparse Representation With Joint Constraints , ICIMCS14, July 1012, 2014.

  20. Ianchao Yang, Zhaowen Wang, Zhe Lin, Scott Cohen, and Thomas Huang, Couple Dictionary Training for Image Super- resolution, IEEE transactions on image processing , March 13, 2012.

  21. By Ron Rubinstein, Alfred M. Bruckstein , Michael Elad, Dictionaries for Sparse Representation Modeling, Proceedings of the IEEE Vol. 98, No. 6, June 2010.

  22. Michal Aharon, Michael Elad, and Alfred M. Bruckstein, On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them, Linear Algebra and its Applications, elsevier.com.

Leave a Reply