- Open Access
- Total Downloads : 663
- Authors : P.Arulselvan, C.Karthik, M.Peer Mohamed
- Paper ID : IJERTV2IS90644
- Volume & Issue : Volume 02, Issue 09 (September 2013)
- Published (First Online): 19-09-2013
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
FPGA Implementation of Efficient Fast Convolution Architecture Based Discrete Wavelet Transform
P.Arulselvan1, C.Karthik2, M.Peer mohamed3
PG Scholars, Government College of Technology, Coimbatore-641 013
Abstract
This paper presents a VLSI design approach for a efficient and high speed 1D Discrete Wavelet Transform computing reduces the hardware complexity in addition to reduce the critical path to the multiplier delay. The hardware requirement is a major concern in the processing of discrete wavelet transform. The system is verified, using (9,7)filter coefficients on Xilinx Sparton-3E Field Programmable Gate Array(FPGA) device without accessing any external memory. It is observed that the approximation method for constant multiplier implementation in DWT can increases the speed and reduces the hardware requirement for the computation of Discrete Wavelet Transform. In this way, the developed design requests reduced combinational path delay, computation power and provide very high-speed processing.
Keywords- DWT, Fast Convolution, VLSI, FPGA
-
Introduction
The discrete wavelet transform (DWT) has gained wide popularity due to its excellent de-correlation property. Many modern image and video compression systems embody the DWT as the transform stage. It is widely recognized that the (9,7) filters are among the best filters for DWT-based image compression [1]. In fact, the JPEG2000 image coding standard employs the (9,7) filters as the default wavelet filters for lossy compression.
Lifting and convolution present the two computing approaches to achieve the discrete wavelet transform. While conventional lifting based architectures require fewer arithmetic operations compared to the convolution-based approach for DWT, they sometimes have long critical paths. If Ta and Tm are the delays of the adder and multiplier, respectively, then the critical path of the lifting based architecture for the (9, 7) filter is (4×Tm + 8×Ta), while that of the convolution implementation is (Tm + 2×Ta) [2]. In addition to this and for the reason to preserve proper precision,
intermediate variables widths are larger in lifting -based computing. As a result, the lifting multiplier and adder delays are longer than the convolution ones. Hence convolution is a best method to reduce the delays in the computation of DWT [5].
Conventionally, programmable DSP chips are used to implement DWT algorithms for low-rate applications and the VLSI application specific integrated circuits (ASICs) for higher rates. The FPGAs are programmable logic devices that provide sufficient quantities of logic resources that can be adapted to support a large parallel distributed architecture.
-
Discrete Wavelet Transform Features
The discrete wavelet transform is a mathematical tool that has aroused great interest in the field of image processing due to its nice features. Some of these characteristics are:
-
It allows image multi resolution representation in a natural way because more wavelet sub bands are used to progressively enlarge the low frequency sub- bands.
-
It supports wavelet coefficients analysis in both space and frequency domains, thus the interpretation of the coefficients is not constrained to its frequency behaviour and can perform better analysis for image vision and segmentation; and
-
For natural images, the DWT achieves high compactness of energy in the lower frequency sub- bands, which is extremely useful in applications such as image compression.
-
1D Discrete Wavelet Transform
The input discrete signal X (n) is filtered by a low-pass filter (h) and a high-pass filter (g) at each transform level. The two output streams are then sub – sampled by simply dropping the alternate output samples in each stream to produce the low pass sub- band (YL) and high pass sub-band (YH). The associated equations can be written as (1). Figure 1 shows the signal analysis in one dimensional (1D) Discrete Wavelet Transform.
example, Figure 3 shows the 9/7-tap Daubechies DWT consisting of a 7-tap high-pass filter and a 9-tap low- pass filter.
(1)
Figure 1 One Dimensional (1D) Discrete Wavelet
transform
-
2D Discrete Wavelet Transform
The basic idea of 2-D architecture is similar to 1D architecture [3]. A 2D DWT (Figure 2) can be seen as a 1D wavelet scheme which transform along the rows and then a 1D wavelet transform along the column. The 2D DWT operates in a straightforward manner by inserting array transposition between the two 1D DWT.
Figure 2 Two Dimensional (2D) Discrete Wavelet
transform
-
-
-
Existing 1D Discrete Wavelet Transform Architectures
-
Basic 1D-DWT Architecture for FIR Filter
A basic implementation of a 1D DWT has been done by using the Daubechies biorthogonal wavelet coefficients. Two different output bands are produced by applying two FIR filters on data input samples. A low-pass filter using h(x) coefficients produces low- frequency data and a high-pass filter using g(x) coefficients produces high-frequency data. As an
Figure 3 Basic 1D-DWT Architecture by (9,7) Daubechies Filter
-
Convolution Based 1D DWT Architecture
The 1D DWT convolution method used to reduce the architecture complexity by ease the number of adders and multipliers from the basic (9,7) FIR filter architecture. The architecture provides the convolution outputs for high pass sub-band and low pass sub-band outputs separately for high speed processing [4].
Figure 4 Convolution based 1D DWT Architecture
-
-
Fast Convolution Based 1D Discrete Wavelet Transform
The digital filter is generally comprised of plurality of multipliers, which occupy large areas and consume much power, impose constraints on a one-chip solution when circuits are integrated. In this aspect, efforts have been expanded to reduce the associated hardware complexity by simplifying multipliers in the convolution architectures [3-7].
The proposed architecture shown in Figure 5, registers is added to send either LPF or HPF coefficient. This proposed method makes the architecture of LPF and HPF suitable for implementing a multiplier-less architecture, as the coefficients to be multiplied in LPF and HPF are different [5].
Timing Details
Convolution Architecture
Proposed Architecture
Minimum period
10.765 ns
4.343 ns
Max Frequency
92.894 MHz
230.256 MHz
Combinational Path Delay
16.651 ns
7.059 ns
Timing Details
Convolution Architecture
Proposed Architecture
Minimum period
10.765 ns
4.343 ns
Max Frequency
92.894 MHz
230.256 MHz
Combinational Path Delay
16.651 ns
7.059 ns
. (3)
. (4)
TABLE I. TIMING AND FREQUENCY ANALYSIS
Similarly, the high-pass filter coefficients present symmetry as follows,
TABLE II. DYNAMIC POWER ANALYSIS
On Chip
Convolution Architecture
(mW)
Proposed Architecture (mW)
Clocks
1.75
1.41
Logic
1.80
1.10
Signals
2.29
0.88
I/Os
0.63
0.16
TOTAL
6.46
3.56
-
Simulation Results
The MATLAB simulation results of 1D discrete wavelet transform on a 256*256 sized gray scale image of Baboon and Camera man is being illustrated in the Figures below. Figure 7 shows the average and detail parts of the Baboon and Camera man image accurately after one dimensional filtering. Figure 7 shows the approximation image (average), the horizontal, vertical and diagonal details. Progressive transmission of image is one of the main advantages of discrete wavelet transform.
(a) Input image (b) 1D-DWT
(a) Input image (b) 1D-DWT
Figure 7 Matlab Output Images
-
CONCLUSION
In this paper, we have proposed a parallel architecture for very high-speed computing DWT using
fast convolution method. To produce one output in every clock cycle in addition to reduce the dynamic power as well as critical path, fast convolution based architecture approach is performed. In this approach, the systems start the column processing as soon as sufficient numbers of rows have been filtered.
References
-
Mallat, S.: A Theory for Multiresolution Signal Decomposition: The Wavelet Representation IEEE Trans. on Pattern Analysis and Machine Intelligence,
Vol. 11, No. 7. (1989) 674-693
-
Gnavi, S., Penna, B., Grangetto, M., Magli, E., Olmo, G.: Wavelet kernels on a DSP: A comparison between lifting and filter banks for image coding. Applied Signal Processing: Special Issue on Implementation of DSP and Communication Systems.
Vol. 2002. No. 9. (2002) 981-989
-
Acharya, T.: Architecture for Computing a Two- Dimensional Discrete Wavelet Transform. US Patent 6178269. (2001)
-
Acharya, T.,Chen, P.VLSI Implementation of a DWT Architectue, Proceedings of the IEEE International Symposium on Circuits and Systems(ISCAS).Monterey,CA.(1998)
-
B.-F.Wu and C.-F. Lin, An efficient architecture for JPEG2000 coprocessor, IEEE Trans. Consum.
Electron, vol. 50, no. 4, pp. 1183 1189, Nov. 2004
-
Andra, K., Chakrabarti, C, Acharya,T : A VLSI Architecture for Lifting-Based Forward and Inverse Wavelet Transform. IEEE Transactions on Signal Processing, vol. 50. No. 4. (2002) 966-977
-
Daubechies, I., Sweldens, W.: Factoring wavelet transforms into lifting schemes. The Journal of Fourier Analysis and Applications vol. 4. (1998) 247-269
-
Gaurav Tewari, Santu Sardar, K. A. Babu, High- Speed & Memory Efficient 2-D DWT on Xilinx Spartan3A DSP using scalable Polyphase Structure with DA for JPEG2000 Standard, IEEE, 2011
-
Q.P.Huang, R.Z.Zhou, and Z.L Hong,Low memory and low complexity VLSI implementation of JPEG2000 codec, IEEE Trans. Consum. Electron., vol.50, no.2,pp. 638-646, May 2004
-
K.Z.Mei, N.N.Zheng, C.Huang, Y.Liu, and Q.Zeng, VLSI Design of a High-Speed and Area- Efficient JPEG2000 Encoder, IEEE Trans. Circuits Syst. Video Technol., vol.17, no.8, pp. 1065-1078,Agu. 2007.