Implementation of Reduced Area and Delay Integer DCT using Modified Transposition Buffer

Yeshwanth.   E; Shruthi.   G

doi:10.17577/IJERTV4IS030882

Volume 04, Issue 03 (March 2015)

Implementation of Reduced Area and Delay Integer DCT using Modified Transposition Buffer

DOI : 10.17577/IJERTV4IS030882

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 67
Total Downloads : 321
Authors : Yeshwanth. E, Shruthi. G
Paper ID : IJERTV4IS030882
Volume & Issue : Volume 04, Issue 03 (March 2015)
DOI : http://dx.doi.org/10.17577/IJERTV4IS030882
Published (First Online): 27-03-2015
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Implementation of Reduced Area and Delay Integer DCT using Modified Transposition Buffer

Yeshwanth. E

Department of Electronics and Communication Don Bosco Institute of Technology

Bangalore, India

Shruthi. G

Department of Electronics and Communication Don Bosco Institute of Technology

Bangalore, India

Abstract– In this paper we reduce area and delay for the transposition buffer used in the implementation of integer discrete cosine transform (DCT) which is then used in High Efficiency Video Coding (HEVC). The proposed architecture single can be used for getting 2D Integer DCT for any of the following lengths such as 4,8,16, and 32. The DCT lengths will process with a throughput per cycle of any of the DCT lengths can be up to 32 DCT coefficients whatever may be the transform width used (i.e., transform size).The 2D transform can be obtained using 1D DCT and transposition buffer. In this work the transposition buffer is modified to reduce area and delay of the proposed architecture. From the results, it is found that the proposed architecture involves less area and delay compared to the previous 2D DCT with previous transposition buffer structure used.

KeywordsTransposition buffer, hevc, integer dct.

INTRODUCTION (Heading 1)

In recent years, the systems which operate at high resolutions have experiences a significant demand for high dynamic range. The fields that are evident to use such requirements include high-quality digital video in multimedia devices and video-over-Internet protocol networks, geospatial remote sensing, traffic cameras, automatic surveillance, homeland security, automotive industry and multimedia wireless sensor networks; hardware capable of significant throughput is necessary; as well as allowable area complexity and time complexity. In this work, the discrete cosine transform (DCT) is an essential mathematical tool in both image coding and video coding. The DCT was demonstrated to provide very good energy compression for natural images, which can be described by first-order Markov signals.

Moreover, in many cases, the DCT is a very nearest substitute for the Karhunen-Loeve Transform (KLT), which has optimal properties than KLT. Therefore, the two-dimensional (2-D) version of the 8-point DCT was used in many imaging standards such as JPEG, MPEG1, MPEG-2, H.261, H.263, and H.264/AVC. In according to this, new compression schemes such as the High Efficiency Video Coding (HEVC) uses DCT like integer transforms operating at various block sizes ranging from 4*4to 32*32 pixels. The distinguishing characteristic of HEVC is that the bit rate is reduced by half of that as required by H.264/AVC. The HEVC has a capability of

achieving very high compression performance. Even though the HEVC uses a half a bit rate that of H.264/AVC, but approximately with same image quality. Hence HEVC is one of the effective coding which can be used for high-resolution video applications. In terms of arithmetic operations, HEVC possesses a significant computational complexity.

DCT-A DCT expresses a sequence of finitely many data points in terms of sum of cosine functions oscillating at different frequencies.

The use of cosine rather than sine functions is critical in these applications: for compression, it turns out that cosine functions are much more efficient.
LITRATURE SURVEY

In this paper the author [1], introduces H.264/AVC the newest video coding standard of the ITU-T. The main goal is to enhance compression, performance and provision of a network-friendly video representation addressing. Using this, the author has attempted conversational and non conversational H.264/AVC. The advantage of this paper is that it explains the standardization process. The disadvantage of this paper is that the desired efficiency of video compression couldnt be achieved to that extent.

In this paper the author [2], introduces a compression method in which he uses some zeros in an efficient 8Ã—8 sparse orthogonal transform matrix. The proposed transform matrix provides 30% reduction in the number of operations which is superior than the signed discrete cosine transform (SDCT) matrix and DCT. The advantages of this paper are that the arithmetic operations are reduced .The disadvantage of this paper is that the complexity is increased.

In this paper the author [3], proposed transformation matrix contains only zeros and ones. This algorithm is more superior then the previous algorithms designed earlier. This algorithm provides efficient low and high image compression scenarios with computational complexity. This exhibits a close spectral behaviour relative to the DCT. This method can perform more than the BAS-2008 method in High compression and Low

compression ratios. This can be measured using PSNR, UQI, and MSE measurements.

In this paper the author [4], explains about the 2D DCT, the generalized architecture of DCT and also author explains about the procedure to get 2D DCT using 1D DCT. The above procedure uses transposition buffer which has more area and delay compared to proposed transposition buffer. Hence this work includes the reduction in area and delay product.
METHODOLOGY

The proposed architecture uses 4-point DCT as a basic building block tom construct higher lengths. In 4-point DCT, it consists of input adder unit (IAU), a shift add unit (SAU), and an output adder unit (OAU) [4].

Control unit

transposition buffer the enable signals are made automatic to reduce delay and the AND gates are eliminated hence reducing the area.

Existing transform:

The I refer to input and O refers to output. The above structure consists of two 1-D DCT and a transposition buffer. The 4×4 transposition buffer is shown in figure 2(b).It consists of 16 register cells arranged in four rows and four columns it can store 16 values. The output from transposition buffer cells can be taken either row wise or column wise.

x(0) to x(N-1)

The existing buffer is as shown in figure 2(b). It consists of 16 registers to save the DCT values and convert into 2D DCT. The existing system can be modified to reduce delay and area. The modified architecture eliminates AND gates so that area will be reduced and counter is inserted to reduce delay of giving manual inputs as in existing buffer.

Inpu t Mux Asse mbly

N/2-

point Reusa ble DCT

unit

y(0),y(2)…y(N-2)

Output Mux assem bly

N/2-

point Reusa ble DCT

unit

x(N/2) to x(N-1)

Output Mux assem bly

Input Adde r Unit

y(1),y(3)…y(N-1)

Outp ut adder unit

Shift- add Unit

1st stage AND

Gates

Figure 1: Proposed reusable generalized architecture.

N-point integer DCT

unit

N*N Transpo sition buffer

N-point integer DCT

unit

The N point reusable generalized architecture uses an N/2 point DCT block for computing N point. It also consists of

IAU, SAU and OAU. With the help of combinational circuits

the manual selection of N/2 point DCT block will be selected.

.

The input of size N bit is fed to 1st stage AND Gates, Input Mux Assembly and to 2nd stage AND Gates .After computation the output of N-bit is taken from N/2-point

Reusable DCT unit and Output Mux assembly. .

The 2D DCT can be obtained with the help of 1D DCT and

transposition buffer as shown in the figure 2(a).The 2D DCT is used for compressing the images in both the directions (i.e., rows and columns).

The existing transposiion buffer is as shown in figure2 (b).The existing transposition buffer is modified in this work to reduce the overall area and delay as shown in figure 2(c).In the existing transposition buffer the manual enable signal are used and many AND gates are used but in the modified

Figure 2(a): N Ã— N-point 2-D integer DCT

en1

en2

en3

R00

clk

en0

X0

R00 R00 R00

R00

X1

R00 R00 R00

X2

R00

Advantages and Applications:

Advantages

This architecture is reused for many of the prescribed lengths.
The proposed integer DCT architecture is mainly helps in less area and delay.
The throughput is constant for lengths 4, 8, 16 and 32.
Reusable integer DCT architecture consumes less energy compared to direct implementation of matrix multiplication.
The proposed transposition buffer helps mainly in compressed area and delay.

Results:

Transpose memory:

R00

X3

R00

R00 R00

Mux Mux Mux Mux

Y0

Y1

Y2

Y3

Figure 2(b):Existing Transposition Buffer

Modified transform:

The modified transposition buffer is as shown in figure 3.The modified transposition buffer consists of counter, multiplexers and registers. The AND gates are eliminated in this modified structure hence area is reduced compare to existing structure. In the existing transposition buffer the enable signal is given manually hence delay is high in this case but in modified transposition buffer the counter is used to give the signal automatically hence the delay is reduced.

The counter used in the modified transform is an up counter used to load the values automatically without any delay. The counter is powered by a clock for every one tick of clock the registers values are shifted. The Multiplexers select the input lines according to select lines connected to counter.

Software Requirement:

Verification Tool
- Modelsim 6.4c
Synthesis Tool
- Xilinx ISE 9.1

Proposed reusable architecture for N = 8:

D

Xj,0 Xj,1 Xj,2

Xj,3 Xj,4 Xj,5 Xj,6

Xj,7

D D D D D D D

D D D

D D D D

out put

D D D D

D D D D D D D

D

MUX MUX MUX MUX MUX MUX MUX MUX

c

D D D D D D D

clk

Figure 3: Modified transposition buffer

Top module: Device Utilization Summary

Existing:

Modification:

The utilization efficiency of the existing method has been improved by the proposed method. The number of lookup tables has been reduced by the proposed method which in turn has reduced the area and delay product as shown in results above.

REFERENCES:

[1]. Overview of the H.264/Avc Video Coding Standard by Thomas Wiegand, Gary J. Sullivan,Senior Member, IEEE, Gisle Bjontegaard, and Ajay Luthra Senior Member,IEEE.

[2]. S.Bouguezel M.O. Ahmad, M.N.S. Swamy proposed Low Complexity 8×8 Transform for Image Compression.

[3]. R.J. Cintra, F.M. Bayer, proposed A DCT approximation for Image Compression.

[4]. Pramod Kumar Meher, Senior Member, IEEE, Sang Yoon Park, Member, IEEE, Basant Kumar Mohanty, Senior Member, IEEE, Khoon Seong Lim, and Chuohao Yeo, Member, IEEE IEEE Efficient Integer Dct Architectures For Hevc, Transactions on Circuits And Systems For Video Technology, Vol. 24, No. 1, January 2014.

Implementation of Reduced Area and Delay Integer DCT using Modified Transposition Buffer

Leave a Reply