Design and Implementation of an Efficient Serial- Pipelined FFT Architecture for through Wall Image Processing on FPGA

DOI : 10.17577/IJERTV3IS051838

Download Full-Text PDF Cite this Publication

Text Only Version

Design and Implementation of an Efficient Serial- Pipelined FFT Architecture for through Wall Image Processing on FPGA

Nivedita S Khanapur

Dept of ECE

M.S Engineering College Bangalore, India

Dr. Cyril Prasanna Raj P Dean (R&D), Professor(ECE)

    1. Engineering College Bangalore, India

      AbstractThrough Wall Image Detection is a recent technique used in detection of any subject which is behind the wall. Strong pulses such as UWB which spread over wide range are used so that they have tendency to penetrate the wall. To analyze the movements in frequency domain algorithm like FFT is used. FFT is an efficient way of computing DFT. FFT finds issue in having larger bandwidth. It can be overcome by overlapping of two different techniques, so that analysis can be easily done.

      KeywordsFast Fourier Transform, Ultra Wide Band, Through Wall Image Detection.

      1. INTRODUCTION

        Through Wall Imaging is a technique to detect the targets behind the wall. UWB is found the best technology used in through wall detection. The UWB (Ultra Wide Band) pulses very short and have tendency to spread all over a wide frequency range. It has the characteristic of penetrating the wall and has higher resolution range. The penetrating characteristic also depends on the composition of wall [5]. Movements of the person can be detected with a slight variation such as breathing, arm swing etc. In particular center frequencies of life signals can be extracted with high accuracy. Detecting the human beings presence has an application like security, during natural calamity like earth quake, fire disaster and for military purposes. Currently research activities are going, on detecting the moving targets under the strong clutter. To analyze the human characteristic in time domain is quite difficult so its converted to frequency domain using FFT and converted back to time domain using Inverse FFT. FFTs have many issues to overcome those the FFT architecture used here is a overlap of two different techniques so that clock cycles are reduced so that speed is recovered and area is reduced.

      2. OVERLAPPING TECHNIQUE

        reversed and output bits are normal as shown in Fig.1. FFT is performed in serial, parallel or pipelined manner. As there is a trade-off among area, speed and power, it depends on the way the performance is required for using the above methods.

        Fig.1. 16- Point DIT- FFT

        B. Serial FFT

        As the name suggests in serial FFT data comes in serially for computation. Only one butterfly structure is used as shown in Fig.2. Once the data is available FFT computation is done. The track of data can be easily done in serial FFT. The challenge faced in this method is speed. As only one butterfly structure is used it is slow technique. Previous stages remain idle while performing in next stage. Many clock cycles will be wasted.

        A. FFT

        There are many transforms used to convert time domain to frequency domain and vice-versa. In this paper FFT is used, they are of different kinds. Here Divide and Conquer method is found the convenient one. The number of inputs are 16 i.e N=16. FFTs can be performed using decimation in time and decimation in frequency. The results of both are same but differ in the way it is computed. In particular here Decimation in Time (DIT) Radix-2 FFT is used where the input bits of bit

        Fig.2. Butterfly Structure

        X0 and Y0 are taken as two inputs. At a time two outputs are obtained. Using butterfly structure both the inputs are added first and then they are subtracted Y0 is multiplied with twiddle factor.

        1. Pipelined FFT

          To overcome the speed in above method, pipelining can be used. Previous stages will not be idle when performance will be going in next stage. The issue faced in this method is area. It requires more area to overcome speed issues.

        2. Overlapping

        To overcome the above mentioned challenges of area and speed, an optimized overlapping technique is used which is shown in Fig 3. A novel architecture using both serial and pipelined architecture is introduced. It can be seen that the clock cycles are also not wasted nor area is consumed more. This method is found the best optimized one in performing FFT. The number of input samples taken at a time is 16. When the next 16 samples come it will overlap the previous samples. There will neither be clash between any samples nor will datas be lost. Thus saving the computation time. Here it is shown for only two set of 16 samples. The samples keep coming, but 16 at once.

        Fig. 3. Overlapping structure

      3. MATHEMATICAL ANALYSIS

        The clock cycles required to compute FFT can be reduced as shown in the Table.1 below. The first column gives the clock cycles required. 2nd column is about input samples, followed by 4 butterfly stages.

        1. Analysis

          • The first input samples are denoted as X0 to X15.

            Clock Cycles

            Input Stage

            1st

            Stage

            2nd

            Stage

            3rd

            Stage

            4th

            Stage

            1.

            X0

            0

            0

            0

            0

            2.

            X8

            0

            0

            0

            0

            3.

            0

            A1,A2

            0

            0

            0

            4.

            X4

            0

            0

            0

            0

            5.

            X12

            0

            0

            0

            0

            6.

            0

            A3,A4

            0

            0

            0

            7.

            0

            0

            B1,B3

            0

            0

            8.

            0

            0

            B2,B4

            0

            0

            9.

            X2

            0

            0

            0

            0

          • The second input samples are denoted asY0 to Y31. Table.1 Serial Implementation

            • The second input samples are denoted asY0 to Y31.

            • Outputs of 1st stage are denoted as A1 to A32.

            • Outputs of 2nd stage are denoted as B1 to B32.

            • Outputs of 3rd stage are denoted as C1 to C32.

            • Outputs of 4th stage are denoted as D1 to D32.

        In the tables described entire clock cycles are not shown. Table.1 gives the idea of serial implementation. The clock cycles where the overlapping of two techniques i.e. serial- pipelined is shown clearly in Table 2. The clock cycles required to complete serial FFT of 16 samples take 48 clock cycles. For the next 16 it takes another 48 cycles which all together gives 96 clock cycles. By performing Serial-pipelined FFT the clock cycles are reduced from 96 to 81 which shows that computation is faster.

        Table.2 Pipelined-Serial Operation

        td>

        D1,D9

        34.

        Y0

        A15,A16

        0

        0

        0

        35.

        Y8

        0

        B13,B15

        0

        0

        36.

        0

        A17,A18

        0

        C9,C13

        0

        37.

        Y4

        0

        0

        0

        38.

        Y12

        0

        0

        0

        D5,D13

        39.

        0

        A19,A20

        0

        C11,C15

        0

        40.

        0

        0

        B17,B19

        0

        D3,D11

        41.

        0

        0

        B18,B20

        0

        D7,D15

        42

        Y2

        0

        B14,B16

        0

        0

        43.

        Y10

        0

        0

        C10,C14

        0

        44.

        0

        A21,A22

        0

        0

        D2,D10

        45.

        Y6

        0

        0

        0

        D6,D14

        46.

        Y14

        0

        0

        C12,C16

        0

        47.

        0

        A23,A24

        0

        0

        D4,D12

        48.

        0

        0

        B21,B23

        0

        D8,D16

        50.

        0

        0

        0

        C17,C21

        0

        51.

        0

        0

        0

        C18,C22

        0

        52.

        0

        0

        B22,B24

        0

        0

        53.

        0

        0

        0

        C19,C23

        0

        54.

        Y1

        0

        0

        C20,C24

        0

        70.

        0

        0

        0

        0

        D17,D25

        71.

        0

        0

        0

        0

        D18,D26

        72.

        0

        0

        0

        C26,C30

        0

        73.

        0

        0

        0

        0

        D19,D27

        74.

        0

        0

        0

        0

        D20,D28

        75.

        0

        0

        B30,B32

        0

        0

        76.

        0

        0

        0

        C27,C31

        0

        77.

        0

        0

        0

        0

        D21,D29

        78.

        0

        0

        0

        0

        D22,D30

        79.

        0

        0

        0

        C28,C32

        0

        80.

        0

        0

        0

        0

        D23,D31

        81.

        0

        0

        0

        0

        D24,D32

        Table.3 Outputs of second input samples

        The clock cycles seen from Table.3 will be wasted so it can be made useful by implementing the pipelined operation.

      4. DESIGN AND IMPLEMENTATION

        1. Design Specification

          • The number of input samples, N=16

          • Clock cycles required to compute 32 samples=81

          • Number of bits=8.

          • Clock period= 3.719ns.

          • Operating frequency=268.897MHz.

            The number of clock cycles required to compute 32 samples taken 16 at a time is reduced to 81.

        2. Top Reference model

          In Through Wall Imaging the concept is to detect the subject behind the wall. When the UWB pulse hits and penetrates the wall it hits the subject behind the wall and is received back at the receiver end .The received signal will not be as strong as the transmitted one. Thus the analysis of the signal in particular is quite hectic. It can be analyzed easily in frequency domain. So an optimized FFT processor is designed which helps in the analysis of the targets presence is placed at the receiver end as shown in Fig.4.

          The speed and area issues are optimized using this model.

          Fig.4. Reference Model

          The parameters on which penetration is possible are transmitting power, wall composition etc.

        3. Architecture

        As shown in Fig.5 is the processor architecture . Here the data comes in sequentially and is sent to shift register. Addition and subtration is performed using butterfly structure and multiplier block is used to perform multiplication of twiddle factor, results are stored in

        register. This block is used for one stage, this repeats as the

        the number of stages used. In this paper the number of input samples used are 16 so for 4 stages of the below blocks are used.

        Fig.5. Processor Architecture

      5. SIMULATION RESULTS

Below Fig.6 shows the simultion results of Serial Pipelined FFT design It is showed for N=8. The same results are obtained for N=16.

Fig.6. Simulation Result

ACKNOWLEDGMENT

This research work has been supported by my project supervisors Dr. Cyril Prasanna Raj P, Dean (R&D), Professor (ECE), Mr. Vinod Kumar B.L (Asst.Professor), Dept. of ECE. I would like to thank my supervisors, parents and friends for their support.

REFERENCES

  1. Jing Li, Zhaofa Zeng, Jiguang Sun, Fengshan L Through Wall detection of Human Beings movement by UWB Radar Vol 9 no 6 November 2002

  2. Chieh-Ping Lai, Ram M.Narayanan Ultrawideband Random Noise Radar design for Through Wall Surveillance Vol 46 No 4 October 2010

  3. Hong Wang, Ram M Narayanan and Zheng Ou Zhou Through Wall Imaging of Moving Targets Using UWB Random Noise Radar Vol 8 2009

  4. Jun Hu, Zhenlong Yuan, Guofu Zhu and Zhimin Zhou Robust Detection of Moving Human Target via Impulse Through Wall Radar 18 Oct 2013

  5. Sukhvinder Singh, Qilian Liang, and Li Sheng Sense Through Wall Human Detection using UWB radar 2013.

  6. A.G.Yarovoy, L.P.Lighart UWB Radar for human Being Detection IEEE, A&E System magazine, November 2006

Leave a Reply