Implementation of QR Decomposition for MIMO Detection

DOI : 10.17577/IJERTV4IS040480

Download Full-Text PDF Cite this Publication

Text Only Version

Implementation of QR Decomposition for MIMO Detection

Veena Vijayan,

SRM University

PG Scholar, Department of ECE, SRM Nagar, Kattankulathur, Kancheepuram

Tamilnadu 603203, India.

Mrs. K. Ferents Koni Jiavana,

SRM University

AP (OG), Department of ECE,

SRM Nagar, Kattankulathur, Kancheepuram Tamilnadu 603203, India.

AbstractThe exceptional growth of telecommunication industry demands new technologies which can efficiently meet the requirements such as spectral efficiency, reduced area and robustness. The main challenge faced by MIMO technology is complexity in detection of transmitted symbol at detector. The overhead associated with detection is reduced by usage of pre processing methods. QR decomposition is a most important one of these. Of the various QR decomposition methods available, in this paper we are considering givens rotation based QR decomposition. Also, another variant of givens rotation, i.e., column-wise givens rotation is also implemented. The performance of both methods are analyzed and compared.The simulation results are taken using ModelSim Altera 6.3, Xilinx

    1. and Cadence Encounter Tools.

      Index TermsMultiple Input Multiple Output (MIMO) systems; Pre-processing; QR decomposition; Givens rotation; Column-wise givens rotation.

      The QR decomposition of a matrix is the factorization of the particular matrix in to two matrices, namely orthogonal matrix Q and upper triangular matrix R. Of the various popular methods like Gram-Schmidt algorithm, Householder transformation, in this work we are focusing on Givens rotation based QR decomposition and a modified less complex version of it.

      This paper is structured as follows: section II gives brief description of MIMO systems. Section III gives basic idea of QR decomposition and section IV represents Givens rotation based QR decomposition implementation while section V explains the modified Givens rotation based QR decomposition which is named Column-Wise Givens rotation. Section VI deals with the results and comparison between both methods and with existing methods in literature. Finally section VII presents conclusion of the work.

      1. INTRODUCTION

        Electronic communication systems are becoming a crucial part of our day to day life. Mobile communication devices, laptops etc are among the in-evitable. The diversity of applications that can be performed by simple communication devices for texting, calling ranging up to multimedia streaming which demands high data rate of communication systems. As ensuring high quality of communicated data is also of as much crucial, it is essential to design transmitter – receiver systems with utmost care in order to reduce error rate and decoding complexity.

        Multiple-Input-Multiple Output (MIMO) systems have come into mainstream in order to tackle these difficult scenarios. It makes use of more than one antenna at transmitter and receiver which handle symbols in same frequency band. Although it have high spectral efficiency, it faces overhead of complexity of detection. The detection process at receiver side is accomplished by various detection algorithms such as sphere decoding, VBLAST and successive interference cancellation[1]. QR decomposition is an important operation to be done in order to make this detection procedure much simpler.

        Various modified versions of QR Decomposition exist and is being continuously researched based on diverse requirements. In this paper in the present works on QR decomposition, the results are presented in terms of architecture itself not considering the overall performance.

      2. MIMO SYSTEMS

        In radio, multiple-input and multiple-output, or MIMO is the use of multiple antennas at both the transmitter and receiver to improve communication performance. It is one of several forms of smart antenna technology. MIMO technology has attracted attention in wireless communications, because it offers significant increases in data throughput and link range without additional bandwidth or increased transmit power. It achieves this goal by spreading the same total transmit power over the antennas to achieve an array gain that improves the spectral MIMO to be efficient (more bits per second per hertz of bandwidth) and/or to achieve a diversity gain that improves the link reliability (reduced fading). Because of these properties, MIMO is an important part of modern wireless communication standards such as IEEE 802.11n (Wi-Fi), 4G, 3GPP Long Term Evolution, WiMAX and HSPA+.

        1. MIMO System model

          A MIMO system essentially consists of transmitter with M transmitter antennas, a flat fading channel, a receiver with N number of receiver antennas. In this paper, we are dealing with equal number of transmitter and receiver antennas [M= N]. The system can be basically represented by,

          = + (1)

          Where x [1 , 2 , 3 , ] are transmitted symbols in a constellation of modulation scheme, and received symbols are represented by y [1 , 2 , 3 , .. ]. Noise component of transmitted data is given by Gaussian noise vector n.

          iii.) The product of Q and R must be H i.e. H=QR.

          Thus, QR decomposition of a channel matrix is given by the equation

          H = Q R (2)

          After multiplyin

          g After multiplying y with , the signal model can be re written as;

          = +

          = +

          = +

          (3)

          Transmitter Receiver

          M antenna N antenna

          Fig.1. Linear MIMO system model

        2. Pre- processing in MIMO Detection

        A MIMO detector circuit is mainly comprised of symbol decoder followed by a pre-processor. The pre- processor aims at providing signal quality index and reduction of decoding complexity. The channel matrix H is often undergone QRD in almost all practical MIMO decoders [7]. In these cases the MIMO detector is assumed to have complete knowledge of channel. i.e., channel matrix is known. The corresponding computations on this channel matrix termed as pre-processing have to be done at rate of change of channel which is varying at a much lesser rate than at which actual transmitted data symbol take place.

        The MIMO Detection process is composed of large number of complex calculations. Thus by shifting the computational complexity to pre-processor has the advantage that processing is only done once and can be iteratively used while it may had to be re-computed for each detection between MIMO detector and channel decoder. Due to stringent throughput requirements, there may arise need for several MIMO detectors to run in parallel. So moving complexity from detector units to single QR decomposition unit is highly efficient.

      3. QR DECOMPOSITION

        QR decomposition (QRD) is one of the key algorithms employed in MIMO systems [3]. Since numerous MIMO detection methods require QR decomposition of the channel matrix as starting point, the QR decomposition of the matrix H is a mathematical operation that generates two matrices Q and R which have the following properties:

        Advantage of using upper triangular matrix R is that the transmitted data can be detected without inversion computations but with a back substitution only.

      4. GIVENS ROTATION BASED QR DECOMPOSITION.

        Of the most popular methods for QR decomposition such as householder transformation, Gram-Schmidt process and Givens rotation, Givens rotation is mostly favored as Householder transformation cannot be parallelized and Gram-Schmidt algorithm is numerically unstable [4]. On the other hand, the Givens rotation allows a parallel computational structure QR decompositions can also be computed with a series of Givens rotations. Each rotation zeros an element in the sub diagonal of the matrix, forming the R matrix. The concatenation of all the Givens rotations forms the orthogonal Q matrix. The Givens rotation procedure is useful in situations where only a relatively few off diagonal elements need to be zeroed, and is more easily parallelized than Householder transformations.

        Givens Generation of a 2 X 2 matrix

        The major features of our (GR-QRD) are as follows:

        1. GR-QRD has less number of multiplications than the GR implementations in the literature. Hence, it requires fewer resources to perform the decomposition.

        2. Apart from reduced computational complexity GR-QRD can exploit more fine grained and coarse grained parallelisms desirable in streaming application when compared to other approaches in the literature.

        3. Our GRA implementation of QRD-GR leads to lesser time to market by reducing the re-engineering efforts.

        2×2

        Consider a 2 X 2 matrix, = 11 12 then,

        21 22

        G = 11 12

        21 22

        i.) The columns of Q are orthogonal. This means the inner product of any two columns of Q is zero (

        . = ), and the inner product of a column with

        itself is one.

        ii.) R is an upper triangular matrix. This means that any element under the main diagonal is zero.

        112 + 212 1112+2122

        =

        112+212

        0 11221221

        112+212

        (4)

        In equation: 4 where = is Givens matrix,

        Dataflow graph for Givens generation

        s = 21

        112+212

        and c = 11

        112+212

        c and s are also called cosine term and sine term[4].

        The upper triangularization process of a 4×4 matrix follows the pattern as given in figure 2:

        Fig. 2. Order of annihilation of elements in GR-QRD

        In the process, the lower triangular elements i.e., the elements below diagonal is made zero one by one by calculating corresponding sine and cosine terms. The number of iterations needed is 6 for a 4×4 matrix. For an N × N matrix, it will be

        (1). The flow of process involved is shown in architecture

        2

        below:

        Fig.3. Architecture for processing elements in GR-QRD

        Fig.4. Givens generation dataflow graph for GR-QRD

        Dataflow graph for Givens rotation/ row updating

        Fig.5. Row rotation dataflow graph for GR-QRD

        The first graph corresponds to Givens Generation (GG), next graph corresponds to row 1 update and row 2 update. For one iteration, it requires 8 multiplications, 2 additions, 2 subs tractor, square root and divider one of each. Thus for in the process of converting a 4×4 matrix it needs 48 multiplications, 12 addition, 12 subtraction, 6 divider and 6 square root operations.

      5. COLUMN-WISE GIVENS ROTATION BASED QR DECOMPOSITION

        In this paper, we analyze Column-wise Givens Rotation (GR) based QRD (GR-QRD) where we reduce the computational complexity of GR and exploit higher degree of parallelism. This low complexity Column-wise GR (CGR) can annihilate multiple elements of a column of a matrix simultaneously. The algorithm is first realized on a Two- Dimensional (2D) systolic array.

        The pattern in which each column in matrix is nullified is as follows. As we can notice, the number of iterations required is only 3 instead of 6 in normal givens rotation for a 4 × 4 matrix.

        Fig.5. Order of annihilation of matrix elements in CGR-QRD

        The 2D systolic array architecture for Colum wise givens rotation depicts the involvement of lot of overlapping in computation and communications between the processing elements. The architecture is shown in following figure:

        Fig.6. 2D systolic architecture of PEs in CGR-QRD

        In column-wise givens rotation based QR decomposition we make use of two types of processing elements. They are

        1. Givens generation PE

        2. Row updating PE

Givens generation PEs are intended to produce necessary parameters for updating row elements of the matrix with each column elimination. Mean while row updating PEs do the work of updating row elements. The structure of these processing elements have to be designed according to what dimension matrix we are dealing with and for each column elimination the givens generation PE and row updating PE has o be reconfigured accordingly.

The Givens generation PE for eliminating column 1 of a 4 × 4 matrix is as shown below:

Fig.7.Givens generation PE for column 1 elimination

Givens generation PE has complicated calculations such as square root and division. This processing element is or eliminating elements 41, 31 21 . for that it uses all the 4 elements of first column. The number of multiplications needed is 12, then 3 additions, 3 square roots and 3 divisions. It produce sine and cosine terms as well as parameters as l1, k1, l2, K2, 1/p1, 1/p2, 1/p3 for updating of 4 rows. The givens row updating is then too implemented for updating row elements with the help of these generated parameters. Structure of a row updating PE is given in following figure:

Fig.8. Row updating PE for column 1 elimination

The row updating PE is relatively simpler in structure with only multipliers, adders and sub tractors. It utilizes all elements of first 2 columns and parameters from givens generation PE. It consists of 11 multiplications, 3 additions and 3 subtractions.

By one iteration of a givens generation PE and row updating PE annihilation of column 1 elements is completed. For column 2 elimination, i.e. 2 elements nullifying in case of 4 × 4 matrix, we need a given generation PE and row updating PE as before but only this will be much simpler as it deals with a matrix size of 3× 3, the first column being already processed. It uses modified elements from previous process.

Givens generation PE for nullifying column 2.

Fig.9. Givens generation PE for column 2 elimination

The row updating PE for updating matrix elements after column 2 elimination:

Fig.10. Row updating PE for column 2 elimination

The givens generation PE for column 2 elimination uses 8 multiplications, 2 additions, 2 square roots and 2 divisions while corresponding row updating PE has 8 multiplications, 2 additions and 2 subtractions. After eliminating 2 elements of column 2, one more element in column 3 needs to be eliminated. The givens generation and row updating PE structures almost resemble normal givens rotation process as it deals with a 2 × 2 matrix.

Given generation dataflow graph for column 3 elimination is shown below:

Fig.11. Givens generation PE for column 3 elimination

Fig.12. Row updating PE for column 3 elimination

The givens generation PE has 4 multiplications, 1 addition, 1 square root and 1 division while row updating PE 5 multiplications, 1 addition and 1 subtraction.

  1. PERFORMANCE COMPARISON

    As the dimension of matrix gets higher, the performance of column-wise givens rotation shows higher difference. It is shown in the table 1. Dimension of matrix is given by n.

    TABLE 1. Comparison of computational complexity

    The difference in latency associated with both methods can be best compared with help of this illustration.

    Fig.13. Number of iterations needed by GR-QRD and CGR-QRD for each matrix dimension

  2. CONCLUSION

In this paper, a modified method of givens rotation based QR decomposition- column wise givens rotation. In the algorithm, a matrix is being made upper triangular by annihilating all elements in a column in a single go. It achieves more parallelism than normal givens rotation because the row updating process concurrently operates with givens generation. This overlap of operation as in 2D systolic array architecture gives more efficiency in performance. It successfully reduces the overhead at receiver side by transferring part of the computation worklad of detector to QR decomposition section.

VI. REFERENCES

  1. Mahdi shabany, Ameer youssef, and glenn gulak, senior member, IEEE,high throughput 0.13- m cmos lattice reduction core supporting 880 mb/s detection, IEEE Transactions on very large scale integration (VLSI) systems, vol. 21, no. 5, may 2013.

  2. X. Wang and M. Leeser, A truly two-dimensional systolic array fpga implementation of qr decomposition, ACM Trans. Embed. Comput. Syst., vol. 9, no. 1, pp. 3:13:17, Oct. 2009. [Online]. Available: http://doi.acm.org/10.1145/1596532.1596535

  3. L. Ma, K. Dickson, J. McAllister, and J. Mccanny, Modified givens rotations and their application to matrix inversion, in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, 31 2008-april 4 2008, pp. 1437 1440.

  4. Farhad Merchant, Anupam Chattopadhyay, Ganesh Garga, S K Nandy, Ranjani Narayan, and Nandhini Gopalan, Efficient QR Decomposition Using Low Complexity Column-wise Givens Rotation (CGR), 2014 27th International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems, MPSoC Architectures, UMIC Research Centre, RWTH Aachen University, 52074 Aachen,

    Germany

  5. L. Bruderer, C. Studer, M. Wenk, D. Seethaler, and A. Burg, VLSI implementation of a low-complexity LLL lattice reduction algorithm for MIMO detection, in Proc. IEEE Int. Symp. Circuits Syst., May 2010,pp. 37453748.

  6. C. Liao and Y. Huang, Power-saving 4×4 lattice reduction processor for MIMO detection with redundancy checking, IEEE Trans. Circuit Syst. II, vol. 58, no. 2, pp. 9599, Feb. 2011

  7. Pravin W. Raut, S.L. Badjate, MIMO-Future Wireless Communication, in International Journal of Innovative Technology and Exploring Engineering (IJITEE)., Volume-2, Issue-5, April 2013, 2278-3075

  8. Bosco Leung, VLSI for Wireless Communication, Springer New York Dordrecht Heidelberg London, Second edition, 2011, ISBN 978- 1-4614-0985-4.

Leave a Reply