# A Reconfigurable, Power–Scalable Rake Receiver IP for W-CDMA

A. Bianco, A. Dassatti, M. Martina, A. Molino, F. Vacca \* CERCOM – Politecnico di Torino - Italy e-mail: maurizio.martina(andrea.molino,fabrizio.vacca)@polito.it

Abstract— During the last years wireless market has experienced an exponential growth. 2G systems are essentially voice– oriented: the main innovation expected from 3G ones is the ubiquitous Internet and multimedia fruition. The transition from 2G to 3G provides both opportunities and challenges: one way to make this migration as smoother as possible relies on the employment of reconfigurable architectures. In this paper a reconfigurable Rake Receiver for W-CDMA is proposed. Very promising results from the physical implementation on a XCV300E have been obtained.

## I. INTRODUCTION

In the last years digital telecommunications techniques have achieved a wide popularity, mainly due to the diffusion of cellular phones and wireless devices. The request for more complex and complete services, such as high speed data transmission, multimedia streaming and ubiquitous infotainement, has moved many research groups in the electronic field towards the study of new and efficient algorithms, codes and modulations. Besides, the possibility to transmit not only voice but even data on a tetherless network, has fostered the development of new technologies and new standards for cellular communications. In the near future second generation (2G) standards, such as GSM, will share the network and the existing infrastructures with third generation (3G) ones, as UMTS. This transition can be particularly critical for those countries wherein the GSM system has gained a widespread diffusion. One of the main differences between 2G and 3G systems is the channel access technique: in fact, while GSM is mainly TDMA oriented, 3G wireless terminals will be strongly based on CDMA [1]. In this paper the implementation of a CDMA receiver is proposed with particular care to the Rake architecture. The whole design has been carried out on a XILINX Virtex-E FPGA (XCV300E) in order to achieve high degrees of flexibility. In section II some basic CDMA concepts are explained and a system view is given. Section III some details about the architecture are presented, while in section IV results obtained after the place & route step both in terms of complexity and performance are shown. Finally in section V some conclusions are drawn.

#### II. CDMA OVERVIEW

## A. Basic principles

The CDMA is a technique to access a shared channel resorting to orthogonal codes [1]. The main concept is that every user is able to spread its information on certain band. The band reserved for the communication  $(B_{chip})$  oughts to be wider than the one required by the information itself  $(B_{bit})$ , in order to allow the spreading operation. From an implementation point of view the spreading operation can be performed multiplying the original signal  $x_n$  (with band  $B_{bit}$ ) and the code  $w_n$  (with band  $B_{chip}$ ). The result is that x energy is spreaded over w band. The resulting signal will be referred as  $t_n$ . If several orthogonal codes are available, more users can access the same channel, provided that each user spreads with a different code. The receiver can recover the information multiplying the bit-stream against the same code employed by the transmitter. Even if CDMA principles seem very simple, from the implementation point of view many challenges have to be faced. Since the signal x has been spreaded by the code w, the resulting signal rate will become very high. This implies that the receiver needs to perform the first operations in the decoding chain working at a very high data rate. Usually x rate is called *bit-rate* while w and y ones are referred as *chip-rate*. Another important aspect in a CDMA receiver is the synchronisation. In fact, the CDMA receiver can recover the original signal only if it is perfectly synchronised with the transmitter [1]. To achieve synchronisation the most common strategy relies on the employment of a pilot sequence. Namely, the pilot sequence is a control bit-stream, used only to assure the receiver to be synchronised and to be able to start properly the decoding operations.

#### B. Receiver related issues

In figure 1 a CDMA receiver block scheme is shown: after the antenna and the radio–frequency circuits the signal oughts to be converted from the analog domain to the digital one by an ADC. The digital block devoted to this operation is the Digital Down Converter (DDC), whose main tasks are to filter the signal and to change the data rate [5]. In order to successfully carry out CDMA de–spreading the receiver needs to know the exact spreading code employed by the transmitter. The generation of the local de–spreading sequence can be successfully accomplished just knowing the transmitter network identifier. However the recovering of carrier synchronisation

<sup>\*</sup>This work has been partially supported by CERCOM (Center for Multimedia Radio Communications), Politecnico di Torino, ITALY.



Fig. 1. CDMA receiver

is not a trivial task. In the literature several methods have been proposed regarding the synchronisation [1], [6], [4]. Even after the carrier synchronisation, the local de-spreading code generator could exhibit some frequency drift phenomena, mainly caused by Doppler shift or by the lack of a global, unique time reference. These CDMA disadvantage can be concealed by a proper selection of the tracking method employed: after the local code has been synchronised with the incoming one, a tracking loop is used to keep the phase error bounded. When the de-spreading operation is performed, the original signal should be recovered. However the transmitted signal could reach the receiver from different paths with different delays: this phenomenon is well-known in the digital communication field under the name of multipath. Instead of trying to remove, or at least to conceal, multipath effects, an interesting solution lays in the the employment of a *rake receiver*. The main idea behind the rake receiver is to use different de-spreading blocks, the so called rake *fingers*(see figure 2), each of which will use a "phase-shifted" image of the local code (figure2). The phase delay can be estimated resorting to periodic path estimation by the means of the pilot sequence. For a given path, the received signal for the kth user can be written as:

$$\mathbf{r_n}(k) = \mathbf{t_n}(k) \, \alpha_l e^{j \phi_l}$$

where  $\mathbf{t_n}(k)$  is the transmitted signal while  $\alpha_l$  and  $\phi_l$  are two parameters able to describe the attenuation and the delay of the path. It has been demonstrated [1] that the path delay can be estimated very effectively through the average of  $N_p$  received chips. For the in-phase branch  $(r_n^I(k))$  the operation to carry out will be:

$$\alpha_l \cos(\phi_l) \propto \frac{1}{N_p} \sum_{i=1}^{N_p} r_{n_i}^I(k)$$



Fig. 2. Rake finger of CDMA receiver

while for the quadrature one  $(r_n^Q(k))$ :

$$\alpha_l \sin\left(\phi_l\right) \propto \frac{1}{N_p} \sum_{i=1}^{N_p} r_{n_i}^Q(k)$$

The selection of the  $N_p$  parameter is crucial for the rake receiver: in fact while a smaller value of  $N_p$  can lead to a not optimal or even erroneous path estimation, a larger one tends to reduce the "reactivity" of the system towards user mobility. From an architectural point of view the rake receiver exhibits interesting degrees of regularity. In particular starting from the architecture of a single finger, it is possible to design a "sliceable" receiver, wherein the number of the fingers and their effective power status can be dynamically selected and reconfigured. It is important to state that in mobile wireless terminals the energy budget is limited and batteries life is one of the most critical aspects to be considered when the design of a tetherless receiver is addressed. Nevertheless in CDMA scenario, the number of fingers needed in a rake receiver strongly depends on the environment around the receiver itself. Many fingers can be required in case of particularly bad conditions: the receiver is moving, many users are communicating on the channel, territorial morphology presents obstacles (mountains, skyscrapers, ...). On the other hand when the receiver is not in movement, few actors are requiring the channel or the number of multipaths is negligible, a single finger may grant satisfactory results and noteworthy energy save. These considerations have to be taken into proper account in the design of a rake receiver for wireless mobile CDMA applications. Finally, after the synchronisation and the de-spreading sections, the channel decoding will take place (see figure 1). In particular, depending on the required Quality of Service, different channel coding techniques can be employed.

#### **III. ARCHITECTURE**

As described in the previous sections the rake is the block devoted to exploit multipath phenomenon to increase the received signal to noise ratio. From the architectural point of view, the proposed IP is made of two main regions (see figure3): the darker represents the control driven blocks, while the brighter the data flow ones. As far as the latters are concerned, three main parts can be identified, namely: the rake receiver core (composed by a variable number of *fingers*), the



Fig. 3. Rake receiver architecture

adder and the programmable comparator. The first is the rake receiver core, based on the employment of a parametric number of fingers, one for each path to be taken into account. A finger is made of: a path estimator able to estimate the delay and the attenuation parameters in  $N_p$  cycles, a multiplier devoted to multiply the data by the obtained path estimation, a de-spreading unit implemented with a combinational net and an accumulator (see figure 2). The second is an adder, devoted to sum together the contributions available from the different fingers. This unit oughts to be controlled by a proper control unit (CU); in fact the adder needs to know which among the fingers outputs have to be added. To dynamically reduce the power dissipation, the CU is able to selectively turn on the adder, in order to load the results produced by the proper number of fingers. A priority encoder selects which finger is going to be served. The adder performs the operation and with a clear signal marks the finger as "served", this is simply implemented with a combinational net and some set-reset flip flops. When all the fingers are "served" the adder generates a data ready signal and is turned off by its CU. The third is a programmable comparator devoted to decide the current decoded value. This block has been described to have a programmable number of thresholds, loadable from a memory and changeable at the run time. The comparator starts the comparison when it receives an enable signal, this signal is simply the data ready produced by the adder. Moreover to make the architecture more flexible and more power efficient, every finger can be turned off by a power controller. This unit creates a feedback in the architecture to reduce the power consumption. The basic idea is that this block, when turned on, is able to test the result produced by the comparator resorting to eight confidence thresholds. When all the fingers are turned on, if the comparator generates a value with a high degree of confidence with respect to the confidence thresholds, the power controlled tries to turn off a certain number of fingers caring to grant a satisfactory signal to noise ratio. As far as the proposed IP reconfigurability is concerned, several parameters have been taken into account:

• incoming data width;

TABLE I **RESULTS WITH 8 BITS DATA-PATH IMPLEMENTATION** Component #LUT **# FF** Freq(MHz) Finger 125.8 378 (6%) 239 (4%) Adder 173.0 48 (0%) 54 (0%) 137.2 57 (0%) 61 (0%) Comparator Adder Ctrl 181.2 49 (0%) 12 (0%) Power Ctrl 178.0 62 (1%) 38 (0%) Finger Ctrl 123.9 42 (0%) 22 (0%) Rake 123.9 1770 (29%) 1143 (19%)

- $N_p$ ;
- rake adder width;
- multiplier's pipe depth;
- adder's pipe depth;
- maximum finger number;
- number of thresholds.

### IV. EXPERIMENTAL RESULTS

The architecture described in the previous section has been implemented resorting to VHDL. The high degrees of reconfigurability needed in a SDR approach have been achieved resorting both to the VHDL generic construct and to dynamically changeable parameters. The different number of blocks that can be intanced in the design, and in particular the number of fingers employed to recover multipath effect, are among the most useful figures to grant the proposed IP to be reused in different scenarios. The dynamic change of blocks in the design has been achieved through the generate statement provided by VHDL. It is worth noticing that the optimal selection of the aforementioned parameters can be accomplished depending on the target application. As an example given a set of UMTS parameters, the proposed IP can become a standard compliant rake receiver, simply assigning the proper values. As a significant case of study the number of fingers has been fixed to 4 and the data-path width has been investigated for different values. Starting from off-the-shelf ADCs resolution, the incoming data width has been set to 8, 12 and 16 bits. As far as the other parameters are concerned, the following values have been employed:

- $N_p = 32;$
- multiplier's pipe depth = 3;
- number of thresholds = 64.

In tables I, II, III the experimental results are presented.

Starting from the aforementioned parametric description the whole rake receiver architecture has been tested and validated

| <b>R</b> ESULTS WITH 12 BITS DATA–PATH IMPLEMENTATION |           |            |            |  |
|-------------------------------------------------------|-----------|------------|------------|--|
| Component                                             | Freq(MHz) | # LUT      | # Bit      |  |
| Finger                                                | 134.8     | 456 (7%)   | 285 (5%)   |  |
| Adder                                                 | 161.6     | 65 (1%)    | 72 (1%)    |  |
| Comparator                                            | 129.9     | 61 (0%)    | 65 (0%)    |  |
| Adder Ctrl                                            | 181.2     | 68 (1%)    | 14 (0%)    |  |
| Power Ctrl                                            | 178.0     | 62 (1%)    | 38 (0%)    |  |
| Finger Ctrl                                           | 123.9     | 42 (0%)    | 22 (0%)    |  |
| Rake                                                  | 123.9     | 2122 (35%) | 1351 (22%) |  |

TABLE II

TABLE III DESULTS WITH 16 DITS DATA DATH IMPLEMENTATION

| Component   | Freq(MHz) | # LUT      | # Bit      |
|-------------|-----------|------------|------------|
| Finger      | 124.0     | 587 (9%)   | 404 (7%)   |
| Adder       | 151.6     | 82 (1%)    | 94 (1%)    |
| Comparator  | 123.4     | 65 (1%)    | 69 (0%)    |
| Adder Ctrl  | 175.8     | 89 (1%)    | 16 (0%)    |
| Power Ctrl  | 178.0     | 62 (1%)    | 38 (0%)    |
| Finger Ctrl | 123.9     | 42 (0%)    | 22 (0%)    |
| Rake        | 123.4     | 2688 (44%) | 1855 (30%) |

in SYNOPSYS CoCentric System Studio environment. The logical synthesis has been carried out on a XILINX VirtexE 300 (XCV300e), resorting to Synplify Pro v7.0 by Synplicity. As an example the whole rake receiver can reach an operative frequency of 123.4MHz with a data width of 16 bits at the expense of less than half the FPGA available area. The complete flow over the FPGA has been performed with XILINX ISE tools, obtaining very satisfactory results both in terms of complexity and performance: in particular for the 16 bits case the post place and route frequency is 107MHz. It is worth noticing that with XILINX XPower a power estimations has been obtained: the proposed architecture shows a dynamical power consumption of 202mW. This value is very interesting since it has been obtained when all the four fingers are turned on. In the mean case a power consumption of 149mW has been measured thanks to the power controller effects on the fingers status. In order to fully validate the proposed IP, also with the limited power budget typical of mobile system, a comparison with an ASIC implementation has been planned. The proposed architecture and the obtained results can be compared with some other works [7], [8] and [9]. It is worth noticing how in the literature few remarks are posed to reconfigurability and power-scalability issues. In particular in [7] a rake receiver architecture is presented and some results are given for a XILINX XC4028XL implementation. However no results from the power consumption point of view are given and dynamic power management strategies are not applied. In [8] some interesting system considerations are given regarding the implementation of a Rake Receiver for W-CDMA on an AL-

TERA FLEX10K. However few results from the FPGA implementation point of view are discussed and power consumption issues are not taken into account. Finally in [9] very interesting results are provided for a Custom Computing Machine implementation. To the best of our knowledge this paper is the first power-scalable Rake Receiver implementation on a modern FPGA device.

## V. CONCLUSIONS

In this paper a reconfigurable architecture for a rake receiver has been proposed. Particular care has been posed during the design flow to the CDMA environment to assure high degrees of flexibility and reconfigurability. Moreover a dynamic power scheduling approach has been taken into account to better suit the limited power budget available on next generation mobile phones.

#### REFERENCES

- [1] Viterbi A.J., CDMA: principles of spread spectrum communications, Addison-Wesley Publishing Company, 1995.
- [2] Buracchini E., "The software radio concept," IEEE Communications Magazine, vol. 38, pp. 138-143, Sept. 2000.
- [3] Tuttlebee W.H.W., "Software-defined radio: facets of a developing technology," IEEE Personal Communications, vol. 6, pp. 38-44, Apr. 1999.
- [4] Cummings M. and Haruyama S., "FPGA in the Software Radio," IEEE Communications Magazine, vol. 37, pp. 108-112, Feb. 1999.
- [5] Hogenauer E.B., "An economical class of digital filters for decimation and interpolation," IEEE Trans. Acoustic, Speech and Signal Processing, vol. 29, pp. 155–162, Apr. 1981.
- [6] Louveaux J., Vandendorpe L., Cuvelier L., and Pollet T., "An Early-Late Timing Recovery Scheme for Filter-Bank-Based Multicarrier Transmission," IEEE Transactions on Communications, vol. 48, pp. 1746-1754, Oct. 2000.
- [7] Leung O., Chi-Ying T., and Cheng R.S., "VLSI Implementation of Rake Receiver for IS-95 CDMA testbed using FPGA," in Proceedings of Asia and South Pacific Design Automation Conference, 2000, pp. 3-4.
- [8] Korah S.P. and McDonald S.A., "Towards the Implementation of a WCDMA AAA Receiver on an FPGA Software Radio Platform," in Vehicular Technology Conference, 2001, vol. 3, pp. 1917-1921.
- [9] Srikanteswara S.and Neel J., Reed J.H., and Athanas P., "Soft radio implementations for 3G and future high data rate systems," in Global Telecommunications Conference, 2001, vol. 6, pp. 3370-3374.