

September 10, 1997



# **DSP CORE Generator**

Xilinx Inc. 2100 Logic Drive San Jose, CA 95124 Phone: +1 408-559-7778 Fax: +1 408-559-7114 E-mail: dsp@xilinx.com URL: www.xilinx.com

DFT technology developed by Rice Electronics.

### Features

- · 2's complement, fixed point arithmetic
- Real-valued input data (15 bit)
- Complex output data (16 bit ~86 dB available output SNR)
- Integrated Input Buffer (no external memory requirements)
- Transform size (N) = 32, 64, or 128
- Process real-time sampling rates >1Mhz (>12Mhz for N=32)
- Simplified interface (nominal support logic required)
- Synchronous design, optimized for XC4000E, EX, and XL families of FPGAs

# Applications

- Communications (high speed modems, transmultiplexers)
- Instrumentation (medical, scientific, test)
- Multi-media (signal compression/decompression)
- Military (radar, EW, ELINT, ESM)

#### Table 1. 16-bit DFT Family Parameters

# DFT Cores (Real Data In, Complex Data Out)

Product Specification – PRELIMINARY

# **General Description**

The 16-bit Discrete Fourier Transform (DFT) Cores are functionally complete elements. The designs present a simplified interface and require no external memory. The Cores target the Xilinx XC4000E, EX, and XL FPGA product series.

The Cores render the following transform for a real-valued input vector,  $f(\boldsymbol{n})$ :

$$F(j) = \sum_{n=0}^{N-1} f(n)e^{-2\pi i n \frac{j}{N}}$$

for n=0 to N-1, j=0 to (N/2-1)

where:

F(j) = output (frequency domain) coefficients

- f(n) = input (time domain) sequence
- N = length of input sequence (transform size)

The Cores accept a real-valued input sequence f(n) and produce a complex output sequence F(j). Due to the real-valued nature of f(n), only the lower half of the set of F(j) is generated. The upper half of the F(j) is the complex conjugate of the lower half, and therefore represents redundant information.

The Cores are optimized for small to medium size transforms (32 - 128 points). A 32-point transform can be executed in less than 5 microseconds (XC4000-3). General specifications of the Cores are listed in Table 1.

| Core Name                                                                                                                                                                                                       | N = Size of<br>Transform | P = Clock<br>Periods <sup>1</sup> | Clock Speed <sup>2</sup> | Execution Time <sup>3</sup> | Core Size <sup>4</sup> |  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|-----------------------------------|--------------------------|-----------------------------|------------------------|--|
| 16b32pt                                                                                                                                                                                                         | 32 points                | 144                               | 59 Mhz                   | 2.5 usecs                   | 274 CLBs               |  |
| 16b64pt                                                                                                                                                                                                         | 64 points                | 1088                              | 59 Mhz                   | 18.5 usecs                  | 302 CLBs               |  |
| 16b128pt                                                                                                                                                                                                        | 128 points               | 4224                              | 59 Mhz                   | 71.8 usecs                  | 394 CLBs               |  |
| 1. P = number of clock periods for transform execution 2. Maximum clock speeds based on XC4000E-1 series 3. Execution Time = P/(Clock Speed in Mhz) 4. XC4008 = 324 CLBs XC40010 = 400 CLBs XC4085 = 3.136 CLBs |                          |                                   |                          |                             |                        |  |

## **Functional Description**

The 16-bit DFT Cores are functionally complete processors requiring minimal external control.

The Cores incorporate an Input Buffer, which is loaded with the vector f(n) to be processed. This constitutes the **Initialization** state of the Core.

After initialization, the Core can perform the DFT function with no external control. During this time, output coefficients F(j) are produced by the Core at a constant rate. This is termed the **Execution** state of the Core.

The DFT Cores possess physically separate input and output interfaces.

The input interface is used only during the **Initialization** state. Separate data and address busses are provided, allowing access to individual locations of the Input Buffer memory.

The output interface is active only during the **Execution** state. This interface provides physically separate busses for output coefficients, and index information. The index identifies the specific output coefficient F(j).

## Pinout

#### DATA INPUT

Prior to processing, an input vector f(n) is loaded to the Core's Input Buffer memory. Data is written to the Input Buffer using the 15 bit Data Input bus (IN[14:0]), the Address bus (ADR[x:0]), and a Write Enable (WREN).

#### DATA OUTPUT

Output consists of the 16-bit Output bus (FREQ[15:0]), the Index bus (INDEX[x:0]), and an output Synchronization signal (SYNC). Output coefficients F(j) are presented on the Output bus. The corresponding value of j appears on the Index bus.

The Index bus identifies the component (imaginary or real) and the coefficient number (j), associated with the data on the Output bus.

#### TIMING INPUTS

Timing inputs consist of a START signal, and a continuous clock (MULCK). A pre-defined number of clock pulses is required for execution of the transform (see Table 1).

Figure 1 illustrates the 16-bit DFT Core interface signals.



#### Figure 1. 16-bit DFT Core Interfaces

The format of the interface signals is summarized in Table 2.

#### **Table 2. Core Signal Pinout**

| Signal     | Signal Direction | Description                                                                                                                                                                                                                                            |
|------------|------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| START      | Input            | Logic level dictates opera-<br>tional state: 0=Initialization,<br>1=Execution                                                                                                                                                                          |
| MULCK      | Input            | Continuous clock (see Table 1 for max.frequency)                                                                                                                                                                                                       |
| WREN       | Input            | Logic level controls writing to<br>Core Input Buffer memory:<br>1=Write, 0=No action. Active<br>only when START is low                                                                                                                                 |
| IN[14:0]   | Input            | One sign bit + 14 magnitude<br>bits; 2's complement nota-<br>tion. Uni-directional input bus<br>for f(n)                                                                                                                                               |
| ADR[x:0]   | Input            | Consists of log <sub>2</sub> N bits, where<br>N is transform size (e.g., 5<br>address lines for N=32).<br>Specifies write address to<br>Input Buffer memory                                                                                            |
| SYNC       | Output           | Positive pulse indicates pres-<br>ence of new output compo-<br>nent on FREQ output bus<br>(pulse width = 1 clock period)                                                                                                                               |
| FREQ[15:0] | Output           | One sign bit + 15 magnitude<br>bits; 2's complement nota-<br>tion. Uni-directional output<br>bus for F(j)                                                                                                                                              |
| INDEX[x:0] | Output           | Consists of log <sub>2</sub> N bits, where<br>N is transform size (e.g., 5<br>bits for N=32). Identifies F(j)<br>value on FREQ bus. Index<br>MSB indicates imaginary or<br>real component: 0=Imagi-<br>nary, 1=Real. Remainder of<br>Index indicates j |

# **Timing and Control**

The 16-bit DFT cores possesses two operational states as follows:

1) Initialization (input vector f(n) is loaded into Core)

2) Execution (output vector F(j) is produced by Core)

The Start signal determines whether the Core is in the Initialization or Execution state. The Core requires a continuous clock input (MULCK) in both states. (Table 1 lists maximum clock frequency.)

### Initialization (START=0)

#### (Input State)

When Start is low, DFT processing is disabled and the Input Buffer can be initialized (loaded). Individual address locations are loaded by means of the Address Bus, Input Bus, and Write Enable (WREN).

The Core latches these inputs on the rising edge of MULCK. All inputs should be stable at least 20ns prior to the rising edge of MULCK. When WREN is high and Start is low, data is written to the specified address.

The input, f(n), must be loaded into the following Input Buffer address locations:

- f(0)--->location 0 f(1)--->location 1 f(2)--->location 2
- f(N-1)--->location (N-1)

When Start is high, WREN has no effect on the Input Buffer.

### Execution (START=1)

#### (Output State)

Setting Start high begins DFT processing. The number of clock periods (P) required to execute the DFT is given in Table 1.

Access to the Core internal memory is disabled during the execution phase.

During execution, the frequency coefficients F(j) are produced at the Output bus. Separate 16 bit values are generated for the real and imaginary components of F(j). The components are produced in a specific order, as defined in Table 3.

#### Table 3. DFT Output Order

| Output Bus                                                                                                                                   | Index Bus LSBs | Index Bus MSB |  |  |  |
|----------------------------------------------------------------------------------------------------------------------------------------------|----------------|---------------|--|--|--|
| F <sub>I</sub> (0) [First out]                                                                                                               | 0              | 0             |  |  |  |
| F <sub>I</sub> (0)                                                                                                                           | 0              | 0             |  |  |  |
| F <sub>R</sub> (0)                                                                                                                           | 0              | 1             |  |  |  |
| F <sub>R</sub> (0)                                                                                                                           | 0              | 1             |  |  |  |
| F <sub>I</sub> (1)                                                                                                                           | 1              | 0             |  |  |  |
| F <sub>I</sub> (N/2 - 1)                                                                                                                     | N/2 - 1        | 0             |  |  |  |
| F <sub>R</sub> (1)                                                                                                                           | 1              | 1             |  |  |  |
| F <sub>R</sub> (N/2 - 1)                                                                                                                     | N/2 - 1        | 1             |  |  |  |
| F <sub>I</sub> (2)                                                                                                                           | 2              | 0             |  |  |  |
| F <sub>I</sub> (N/2 - 2)                                                                                                                     | N/2 - 2        | 0             |  |  |  |
| F <sub>R</sub> (2)                                                                                                                           | 2              | 1             |  |  |  |
| F <sub>R</sub> (N/2 - 2)                                                                                                                     | N/2 - 2        | 1             |  |  |  |
| F <sub>I</sub> (3)                                                                                                                           | 3              | 0             |  |  |  |
| F <sub>I</sub> (N/2 - 3)                                                                                                                     | N/2 - 3        | 0             |  |  |  |
| F <sub>R</sub> (3)                                                                                                                           | 3              | 1             |  |  |  |
| F <sub>R</sub> (N/2 - 3)                                                                                                                     | N/2 - 3        | 1             |  |  |  |
| :                                                                                                                                            | :              | :             |  |  |  |
| F <sub>I</sub> (N/4)                                                                                                                         | N/4            | 0             |  |  |  |
| F <sub>I</sub> (N/2 - N/4)                                                                                                                   | N/2 - N/4      | 0             |  |  |  |
| F <sub>R</sub> (N/4)                                                                                                                         | N/4            | 1             |  |  |  |
| F <sub>R</sub> (N/2 - N/4)                                                                                                                   | N/2 - N/4      | 1             |  |  |  |
| [Last out]                                                                                                                                   |                |               |  |  |  |
| Note: The 16-bit DFT core generates redundant information at the beginning and end of the transform, resulting in (N+4) total output values. |                |               |  |  |  |

The output components are produced at a constant rate, with P/(N+4) clock pulses between outputs (P=number of clocks for transformation, N=transform size).

The Output Sync line produces a positive pulse for each new output component. The pulse is one clock cycle wide, and occurs one clock cycle after the appearance of a new component. Figure 2 illustrates output timing.

The Index bus identifies the component on the Output Bus. The Index MSB is low (0) for <u>imaginary</u> output and high (1) for <u>real</u> output. The remainder of the Index bus represents j, which ranges from 0 (DC) to (N/2-1).

An Index value of all zeros indicates no meaningful data is on the Output bus. This corresponds to the component  $F_1(0)$ , which is defined by the DFT equation as zero, for a real-valued input sequence. Simulation output for this component may be either undefined (xxxx) or zero.



(OUT Bus)

Note: All signal transitions occur on a rising clock edge.

Figure 2. Execution State – Output Timing

### **Ordering Information**

This macro comes standard with the Xilinx CORE Generator. For additional information contact your local Xilinx sales representative, or e-mail requests to dsp@xilinx.com.

For information on Rice Electronics, contact:

Rice Electronics PO Box 741 Florissant, MO 63032 Phone: +1 314-838-2942 Fax: +1 314-838-2942 E-mail: ricedsp@aol.com X8157