

## FFT Core (1024 Points)

September 10, 1997

Product Specification - PRELIMINARY



## **DSP CORE Generator**

Xilinx Inc.

2100 Logic Drive San Jose, CA 95124

Phone: +1 408-559-7778
Fax: +1 408-559-7114
E-mail: dsp@xilinx.com
URL: www.xilinx.com

FFT technology developed by Rice Electronics.

### **Features**

- · 2's complement, fixed-point arithmetic
- Real-valued input data (15 bit)
- Complex output data (16 bit ~86 dB available output SNR)
- Transform size (N) = 1024
- No programming required
- No "twiddle factor" memory required (internal to Core)
- Process real-time sampling rates ~2Mhz
- Simplified interface (nominal support logic required)
- Synchronous design, optimized for XC4000E, EX, and XI families of FPGAs

# **Applications**

- Communications (high speed modems, transmultiplexers)
- Instrumentation (medical, scientific, test)
- Multi-media (signal compression/decompression)
- Military (radar, EW, ELINT, ESM)

## **General Description**

The 1024-point Fast Fourier Transform (FFT) Core is a functionally complete processor. The design requires a 1024-point external data memory and nominal interface support. The Core targets the Xilinx XC4000E, EX, and XL FPGA product series.

The Core renders the following transform for a real-valued input vector, f(n):

$$F(j) = \sum_{n=1}^{N-1} f(n)e^{\frac{-2\pi i j n}{N}}$$

for j=0 to (N/2-1)

where:

F(j) = output (frequency domain) coefficients

f(n) = input (time domain) sequence

N = length of input sequence (transform size)

The Core accepts a real-valued input sequence f(n) and produces a complex output sequence F(j). Due to the real-valued nature of f(n), only the lower half of the set of F(j) is generated. The upper half of the F(j) is the complex conjugate of the lower half, and therefore represents redundant information.

General specifications of the FFT Core are listed in Table 1.

## **Functional Description**

The 1024-point FFT Core is a functionally complete processor requiring minimal external control. Only an External Memory (1k word) is required for Core operation. The External Memory holds the input vector f(n) to be transformed.

The Core itself requires no initialization, and may be activated whenever valid data is present in External Memory. When in operation, the Core must have exclusive access to External Memory. The Core performs "read-only" accesses to the Memory (no write operations).

Table 1. 1024-Point FFT Parameters

| Core Name | N = Size of<br>Transform | P = Clock<br>Periods <sup>1</sup> | Clock Speed <sup>2</sup> | Execution Time <sup>3</sup> | Core Size <sup>4</sup> |
|-----------|--------------------------|-----------------------------------|--------------------------|-----------------------------|------------------------|
| 1024 FFT  | 1024 points              | 17408                             |                          |                             | 532                    |

- 1. P = number of clock periods for transform execution
- 2. Maximum clock speed based on XC4000E-3 series
- 3. Execution Time = P/(Clock Speed in Mhz)
- 4. Approximately 70% utilization of F/G function generators for XC4013 device

The 1024-point FFT Core requires no external storage of constants. "Twiddle-factors" are generated internally to the Core and require no user programming.

The Core possesses physically separate input and output interfaces.

The input interface has separate data and address buses for accessing External Memory. While External Memory consists of 1024 words, only 9 address bits are required from the Core. This is due to the simultaneous access of two memory words on every read cycle, as explained below.

The output interface presents output coefficients F(j) at a constant rate. This interface provides physically separate buses for output coefficients and index information. The index identifies the specific output coefficient F(j).

### **Pinout**

#### **DATA INPUT**

The input interface includes dual 16-bit unidirectional data buses to the Core (INHI[15:0], INLO[15:0]) and a 9-bit address bus (ADR[8:0]) from the Core. These buses support read-only operations from External Memory during FFT processing.

#### **DATA OUTPUT**

Output consists of the 16-bit unidirectional Output bus (OUT[15:0]), the Index bus (INDEX[9:0]), and an output Synchronization signal (SYNC). Output coefficients F(j) are presented on the Output bus. The corresponding value of j appears on the Index bus.

The Index bus identifies the component (imaginary or real) and the coefficient number (j), associated with the data on the Output bus.

#### **TIMING INPUTS**

Timing inputs consist of a START signal and a continuous clock (FFTCK). A pre-defined number of clock pulses is required for execution of the transform (see Table 1).

Figure 1 illustrates the 1024-point FFT Core interface signals. The format of the interface signals is summarized in Table 2.



Figure 1. 1024-point FFT Core Interface

**Table 2. Core Signal Pinout** 

| Signal                   | Signal Direction | Description                                                                                                                                                 |
|--------------------------|------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
| START                    | Input            | Logic level dictates operational state: 0=Reset State (Dormant), 1=Execution State                                                                          |
| FFTCK                    | Input            | Continuous clock (see Table 1 for max.frequency)                                                                                                            |
| INHI[14:0]<br>INLO[14:0] | Input            | One sign bit + 14 magnitude bits; 2's complement notation. Dual unidirectional input buses for f(n)                                                         |
| ADR[8:0]                 | Input            | Unsigned 9-bit bus. Speci-<br>fies read address to External<br>Memory                                                                                       |
| RDCK                     | Output           | Continuous clock from Core.<br>Synchronizes access to Ex-<br>ternal Memory. Derived from<br>FFTCK (+2)                                                      |
| OUT[15:0]                | Output           | One sign bit + 15 magnitude bits; 2's complement notation. Unidirectional output bus for F(j)                                                               |
| SYNC                     | Output           | Positive pulse indicates pres-<br>ence of new output compo-<br>nent (pulse width = one clock<br>period)                                                     |
| INDEX[9:0]               | Output           | Unsigned 10-bit bus. Identifies F(j) value on OUT bus. Index MSB indicates imaginary or real component: 0=Imaginary, 1=Real. Remainder of Index indicates j |

# **Timing and Control**

When the START signal is low, the Core is "reset". This prepares the Core for execution of a new transform. START must go low for a minimum of one FFTCK period for reset to occur.

While START is low, the Core interfaces are inactive. FFT processing begins when START goes high.

### Input Interface

### Input Interface Timing

The input interface requires exclusive (uninterrupted) access to the External Memory during FFT processing. This interface consists of an address bus, dual input data buses, and a continuous clock (RDCK).

RDCK is produced by the Core and can be used to synchronize External Memory to the Core. RDCK is derived from the Core input clock (FFTCK) and is half the FFTCK frequency.

The address bus from the Core changes on the rising edge of RDCK. The data buses to the Core must be stable by the next rising edge of RDCK. As seen in Figure 2, the time allocated for memory access is one RDCK (2 FFTCKs). This is equivalent to ~60ns at maximum clock speeds.

At the Core interface, the address and data buses terminate (respectively) at the output and input of FD type registers (XC4000 library primitives). Consequently, the Core contributes minimal logic delay in the memory access path. Accordingly, most of the RDCK period is available for delay through External Memory and associated I/O buffers.



Address bus represents index on input sequence f(n). INLO and INHI (input data buses) must respond with associated data samples within 1 RDCK period.

Continuous timing sequence is maintained for duration of FFT process.

Note: All signal transitions occur on rising clock edge.

X8224

#### **External Memory Organization**

The input buffer, f(n), must be accessible in two separate halves from External Memory. The two halves must be available simultaneously on the INLO and INHI data buses. This imposes the following organizational requirements on External Memory:

The lower half of f(n) must be available at the INLO bus from the following External Memory address locations:

```
f(0)--->location 0
f(1)--->location 1
f(2)--->location 2
•
•
•
f(N/2 - 1)--->location (N/2 - 1)
```

The upper half of f(n) must be available at the INHI bus from the following External Memory address locations:

Note: the common 9-bit ADR bus is used to simultaneously address both halves of External Memory.

## **Output Interface**

During execution, the frequency coefficients F(j) are produced at the Output bus. Separate 16-bit values are generated for the real and imaginary components of F(j). The components are produced in a specific order, as defined in Table 3.

Asserting the Start signal high begins the FFT process. After an initial latency period, output components are produced at a constant rate, with 16 clock pulses between outputs (Figure 3). Initial latency is ~1024 FFTCK periods.

Table 3. FFT Output Order

| Output Bus                                                       | Index Bus LSBs | Index Bus MSB |  |  |  |  |
|------------------------------------------------------------------|----------------|---------------|--|--|--|--|
| F <sub>I</sub> (0) [First out]                                   | 0              | 0             |  |  |  |  |
| F <sub>I</sub> (0)                                               | 0              | 0             |  |  |  |  |
| F <sub>R</sub> (0)                                               | 0              | 1             |  |  |  |  |
| F <sub>R</sub> (0)                                               | 0              | 1             |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
|                                                                  |                |               |  |  |  |  |
| Note: The 16-hit FET core generates redundant information at the |                |               |  |  |  |  |

Note: The 16-bit FFT core generates redundant information at the beginning and end of the transform, resulting in (N+4) total output values.

The Output Sync line produces a positive pulse for each new output component. The pulse is one clock cycle wide, and occurs one clock cycle after the appearance of a new component. Figure 3 illustrates output timing.

The Index bus identifies the component on the Output Bus. The Index MSB is low (0) for <u>imaginary</u> output and high (1) for <u>real</u> output. The remainder of the Index bus represents j, which ranges from 0 (DC) to 511.

An Index value of all zeros indicates no meaningful data is on the Output bus. This corresponds to the component  $F_I(0)$ , which is defined by the FFT equation as zero for a real-valued input sequence. Simulation output for this component may be either undefined (xxxx) or zero.



Note 1: All signal transitions occur on a rising clock edge.

Note 2: The production of sequential F(j) is for illustrative purposes only.

X8223

Figure 3. Output Interface Timing

# **Ordering Information**

This macro comes standard with the Xilinx CORE Generator. For additional information contact your local Xilinx sales representative, or e-mail requests to dsp@xilinx.com.

For information on Rice Electronics, contact:

Rice Electronics

PO Box 741

Florissant, MO 63032

Phone: +1 314-838-2942 Fax: +1 314-838-2942 E-mail: ricedsp@aol.com