

Parallel Multipliers – Performance Optimized

April 20, 1998

**Product Specification** 



Xilinx Inc. 2100 Logic Drive San Jose, CA 95124

Phone: +1 408-559-7778
Fax: +1 408-559-7114
E-mail: coregen@xilinx.com
URL: www.xilinx.com

### **Features**

- High performance compact implementation
- Drop-in modules for the XC4000E, EX, and XL families
- Two variable operands, 2's complement arithmetic
- Signed data, full precision outputs
- · Registered inputs and outputs
- Pre-designed modules with relative placement gives predictable timing
- Fast place and route times
- Supported in Viewlogic and Foundation schematic entry and simulation tools
- VHDL and Verilog instantiation code supplied for HDL designs
- High performance and density guaranteed through Relational Placed Macro (RPM) mapping and placement technology
- · Available in Xilinx CORE Generator

## **General Description**

Xilinx LogiCORE™ high-speed parallel multipliers are predefined drop-in modules ideal for fast, real-time DSP applications or any application where multiplication speed, area efficiency, and design time are important. The multipliers use highly efficient algorithms, tuned and optimally implemented in the Xilinx XC4000E, EX, and XL series of FPGAs.

Two parallel operands can be input to the multiplier core every clock cycle. A new double precision output will be available every clock cycle after an initial latency period. For example, a 12 by 12 multiplier can produce a result every 11 nsec. in a 4000E-1.

Two pre-defined modules are available: an 8x8 multiplier and a 12x12 multiplier. Table 1 gives the implementation statistics for these modules

### **Pinout**

Signal names for the schematic symbol are shown in Figure 1 and described in Table 2.



Figure 1: Core Schematic Symbol

**Table 2: Core Signal Pinout** 

| Signal | Signal Direction | Description                                         |  |  |
|--------|------------------|-----------------------------------------------------|--|--|
| A      | Input            | Parallel Data In, N-bits wide                       |  |  |
| В      | Input            | Parallel Data In, N-bits wide                       |  |  |
| С      | Input            | Clock, processes data on the low to high transition |  |  |
| PROD   | Output           | Parallel Data Out, 2N-bits wide                     |  |  |

Table 1: Parallel Multiplier Implementation Statistics

| Design<br>Name | Input Data     | Output<br>Data | Pin to Pin<br>Performance <sup>1</sup><br>XC4000E-1 | Latency | CLBs<br>Used | CLB Array Size    | Smallest<br>Device |
|----------------|----------------|----------------|-----------------------------------------------------|---------|--------------|-------------------|--------------------|
| M8x8           | 8 x 8 Signed   | 16 bit         | 97 MHz                                              | 4 clks  | 70           | 10 rows x 8 cols  | 4003E              |
| M12x12         | 12 x 12 Signed | 24 bit         | 89 MHz                                              | 5 clks  | 156          | 14 rows x 13 cols | 4005E              |

#### Note:

1. Based on XC4000E-1 advanced speed files.

## **Multiplier Trade-offs**

Two different implementations of parallel multipliers trade area for speed. The area efficient designs consume about one-fourth less CLB resources than the high speed designs in the 4000E family.

The additional routing resources in the 4000EX and 4000XL families will increase the performance for the area efficient designs. In addition, both structures will benefit from the overall performance increase derived from the 4000XL .35 micron process technology.



Figure 2: Trade-offs of Area Versus Speed Optimization for Multipliers

# **Ordering Information**

This macro comes free with the Xilinx CORE Generator. For additional information contact your local Xilinx sales representative, or e-mail requests to coregen@xilinx.com.