![](images/vir_sm.gif) |
Virtex-E
Extended Memory Product Backgrounder
|
|
Xilinx Deploys Copper on Network Switch FPGA Family
New Virtex-E Extended Memory Family Supports 160 Gbps Buffered
Crossbar Switches
Introduction
Bandwidth Challenges in Next Generation System Design
The rapid growth of networking, fueled by the Internet and corporate
Intranets, requires increased bandwidth and the management of this traffic
to the increasing number of hosts and links. Networking companies have
responded to these demands by providing specialized networking equipment
that can handle such rapid growth.
The need to manage these demands is complicated by the requirement to
adopt new protocols and emerging standards quickly and easily. To accommodate
these conflicting demands, system architects are demanding greater performance
and flexibility in the components they choose. At one time, most network
switch data path functions were performed in software, but now they are
primarily implemented in ASICs. However, rapid improvement in FPGA architecture
and performance, coupled with advanced semiconductor processes, now allows
system designers to use programmable logic to integrate more functions
on-chip, increase system performance, and complete their designs faster.
The Virtex-E Extended Memory (Virtex-EM) family is a new FPGA family
that provides a unique memory to logic ratio, enabling elegant system solutions
for high memory bandwidth applications. Applications such as network processing,
that could only be addressed by custom ASICs or ASSPs, are now addressable
by FPGAs.
Based on the highly successful Virtex-E architecture that couples unparalleled
density and performance with unprecedented system level features, Virtex-EM
is the first production FPGA available in a 0.18 micron copper process.
The flexibility of the Virtex architecture has permitted the rapid development
and deployment of this unique family in response to application opportunities
recognized while designing with million gate FPGAs. Virtex-EM supports
all the system level features offered in Virtex-E, including eight fully
digital Delay-Locked Loops, a seamless interface to a multitude of voltage
and signal standards, distributed RAM, Block RAM, and support for differential
signaling standards such as LVPECL, Bus LVDS and LVDS.
The higher memory to logic cell ratio now allows the inclusion of over
one million bits of embedded dual-port block memory in the XCV812E. This
is ideally suited for applications such as 160 Gbps buffered crossbar switches
and buffer management in other high-end networking applications.
Virtex-EM Family
Offered in two densities, the Virtex-EM family is built on a 0.18 micron,
six-layer metal, copper process, and provides an unprecedented amount of
embedded memory on an FPGA.
Device |
Logic Cells |
Dual-Port Block Memory (Kbits) |
Maximum Usable I/O |
XCV405E |
10,800 |
560 |
404 |
XCV812E |
21,168 |
1,120 |
556 |
Memory Bandwidth
True Dual-Port Embedded Block Memory for Highest Memory Bandwidth
Whether used as FIFOs to buffer data on and off chip, caches for high
speed parallel searches, or ATM packet buffers, the system requirements
for more memory grows much
Diagram
To view entire document in pdf format with above diagram included, click
here.
faster than it does for the requirement for more logic. Xilinx pioneered
using embedded distributed memory in its XC4000 series FPGAs to allow the
configurable logic block to support logic or memory. The Virtex family
extended the memory offering with the addition of True Dual Port Block
RAM, and the Virtex-E series enhanced the RAM content to include up to
832 Kbits of True Dual-Port block RAM. The Virtex-EM family provides a
further leap in internal memory up to 560Kbits for the XCV405E and 1,120
Kbits in the XCV812E, capable of 250 MHz performance.
Virtex-EM True Dual-Port Memory
Diagram
To view entire document in pdf format with above diagram included, click
here.
The diagram above shows the implementation of True Dual-Port memory,
which provides support for simultaneous read and write operations. True
Dual-Port memory use includes:
o Bi-directional FIFOs with data width conversion
o Bi-directional RAM with data width conversion
o FIFOs for Data Width Conversions
o Partitioning of RAM into two single port RAMs
o ROM with two read ports for graphics applications
o Tables with x and y axis or sine/cosine lookups
Each True Dual-Port memory block supports 4 Kbits of memory. Each port
can be configured separately to support a variety of depth/width combinations.
Embedded memory can serve to buffer high bandwidth data as well as reduce
the internal processing speed by transparently converting from one data
width to another.
N x N Buffered Crossbar Switch Design using Virtex-EM
Network Switch
Network systems are continually striving for higher bandwidth to support
ever-increasing data transmission requirements. Central to the operation
of a network system is the network switch itself, which is used to route
data from select input channels to output channels in a programmable manner.
For example, a buffered crossbar switch used in an OC-192 SONET application
may have 16 input and 16 output 8-bit channels, with 4 priority, or Quality
of Service (QOS), levels. In this case, incoming data transmissions are
buffered in a memory until the output channel is free to receive the data.
Because of different QOS requirements, an incoming data transmission may
need to be routed to the output channel at higher priority than a file
download, for example. The network switch is typically implemented with
an FPGA or ASIC and an external FIFO memory. In this case, some lower priority
transmissions may be dropped due to insufficient priorities, as the FPGA
may be processing higher priority transmissions in the memory.
With the large on-chip memory of the Virtex-EM devices, it is now possible
to combine all the individual FIFO memories with the interconnection logic
on a single FPGA. This integration supports increased data bandwidth and
eliminates the possibility of dropped transmissions. For example, a single
XCV812E Virtex-EM device with 1,120 Kbits of block RAM will be able to
accommodate a full 16 x 16 byte-wide crossbar switch with 4 priority levels.
This level of logic integration, along with its specification of 10 Gbps
data rates, yields an aggregate bandwidth of 160 Gbps.
In Figure 1, an N x N buffered crossbar switch is organized as a matrix
that connects N input ports to N output ports with buffers at the crosspoints.
This scheme of placing buffers at the crosspoints instead of input buffering
has the advantage of no "dropped" packets or Head-Of-Line (HOL) blocking.
OC-192 16 x 16 Buffered Crossbar Switch
Diagram
To view entire document in pdf format with above diagram included, click
here.
The number of buffers required for an N x N switch equals the product
of N x N x (# of priority or QOS levels). For example, a 16 x 16 switch,
connecting 16 8-bit input ports to 16 8-bit output ports with 4 priority
levels, requires 16 x 16 x 4 = 1,024 buffers. The amount of block RAM available
on the XCV812E to implement these buffers makes it a natural fit for Gigabit
per second network switches.
A fully populated 16 x 16 buffered crossbar switch with four priority
levels can be implemented in an XCV812E with four 128-byte high-speed true
dual-port RAM buffers at each of the 256 crosspoints (four buffers to support
the QOS level of 4). The optimum crosspoint buffer size and memory latency
are also important design considerations dictated by desired system bandwidth
and the number of priority levels that must be supported.
Additional Applications for Virtex-EM
Other additional applications that can benefit from the higher block
RAM integration of the Virtex-EM family include high-end video and image
processing applications. These applications are looking for ways of increasing
the level of integration and performance while trying to keep up with new
standards. More demanding requirements for line buffering and filtering
necessitate refined digital signal processing (DSP) capabilities to improve
and manipulate images. These new DSP techniques require more complex computations
on higher resolution images and have created new challenges for system
designers. Sophisticated digital filtering also requires more data buffering,
and thus more memory. Along with this demand for improved imaging comes
the need for smaller systems and higher integration.
Virtex-EM devices can be used in video and image processing applications,
where the inherent flexibility of the architecture allows powerful data
path and digital signal processing (DSP) operations to be efficiently implemented.
Many applications can benefit from additional block memory that can be
used for FIFOs and buffers. This can greatly enhance overall system bandwidth.
These include: digital wireless applications, high-end digital modems,
and base stations.
High-end Video and Image Processing
Professional Video Broadcast
Advanced video applications continue to require sophisticated digital
signal processing capabilities to improve both image quality and reduce
data requirements. For example, professional sports programs require very
high levels of image quality, but are limited by the available broadcasting
bandwidth. As a result, new DSP techniques have recently been introduced
to combat this problem. As a general rule, these new techniques require
more complex computations on higher resolution images on a real-time basis.
The new MPEG-4 standard is being developed to replace the MPEG-2 standard,
where the underlying mathematics will use Wavelets rather than simpler
Discrete Cosine Transforms (DCT). These requirements create a demanding
challenge for system designers.
Traditionally, the video processing engine is implemented in an FPGA
or ASIC with an external line buffer memory that can store 8 pixel lines
of the image. In this case, the FPGA computes the DCT by operating on 8x8
images (with 64 pixels, each with 8 to 24 bits of resolution) within the
memory space. This scenario limits the bandwidth of the system, due to
the need to go off-chip for the memory.
With the new Virtex-EM family, it is now possible to combine the line
buffer memory and the DSP engine within the same chip, and therefore eliminating
the bandwidth limitations of going off-chip. For example, the XCV812E is
capable of storing up to 4 frames of 16 pixel lines in the high-resolution
case of 1920 pixels/line. By using the large block RAM capacity of the
Virtex-EM device, both the video processing engine and line buffer memory
can be accommodated on the same device, which will greatly enhance the
overall system bandwidth.
Virtex-EM on Copper Process
Xilinx and UMC have been close partners with a history of technological
advances. Together, Xilinx and UMC have produced a continuous success of
technological advances, from the highest density FPGA in production, with
over 150 million transistors to the joint development of copper interconnection
technology.
Virtex-EM is the first production FPGA family to be available with copper
interconnects technology. Virtex-EM is specifically designed to take advantage
of the inherent benefits that copper technology allows. In fact, the copper
process delivers critical performance benefits essential in high bandwidth,
internal switching applications.
The Virtex-EM products use copper metalization for the top two metal
layers. These layers supply power to the chip and to route important input
and output signals. The benefits of using copper interconnect are apparent
in the improved performance (reduced internal voltage drops), lower power
consumption, improved clock and output skew and higher reliability. Future
Xilinx products will take advantage of copper for additional layers.
Completing the Solution with Packaging, Software, and Intellectual
Property
Both the XCV405E and XCV812E devices are offered in 1.27 mm BGA560 ball
grid array packages to allow the dissipation of several watts of power
for high bandwidth applications. To ensure support for high I/O requirements,
1.00 mm pitch fine pitch (FG) packages allow up to 404 I/Os on the XCV405E
and up to 556 I/Os in the XCV812E. The FG676 package, with a footprint
of only 27 mm by 27 mm is available for the XCV405E, and the FG900 measuring
31 mm by 31 mm is available for the XCV812E.
The XCV812E device is supported by the current releases of the Alliance
and Foundation Series' software. The XCV405E will be supported in the Q2'00
software releases.
In the Xilinx tradition of making cores available at the time of silicon
and software availability, the Virtex-EM family is supported with the CIP_4
release of cores http://www.xilinx.com/ipcenter/coregen/updates.htm currently
available and downloadable from the IP Center: http://www.xilinx.com/ipcenter/index.htm.
All the cores can be installed for use via the Xilinx Core Generator System.
The CIP_4 cores release includes numerous Smart_IP cores that are parameterizable,
optimized, and predictable. The release includes generators for asynchronous
FIFOs, Block Memory Modules, Distributed Memory, Parallel Multipliers,
FIR Filters, FFTs, NCOs, Sine/Cosine LUT, and many more base level functions.
The popular configurable Reed Solomon Encoder and Decoder cores are also
available for the Virtex-EM family.
Xilinx is a registered trademark of Xilinx Inc. All XC-prefix
product designations, Virtex, Smart-IP, XPERTs, SelectRAM+, SelectI/O,
SelectI/O+, True Dual-Port, CORE Generator, LogiCORE, and REAL 64/66 PCI
are trademarks of Xilinx. Other brands or product names are trademarks
or registered trademarks of their respective owners.
to view the
PDF file.
|