Virtex-E Extended Memory Product Backgrounder
 

 

Xilinx Deploys Copper on Network Switch FPGA Family
New Virtex-E Extended Memory Family Supports 160 Gbps Buffered Crossbar Switches 

Introduction 

Bandwidth Challenges in Next Generation System Design 

The rapid growth of networking, fueled by the Internet and corporate Intranets, requires increased bandwidth and the management of this traffic to the increasing number of hosts and links. Networking companies have responded to these demands by providing specialized networking equipment that can handle such rapid growth. 

The need to manage these demands is complicated by the requirement to adopt new protocols and emerging standards quickly and easily. To accommodate these conflicting demands, system architects are demanding greater performance and flexibility in the components they choose. At one time, most network switch data path functions were performed in software, but now they are primarily implemented in ASICs. However, rapid improvement in FPGA architecture and performance, coupled with advanced semiconductor processes, now allows system designers to use programmable logic to integrate more functions on-chip, increase system performance, and complete their designs faster. 

The Virtex-E Extended Memory (Virtex-EM) family is a new FPGA family that provides a unique memory to logic ratio, enabling elegant system solutions for high memory bandwidth applications. Applications such as network processing, that could only be addressed by custom ASICs or ASSPs, are now addressable by FPGAs. 

Based on the highly successful Virtex-E architecture that couples unparalleled density and performance with unprecedented system level features, Virtex-EM is the first production FPGA available in a 0.18 micron copper process. The flexibility of the Virtex architecture has permitted the rapid development and deployment of this unique family in response to application opportunities recognized while designing with million gate FPGAs. Virtex-EM supports all the system level features offered in Virtex-E, including eight fully digital Delay-Locked Loops, a seamless interface to a multitude of voltage and signal standards, distributed RAM, Block RAM, and support for differential signaling standards such as LVPECL, Bus LVDS and LVDS.

The higher memory to logic cell ratio now allows the inclusion of over one million bits of embedded dual-port block memory in the XCV812E. This is ideally suited for applications such as 160 Gbps buffered crossbar switches and buffer management in other high-end networking applications.

Virtex-EM Family 

Offered in two densities, the Virtex-EM family is built on a 0.18 micron, six-layer metal, copper process, and provides an unprecedented amount of embedded memory on an FPGA. 
 
Device Logic Cells Dual-Port Block Memory (Kbits)  Maximum Usable I/O
XCV405E  10,800 560  404 
XCV812E  21,168  1,120 556 

Memory Bandwidth 
True Dual-Port Embedded Block Memory for Highest Memory Bandwidth 

Whether used as FIFOs to buffer data on and off chip, caches for high speed parallel searches, or ATM packet buffers, the system requirements for more memory grows much

Diagram

To view entire document in pdf format with above diagram included, click here

faster than it does for the requirement for more logic. Xilinx pioneered using embedded distributed memory in its XC4000 series FPGAs to allow the configurable logic block to support logic or memory. The Virtex family extended the memory offering with the addition of True Dual Port Block RAM, and the Virtex-E series enhanced the RAM content to include up to 832 Kbits of True Dual-Port block RAM. The Virtex-EM family provides a further leap in internal memory up to 560Kbits for the XCV405E and 1,120 Kbits in the XCV812E, capable of 250 MHz performance. 

Virtex-EM True Dual-Port Memory

Diagram 

To view entire document in pdf format with above diagram included, click here

The diagram above shows the implementation of True Dual-Port memory, which provides support for simultaneous read and write operations. True Dual-Port memory use includes: 
o Bi-directional FIFOs with data width conversion 
o Bi-directional RAM with data width conversion 
o FIFOs for Data Width Conversions 
o Partitioning of RAM into two single port RAMs 
o ROM with two read ports for graphics applications 
o Tables with x and y axis or sine/cosine lookups 

Each True Dual-Port memory block supports 4 Kbits of memory. Each port can be configured separately to support a variety of depth/width combinations. Embedded memory can serve to buffer high bandwidth data as well as reduce the internal processing speed by transparently converting from one data width to another. 

N x N Buffered Crossbar Switch Design using Virtex-EM 

Network Switch
Network systems are continually striving for higher bandwidth to support ever-increasing data transmission requirements. Central to the operation of a network system is the network switch itself, which is used to route data from select input channels to output channels in a programmable manner. For example, a buffered crossbar switch used in an OC-192 SONET application may have 16 input and 16 output 8-bit channels, with 4 priority, or Quality of Service (QOS), levels. In this case, incoming data transmissions are buffered in a memory until the output channel is free to receive the data. Because of different QOS requirements, an incoming data transmission may need to be routed to the output channel at higher priority than a file download, for example. The network switch is typically implemented with an FPGA or ASIC and an external FIFO memory. In this case, some lower priority transmissions may be dropped due to insufficient priorities, as the FPGA may be processing higher priority transmissions in the memory. 

With the large on-chip memory of the Virtex-EM devices, it is now possible to combine all the individual FIFO memories with the interconnection logic on a single FPGA. This integration supports increased data bandwidth and eliminates the possibility of dropped transmissions. For example, a single XCV812E Virtex-EM device with 1,120 Kbits of block RAM will be able to accommodate a full 16 x 16 byte-wide crossbar switch with 4 priority levels. This level of logic integration, along with its specification of 10 Gbps data rates, yields an aggregate bandwidth of 160 Gbps. 

In Figure 1, an N x N buffered crossbar switch is organized as a matrix that connects N input ports to N output ports with buffers at the crosspoints. This scheme of placing buffers at the crosspoints instead of input buffering has the advantage of no "dropped" packets or Head-Of-Line (HOL) blocking. 

OC-192 16 x 16 Buffered Crossbar Switch

Diagram 

To view entire document in pdf format with above diagram included, click here

The number of buffers required for an N x N switch equals the product of N x N x (# of priority or QOS levels). For example, a 16 x 16 switch, connecting 16 8-bit input ports to 16 8-bit output ports with 4 priority levels, requires 16 x 16 x 4 = 1,024 buffers. The amount of block RAM available on the XCV812E to implement these buffers makes it a natural fit for Gigabit per second network switches. 

A fully populated 16 x 16 buffered crossbar switch with four priority levels can be implemented in an XCV812E with four 128-byte high-speed true dual-port RAM buffers at each of the 256 crosspoints (four buffers to support the QOS level of 4). The optimum crosspoint buffer size and memory latency are also important design considerations dictated by desired system bandwidth and the number of priority levels that must be supported. 

Additional Applications for Virtex-EM 

Other additional applications that can benefit from the higher block RAM integration of the Virtex-EM family include high-end video and image processing applications. These applications are looking for ways of increasing the level of integration and performance while trying to keep up with new standards. More demanding requirements for line buffering and filtering necessitate refined digital signal processing (DSP) capabilities to improve and manipulate images. These new DSP techniques require more complex computations on higher resolution images and have created new challenges for system designers. Sophisticated digital filtering also requires more data buffering, and thus more memory. Along with this demand for improved imaging comes the need for smaller systems and higher integration. 

Virtex-EM devices can be used in video and image processing applications, where the inherent flexibility of the architecture allows powerful data path and digital signal processing (DSP) operations to be efficiently implemented. Many applications can benefit from additional block memory that can be used for FIFOs and buffers. This can greatly enhance overall system bandwidth. These include: digital wireless applications, high-end digital modems, and base stations. 

High-end Video and Image Processing 

Professional Video Broadcast 
Advanced video applications continue to require sophisticated digital signal processing capabilities to improve both image quality and reduce data requirements. For example, professional sports programs require very high levels of image quality, but are limited by the available broadcasting bandwidth. As a result, new DSP techniques have recently been introduced to combat this problem. As a general rule, these new techniques require more complex computations on higher resolution images on a real-time basis. The new MPEG-4 standard is being developed to replace the MPEG-2 standard, where the underlying mathematics will use Wavelets rather than simpler Discrete Cosine Transforms (DCT). These requirements create a demanding challenge for system designers. 

Traditionally, the video processing engine is implemented in an FPGA or ASIC with an external line buffer memory that can store 8 pixel lines of the image. In this case, the FPGA computes the DCT by operating on 8x8 images (with 64 pixels, each with 8 to 24 bits of resolution) within the memory space. This scenario limits the bandwidth of the system, due to the need to go off-chip for the memory. 

With the new Virtex-EM family, it is now possible to combine the line buffer memory and the DSP engine within the same chip, and therefore eliminating the bandwidth limitations of going off-chip. For example, the XCV812E is capable of storing up to 4 frames of 16 pixel lines in the high-resolution case of 1920 pixels/line. By using the large block RAM capacity of the Virtex-EM device, both the video processing engine and line buffer memory can be accommodated on the same device, which will greatly enhance the overall system bandwidth. 

Virtex-EM on Copper Process 

Xilinx and UMC have been close partners with a history of technological advances. Together, Xilinx and UMC have produced a continuous success of technological advances, from the highest density FPGA in production, with over 150 million transistors to the joint development of copper interconnection technology. 

Virtex-EM is the first production FPGA family to be available with copper interconnects technology. Virtex-EM is specifically designed to take advantage of the inherent benefits that copper technology allows. In fact, the copper process delivers critical performance benefits essential in high bandwidth, internal switching applications. 

The Virtex-EM products use copper metalization for the top two metal layers. These layers supply power to the chip and to route important input and output signals. The benefits of using copper interconnect are apparent in the improved performance (reduced internal voltage drops), lower power consumption, improved clock and output skew and higher reliability. Future Xilinx products will take advantage of copper for additional layers.

Completing the Solution with Packaging, Software, and Intellectual Property 

Both the XCV405E and XCV812E devices are offered in 1.27 mm BGA560 ball grid array packages to allow the dissipation of several watts of power for high bandwidth applications. To ensure support for high I/O requirements, 1.00 mm pitch fine pitch (FG) packages allow up to 404 I/Os on the XCV405E and up to 556 I/Os in the XCV812E. The FG676 package, with a footprint of only 27 mm by 27 mm is available for the XCV405E, and the FG900 measuring 31 mm by 31 mm is available for the XCV812E. 

The XCV812E device is supported by the current releases of the Alliance and Foundation Series' software. The XCV405E will be supported in the Q2'00 software releases. 

In the Xilinx tradition of making cores available at the time of silicon and software availability, the Virtex-EM family is supported with the CIP_4 release of cores http://www.xilinx.com/ipcenter/coregen/updates.htm currently available and downloadable from the IP Center: http://www.xilinx.com/ipcenter/index.htm. All the cores can be installed for use via the Xilinx Core Generator System. The CIP_4 cores release includes numerous Smart_IP cores that are parameterizable, optimized, and predictable. The release includes generators for asynchronous FIFOs, Block Memory Modules, Distributed Memory, Parallel Multipliers, FIR Filters, FFTs, NCOs, Sine/Cosine LUT, and many more base level functions. The popular configurable Reed Solomon Encoder and Decoder cores are also available for the Virtex-EM family. 

Xilinx is a registered trademark of Xilinx Inc. All XC-prefix product designations, Virtex, Smart-IP, XPERTs, SelectRAM+, SelectI/O, SelectI/O+, True Dual-Port, CORE Generator, LogiCORE, and REAL 64/66 PCI are trademarks of Xilinx. Other brands or product names are trademarks or registered trademarks of their respective owners. 

Get Acrobat to view the pdf PDF file. 
 
More Information

Virtex Overview 
Memory Corner
System Performance/Bandwidth
System Interface
System Design Tools
Data Sheets
Application Notes

Product Information
Software
IP Center
Hard Copy Literature Request


 
  Trademarks and Patents
Legal Information

Privacy Policy
| Home | Products | Support | Education | Purchase | Contact | Search |