Return to the Products Page
 homesearchagentssupportask xilinxmap
 
Virtex Series FPGAs
System Performance/Bandwidth
  
All material pertains to both Virtex and Virtex-E unless specifically noted in parentheses. 

Technical Briefs 
 
  Bandwidth (Virtex-E) 
  I/O Bandwidth Scalable to 200 Gbps 
  Eight High Performance DLLs 
  Maximizing Bandwidth Using Double Data Rate Memory 
  SelectI/O+ Technology 
  Differential Signaling: LVPECL, LVDS, Bus LVDS
  True Dual-Port Embedded Block Memory 
  Packaging, Software & Intellectual Property 
  System Timing (Virtex)  


Get Acrobat to view the pdf PDF files below. 

Misc. Application Notes 

All Other Virtex Application Notes
Customer Testimonials

What Customers Think About DLLs

 

The Xilinx Virtex-E Family: Bandwidth Solution for Next Generation High Performance Systems 

With the introduction of the Virtex series, Xilinx redefined the field programmable gate array (FPGA). Unparalleled density and performance coupled with a powerful set of system level features enables designers to architect their systems with the Virtex capabilities at the core. 

As a result of being designed in system critical applications, the measuring criteria of an FPGA has grown from the traditional measurements of density and performance to include bandwidth capability. Bandwidth is the measure of how many actions can be accomplished in a given amount of time and is the key differentiating factor for many end products. Computing architectures measured in million instructions per second (MIPS) and data communication systems measured in gigabits per second highlight the importance of bandwidth. 

The new Virtex-E family of FPGAs is built on the highly successful Virtex architecture. Leveraging the 0.18-micron, six-layer metal technology, Virtex-E devices push the traditional FPGA measurements to two million gates, 804 I/Os and over 311 Megabits per second (Mbps) performance. Combined with system level features for clock management, multiple standard I/O, and embedded True Dual-Port memory, the Virtex-E family is designed to support the bandwidth requirements of next generation high performance systems. 
 
Device 
Logic Cells 
Dual-Port Block Memory (Kbits)
Maximum 
Usable I/O 
 I/O 
Bandwidth (Gbps) 
XCV50E
1728
56
176
44
XCV100E
2700
80
176
44
XCV200E
5292
112
284
71
XCV100E
2700
40
176
44 
XCV200E
5292
56
284
71 
XCV300E
6912
128
316
79 
XCV400E 
10800
160
404
100 
XCV600E
15552
288
512
127 
XCV1000E
27648
384
660
164 
XCV1600E
34992
576
724
180 
XCV2000E
43200
640
804
200
XCV2600E
57132
736
804
200
XCV3200E
73008
832
804
200

The Virtex-E family consists of the 50,000-gate XCV50E device 
to the industry's first FPGA to exceed three million gates, the XCV3200E device. 
 

I/O Bandwidth Scalable to 200 Gigabits per second (Gbps

The I/O bandwidth is calculated by multiplying 311 Mbps I/O performance rate times 80 percent of the maximum usable I/O number—where 80 percent is the assumption used for a typical percentage of the device's I/Os used as data signals. It is very likely that a given design will require multiple high bandwidth data ports with the bandwidth distributed across the required ports. For next generation systems, port bandwidth on the order of 10 Gbps such as OC-192 data rates is the leading edge. With the high bandwidth capability supported by the Virtex-E architecture, several 10 Gbps ports can be achieved within a single device. 

Bandwidth Enabling Technology  

Simply having input and output pins that can toggle at high frequencies is only one part of the complete solution for addressing next generation bandwidth requirements. Precise clock management that controls timing relationships between external clock and data must allow the device to interface with a variety of external components with some flexibility. Once the high bandwidth is captured at the device pins, capabilities for internal memory and logic must process the data at the required bandwidth. Furthermore, to communicate with external interfaces (such as with high-speed external memory or system backplanes) the device pins must address a variety of signal standards. The Virtex-E devices contain advanced system technology to support the bandwidth requirements throughout the system. The illustration below shows a block diagram of the Virtex-E bandwidth enabling technology including 100 percent digital delay lock loops (DLL), True Dual-Port embedded memory and SelectI/O+ technology to address these areas of high-bandwidth support. 

 

Virtex-E Architecture Overview 

  
 
 
Eight High Performance DLLs - Drop-in Bandwidth Optimization 

Supporting the highest bandwidth data rates between devices requires advanced clock management technology such as DLL. The DLL circuitry allows for very precise synchronization of external and internal clocks. Xilinx was the first to deliver DLLs in programmable logic by offering four 200 MHz DLLs in every Virtex device. The Virtex-E family takes this technology to the next level with devices containing eight DLLs capable of over 311 MHz. As a fully digital implementation, the Virtex and Virtex-E DLLs do not have the typical problems of an analog phase locked loop (PLL) including board isolation and decoupling of power and ground. Virtex-E DLLs provide precise clock edges through phase shifting, frequency multiplication, and frequency division. The precise duty cycle generation is critical for high performance applications (like Double Data Rate, or DDR) in which a slight shift in duty cycle can dramatically decrease overall system performance. 
 

 Bandwidth Critical Specifications of the Virtex-E DLL
 
Parameter Value 
Maximum Output Frequency320 MHz* 
Maximum Output Jitter 100 ps 
Output Frequency Duty Cycle 50%+/- 100 Ps 
                                                    * Based on Virtex-E –7 speed grade product  
 

DLL Application Notes and Reference Designs


 Application NotesReference Design
Using the Virtex Delay-Locked Loop Using the Virtex Delay-Locked LoopInternet Link

Maximizing Double Data Rate Memory Bandwidth with Virtex-E DLL  

A key technique for increasing the bandwidth of a particular data port is to have signals change on both edges of a clock, commonly referred to as the Double Data Rate technique. Memory suppliers have already started to support this type of high performance technique to increase the memory bandwidth of their devices. At high frequencies, signal integrity limits the clock performance, which limits the bandwidth of the data. Bandwidth for the port is immediately doubled if the architecture can change data at each edge of a system clock. It is critical that a clock duty cycle is very precise to 50 percent for this technique. Since Virtex-E DLLs can generate clocks with a duty cycle guaranteed to be within 100 picoseconds (Ps) of 50 percent; system designers can achieve the maximum memory bandwidth in the DDR application. The following diagram demonstrates how Virtex-E DLLs help achieve maximum bandwidth in a 266 MHz DDR application. For more details regarding DLL, please see the DLL Tech Topic.
 

 Virtex-E Interfacing to a 266 MHz DDR SDRAM Memory 
  
 
SelectI/O+ Technology: Flexibility, High Bandwidth, and Superior Signal Integrity  

In order to meet the bandwidth requirements, electrical signals need to travel on a printed circuit board over 100 MHz, standard TTL and CMOS signal technology cannot keep pace. With the Virtex series, Xilinx pioneered the SelectI/O technology designed to support 200 MHz I/O and allow a single device to interface to any device without external converters. Virtex-E SelectI/O+ technology expands the performance and flexibility by supporting high performance I/O standards such as HSTL and SSTL at over 311 Mbps per pin. In addition, Virtex-E devices are the first programmable logic device to directly interface to differential I/O standards including LVDS, Bus LVDS (BLVDS) and LVPECL. The Virtex-E family offers a hierarchy of differential support, including up to 36 I/O pairs for LVDS and LVPECL operating at 622 Mbps, and up to 344 differential pairs operating at over 311 Mbps. Support for up to 344 differential pairs capable of over 311 Mbps provides maximum bandwidth of over 100 Gbps, which can be distributed over the three differential signal standards as needed. For the first time in a programmable device, system designers can leverage the high bandwidth and noise immunity characteristics of these standards. For more information, see  Table of Standards Supported 
 

Virtex-E Aggregate Bandwidth Summary

 

I/O StandardType
Number of Device I/O Pins
1
2
32
72
688
804

SSTL

Single Ended311 Mbps622 Mbps10 Gbps22 Gbps214 Gbps250 Gbps
HSTLSingle Ended311 Mbps622 Mbps10 Gbps22 Gbps214 Gbps250 Gbps
GTL+Single Ended311 Mbps622 Mbps10 Gbps22 Gbps214 Gbps250 Gbps
LVDSDifferential

n/a

622 Mbps10 Gbps22 Gbps107 Gbpsn/a
LVPECLDifferentialn/a622 Mbps10 Gbps22 Gbps107 Gbpsn/a
Bus LVDSDifferentialn/a311 Mbps5 Gbps11 Gbps107 Gbpsn/a

 

High-Performance Differential Signaling: LVPECL, LVDS, and Bus LVDS  

Typical aggregate bandwidth requirements for leading edge systems are exceeding 100 Gbps. Increasingly, leading systems designers are turning to differential signaling as the mechanism of choice for these requirements. Differential signaling enables high bandwidth while reducing power, increasing noise immunity, and decreasing EMI emissions. Virtex-E devices meet this emerging challenge with unprecedented capabilities and support for high-performance differential signaling. Virtex-E SelectI/O+ technology addresses the three leading industry-standard differential signaling standards: LVPECL, LVDS, and Bus LVDS (BLVDS). 

LVPECL I/O is widely used in 100+ MHz inter-chip signaling in high-speed data communications and instrumentation systems. Fiber-Optic Network Interfaces and gigahertz Analog-to-Digital Converters, for example, rely on LVPECL I/O to achieve gigabit per second bandwidth. All Virtex-E I/Os support LVPECL input, output, and I/O signaling. This unparalleled flexibility enables users to create interfaces to hundreds of industry-standard LVPECL devices. 

In addition to high-speed interfacing, LVPECL is the industry standard for transmission of precise, onboard clocks at frequencies in excess of 100 MHz. While traditional LVTTL clock sources are typically limited to 100 MHz and below (due to the fundamental signal integrity limits), LVPECL clock sources provide operation up to 400 MHz. As FPGA system clock frequencies exceed 100 MHz, LVPECL clocking becomes an essential requirement. The Virtex-E device supports high-performance LVPECL clock inputs for global and local clocking, with frequencies in excess of 300 MHz. In addition, through the use of its multiple DLLs coupled with SelectI/O+ technology, the Virtex-E devices enable zero-delay conversion of precise LVPECL clocks into virtually any required I/O standard. This facilitates the use of Virtex-E FPGAs as an integral part of high-performance board-level clock distribution strategies. 

In addition to LVPECL, the Virtex-E family has the industry’s first programmable devices to support Low-Voltage Differential Signaling (LVDS). LVDS exists in two commonly available variants, LVDS and Bus LVDS. LVDS is optimized for high-speed point-to-point links, while Bus LVDS is optimized for backplane applications employing Multi-Drop (One Transmitter, Multiple Receiver), and MultiPoint (Multiple Transmitters and Receivers) configurations. The Virtex-E device provides unparalleled support for both LVDS and Bus LVDS, with support on all devices and speed grades, and up to 688 pins (344 pairs) of LVDS and/or Bus LVDS capabilities on the largest device, providing an aggregate bandwidth in excess of 100 Gbps. The Virtex-E Bus LVDS I/Os are fully compatible with industry-standard Bus LVDS devices from National Semiconductor and other vendors. 
 

True Dual-Port Embedded Block Memory for Highest Internal Memory Bandwidth  

Whether used as FIFOs to buffer data on and off chip, caches for high speed parallel searches, or ATM packet buffers, the system requirements for more memory grows much faster than it does for the requirement for more logic. Xilinx pioneered using embedded distributed memory (with its SelectRAM+ technology) in its XC4000 FPGAs to allow the configurable logic block to support logic or memory. With the Virtex series, this technology was enhanced to include up to 128 Kbits of True Dual-Port fast-embedded block RAM. The Virtex-E family again provides a quantum leap in internal memory bandwidth by supporting up to 832 Kbits of True Dual-Port RAM (208 blocks of 4Kbits memory) capable of 250 MHz performance. 
 

Comparison of Virtex-E True Dual-Port Memory to a Two-Port Memory 
This diagram shows that in order to emulate most of the functionality of a dual-port memory, two-port memory architectures require twice the number of memory bits and multiplexing of address and data. This results in two-port memory at roughly half the bandwidth and efficiency of the Virtex-E True Dual-Port memory in any given configuration. 
 
Managing Bandwidth Using True Dual-Port Memory  

Each True Dual-Port memory block supports 4 Kbits of memory. Each port can be configured separately to support a variety of depth/width combinations. Embedded memory can serve to buffer high bandwidth data as well as reduce the internal processing speed by transparently converting from one data width to another. The diagram below demonstrates an OC-192 application example. A data port with OC-192 bandwidth comes in on 32 BLVDS pairs running at 311 Mbps per pair. Eight blocks of embedded RAM are used to buffer the data internally. The port taking data from the I/O register to the memory is configured as one Kbit deep by 32. The port leading to the internal processing of the data is configured as 256 by 128. Internal processing of the 128-bit data need only run at 78 MHz to keep up with the OC-192 bandwidth. An outgoing port would be configured similarly. 
 

Buffering the High Bandwidth Data with Virtex-E True Dual-Port BlockRAM
 
Completing the Solution with Packaging, Software and Intellectual Property  

To support tens of gigabits per second bandwidth through a device, the package must be capable of packing many high performance I/Os in limited board space. At the same time, package power characteristics may be required to dissipate several watts of power for the highest bandwidth applications. Virtex-E devices continue the tradition of the industry's most reliable and flexible packaging. Mainstream plastic quad flat pack (PQFP) and 1.27 mm ball grid array (BGA) as well as leading edge 0.8 mm chip scale package (CSP) and 1.0 mm fine pitch BGA (FG) are supported across the family. For the fine pitch 1.0 mm BGA offering, the Virtex-E family introduces three new FG packages: 31 mm by 31 mm FG900; 35 mm by 35 mm FG1156; and a thermally enhanced 42.5 mm by 42.5 mm FG860. The Virtex-E family can now support up to 804 I/Os with a board real estate as small as 35 mm by 35 mm. These packages set new standards in I/Os per square inch as well as maximum bandwidth per square inch. 

Squeezing all the bandwidth capabilities of the three million gate XCV3200E device will likely be distributed among multiple digital designers. Xilinx introduced Virtex-E support with version 2.1i software that further reduced the industry leading compile times by an additional 50 percent as well as delivered the Xilinx Internet Team Design tools to coordinate communication between designers resulting in optimized design cycles. 

Smart-IP technology allows Xilinx to offer architecture independent IP that provides the best in predictability, flexibility and performance. The CORE Generator 2.1i tools support the Virtex-E devices at silicon availability. Popular high bandwidth cores including the Real-PCI 64/66 solution and the 32/33 Xilinx LogiCORE PCI solution support Virtex-E devices today. With the capabilities of the Virtex-E architecture coupled with Xilinx LogiCORE, AllianceCORE and XPERTs programs, many high bandwidth cores are in development. For up to date information on the latest cores, contact Xilinx IP Center at www.xilinx.com/ipcenter. 

Summary  

Demonstrated by unprecedented adoption rate, Virtex series redefined the FPGA with a feature set that moved FPGAs from glue logic to the core of the system design. With the significant performance and flexibility enhancements in the areas of clock management, SelectI/O+ technology, True Dual-Port block memory, and high performance differential signaling support, the Virtex-E family is well set to continue the success of Virtex platform by enabling system designers to meet the bandwidth requirement of the next generation high performance systems. By supporting overall bandwidth requirements previously addressed only by inflexible ASIC technology, Virtex-E devices will further increase the application space for programmable logic. 
 

System Timing Solution (Virtex)

Highlights 

The Virtex family solves clock skew issues by allowing internal and external clock synchronization.  With four Delay Locked Loops, your system's clock skew will be removed.  You can get clock-to-output delays of less than three nanoseconds and very substantial increases in system performance. 
 
 
Fully Digital Delay Locked Loops (DLL) 
  • Four independent DLL circuits for internal and external clock synchronization
  • 200+ MHz chip-to-chip communication
  • Less than 3 ns clock-to-output time across all devices
  • Clock doubling and clock division

  • 0o, 90o, 180o, and 270o phase 
    clocks 
Zero Delay Clock Management 
Zero Delay Clock Management 
Multiple DLLs facilitate precise generation of zero-delay clocks both inside and outside the FPGA for highest chip-to-chip speeds. 
 
Overview of Delay-Locked Loops  

As FPGAs grow in size, quality on-chip clock distribution becomes increasingly important. Clock skew and clock delay impact device performance and the task of managing clock skew and clock delay with conventional clock trees becomes more difficult in large devices. The Virtex series of devices resolve this potential problem by providing four fully digital dedicated on-chip Delay-Locked Loop (DLL) circuits which provide zero propagation delay and zero clock skew between output clock signals distributed throughout the device. 

Each DLL can drive up to two global clock routing networks within the device. The global clock distribution network minimizes clock skews due to loading differences. By monitoring a sample of the DLL output clock, the DLL can compensate for the delay on the routing network, effectively eliminating the delay from the external input port to the individual clock loads within the device. 

In addition to providing zero clock skew and zero delay with respect to a user source clock, the DLL can provide multiple phases of the source clock. The DLL can also act as a clock doubler or it can divide the user source clock by up to 16. 

Clock multiplication gives you a number of design alternatives. For instance, a 50 MHz source clock doubled by the DLL can drive an FPGA design operating at 100 MHz. This technique can simplify board design because the clock path on the board no longer distributes such a high-speed signal. A multiplied clock also provides designers the option of time-domain-multiplexing, using one circuit twice per clock cycle, consuming less area than two copies of the same circuit. To increase the effective clock multiplication factor to four, you can connect two DLLs in series. 

The DLL can also act as a clock mirror. By driving the DLL output off-chip and then back in again, you can use the DLL to deskew a board level clock between multiple devices. 

In order to guarantee the system clock establishes prior to the device “waking up,” the DLL can delay the completion of the device configuration process until after the DLL achieves lock. 

By taking advantage of the DLL to remove on-chip clock delay, you can greatly simplify and improve system level design involving high-fanout, high-performance clocks. 

Customer Comments:

"Virtex FPGAs have allowed us to implement our next generation digital TV broadcast systems in record time," said John Simmons, project manager, of NDS, a world leader in digital broadcasting solutions. "A key time saver was the availability of multiple DLLs that allowed us to synchronize a 74 MHz clock to more than 30 devices including multiple FPGAs, SDRAMs, and other components. Designing a no-skew clock system from scratch would take months. Xilinx delivered a ready-made solution to us with Virtex FPGAs."  
 

“We did an exhaustive search and found that only the Xilinx Virtex FPGAs could provide us with the performance and density necessary to add Gigabit capability to the Nebula switch family,” said John Peters, vice president of development at Performance Technologies.  “We are very impressed with the system level capabilities of the Virtex FPGAs, particularly the digital delay locked loops and support for multiple I/O standards.” 
 
More Information 

Virtex Overview 
System Memory 
System Interface 
System Design Tools 
Data Sheets 
Application Notes 

Product Information 
Software 
IP Center 
Hard Copy Literature Request