Summaries
Computing Multidimensional
DFTs Using Xilinx FPGAs
This paper reports on a reconfigurable computing architecture
that takes advantage of the reduced computational requirements of the polynomial
transform method for computing 2-D DFTs. An FPGA architecture is
described that is capable of processing 24 512 x 512-pixel images per second.
The proposed system is 46% more area efficient than a row-column DFT processor
implemented using the same technology.
FPGA Interpolators
Using Polynomial Filters
A fractional delay (FD) filter is a device for performing bandlimited
interpolation between the samples of a time-series. The Farrow filter
is a multirate filter structure that offers the option of continuously
adjustable delay. This paper presents a derivation of the method
proposed by Farrow and demonstrates the performance and complexity of resampling
filters using his technique. The FPGA implementation of the Farrow
architecture is described. This implementation requires only 2.7%
of the logic resources of a conventional polyphase decomposition with the
same functionality. The paper also develops some important system
options made available to the designer as spin-offs of the derivation.
High-Performance
FPGA Filters Using Sigma-Delta Modulation Encoding
This paper investigates an architectural option for constructing
high sample-rate narrow-band single rate and multi-rate filters using Xilinx
FPGA technology. The implementation provides a significant savings
in device logic resources in comparison to other techniques that provide
the same functionality. The sigma-delta pre-processor is described
and its implementation using the XC4000 FPGAs is reported. The architecture
of the reduced precision filter is presented and its FPGA realization described.
Implementing Area Optimized
Narrow-Band FIR Filters Using Xilinx FPGAs
This paper reports on a reconfigurable computing architecture
that takes advantage of the reduced computational requirements of the polynomial
transform method for computing 2-D DFTs. An FPGA architecture is
described that is capable of processing 24 512 x 512-pixel images per second.
The proposed system is 46% more area efficient than a row-column DFT processor
implemented using the same technology.
Minimum Multiplicative Complexity
Implementation of the 2-D DCT using Xilinx FPGAs
This paper investigates two options for the FPGA implementation
of a very high-performance 2-D discrete cosine transform (DCT) processor
for real-time applications. The paper provides an overview of the
DCT calculation using Distributed Arithmetic (DA) methods and describes
the FPGA implementation. Comparisons are made that show the polynomial
transform approach to require 67% of the logic resources of a DA processor
for equal throughputs.
Configurable
Logic for Digital Signal Processing
This paper provides an overview of FPGA DSP from both an applications
and implementation perspective. Examples of FPGA DSP in image processing
and digital communications will be used to illustrate the utility of FPGAs
for high-performance DSP. An overview of the discrete wavelet transform
(DWT) will be provided, and considerations for its efficient FPGA implementation
will be discussed. A novel technique for efficiently implementing
FPGA-based digital filters will be presented. The design of several
digital receiver functions as well as the use of run-time re-configuration
will be covered.
FPGA Implementation
of Adaptive Temporal Kalman Filter for Real Time Video Filtering
In this paper, an adaptive temporal filter is proposed that
lend itself to hardware implementation for real-time temporal processing
of image sequences. The proposed algorithm is based on adaptive Kalman
filtering which is relatively simple and effective in its performance.
Adaptation in this case is with respect to motion in the image sequence
as well as variation of noise statistics. An efficient hardware implementation
of this algorithm, based on FPGA technology, is proposed.
FPGA Implementation
of a Nonlinear Two Dimensional Fuzzy Filter
In this paper a nonlinear fuzzy filter is proposed for hardware
implementation. The proposed fuzzy filter is tailored for implementation
into a Xilinx Virtex series of FPGA for real-time image sequence (video)
restoration. Implementation details and recommendations for further
improvement are discussed. Result of a simulation example from the
proposed hardware implementation is also presented.
Issues on Medical
Image Enhancement
Using Matlab, sets of medical images have been evaluated with
various kernel shapes and sizes. The result of the work points toward
a maximum kernel size of 15x15 for Gaussian kernel in unsharp masking.
To implement the enhancement algorithm a 15x15 image filter in a single
chip realization has been completed in Xilinx FPGA (external line buffers).
The enhancement algorithm is partitioned into a low pass filter (LPF) and
image mixing cores. The Xilinx implementation utilizes the small
size of constant coefficient multipliers (KCMs) and uses the fact that
both separable kernels are identical.
Synplify Guide for Model Technology
- ModelSim
This application note describes how to successfully integrate
simulation into your design methodology. This integration not only
involves the simulation process but additionally uses test benches to run
the different types of simulations.
Xilinx/Exemplar Large Device Design
Methodology
This application note discusses methodology and optimization
settings for Leonardo, Alliance Series and ModelSim when targeting all
Xilinx devices. The intent of this app note is to present a single
methodology that works, not to exhaustively explore all the different options
in the tool sets.
Xilinx/Synplicity High Density Methodology
This application note is intended to assist designers who are
using Synplicity and Xilinx to design a high density FPGA (100,000 gates).
Phase I describes synthesis-specific techniques and Phase II describes
implementation-specific (place and route) techniques for optimizing a design
for speed.
Using Xilinx FPGAs
to Design Custom DSPs
This technical paper discusses optimization techniques of digital signal
processing algorithms into FPGAs. FPGAs offer both price and performance
advantages over traditional off-the-shelf DSP solutions.
Using Programmable Logic to Accelerate
DSP Functions
This paper discusses the benefits of using programmable logic in Digital
Signal Processing (DSP) applications. Two case studies - a 16 tap, 8 bit
fixed point FIR filter and a 24 bit Viterbi decoder - demonstrate the advantages
of using programmable logic. The summary includes general information on
how to decide if programmable logic is best for your DSP application.
A Guide to Using Field Programmable
Gate Arrays (FPGAs) for Application Specific DSP Performance
FPGAs have become a competitive alternative for high performance DSP
applications, previously dominated by general purpose DSP and ASIC devices.
This paper describes the benefits of using an FPGA as a DSP Co-processor,
as well as a stand-alone DSP Engine. Two case studies, a Viterbi Decoder
Co-processor and a 16 Tap FIR Filter, are used to illustrate how the FPGA
can radically accelerate system performance and reduce component count
in a DSP application. Finally, different implementation techniques for
reducing hardware requirements and increasing performance are described
in detail.
Building High Performance FIR Filters
Using KCMs
The implementation of digital filters with sample rates above just
a few MHz are generally difficult and expensive to realize using standard
digital signal processors. At this point the potential of distributed arithmetic
and parallel processing performed in a Xilinx FPGA becomes the ideal solution.
The reprogrammable aspect of FPGAs permits optimum use of the available
gates in the form of Constant (K) Coefficient Multipliers (KCMs), while
enabling the filter to be tuned or changed at any time. Filters employing
fully parallel KCMs are ideal for sample rates exceeding 27 MHz with the
example able to operate above 50 MHz. This paper identifies the implementation
of a Finite Impulse Response Filter using constant (k) coefficient multipliers
in the XC4000E.
FPGAs and DSP
This paper introduces the Xilinx Field Programmable Gate Array (FPGA)
technology and helps you understand how FPGAs can be used for DSP system
implementation. You will find a comparison of the implementation of a simple
DSP function in both Programmable DSP and Gate Array technology. A brief
explanation of Gate Array technology is followed by a description of Xilinx
FPGA technology.
The Fastest FFT in the West
This paper discusses that the incorporation of a large FFT in a single
FPGA, while noteworthy, may evoke a "so what" response. Its speed will
be compared to the more standard single-chip DSP design. We propose to
compare Xilinx FPGA performance with an exhaustive list of DSP devices.
The test benchmark, established in 1995, is the execution time of a 256
point FFT. The speed in the FPGA design is set by the computation time
of the radix 2 butterfly. For 16 bit data and a 50 MHz system clock the
computation time indicated is 320 ns.
The Fastest Filter in the West
This paper discusses the use of Distributed Arithmetic to build faster
FIR Filters in FPGAs and compares the performance to other off-the-shelf
devices, including the Harris HSP43881. The candidate for this speed challenge
is the symmetrical FIR filter; specifically, a programmable 8 tap filter
with 8 bits of both coefficient and data values. It is programmable in
the sense that its gate resources can be configured to do other tasks.
Our adversary is the fixed point "DSP" chip which, in single precision,
processes 16-bit words.
The Role of Distributed Arithmetic in
FPGA-based Signal Processing
In this document the Distributed Arithmetic algorithm is derived and
examples are offered that illustrate its effectiveness in producing gate-efficient
designs. Distributed Arithmetic plays a key role in embedding DSP functions
in the Xilinx XC4000 family of FPGA devices.
C-Cube CL550 and Xilinx XC3020A ISA-based
Motion-JPEG Codec
This design is the result of a collaborative effort between C-Cube
Microsystems, Auravision Corporation (Fremont, CA), Xilinx, and Ring Zero
Systems (San Mateo, CA). The design is a Motion-JPEG video codec for ISA
bus PC platforms based on the CL550 JPEG, which features a direct hardware
interface to the Auravision VxP500 Video Processor.
16 Tap, 8 Bit FIR Filter Application
Note
This application note describes the functionality and integration of
a 16 Tap, 8 Bit Finite Impulse Response (FIR) filter macro with predefined
coefficients (e.g. low pass) and a sample rate of 5.44 mega-samples per
second or 784 MIPS using an XC4000 device. The application note also describes
how to set the coefficients of the FIR Filter to meet the needs of other
applications.
Plug and Play ISA in Xilinx FPGAs
This Application Note describes a Plug and Play ISA interface reference
design using a Xilinx XC4003, or larger, FPGA device. This design implements
the features used in a majority of Plug and Play designs.
Dynamic Microcontroller in XC4000
An application note and design
files for a microcontroller with dynamic bus sizing. Uses Xilinx LogiBLOX.
VIEWlogic design files and a QBASIC-based assembler are available.
Pulse Width Modulation in Xilinx
An application note and design files for building a pulse width modulation
circuit in Xilinx programmable logic. Uses Xilinx LogiBLOX. VIEWlogic design
files are available.
Synthesis and
Simulation Design Guide
This manual provides a general overview of designing FPGAs with HDLs.
It includes design hints for the novice HDL user, as well as for the experienced
user who is designing FPGAs for the first time. Written for the Xilinx
M1 development tools.
Synopsys (XSI)
Synthesis and Simulation Design Guide
A Synopsys-specific version of the generic Synthesis and Simulation
Design Guide. Written for the Xilinx M1 development tools.
Configuring FPGAs Over a Processor
Bus
This application note describes how to configure an SRAM-based FPGA
over a processor bus. It also illustrates the source code required to download
a configuration bitstream using an IBM PC as a host microprocessor. 'C'
source code is provided.
Useful in reconfigurable computing applications.
|