Upload
dangtruc
View
228
Download
6
Embed Size (px)
Citation preview
Reconfigurable ComputingReconfigurable Computing
ApplicationsApplications
Chapter 9Chapter 9
Prof. Dr.Prof. Dr.--Ing. Jürgen TeichIng. Jürgen TeichLehrstuhl für HardwareLehrstuhl für Hardware--SoftwareSoftware--CoCo--DesignDesign
Reconfigurable Computing
Overview Overview
Reconfigurable Computing 2
FPGAs have been used in the past mostly in
Rapid prototyping
Non-frequent reconfigurable systems
Hardware implementation, sometimes specific for the FPGA architectureThe most important application areas are:
Searching (text, genetic database, etc.)
Image processing
Mechanical control
Etc.
Searching Searching –– pattern matchingpattern matching
Reconfigurable Computing 3
Pattern matching is the basis of search engines
The purpose is to find and (count) the occurrence of a given pattern in a given text
Useful in:
DictionariesDocument collection indexingDocument filtering and classificationSpam avoidanceContent surveillance
Searching Searching –– pattern matching pattern matching –– sliding windowssliding windows
Reconfigurable Computing 4
Sliding windows (Cockscot & Foulk )
Keywords are kept in register. One character / ByteA set of comparators are used. One comparator / ByteHit signal is set whenever the text-segment matches the corresponding wordAdvantage:
Easy to replace old patternsDrawbacks:
Not flexible: Fixed length of registersRedundancy: more comparators than necessary for word with same prefix
Searching Searching –– pattern matching pattern matching -- sliding windowssliding windows
Reconfigurable Computing 5
Avoid redundancyUse only one comparator for common characters in different words
Data folding (Foulk)
Fold the data in the circuitConsider the bit-representation of each characterGenerate a comparator circuit for each character in the words to be searched for
Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit
8-bit comparator
Bit Bit Bit Bit Bit Bit Bit Bit
01001110-Comparator
Searching Searching –– pattern matching pattern matching -- FSMFSM--BasedBased
Reconfigurable Computing 6
FSM-Based pattern matcherEach regular grammar can be recognized by an FSMIn pattern matching, the target words define the regular grammarThe target words are compiled in the automatonEach word defines a unique path from the start state to an end stateWhen scanning a text, the automaton changes its state with the appearance of charactersReaching a final state correspondsto the appearance of a wordRedundancy is avoided by implementing common prefix
FSM-Recognizer and correspondingstate transition table for the word conte
Searching Searching –– pattern matching pattern matching -- FSMFSM--BasedBased
Reconfigurable Computing 7
FSM-Based pattern matcherRAM-implementation
One RAM or ROM for storing the state transition tableOne state registerOne character registerA hit detectorThe Input character and the state register are used to determine the nextstateThe hit detector checks if the current state is equal to a hit state and sets a hit for the corresponding wordAdvantage:
Simple to implementDrawback:
Expensive in terms of flip flops
Char reg
RAMROM
State Reg Nextstate
Hit detect
Characterstream
RAM/ROM implementation of the word recognizer
Searching Searching –– pattern matching pattern matching -- FSMFSM--BasedBased
Reconfigurable Computing 8
FSM-Based pattern matcherOne-hot implementation
Each state is coded in one flip flopThe D-input of the flip flop is obtained by an AND of the output of the previous flip flop with the result of the comparatorThe comparator is character-specificOnly n FF are used to implement a word of length nAdvantage:
Low costReflects the structure of the grammar
Drawback: Not easy to buildRedundancy in the comparators
o n t e
c
Character-specific comparators
Searching Searching –– pattern matching pattern matching -- FSMFSM--BasedBased
Reconfigurable Computing 9
FSM-Based pattern matcherExploiting common prefix
For words with common prefix, only one common starting path corresponding to the length of the common prefix is used.Redundancy of comparators can be avoided by implementing only one comparator for each character.The result of the comparison will then be provided to all gates using them Words with common prefix
and the corresponding FSM
Searching Searching –– pattern matching pattern matching -- FSMFSM--BasedBased
Reconfigurable Computing 10
FSM-Based pattern matcherOptimized architecture
Implement the common prefix Redundancy of comparators is removed: Each character in the set is implemented in a position vector: pos(i) = 1 iff character i is detected Block diagram of the optimal
pattern matcher
Detailed structure of the optimal pattern matcher
Reconfigurable Computing 11
Searching Searching –– pattern matching pattern matching –– use of use of reconfigurationreconfiguration
Bit Bit Bit Bit Bit Bit Bit Bit
FSM-Based pattern matcherUse of reconfiguration
Replace the character comparatorsReplace the FSM for a set of words
Rec
onfig
urat
ion
New character comparator
New set of wordsReconfiguration
Reconfigurable Computing 12
Signal processing Signal processing –– distributed arithmetic distributed arithmetic --MotivationMotivation
Signal processing applications (FFT, Convolution, Filter algorithms) are characterized by MAC-intensive computations Signal processing functions are usually implemented on special processors
DSPsASICs
FPGAs provide the advantage of reconfigurability, but MAC-intensive applications are expensiveHowever, for MAC computations involving one constant vector, FPGAs present one of the best alternatives to DSPs
Signal processingSignal processing-- distributed arithmetic distributed arithmetic -- BasicsBasics
Reconfigurable Computing 13
iji XA ∗∑Because the Ai are constant, there exist 2n possible values for
We can pre-compute the possible values and store them in a LUT (DALUT) and retrieve them on demand at run-time
( ) ( )ijijj
iji XA2=2XA=XA=Z ∗∗∗∗∗ ∑∑∑∑
With the binary representation for Xi: j
iji 2X=X ∑
( )∑ ∗∗ ii XA=XA=ZA constant row vector, X column vectorSolution of the following equation:
( )ijij XA2=Z ∗∗∑∑ is the classical form of distributed arithmetic
FPGA Advantage: Computation is memory-based (use of LUTs)
Signal processingSignal processing-- distributed arithmetic distributed arithmetic -- BasicsBasics
Reconfigurable Computing 14
To better understand, we spread the DA equation
Z=[ ] 02( ) ( ) nn01n01n AX+AX ∗∗ −−.........................220110 AX+AX ∗∗
+ [ ] 12( ) ( ) nn11n11n AX+AX ∗∗ −−221111 AX+AX ∗∗ .........................
2W21W1 AX+AX ∗∗ ......................... ( ) ( ) nnW1nW1n AX+AX ∗∗ −− ] W2
. .........
.........
........
+ [
The bits of the variables will be used to address the memory andretrieve the required values in a bit-serial way.
The DA-datapath implementation is straightforward
Signal processingSignal processing-- distributed arithmetic distributed arithmetic -- DatapathDatapath
Reconfigurable Computing 15
( )1W1W1 XX − ................
......
1011XX( )1W2W2 XX − ................ 2021XX
( )1WnnWXX − ................ n0n1XX ...
1A0
2A
21 A+A3A
13 A+A23 A+A
123 A+A+A4A
Z+/-
DA-LUT AddressDA-LUT
Parallel bit-serial input j-shift
Signal processingSignal processing-- distributed arithmetic distributed arithmetic -- DatapathDatapath
Reconfigurable Computing 16
k-parallel( )1W1W1 XX − ................
......
1011 XX
( )1W2W2 XX − ................ 2021 XX
( )1WnnW XX − ................ n0n1 XX
DA-LUT 1 DA-LUT 2 DA-LUT k
ACC 1 ACC 2 ACCk
Z Adder tree
Signal processingSignal processing-- distributed arithmetic distributed arithmetic -- ExampleExample
Reconfigurable Computing 17
Recursive convolution of time domain simulation of opticalmultimode intra/system interconnects
( ) ( ) 35322415041n0n xf+xf+xfxf+tyf=ty ∗∗∗−∗∗ −
Recursive formula to be implemented on 3 intervals
Comparison of different implementations Virtex 2000E implementation on
the Celoxica RC1000-PP board
Signal processing Signal processing –– Fast Fourier Transform (FFT)Fast Fourier Transform (FFT)
Reconfigurable Computing 18
Fourier Series developed by the French Mathematician Joseph Fourier (1807)
Application of the initial idea in the field of heat diffusion
The advent of digital computation and “discovery” of Fast Fourier Transform (FFT) in the 50's revolutionized the field of signal processing
Practical processing and meaningful interpretation of signals with exceptional importance for human and industry
Medical monitors and scanners
Modern electronic communications
Image processing
etc.
Signal processing Signal processing –– FFT FFT -- BasicsBasics
Reconfigurable Computing 19
Fourier transform F(u) of a continuous function f(x):dxe)x(f)u(F )ux2j( π−
∞
∞−∫=
Given F(u), we can obtain f(x) by means of inverse Fourier transform
Extension in 2-D, F(u,v)
Corresponding inverse in 2-D
due)u(F)x(f )ux2j( π−∞
∞−∫=
due)v,u(F)y,x(f ))vyux(2j( +−∞
∞−∫= π
dxe)y,x(f)v,u(F ))vyux(2j( +−∞
∞−
∞
∞−∫∫= π
Signal processing Signal processing –– FFT FFT -- BasicsBasics
Reconfigurable Computing 20
Fourier transform of a discrete function of one variable F(u)
Given F(u), we can obtain, f(x) by means of inverse Fourier transform
1M,...,0u,1M,...,0x,e)x(fM1)u(F )M/ux2j(
1M
0x−=−== −
−
=∑ π
1M−1M,...,0u,1M,...,0x,e)u(F)x(f )M/ux2j(
0x−=−== −
=∑ π
The Brute force computation of the Fourier transform requires M2 multiplications and additions
The performance can be improved to (M log M) using the successive doubling method
Signal processing Signal processing –– FFT FFT -- BasicsBasics
Reconfigurable Computing 21
For notational convenience, we replace the previous equation with:
)M/2j(M
)ux(M
1M
0xeW,W)x(f
M1)u(F π−
−
=== ∑
We assume M to be of the form
With n and K being positive integers each, we have:
K22M n ==
]WW)1x2(fK1W)x2(f
K1[
21)u(F )u(
K2)ux(
K
1K
0x
))x2(u(K2
1K
0x++= ∑∑
−
=
−
=
)ux(K2
1M
0xW)x(f
K21)u(F ∑
−
==
Fodd(u)Feven(u)
Signal processing Signal processing –– FFT FFT -- BasicsBasics
Reconfigurable Computing 22
Since and , we have
]W)u(F)u(F[21)u(F u
K2oddeven +=
The FFT is a fast implementation of the Discrete Fourier Transform (DFT).
Based on a divide-and-conquer model,M log2M computations
]W)u(F)u(F[21)Ku(F u
K2oddeven −=+
uM
)Mu(M WW =+ u
M2)Mu(
M2 WW −=+
Signal processing Signal processing –– FFT FFT -- AlgorithmAlgorithm
Reconfigurable Computing 23
The N-point DFT computation can be divided into two N/2-point DFT computation. These N/2-point DFT computations can be divided into two N/4-point DFT computations, and so on.
There are log2(N) stages
After the division and the DFT computation, a merging process is performed, in which the transforms are reassembled. Merging process
Signal processing Signal processing –– FFT FFT –– Algorithm Algorithm –– butterfly unitbutterfly unit
Reconfigurable Computing 24
The butterfly unitThe reassembling is done by using the complex elements say, g and h from the previous stage. Current element = g - h*WN
k and g + h*WNk
The twiddle factorThe twiddle factor terms (of the WN
k = exp(j 2* π*k/N) must be availableThe real and complex parts of these factors are stored in a ROM.The factors correspond to a sine (imaginary part) and cosine (real part) functions, in a set of N/2 equally-spaced angles in an interval from 0 to (N-2)* π/N. Therefore, only a N/2-position memory is needed.
g
h * Wk
N-
+
Signal processing Signal processing –– FFT FFT –– FPGA implementationFPGA implementation
Reconfigurable Computing 25
16 points FFTPipelined READ, EXECUTE and OUTPUT stage
Read one complex input (32-bits) in every cycle16 point input is read in 16 cycles
Output one complex result (32-bits) in every cycle
EXECUTE stage also takes 16 cycles
Performance Latency: 17 cyclesThroughput: 1 transform per 16 cycles
READ EXEC OUTPUT
READ EXEC OUTPUT
Signal processing Signal processing –– FFT FFT –– FPGA implementationFPGA implementation
Reconfigurable Computing 26
Completing the execution stage in 16 cycles results in large fan-outs for the internal registers: very poor timing characteristics
By trial and error, divide the execution stage into 4 pipelined sub-stagesThat necessitates 5x internal storage, but 2560 bits of storage is no big deal for XC2V2000.Final performance :
Latency : 65 cyclesThroughput : 1 transform per 16 cycles (same as before)
R E1 E2 E3 E4 O
R E1 E2 E3 E4 O
Signal processing Signal processing –– FFT FFT –– FPGA implementationFPGA implementation
Reconfigurable Computing 27
Resource requirements:
1432 out of 21504 flip-flops (6%)1037 out of 21504 LUT (4%)65 out of 624 I/O pins (10%)
TimingMinimum clock period: 5.899ns (Maximum frequency: 169.520MHz)
Power estimation 440 mW
Image processing Image processing –– Basic operatorsBasic operators
Reconfigurable Computing 28
Image processing algorithms usually process an image (set of points with a given characteristics such as color, gray level, luminance, etc.) point by point. The resulting pixels depend only on the pixels in the original picture. A sequential processor needs quadratic run-time to process a complete image. By using parallelism, each pixel can be computed independently.Many image processing system are based on the following operators:
Median filteringBasic Morphological operationsConvolutionEdge detection
Algorithms are often based on the moving window operator
Image processing Image processing –– Moving windowMoving window
Reconfigurable Computing 29
The moving or sliding window algorithm usually processes one pixel of the image at a timeThe value of the pixel is changed by a function of a local pixel region covered by the windowThe operator moves over the image to cover all pixelsFor a pipelined implementation, all the pixel of the windows must be accessed at the same time for each clockFPGA implementation uses FIFO buffers
FIFO1
FIFO2
RAM
W11 W12 W13
W21 W22 W23
W31 W32 W33
Disposed
FIFO1
FIFO2
W11 W12 W13
W21 W22 W23
W31 W32 W33
DisposedW14 W15
W24 W25
W34 W35
W41 W42 W43
W51 W52 W53
W44 W45
W54 W55RAM
FIFO3
FIFO4
3x3 and 5x5 moving windows
Image processing Image processing –– Moving window Moving window -- FPGAFPGA
Reconfigurable Computing 30
FIFO ImplementationFIFOs are implemented using circular buffers constructed from Multi-ported RAMs (available in, e.g., Virtex FPGA)Indexes keep track of the front and tail items in the bufferBLOCK RAMs are readable and writable in one clock-cycle. This allows a throughput of one pixel per cycle.
3x3 windows2 buffers of size W-3 (W = image width) are usedThe two FIFO buffers must be full to access all the window pixels in one cycleIn each clock cycle, a pixel is read from the memory and placed into the bottom left cornerThe content of the window is shifted to the right with the right most member being added to the tail of the FIFOThe top right pixel is disposed after computation, since it is not used in the future computation
Image processing Image processing –– Median filteringMedian filtering
Reconfigurable Computing 31
BasicsAn impulse noise (or salt and pepper noise) in an image has a gray level with higher low different from the neighbor point.Linear filters have no ability to remove this type of noiseMedian filters share remarkable advantages on removing this type of noiseOften used in digital signal and image/video applications
ImplementationUse a sliding window of odd size (e.g., 3x3) over an imageAt each window position, the median of the sample values is taken to replace the value at the center of the windowHigh computational cost O(N log N) even using most efficient sorting algorithmsGeneral purpose processors are not a good solutions for real time implementation. This justifies the use of FPGAs.
Image processing Image processing –– Median filteringMedian filtering
Reconfigurable Computing 32
Sequential implementation(pseudo code)For x=1 to # rows
For y = 1 to # colsBuild Windows arraypixel(x,y) = Median(window array)
EndEnd
Complexity O(#rows X #cols x NlogN) (N=3)
Hardware sorting implementation
10 5 2014 3 1115 25 2
11
2 3 5 10 11 14 20 25
Median
Image processing Image processing –– Median filtering Median filtering -- resultresult
Reconfigurable Computing 33
Original image Filtered image
Image processing Image processing –– Basic Morphological OperatorsBasic Morphological Operators
Reconfigurable Computing 34
Morphology in image processing studies the appearance of objects. Useful for example in:
Skeletonization Edge detectionRestoration
ProcessingThe image is processed pixel-by-pixel using a structuring element (the sliding windows)The window may fit or not to the image
Most basic building blocks:Erosion (shrinks or erodes an object in the image)Dilation (grows the image)Operations like opening and closing of an image can be derived by performing erosion and dilation in different order
Image processing Image processing –– Basic Morphological OperatorsBasic Morphological Operators
Reconfigurable Computing 35
Erosion Replaces the center pixel in the sliding window by the smallest pixel value in the window arrayThe bright area of the image shrinks, or erodes
Dilation Replaces the center pixel in the sliding window by the greatest pixel value in the window arrayThe bright area of the image grows
AlgorithmSame as the medianInstead of selecting the median element,the minimum is selected for erosion and the maximum is selected for dilation
Image processing Image processing –– Median filtering Median filtering -- resultresult
Reconfigurable Computing 36
DilationOriginal image Erosion
Image processing Image processing –– Convolution Convolution -- BasicsBasics
Reconfigurable Computing 37
Convolution multiplies two arrays of numbers with different sizes and produces a third array of numbersIn image processing, convolution implements operators whose output pixels are computed as linear combinations of certain input pixels values.1-D Convolution
Formally convolution takes two input functions f(x) and g(x) andgenerates h(x) = f(x)*g(x) where g(x) is referred to as the filter:
2-D ConvolutionMost important in modern image processingA finite size window (convolution mask) is scanned over the imageThe output pixel value is the weighted sum of the input pixels within the windowThe weight is the value of the filter assigned to each pixel in the window
( ) ( ) ( )dττxgτf=xh −∫∞
∞−
Image processing Image processing –– Convolution Convolution -- BasicsBasics
Reconfigurable Computing 38
2-D Convolution
Mathematically represented by the following equation:
where x is the input image, h is the filter and y is the output image
Supports a virtual infinite variety of masks, each with its own feature
3x3 convolutions are most commonly used and operate only on a pixel and its directly adjacent neighbours
( ) ( ) ( )jni,mxji,h=nm,yheightimg
0=i
widthimg
0=j−−∑ ∑
− −
P1 P2 P3
P4 P5 P6
P7 P8 P9
W1 W2 W3
W4 W5 W6
W7 W8 W9
P*∑
∑9
1=ii
i9
1=ii
W
PW=P
Image processing Image processing –– Convolution Convolution –– Gaussian filtersGaussian filters
Reconfigurable Computing 39
Gaussian convolution filters1-D 2-D
The idea is to use the 2-D distribution as a point spread function. This is achieved by convolutionA discrete approximation of the Gauss function is required to perform the convolutionIn theory, Gauss distribution is zero anywhere. Therefore an infinite large convolution kernel may be requiredBut in practice, the convolution kernel is truncated as shown in the pictures
( ) 2
2
σ2x
eπσ21=xG
−
( )( )
2
22
σ2y+x
eπσ21=yx,G
−
21 31 21
31 48 31
21 31 21
1256
2 4 5
4 9 12
5 12 15
4 2
9 4
12 5
4 9 12
2 4 5
9 4
4 2
1115
3x3 Gaussian smooth filter 5x5 Gaussian smooth filter
1.4
Image processing Image processing –– Convolution Convolution –– Gaussian filtersGaussian filters
Reconfigurable Computing 40
ConvolutionOriginal image
Image processing Image processing –– Edge detection Edge detection -- BasicsBasics
Reconfigurable Computing 41
Edges Placed in image with strong intensity contrastOften occurs at image location representing boundaries
Edge detectionExtensively used in image segmentation, i.e., dividing an image into areas corresponding to different objectsRepresenting an image by its edges significantly reduces the amount of dataSince edges correspond to strong illumination gradients, the derivatives of the image are used to compute the edgesOperators often used are
Laplace operatorSoebel operatorCanny edge detection algorithm
Image processing Image processing –– Edge detection Edge detection -- operatorsoperators
Reconfigurable Computing 42
LaplaceGradient operatorIntensity difference are enhanced. The edges are more pronouncedFor each pixel, the gray value of its four neighbours (top, left, bottom, right) pixel value are subtracted from its own value
Soebel operatorCombination of two 1-D operators
One for detecting horizontal edgesOne for detecting vertical edges
-1
-1 4 -1
-1
Laplace operator
-1 1
-2 2
-1 1
Soebel-x
1 2 1
-1 -2 -1
Soebel-y
Image processing Image processing –– Convolution Convolution –– Gaussian filtersGaussian filters
Reconfigurable Computing 43
Edge detectionOriginal image
Image processing Image processing –– Use of reconfigurationUse of reconfiguration
Reconfigurable Computing 44
Intelligent image processing systemAccording to input image and other conditions,
Some operations are done to improve the image
Filtering (the correct filter is chosen)Smoothing Segmentation (Edge detection)Skeletonization
Some adjustments are done on the image input hardware
CalibrationFocussing
Everything is done while the system keeps runningFixed parts of the system will run continuouslyReconfigurable must be replaced at run-time
Mechanical Control Mechanical Control –– BasicsBasics
Reconfigurable Computing 45
Controller task is to influence the dynamic behavior of a plantInputs values for the plant depends on plant's outputs (Feedbacks)A plant is modeled as a linear time invariant (LTI)-SystemController is modeled as LTI-System Time discretization
Scaling to fix-pointk, k+1, k+2 …sample pointsT… sample periodtc… calculation time of controller
Plant
Controller
D/A-C
D/A-C
referencevalue
TTkk kk+1+1 kk+2+2
tt
ttcc
Mechanical Control Mechanical Control –– BasicsBasics
Reconfigurable Computing 46
kur
kyr
kkk
kk1kuDxCyuBxAxrrr
rrr
+=
+=+
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡==
...vMvM
vMz row2
row1v
v
rrcombine A,B,C,D to M,combine A,B,C,D to M,represent computation asrepresent computation asset of scalar products ofset of scalar products ofeach row of M witheach row of M with
u: Controller input, x: State, y: OutputA, B, C, D: Constant coefficient matrices
vr
Mechanical Control Mechanical Control –– DADA--ImplementationImplementation
Reconfigurable Computing 47
∑=
==n
1iiiaxxaz (1)
rr
∑−
=
−=1w
0j
jiji 2xx (2)
∑ ∑
∑ ∑
−
= =
−
=
−
=
−
=
=
1w
0j
n
1iiij
j
n
1i
1w
0j
jiji
ax2
2xaz (3)
Scalar product:const. vector and var. vector
xi as w-bit fix-point(here x just unsigned in [0,2[ )
replace (2) in (1),swap the sums
since xij is in {0,1}right sum can have just 2n values
pre-compute it and store it in a 2n x w ROM as LUT
20 2-1 2-2 2-3 2-(w-1)…
ar x
v
Mechanical Control Mechanical Control –– DADA--Implementation (Parallel)Implementation (Parallel)
Reconfigurable Computing 48
Coeffic ient Look - Up T able(DA LUT )
Inpu t R eg is te r
x 1> > c
x 2> > c
DA L UT1x [1..n],w-1
DA L UT2x[1..n],w-2
Data P ath ( DP )( c + 1 Input A dder )
Res ult ( z )
> > c
DA L UTcx n> > c x[1..n],w-c
.. .
...
......
w w w
w
n
n
> > 0> > c -1 > > c -2∑−
=
−=1w
0jj]n..1[
j )x(DALUT2z
c-Bit at a time Architecture
Mechanical Control Mechanical Control –– Multi controller systemMulti controller system
Reconfigurable Computing 49
Many controller modules optimized for different operating regimes
Controllers have different structures not only different coefficients
Supervisor observes plant and determines best controller module
Multiplexer switches controller outputs
Mechanical Control Mechanical Control –– Multi controller systemMulti controller system
Reconfigurable Computing 50
Periodic execution of task graphConditional branching to controller modules (CM)CMs implement various area/time trade-offs possible
tex
area
Reconfigurable Computing 51
Mechanical Control Mechanical Control –– Multi controller system Multi controller system --Use of reconfigurationUse of reconfiguration
One slot solution
Reconfigurable Computing 52
Mechanical Control Mechanical Control –– Multi controller system Multi controller system --Use of reconfigurationUse of reconfiguration
Two slots solution
Reconfigurable Computing 53
Mechanical Control Mechanical Control –– Multi controller system Multi controller system --Use of reconfigurationUse of reconfiguration
)(:timeexecution
CMCMex Aft =
)2/()(22
)2()2(min
1)2(
AftTtfAA
CMex
CMex
CM
==== −
)()()(
)1()1()1(min
1)1(
AgAfttTtfAA
CMrec
CMex
CMex
CM
+=+=== −
CM
CMCMreconfig
ArAgt
×≈= )(
: timereconfig.
Reconfigurable Computing 54
Mechanical Control Mechanical Control –– Multi controller system Multi controller system --FPGA Implementation FPGA Implementation -- inverse penduluminverse pendulum
Synthesis results of controller:
FPGA: Virtex 800Area: 1003 slices (ca. 10%) Clock Rate: > 70 Mhz
Computation Time: tc < 1 µs
The Raptor 2000 Board
Host Simulates PlantHost configurations
RAPTOR 2000Comm. ResourcesConfiguration Manager
FPGAController ModuleSupervisorCommunication
Controller characteristics:Dimensions: p=3, n=2, q=3Word-width: Input: 16 Bit, Intern: 32 Bit
Reconfigurable Computing 55
Mechanical Control Mechanical Control –– Multi controller system Multi controller system --FPGAFPGA--Implementation Implementation -- ArchitectureArchitecture
Two-slot implementationOne slot implementation
CM1
reconfiguring…
CM1
CM1
CM2
CM2
reconfiguring…
CM2
CM3
Reconfigurable Computing 56
Mechanical Control Mechanical Control –– Multi controller system Multi controller system --FPGAFPGA--Implementation Implementation -- ArchitectureArchitecture
ReconfigurableModule Slot
(Controller Module)
Fix Module Slot(Supervisor)
Bus
Mac
ros