Upload
rehan
View
56
Download
1
Embed Size (px)
DESCRIPTION
Algorithm Engineering „Parallele Algorithmen“. Stefan Edelkamp. Übersicht. Parallele Externe Suche Parallele Verspätete Duplikatselimination Parallele Expansion Verteilte Sortierung Parallele Strukturierte Duplikatselimination Disjunkte Duplikatserkennungsbereiche ”Schlöser” - PowerPoint PPT Presentation
Citation preview
Algorithm Engineering „Parallele Algorithmen“
Stefan Edelkamp
Übersicht Parallele Externe Suche Parallele Verspätete Duplikatselimination
Parallele ExpansionVerteilte Sortierung
Parallele Strukturierte DuplikatseliminationDisjunkte Duplikatserkennungsbereiche ”Schlöser”
Parallele AlgorithmenMatrix-MultiplikationList RankingEuler Tour
VerteilteSuche
Distributed setting provides more space. Experiments show that internal time dominates
I/O.
Exploiting Independence
Since each state in a Bucket is independent of the other –
they can be expanded in parallel.
Duplicates removal can be distributed on different processors.
Bulk (Streamed) transfers much better than single ones.
Distributed Queue for Parallel Best-First Search
P0
P1
P2
<15,34, 0, 100>
<g, h, start byte, size><15,34, 20, 100>TOP
<15,34, 40, 100>
<15,34, 60, 100>
Beware of the
Mutual Exclusio
n
Problem!!!
Multiple Processors - Multiple Disks Variant
Sorted buffers w.r.t the hash val
Sorted Files
P1 P2 P3 P4
Divide w.r.t the hash rangesSorted buffers from every processor
Sorted File
h0 ….. hk-1 hk ….. hl-1
ParallelExternal A*
Parallel External A*
Distributed Heuristic Evaluation Assume one child processor for each tile one master processor
B3B1 B2
B8
B4 B5 B6 B7
B9 B10 B11
B12 B13 B14 B15
B0B3B1 B2
B8
B4 B5 B6 B7
B9 B10 B11
B12 B13 B14 B15
B0
Distributed Pattern Database Search
Only pattern databases that include the client tile need to be loaded on the client
Because multiple tiles in pattern, from birds eye PDB loaded multiple times
In 15-Puzzle with corner and fringe PDB this saves RAM in the order of factor 2 on each machine, compared to loading all
In 36-Puzzle with 6-tile pattern databases this saves RAM in the order of factor 6 on each machine, compared to loading all
Extends to additive pattern databases
Distributed Heuristic Evaluation
Same bottleneck in external-memory search
Bottleneck: Duplicate detection Duplicate paths cause parallelization overhead
A
C D
BB
C DDDD
Internal memory External memoryvs.
fast
slow
A
Disjoint duplicate-detection scopes
B1B0 B4
B0 B3B1 B2
B8
B4 B5 B6 B7
B9 B10 B11
B12 B13 B14 B15
B0 B1
B4
B3B2
B7
B2B3 B7
B12
B8
B13 B15B14
B11B8B12 B13 B11B15 B14
Finding disjoint duplicate-detection scopes
B1B0 B4
0 00 0
0
0 0 0 0
0 0
1
0 0 0 0
0 1
1
02
1
B2B3 B7
0 1 0
B8B12 B13 B11B15 B141
2 2
01
2
2
2
2
1
2
2
2
2
2
0 1
1
1
0
1
0
2
3
3
2
B1B5B6
B4 B9
2
3
3
4
3
3
Implementation of Parallel SDD Hierarchical organization of hash tables
One hash table for each abstract nodeTop-level hash func. = state-space projection func.
Shared-memory managementMinimum memory-allocation size mMemory wasted is bounded by O(m#processors)
External-memory version I/O-efficient order of node expansions I/O-efficient replacement strategy
Benötigt nur ein Mutex “Schloss”
B3B1 B2
B8
B4 B5 B6 B7
B9 B10 B11
B12 B13 B14 B15
B0
ParallelleMatrix- Multiplication
ParalleleMatrixMultiplication
Exklusives Schreiben
ParalleleKopien
FazitMatrix Multiplication
Paralleles List Ranking
List Ranking
Erster Algorithmus
Prinzip
Komplexität
Verbesserungen
Strategie
Unabhängige Mengen
2-Färbung
Reduktion
Restauration
Beispiel
Variablen
Beispiel(ctd.)
PseudoCode
NächsterSchritt
Analyse
Backup
Algo
Algo
Speicher
Analyse
Ausblick:Randomisiertin O(n) whp?
Problememit DFS
IdeeEulerTour
ParallelDFS
DFSNummern
Allgemein
Allgemein
Allgemein
Beispiel
Ein Zyklusoder mehrere?
Korrektheit
Korrektheit
Beispiel
KonstruktionEulerTour
Fazit Euler Touren
GPU Architektur
Effektivität
Hierarchischer Speicher
Hash-based Partitioning
BFS
Kernel Functions