Strumenti Utente

Strumenti Sito


magistraleinformaticanetworking:spd:lezioni16.17

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisione Revisione precedente
Prossima revisione
Revisione precedente
magistraleinformaticanetworking:spd:lezioni16.17 [26/05/2017 alle 18:50 (7 anni fa)]
Massimo Coppola
magistraleinformaticanetworking:spd:lezioni16.17 [05/08/2017 alle 18:49 (7 anni fa)] (versione attuale)
Massimo Coppola [Journal] update to lesson list
Linea 8: Linea 8:
   * 08/03/2017 **MPI Lab** Basic program structure. Examples with derived datatypes.    * 08/03/2017 **MPI Lab** Basic program structure. Examples with derived datatypes. 
   * 13/03/2017 **MPI Lab** Implementing communication with assigned asynchronicity degree in MPI. Structured parallel programming in MPI, separation of concerns in practice. Structured parallel patterns in MPI and communicator handling.   * 13/03/2017 **MPI Lab** Implementing communication with assigned asynchronicity degree in MPI. Structured parallel programming in MPI, separation of concerns in practice. Structured parallel patterns in MPI and communicator handling.
-  * <del>15/03/2017</del> 16/03/2017 **MPI Lab** Farm skeleton implementation -- **MPI** MPI collectives with both computation and communication: Reduce (and variants) and Scan (and variants). Using MPI operators with Reduce and Scan. Defining custom user operators, issues and implementation of operator functions. +  * <del>15/03/2017</del> 16/03/2017 **MPI** Farm skeleton implementation with MPIMPI collectives with both computation and communication: Reduce (and variants) and Scan (and variants). Using MPI operators with Reduce and Scan. Defining custom user operators, issues and implementation of operator functions. 
   * 20/03/2017 **MPI Lab** Asynchronous channel implementation, Farm skeleton implementation. Parallel code basic debugging.   * 20/03/2017 **MPI Lab** Asynchronous channel implementation, Farm skeleton implementation. Parallel code basic debugging.
   * <del>22/03/2017</del> postponed   * <del>22/03/2017</del> postponed
Linea 16: Linea 16:
   * 05/04/2017 **TBB** reduce (differences between “functional” and “imperative” forms); deterministic reduce; pipeline class and filter class (i.e. stages), strongly typed parallel_pipeline and make_filter template. TBB containers: similarity and differences with STL containers, mutithreaded/sequential performance tradeoffs wrt software lockout, space and time overheads, relaxed/restricted semantics and feature drops, thread view consistency; container_range, extending containers to ranges, concurrent map and set templates: concurrent_hash, unordered, unordered_multi map; concurrent and unordered set.    * 05/04/2017 **TBB** reduce (differences between “functional” and “imperative” forms); deterministic reduce; pipeline class and filter class (i.e. stages), strongly typed parallel_pipeline and make_filter template. TBB containers: similarity and differences with STL containers, mutithreaded/sequential performance tradeoffs wrt software lockout, space and time overheads, relaxed/restricted semantics and feature drops, thread view consistency; container_range, extending containers to ranges, concurrent map and set templates: concurrent_hash, unordered, unordered_multi map; concurrent and unordered set. 
   * 26/04/2017 **Intro to GPU-based computing**  GPGPU and OpenCL. Development history of modern GPUs, graphic pipeline, HW/FW implementations, load unbalance related to the distribution of graphic primitives executed, more “general purpose” and programmable core design; generic constraints and optimizations of the GPU approach; modern GPU architecture, memory optimization and constraints, memory spaces. GPGPU, and transition to explicitly general purpose programming languages for GPU. Management of large sets of thread processors, concept of command queue and concurrent execution of tasks; consequences on the constraint over synchronization of large computations split among several thread processors.    * 26/04/2017 **Intro to GPU-based computing**  GPGPU and OpenCL. Development history of modern GPUs, graphic pipeline, HW/FW implementations, load unbalance related to the distribution of graphic primitives executed, more “general purpose” and programmable core design; generic constraints and optimizations of the GPU approach; modern GPU architecture, memory optimization and constraints, memory spaces. GPGPU, and transition to explicitly general purpose programming languages for GPU. Management of large sets of thread processors, concept of command queue and concurrent execution of tasks; consequences on the constraint over synchronization of large computations split among several thread processors. 
-  * 03/05/2017  TBB Lab time+  * 03/05/2017  ** TBB Lab time ** -- – Basics of TBB, Mandelbrot Set algorithm implementation.
   * 08/05/2017  ** TBB Lab time ** -- ** TBB ** thread local storage (TLS); TLS-based algorithms for array reduction and results accumulation (farm patterns, stream reductions).   * 08/05/2017  ** TBB Lab time ** -- ** TBB ** thread local storage (TLS); TLS-based algorithms for array reduction and results accumulation (farm patterns, stream reductions).
-  * 10/05/2017  Short project intro -- data stream analysis, window based approaches. +  * 10/05/2017  Short project intro -- data stream analysis, time-based and sample based stream models, window based approaches. 
   * 15/05/2017  ** OpenCL ** OpenCL intro and examples: framework goal (portability, widespdread adoption), design concepts and programming abstractions: Devices/host interaction, context, kernel, command queues; execution model; memory spaces in OpenCL; C/C++ subset for kernels, kernel compilation, program objects, memory objects and kernel arguments, execution, kernel instances and workgroups, workgroup synchronization; portability and chances for load balancing: mapping OpenCL code onto both the GPU and the CPU; examples of vector types and vector operations; basic example of OpenCL program construction (vector addition).   * 15/05/2017  ** OpenCL ** OpenCL intro and examples: framework goal (portability, widespdread adoption), design concepts and programming abstractions: Devices/host interaction, context, kernel, command queues; execution model; memory spaces in OpenCL; C/C++ subset for kernels, kernel compilation, program objects, memory objects and kernel arguments, execution, kernel instances and workgroups, workgroup synchronization; portability and chances for load balancing: mapping OpenCL code onto both the GPU and the CPU; examples of vector types and vector operations; basic example of OpenCL program construction (vector addition).
-  * 17/05/2017  Project intro -- window/pane-based stream computations with accumulation functions over trees+  * 17/05/2017  Project intro --  Introduction to available project topics. Stream mining with TBB MPI / FastFlow; stream based computations on GPU : stream computation of aggregate / accumulation functions in window and pane-based models
-  * 19/05/2017  ** Lab Time **  OpenCL 2.2 +  * 19/05/2017  ** Lab Time ** -- K-means algorithm in MPI, porting to TBB \\ ** OpenCL ** OpenCL 2.2: OpenCL C, C++ (static subset of the C++14 standard). Features missing wrt the C++14. The SYCL model for single-source OpenCL / parallel C++ code. SPIR-V as a common representation of code that allows the integration of existing technology into a unified toolchain (LLVM-based compilers, GLSL, device drivers, format translators). Integration of OpenCL C++ within the SPIR-V SW ecosystem, convergence with Vulkan. 
-  * 22/05/2017  ** TBB Lab time ** K-Means -- **Parallel algorithms** Introduction to some parallel algorithms based on implicit tree expansion combinatorial exploration algorithms (N-queens), Divide and Conquer, Branch and Bound optimization methods.  +  * 22/05/2017  ** TBB Lab time ** K-Means algorithm development. -- **Parallel algorithms** A selection of parallel algorithms based on implicit tree expansion combinatorial exploration algorithms (example: N-queens), Divide and Conquer, Branch and Bound optimization methods. Interaction among different parallel visit orders, computational grain size and available parallelism. Impact of inter-worker synchronization in the B&B case.   
-  * 24/05/2017  OpenCL -- Lab Time +  * 25/05/2017  Parallel B&B, parallel D&C. Different parallelism exploitation at different tree levels. An example of D&C algorithm in Data Mining : parallelisation options for the C4.5 algorithm mining classification trees. ** OpenCL ** -- Lab Time: OpenCL Linux installation.  
-  * //26/05/2017// +  * 26/05/2017  ** OpenCL ** Lab Time: Different implementations of 2d matrix multiplication algorithms (exploiting 2D and 1D work item distributions with 0D and 1D work items, global and local memory, local synchronizations among thread groups, access patterns). 
-  * 29/05/2017  +  * 29/05/2017  The Flowshop problem as an example of parallelizable B&B problem; parallel implementation choices with TBB/MPI. 
-  * //30/05/2017// +  * 30/05/2017  Stream computation of aggregate measures: the General Incremental Sliding-Window Aggregation algorithm and its parallelization. 
  
  
Linea 42: Linea 42:
 | 22/03 |  |  |   | | 22/03 |  |  |   |
 | 26/04 | {{ :magistraleinformaticanetworking:spd:2016:gpgpu_intro.pdf |GPU and GPGPU intro}} |  |   | | 26/04 | {{ :magistraleinformaticanetworking:spd:2016:gpgpu_intro.pdf |GPU and GPGPU intro}} |  |   |
 +| 03/05 |  |  |   |
 +| 08/05 |  |  |   |
 +| 15/05 |  |  |   |
 +| 17/05 |  |  |   |
 +| 19/05 |  |  |   |
 +| 22/05 |  |  |   |
 +| 25/05 |  |  |   |
 +| 26/05 |  |  |   |
 +| 29/05 |  |  |   |
 +| 30/05 |  |  |   |
  
magistraleinformaticanetworking/spd/lezioni16.17.1495824632.txt.gz · Ultima modifica: 26/05/2017 alle 18:50 (7 anni fa) da Massimo Coppola