Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

--- magistraleinformaticanetworking:spd:2016:tbblab [06/05/2016 alle 09:21 (9 anni fa)] – Massimo Coppola
+++ magistraleinformaticanetworking:spd:2016:tbblab [09/05/2016 alle 10:12 (9 anni fa)] (versione attuale) – Massimo Coppola
@@ Linea 21: / Linea 21: @@
 The computation for each point is actually given by the mandelbrot function, which returns the number of iteration until the point diverges, or //MaxI// if the limit is exceeded.
-Choose a plane region near the border of the mandelbrot set and getting some of the points of the set.
+Choose a plane region near the border of the mandelbrot set so that you get both some of the points of the set and some outside in your computation.
   * measure the execution time and the speedup which varies with the iteration limit parameter;
-  * check what is the amount of parallelism choose by TBB
+  * check what is the amount of parallelism chosen by TBB
   * check if the load is balanced
+  * check if the computation lenght per task is balanced (e.g. measure the amount of tasks that reach //MaxI//, or even better compute a cumulative  histogram of the number of iteration for all processed tasks and print it in output).
   * check if you need to tune the grain size in order to achieve a good speedup when //MaxI// is low.
-  * try choosing a different amoutn of parallelism
+  * try manually choosing a different amount of parallelism from that selected by TBB
 Can you dynamically optimize the grain according to the input parameters?
 == Things to do ==
-  * decide how you will provide the sequential mandelbrot function as loop body (via lambda or via a loop body class with () operator)
+  * decide how you will provide the sequential Mandelbrot function as loop body (via lambda or via a loop body class with () operator)
   * decide if you want to set up two nested parallel_for loops or a single loop with a 2D range (it is useful to try both and compare)
+  * decide how to perform a rough check of the balancing of the load per task (histogram, percentage of lengthy tasks...)
 == Useful places around the Mandelbrot set ==
@@ Linea 38: / Linea 40: @@
 | X = -0.7463 Y = 0.1102 | R = 0.005 |
-=== Actual farm with mandelbrot ===
+=== Actual farm with Mandelbrot ===
-Restructure the program to work with a stream of points (or sets of points) that are generated and enter a parallel_do; we still compute the mandelbrot set function, but the sequential function is modified such that the sequential function can take in input and result in poutput a partial computation on a point of the plane.
+Restructure the program to work with a stream of points (or stream of sets of points) that are generated and enter a parallel_do; we still compute the Mandelbrot set function, but the sequential function is modified such that it can take in input (and produce as output) a partial computation on a point of the plane.
   * an input parameter //iter_grain// is specified in the program
   * the input of the sequential function contains
@@ Linea 51: / Linea 53: @@
 You can compute on points in the stream and recycle them if the computation takes too long. Use the parallel_do methods to reinsert in the loop those points that are not completed, and let those that are completed flow out of the do_loop.
-  * Basic use of parallel do implies the grain is always 1. Design a data structure that can aggregate more points into a single parallel_do task. Minimize data copying required by the structure,
+  * does this require a parallel_do that generates new tasks?
+  * Basic use of parallel do implies the grain is always 1. Design a data structure that can aggregate more points into a single parallel_do task. Minimize data copying required by managing the structure, design a way to repack unfinished data points into new tasks while allowing finished computations to exit the loop. Does the new structure require inserting new tasks in the parallel_do?
+  * examine two solutions:
+    - the new tasks are inserted from within the loop itself, requiring a specific kind of parallel_do
+    - the new tasks are generated outside of the loop; how can you manage the synchronization between the loop and the task generators in order to prevent the loop from exiting when new tasks are about to be added?
 == Extensions ==
-If you consider tha computation to have some grain (sets of points are provided to each paralle_do inner function) the approach lends itself to become a sort of divide and conquer, where the long computation of some points are considered "big tasks" that are feed back in input after some time. These tasks can be repacked to allow completed points in the same task to not uselessly flow back into the loop. Proper design of the task structure minimizes data copying in this phase too.
+If you consider the computation can have a specific grain size (size of the sets of points that are provided to each parallel_do inner function) the approach lends itself to become a sort of divide and conquer, where the long computation of some points are considered "big tasks" that are fed back in input after some time. These tasks can be repacked to allow completed points in the same task to not uselessly flow back into the loop. Proper design of the task structure minimizes data copying in this phase too.