1. Comment the following sentence: "A GPU processor obtains higher performances for functional parallelism, once the higher granularity of the tasks favors the GPU architectural design."

False. Functional parallelism means creating a dependency graph of the tasks to be executed, and then parallelize the ones that do not have dependencies on another. This is not where the GPU excels. A GPU is great for data parallelism, where the same operation is acted on a large set of values (SIMD).

  1. Consider the computation of two vectors, sCol and sLin, with the sums, respectively, of the elements of each column and line of matrix A of dimension m×n
    a) Implement a sequential solution and compute the computational complexity of the algorithm.

    b) Implement a parallel program, using MPI, for a distributed memory machine with P processors.

  2. In the context of OpenMP, explain what is loop scheduling, and what are the main differences between static and dynamic loop scheduling. Give an example of adequate usage of both scheduling methods.

  3. Comment the following statement: "A system scalability is determined by the speedup obtained for a high number of processors".
    Yes.

  4. Consider the operation of computing a histogram of an 8-bit image. Each pixel is a single byte. The array count at the end of the program will have the occurrences of each value in the range of [0, 255].

byte image[n][n];
int count[256], i, j;

# pragma omp parallel for
for (i=0; i<n; i++) {
	for (j=0; j<n; j++) {
		count[image[i][j]]++;
	}
}

a) Explain what are the problems of the solution presented in the code.

  1. race conditions. if 2 threads try to write to the same index of the count array then we are going to get incorrect results.

b) Propose a correct and efficient solution.