Parallel Applications 2

Public Resources

 

No login is required.
If provided, read carefully additional texts that can contain some helpful information on the origins of the files, licence agreements, etc.

Internal Resources

 

Login is required to view all content and to download files in this section.
Do not enter your LDAP credentials. A common user name and password were set for all students at the beginning of semester.

Lesson 1

Prerequisites

  • download CUDA 10 project template with all additional libraries for further usage -> DOWNLOAD
  • knowledge of C++

 

Topics and Tasks

  • try to compile the project template
  • explore the project structure

Lesson 2

Prerequisites

  • download the project template with all additional libraries for further usage -> DOWNLOAD
  • knowledge of C++

 

Topics and Tasks

  • Allocate the HOST memory that will represent two M-dimensional vectors (A, B) and fill them with some values.
  • Allocate the DEVICE memory to be able to copy data from HOST.
  • Allocate the DEVICE memory to store an output M-dimensional vector C.
  • Create a kernel that sums scalar values such that C[i] = A[i] + B[i].
  • Allocate the HOST memory that will represent N M-dimensional vectors (A_0,...A_n-1, B_0, ... B_n-1) and fill them with some values.
  • Allocate the DEVICE memory to be able to copy data from HOST.
  • Allocate the DEVICE memory to store output M-dimensional vectors C_0 ... C_n-1.
  • Create a kernel that sums all vectors pairs that C_0[i] = A_0[i] + B_0[i], ... C_n-1[i] = A_n-1[i] + B_n-1[i].
  • THINK ABOUT THE VARIANTS OF YOUR SOLUTION, CONSIDER THE PROS AND CONS.

Lesson 3

Prerequisites

  • download the project template with all additional libraries for further usage -> DOWNLOAD
  • CUDA - memory allocation, page-locked memory

 

Topics and Tasks

  • Create a column matrix m[mRows,mCols] containing the numbers 0 1 2 3 ...
  • The data should be well alligned in the page-locked memory.
  • The matrix should be filled in CUDA kernel.
  • You must use a Pitch CUDA memory with appropriate alignment. Moreover you must use 2D grid of 2D blocks of size 8x8.
  • Increment the values of the matrix.
  • Finally, copy the matrix to HOST using cudaMemcpy2D function.

 

Help for students

  • download an illustrative template of the partial solution -> DOWNLOAD

Lesson 4

Prerequisites

  • download the project template with all additional libraries for further usage -> DOWNLOAD
  • download the the runner template -> DOWNLOAD
  • CUDA - shared memory

 

Topics and Tasks

  • Lets have a simple particle system representing a set of positions of N rain drops in the 3D space, where N>=1M.
  • Create a suitable data representation of the mentioned set of rain drops.
  • Lets have a filed of 256 wind power plants that give 256 movement vectors. The movement vectors invoke changes of all rain drops positions in a second.
  • Create a kernel that simulates the falling of rain drops.
  • Just for sake of simplicity suppose that a single kernel call simulates one second in the simulated world.

 

Help for students

  • download an illustrative template of the partial solution -> DOWNLOAD

Lesson 5

This lesson is focused on discussion about students projects. In the rest of time, the following tasks should be solved.

 

Prerequisites

  • download the project template with all additional libraries for further usage -> DOWNLOAD
  • CUDA - constant memory

 

Topics and Tasks

  • Try to write a simple code that will allocate and set a scalar value in the GPU constant memory.
  • Copy the data back to HOST and check the value.
  • Do the same with custom structure and then with some array.

Lesson 6

Prerequisites

  • download the project template with all additional libraries for further usage -> DOWNLOAD
  • runner 6 - DOWNLOAD
  • CUDA - texture memory

 

Topics and Tasks

  • Try to finish a given application. To do that, you have to implement all subtasks marked by TODO in the code.