Contents

CUDA introduction part 1

CUDA is a tool, with the objective of using CUDA is parallelizing workloads accross multiple cores called CUDA cores. It is not the ideal computing source for most applications (e.g. single-threaded applications). Ideally, you would be using a GPGPU (General-Purpose GPU) computing platform like CUDA for massively parralelizable applications. One example of that is computing the determinant of large N×NN \times N matrices.


Heterogenous computing: offloading certain types of operations from the processor (CPU) to the GPU.

Warp: A group of CUDA threads that a single streaming multi-processor (SM) controls. A warp has 32 threads in it.

CUDA also defines special computing units called blocks and grids. These units are based on threads, which a CUDA thread executes on a single CUDA core. A group of threads are organized into one logical entity called a CUDA block. These are software terms, which correspond to CUDA core and CUDA multi-processor, respectively.

The CUDA grid\kernel is then a group of blocks that is executed on the device (the GPU). We have this structure to make threads within the same block communicate with each other.


The three chevron brackets <<<>>>< < < > > > indicate the number of threads and blocks to run the device code on.

some_device_function<<<number_of_blocks, number_of_threads>>>();

Notice the speed difference between using your CPU (Ryzen 3950X in my case), and a GPU with many more compute corse (GTX 780 here).

//kernel definition

__global__ void VecAdd(float* A, float* B, float* C){
	int i = threadIdx.x; // This is the thread's ID number, which is of length N. This function will run as N threads 
	C[i] = A[i] + B[i];
}


int main(){
	
	VecAdd<<<1, N>>>(A, B, C);  // <<<>>> is the execution configuration syntax 
}



  1. Learning CUDA programming A begineers guide to GPU programming and parallel computing with CUDA 10.x and CC++ by Jaegeun Han, Bharatkumar Sharma

  2. CUDA CPP Programming Guide

  3. Programming Massively Parallel Processors A Hands-On Approach by David B. Kirk, Wen-Mei W Hwu


I recommend watching cudaeducation.com for video tutorials on implementing CUDA code.