CUDA study 2-2. Using Graphics Card Memory
Allocation of graphics card memory
cudaError_t cudaMalloc(void** devPtr, size_t count);
The cudaMalloc()
function functions to allocate memory space in the DRAM of the graphics card.
Enter a pointer to the memory to be allocated as the first argument, and enter the size of the memory to be allocated as the second argument.
cudaError_t
is a return value that indicates whether the function call was successful. If allocation is successful, ‘cudaSuccess
’ is returned, and if it fails, about 20 values are returned that can indicate the cause of failure.
Freeing up graphics card memory
cudaError_t cudaFree(void* devPtr);
The cudaFree()
function releases memory allocated to the DRAM of the graphics card. Pass the pointer to be released as an argument.
Copy data from PC to graphics card
cudaError_t cudaMemcpy(void* dst, const void* src, size_t count, cudaMemcpyHostToDevice);
The name of the function is ‘memcpy’, but its main purpose is to transfer data from PC memory to GPU memory or from GPU memory to PC memory.
The first argument is a pointer to the copy destination, and enters the pointer to the graphics memory allocated with the cudaMalloc()
function. The second argument is a pointer to the source to be copied. Enter the pointer to the memory allocated by the Malloc()
function on the PC. The third argument is the size to copy. The fourth argument enters the type of copy to be performed. When copying from a PC to a graphics card, use cudaMemcpyHostToDevice. The last argument can have the following values:
Type | Action |
---|---|
cudaMemcpyHostToHost |
Copy from PC memory to PC memory |
cudaMemcpyHostToDevice |
Copy from PC memory to graphics card |
cudaMemcpyDeviceToHost |
Copy from graphics card memory to PC |
cudaMemcpyDeviceToDevice |
Copy from graphics card memory to graphics card |
Copy data from graphics card to PC
cudaError_t cudaMemcpy(void* dst, const void* src, size_t count, cudaMemcpyDeviceToHost);
Example
#include <stdio.h>
int main()
{
int InputData[5] = {1, 2, 3, 4, 5};
int OutputData[5] = {0};
int* GraphicsCard_memory;
// Allocate graphics card memory
cudaMalloc((void**)&GraphicsCard_memory, 5*sizeof(int));
// Copy data from PC to graphics card
cudaMemcpy(GraphicsCard_memory, InputData, 5*sizeof(int), cudaMemcpyHostToDevice);
// Copy data from graphics card to PC
cudaMemcpy(OutputData, GraphicsCard_memory, 5*sizeof(int), cudaMemcpyDeviceToHost);
// Print result
for (int i = 0; i < 5; i++)
{
printf(" OutputData[%d] : %d\n", i, OutputData[i]);
}
// Freeing graphics card memory
cudaFree(GraphicsCard_memory);
return 0;
}