StarPU Handbook
|
Macros | |
#define | STARPU_USE_CUDA |
#define | STARPU_MAXCUDADEVS |
#define | STARPU_CUDA_REPORT_ERROR(status) |
#define | STARPU_CUBLAS_REPORT_ERROR(status) |
Functions | |
cudaStream_t | starpu_cuda_get_local_stream (void) |
struct cudaDeviceProp * | starpu_cuda_get_device_properties (unsigned workerid) |
void | starpu_cuda_report_error (const char *func, const char *file, int line, cudaError_t status) |
int | starpu_cuda_copy_async_sync (void *src_ptr, unsigned src_node, void *dst_ptr, unsigned dst_node, size_t ssize, cudaStream_t stream, enum cudaMemcpyKind kind) |
void | starpu_cuda_set_device (unsigned devid) |
void | starpu_cublas_init (void) |
void | starpu_cublas_shutdown (void) |
void | starpu_cublas_report_error (const char *func, const char *file, int line, cublasStatus status) |
#define STARPU_USE_CUDA |
This macro is defined when StarPU has been installed with CUDA support. It should be used in your code to detect the availability of CUDA as shown in Full source code for the ’Scaling a Vector’ example.
#define STARPU_MAXCUDADEVS |
This macro defines the maximum number of CUDA devices that are supported by StarPU.
#define STARPU_CUDA_REPORT_ERROR | ( | status | ) |
Calls starpu_cuda_report_error(), passing the current function, file and line position.
#define STARPU_CUBLAS_REPORT_ERROR | ( | status | ) |
Calls starpu_cublas_report_error(), passing the current function, file and line position.
cudaStream_t starpu_cuda_get_local_stream | ( | void | ) |
This function gets the current worker’s CUDA stream. StarPU provides a stream for every CUDA device controlled by StarPU. This function is only provided for convenience so that programmers can easily use asynchronous operations within codelets without having to create a stream by hand. Note that the application is not forced to use the stream provided by starpu_cuda_get_local_stream() and may also create its own streams. Synchronizing with cudaThreadSynchronize() is allowed, but will reduce the likelihood of having all transfers overlapped.
|
read |
This function returns a pointer to device properties for worker workerid
(assumed to be a CUDA worker).
void starpu_cuda_report_error | ( | const char * | func, |
const char * | file, | ||
int | line, | ||
cudaError_t | status | ||
) |
Report a CUDA error.
int starpu_cuda_copy_async_sync | ( | void * | src_ptr, |
unsigned | src_node, | ||
void * | dst_ptr, | ||
unsigned | dst_node, | ||
size_t | ssize, | ||
cudaStream_t | stream, | ||
enum cudaMemcpyKind | kind | ||
) |
Copy ssize
bytes from the pointer src_ptr
on src_node
to the pointer dst_ptr
on dst_node
. The function first tries to copy the data asynchronous (unless stream is NULL
). If the asynchronous copy fails or if stream is NULL
, it copies the data synchronously. The function returns -EAGAIN
if the asynchronous launch was successfull. It returns 0 if the synchronous copy was successful, or fails otherwise.
void starpu_cuda_set_device | ( | unsigned | devid | ) |
Calls cudaSetDevice(devid) or cudaGLSetGLDevice(devid), according to whether devid
is among the field starpu_conf::cuda_opengl_interoperability.
void starpu_cublas_init | ( | void | ) |
This function initializes CUBLAS on every CUDA device. The CUBLAS library must be initialized prior to any CUBLAS call. Calling starpu_cublas_init() will initialize CUBLAS on every CUDA device controlled by StarPU. This call blocks until CUBLAS has been properly initialized on every device.
void starpu_cublas_shutdown | ( | void | ) |
This function synchronously deinitializes the CUBLAS library on every CUDA device.
void starpu_cublas_report_error | ( | const char * | func, |
const char * | file, | ||
int | line, | ||
cublasStatus | status | ||
) |
Report a cublas error.