Expand description
CUDA runtime for data transfer and kernel execution.
There exist multiple methods to transfer data from main-memory to device memory. Also, data transfer and execution should overlap for the best performance. This module provides a collection of transfer method implementations, and efficient iterators for executing GPU kernels.
Structs
CUDA iterator for two mutable inputs.
Timings of the CudaTransferStrategy
phases
CUDA iterator for two mutable unified memory inputs.
Timer based on CUDA events.
Enums
Specify the CUDA transfer strategy.
Traits
Conversion into a CUDA iterator.
Conversion into a CUDA iterator with a specified transfer strategy.