Expand description

CUDA runtime for data transfer and kernel execution.

There exist multiple methods to transfer data from main-memory to device memory. Also, data transfer and execution should overlap for the best performance. This module provides a collection of transfer method implementations, and efficient iterators for executing GPU kernels.

Structs

CUDA iterator for two mutable inputs.

Timings of the CudaTransferStrategy phases

CUDA iterator for two mutable unified memory inputs.

Timer based on CUDA events.

Enums

Specify the CUDA transfer strategy.

Traits

Conversion into a CUDA iterator.

Conversion into a CUDA iterator with a specified transfer strategy.