Trait numa_gpu::runtime::cuda::IntoCudaIteratorWithStrategy

source · [−]

pub trait IntoCudaIteratorWithStrategy<'a> {
    type Iter;
    fn into_cuda_iter_with_strategy(
        &'a mut self, 
        strategy: CudaTransferStrategy, 
        chunk_len: usize, 
        cpu_memcpy_threads: usize, 
        cpu_affinity: &CpuAffinity
    ) -> Result<Self::Iter>;
}

Expand description

Conversion into a CUDA iterator with a specified transfer strategy.

By implementing IntoCudaIteratorWithStrategy for a type, you define how the type is converted into an iterator capable of executing CUDA functions on a GPU.

strategy specifies which transfer strategy to use. See the CudaTransferStrategy documation for details. The iterator must implement all strategies.

The chunk_len parameter specifies the granularity of each data transfer from main-memory to device memory. The same granularity is used when passing input parameters to the GPU kernel.

Associated Types

source

type Iter

The type of the iterator to produce.

Required methods

source

fn into_cuda_iter_with_strategy(
    &'a mut self,
    strategy: CudaTransferStrategy,
    chunk_len: usize,
    cpu_memcpy_threads: usize,
    cpu_affinity: &CpuAffinity
) -> Result<Self::Iter>

Creates an iterator from a value.

See the module-level documentation for details.

Implementations on Foreign Types

source

impl<'i, 'r, 's, R, S> IntoCudaIteratorWithStrategy<'i> for (&'r mut [R], &'s mut [S]) where
    'r: 'i,
    's: 'i,
    R: Copy + DeviceCopy + Send + Sync + 'i,
    S: Copy + DeviceCopy + Send + Sync + 'i,

Converts a tuple of two mutable slices into a CUDA iterator.

The slices must be mutable (and cannot be read-only) because transfer strategies can be implemented using a parallel pipeline. Parallelism requires exclusive access to data for Rust to successfully type-check. In Rust, holding a mutable reference guarantees exclusive access, because only a single mutable reference can exist at any point in time.

The slices can have mutually distinct lifetimes. However, the lifetime of the resulting iterator must be shorter than that of the shortest slice lifetime.

This implementation should be used as a basis for future implementations for other tuples of slices (e.g., one slice, three slices, etc.).

type Iter = CudaIterator2<'i, R, S>

source

fn into_cuda_iter_with_strategy(
    &'i mut self,
    strategy: CudaTransferStrategy,
    gpu_morsel_bytes: usize,
    cpu_memcpy_threads: usize,
    cpu_affinity: &CpuAffinity
) -> Result<CudaIterator2<'i, R, S>>

Trait numa_gpu::runtime::cuda::IntoCudaIteratorWithStrategy

Associated Types

type Iter

Required methods

fn into_cuda_iter_with_strategy( &'a mut self, strategy: CudaTransferStrategy, chunk_len: usize, cpu_memcpy_threads: usize, cpu_affinity: &CpuAffinity) -> Result<Self::Iter>

Implementations on Foreign Types

impl<'i, 'r, 's, R, S> IntoCudaIteratorWithStrategy<'i> for (&'r mut [R], &'s mut [S]) where 'r: 'i, 's: 'i, R: Copy + DeviceCopy + Send + Sync + 'i, S: Copy + DeviceCopy + Send + Sync + 'i,

type Iter = CudaIterator2<'i, R, S>

fn into_cuda_iter_with_strategy( &'i mut self, strategy: CudaTransferStrategy, gpu_morsel_bytes: usize, cpu_memcpy_threads: usize, cpu_affinity: &CpuAffinity) -> Result<CudaIterator2<'i, R, S>>

Implementors

fn into_cuda_iter_with_strategy(
&'a mut self,
strategy: CudaTransferStrategy,
chunk_len: usize,
cpu_memcpy_threads: usize,
cpu_affinity: &CpuAffinity
) -> Result<Self::Iter>

impl<'i, 'r, 's, R, S> IntoCudaIteratorWithStrategy<'i> for (&'r mut [R], &'s mut [S]) where
'r: 'i,
's: 'i,
R: Copy + DeviceCopy + Send + Sync + 'i,
S: Copy + DeviceCopy + Send + Sync + 'i,

fn into_cuda_iter_with_strategy(
&'i mut self,
strategy: CudaTransferStrategy,
gpu_morsel_bytes: usize,
cpu_memcpy_threads: usize,
cpu_affinity: &CpuAffinity
) -> Result<CudaIterator2<'i, R, S>>