Mojo function
ldg
ldg[type: DType, //, width: Int = 1, *, alignment: Int = alignof[::AnyType,__mlir_type.!kgen.target]()](x: UnsafePointer[SIMD[type, 1]]) -> SIMD[type, width]
Load data from global memory through the non-coherent cache.
This function provides a hardware-accelerated global memory load operation
that uses the GPU's non-coherent cache (equivalent to CUDA's __ldg
instruction).
It optimizes for read-only data access patterns.
Note: - Uses invariant loads which indicate the memory won't change during kernel execution. - Particularly beneficial for read-only texture-like access patterns. - May improve performance on memory-bound kernels.
Parameters:
- type (
DType
): The data type to load (must be numeric). - width (
Int
): The SIMD vector width for vectorized loads. - alignment (
Int
): Memory alignment in bytes. Defaults to natural alignment of the SIMD vector type.
Args:
- x (
UnsafePointer[SIMD[type, 1]]
): Pointer to global memory location to load from.
Returns:
SIMD vector containing the loaded data.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!