Skip to main content
Log in

Mojo struct

LayoutTensor

@register_passable(trivial) struct LayoutTensor[dtype: DType, layout: Layout, rank: Int = $1.rank(), /, *, address_space: AddressSpace = 0, element_layout: Layout = __init__[::Origin[{False}],::Origin[{False}]](IntTuple(1), IntTuple(1)), layout_bitwidth: Int = Int(bitwidthof[::DType,__mlir_type.!kgen.target]()), masked: Bool = False, alignment: Int = Int(alignof[::DType,__mlir_type.!kgen.target]())]

This is a Tensor type that has a specified memory layout and rank. The following example demonstrate a LayoutTensor of float32 with a row major layout of shape (5, 4).

alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()
alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()

Parameters

  • dtype (DType): The data type of the underlying pointer.
  • layout (Layout): The memory layout of the Tensor.
  • rank (Int): The rank of the Tensor.
  • address_space (AddressSpace): The address space of the underlying pointer.
  • element_layout (Layout): The memory layout of each element in the Tensor.
  • layout_bitwidth (Int): The bitwidth of each dimension of runtime layout.
  • masked (Bool): If true the tensor is masked and runtime layouts determine the shape.
  • alignment (Int): Alignment of the data pointer.

Aliases

  • index_type = _get_index_type(layout, address_space):
  • uint_type = SIMD[_get_unsigned_type(layout, address_space), 1]:
  • element_size = element_layout.size():
  • element_type = SIMD[dtype, element_layout.size()]:

Fields

  • ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment]):
  • runtime_layout (RuntimeLayout[layout, bitwidth=layout_bitwidth]):
  • runtime_element_layout (RuntimeLayout[element_layout]):

Implemented traits

AnyType, CollectionElement, CollectionElementNew, Copyable, ExplicitlyCopyable, Movable, Stringable, UnknownDestructibility, Writable

Methods

__init__

@implicit __init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]) -> Self

Create a LayoutTensor with an UnsafePointer. Expect layout to be fully static.

Args:

  • ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]): The UnsafePointer pointing to the underlying data.

__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin], runtime_layout: RuntimeLayout[layout, bitwidth=bitwidth]) -> Self

Create a LayoutTensor with an UnsafePointer. Expect element layout to be fully static.

Args:

  • ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]): The UnsafePointer pointing to the underlying data.
  • runtime_layout (RuntimeLayout[layout, bitwidth=bitwidth]): The runtime layout of the LayoutTensor.

__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin], runtime_layout: RuntimeLayout[layout, bitwidth=layout_bitwidth], element_runtime_layout: RuntimeLayout[element_layout]) -> Self

Create a LayoutTensor with an UnsafePointer, a runtime layout of the Tensor, the runtime layout of each element.

Args:

  • ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]): The UnsafePointer pointing to the underlying data.
  • runtime_layout (RuntimeLayout[layout, bitwidth=layout_bitwidth]): The runtime layout of the LayoutTensor.
  • element_runtime_layout (RuntimeLayout[element_layout]): The runtime layout of each element.

__getitem__

__getitem__(self, *dims: Int) -> SIMD[dtype, element_layout.size()]

Get the element of the tensor with a specified index. Note that the size of index has to match the rank of the tensor.

Args:

  • *dims (Int): The indexes that specify which element to retrieve.

__setitem__

__setitem__(self, d0: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

  • d0 (Int): The first dimensional index.
  • val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__setitem__(self, d0: Int, d1: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

  • d0 (Int): The first dimensional index.
  • d1 (Int): The second dimensional index.
  • val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__setitem__(self, d0: Int, d1: Int, d2: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

  • d0 (Int): The first dimensional index.
  • d1 (Int): The second dimensional index.
  • d2 (Int): The third dimensional index.
  • val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__setitem__(self, d0: Int, d1: Int, d2: Int, d3: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

  • d0 (Int): The first dimensional index.
  • d1 (Int): The second dimensional index.
  • d2 (Int): The third dimensional index.
  • d3 (Int): The fourth dimensional index.
  • val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__add__

__add__(self, other: SIMD[dtype, 1]) -> Self

Add the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__add__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Do an addition with another LayoutTensor and return the added tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be added to.

__sub__

__sub__(self, other: SIMD[dtype, 1]) -> Self

Subtract the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__sub__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Do an subtraction with another LayoutTensor and return the subtracted tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be subtract from.

__mul__

__mul__(self, other: SIMD[dtype, 1]) -> Self

Multiply the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__mul__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Perform a multiplication with another LayoutTensor and return the resulting tensor.

Currently, only tensors of the same shape are supported if the ranks are the same. Additionally, tensors of rank-2 are supported.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be multiplied with.

Returns:

The resulting tensor after multiplication.

__truediv__

__truediv__(self, other: SIMD[dtype, 1]) -> Self

Truediv the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__truediv__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Do an truediv with another LayoutTensor and return the divided tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be subtract from.

__iadd__

__iadd__(self, other: SIMD[dtype, 1])

Adds scalar value to the LayoutTensor. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__iadd__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])

Do an addition with another LayoutTensor in place. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be added to.

__isub__

__isub__(self, other: SIMD[dtype, 1])

Subtract scalar value from the LayoutTensor. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__isub__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])

Subtracts other from the LayoutTensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be subtract from.

__imul__

__imul__(self, other: SIMD[dtype, 1])

Multiply the LayoutTensor with a scalar value inplace. The scalar value will be broadcasted to the entire tensor.

Args:

  • other (SIMD[dtype, 1]): The scalar value.

__imul__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])

Do a multiplication with another LayoutTensor in place. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

  • other_layout (Layout): The layout of the other tensor.

Args:

  • other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be added to.

copy

copy(self) -> Self

Explicitly copy the other LayoutTensor.

Returns:

A copy of the value.

bitcast

bitcast[new_type: DType, /, address_space: AddressSpace = address_space, element_layout: Layout = element_layout](self) -> LayoutTensor[new_type, layout, layout.rank(), address_space=address_space, element_layout=element_layout, masked=masked]

Bitcast the underlying pointer to a new data type.

Parameters:

  • new_type (DType): The new data type it is casting to.
  • address_space (AddressSpace): The address space of the returned LayoutTensor.
  • element_layout (Layout): The element layout of the returned LayoutTensor.

__elementwise_unary

__elementwise_unary[func: fn(SIMD[dtype, element_layout.size()]) capturing -> SIMD[dtype, element_layout.size()], inplace: Bool = False](self) -> Self

__elementwise_binary_with_broadcast

__elementwise_binary_with_broadcast[func: fn(SIMD[dtype, element_layout.size()], SIMD[dtype, element_layout.size()]) capturing -> SIMD[dtype, element_layout.size()], other_layout: Layout, inplace: Bool = False](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

load

load[width: Int](self, m: Int, n: Int) -> SIMD[dtype, width]

Load a value from a specified location.

Parameters:

  • width (Int): The simd width of the returned value.

Args:

  • m (Int): The m dimension of the value.
  • n (Int): The n dimension of the value.

prefetch

prefetch(self, m: Int, n: Int)

Do software prefetching of a value from a specified location.

Args:

  • m (Int): The m dimension of the value.
  • n (Int): The n dimension of the value.

aligned_load

aligned_load[width: Int](self, m: Int, n: Int) -> SIMD[dtype, width]

Do a load with a specified alignment base on the dtype and simd width.

Parameters:

  • width (Int): The simd width if the returned value.

Args:

  • m (Int): The m dimension of the value.
  • n (Int): The n dimension of the value.

store

store[width: Int](self, m: Int, n: Int, val: SIMD[dtype, width])

Store a value to a specified location.

Parameters:

  • width (Int): The simd width of the stored value.

Args:

  • m (Int): The m dimensional index to the tensor.
  • n (Int): The n dimensional index to the tensor.
  • val (SIMD[dtype, width]): The value to be stored.

aligned_store

aligned_store[width: Int](self, m: Int, n: Int, val: SIMD[dtype, width])

Do a store with a specified alignment base on the dtype and simd width.

Parameters:

  • width (Int): The simd width if the stored value.

Args:

  • m (Int): The m dimensional index to the tensor.
  • n (Int): The n dimensional index to the tensor.
  • val (SIMD[dtype, width]): The value to be stored.

stack_allocation

static stack_allocation[*, alignment: Int = alignment]() -> Self

Allocates stack memory for a LayoutTensor. Expects layout to be fully static.

Parameters:

  • alignment (Int): Alignment of the allocation. It must be multiple of the tensor's alignment, which is the minimum required by arch, instruction, etc.

shape

static shape[idx: Int]() -> Int

Returns the shape of the tensor given a index.

Parameters:

  • idx (Int): The index to the shape of the tensor.

stride

static stride[idx: Int]() -> Int

Returns the stride of the tensor given a index.

Parameters:

  • idx (Int): The index to the stride of the tensor.

dim

dim(self, idx: Int) -> Int

Returns the dimension of the tensor given a index.

Arguments: idx: The index to the dimension of the tensor.

coalesce

coalesce(self) -> LayoutTensor[dtype, coalesce(layout, False), coalesce(layout, False).rank(), address_space=address_space, element_layout=element_layout]

Returns a LayoutTensor with a coalesced Layout.

tile

tile[*tile_sizes: Int](self, *tile_coords: Int) -> LayoutTensor[dtype, _compute_tile_layout[*::Int]().__getitem__(0), _compute_tile_layout[*::Int]().__getitem__(0).rank(), address_space=address_space, element_layout=element_layout, masked=masked if masked else _tile_is_masked[layout::layout::Layout,*::Int]()]

Tiles the layout and returns a tensor tile with the specified tile_sizes at specific tile coordinates.

Example:

Memory Layout of
[1 2 3 4]
[2 3 4 5]
[5 4 3 2]
[1 1 1 1]

tile[2, 2](1, 0) will give you
[5 4]
[1 1]
Memory Layout of
[1 2 3 4]
[2 3 4 5]
[5 4 3 2]
[1 1 1 1]

tile[2, 2](1, 0) will give you
[5 4]
[1 1]

Parameters:

  • *tile_sizes (Int): The tile sizes of the returned LayoutTensor.

Args:

  • *tile_coords (Int): The tile coordinate. This refer to the coordinate of the tile after the tiled layout. Consider the following example.

tiled_iterator

tiled_iterator[*tile_sizes: Int, *, axis: Int = 0](self, *tile_coords: Int) -> LayoutTensorIter[dtype, _compute_tile_layout[*::Int]().__getitem__(0), address_space=address_space, axis=OptionalReg(axis), layout_bitwidth=layout_bitwidth, masked=masked if masked else _tile_is_masked[layout::layout::Layout,*::Int]()]

Returns the tiled iterator of the LayoutTensor.

Parameters:

  • *tile_sizes (Int): Tile sizes of each tile the iterator will iterate through.
  • axis (Int): Axis of the LayoutTensor the iterator will iterate through.

Args:

  • *tile_coords (Int): The tile coordinate that the iterator will point to.

split

split[count: Int, axis: Int = 0](self) -> StaticTuple[LayoutTensor[dtype, _compute_tile_layout[::Int,::Int]().__getitem__(0), _compute_tile_layout[::Int,::Int]().__getitem__(0).rank(), address_space=address_space, element_layout=element_layout], count]

Split the LayoutTensor along a axis and return an InlineArray of LayoutTensor.

Parameters:

  • count (Int): Number of portion to split.
  • axis (Int): The axis where the split is applied to.

split[axis: Int = 0, alignment: Int = 1](self, count: Int, idx: Int) -> LayoutTensor[dtype, layout.make_shape_unknown[::Int](), layout.make_shape_unknown[::Int]().rank(), address_space=address_space, element_layout=element_layout]

distribute

distribute[threads_layout: Layout, axis: OptionalReg[Int] = OptionalReg(None), swizzle: OptionalReg[Swizzle] = OptionalReg(None), submode_axis: OptionalReg[Int] = OptionalReg(None)](self, thread_id: UInt) -> LayoutTensor[dtype, _compute_distribute_layout[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]().__getitem__(1), _compute_distribute_layout[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]().__getitem__(1).rank(), address_space=address_space, element_layout=element_layout, masked=masked if masked else _distribute_is_masked[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]()]

Distribute tiled workload to threads.

If the axis is given, for example, using axis = 0 for 4 threads: TH_0 TH_2 TH_1 TH_3 This means the tensor is only distributed to threads in axis = 0, i.e., threads 0 and 1. Threads 2 and 3 gets the same tile as 0 and 1, respectively. This is useful when threads load same vectors from a row in A matrix and some threads share the same vector.

vectorize

vectorize[*vector_shape: Int](self) -> LayoutTensor[dtype, coalesce(_compute_tile_layout[*::Int]().__getitem__(1), True), coalesce(_compute_tile_layout[*::Int]().__getitem__(1), True).rank(), address_space=address_space, element_layout=_divide_tiles[*::Int]().__getitem__(0), masked=masked]

__compute_slice_layout

static __compute_slice_layout(d0_slice: Slice, d1_slice: Slice) -> Layout

static __compute_slice_layout(slice_0: Slice, slice_1: Slice, slice_0_axis: Int, slice_1_axis: Int) -> Layout

static __compute_slice_layout(slice_0: Slice, slice_0_axis: Int) -> Layout

slice

slice[d0_slice: Slice, d1_slice: Slice](self) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, d1_slice), __compute_slice_layout(d0_slice, d1_slice).rank(), address_space=address_space, element_layout=element_layout]

slice[d0_slice: Slice, d1_slice: Slice, slice_indices: IndexList[2], __offset_dims: Int = rank.__sub__(2)](self, offsets: IndexList[__offset_dims]) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, d1_slice, slice_indices.__getitem__[::Indexer](0), slice_indices.__getitem__[::Indexer](1)), __compute_slice_layout(d0_slice, d1_slice, slice_indices.__getitem__[::Indexer](0), slice_indices.__getitem__[::Indexer](1)).rank(), address_space=address_space, element_layout=element_layout]

slice_1d

slice_1d[d0_slice: Slice, slice_indices: IndexList[1], __offset_dims: Int = rank.__sub__(1)](self, offsets: IndexList[__offset_dims]) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, slice_indices.__getitem__[::Indexer](0)), __compute_slice_layout(d0_slice, slice_indices.__getitem__[::Indexer](0)).rank(), address_space=address_space, element_layout=element_layout]

transpose

transpose[M: Int = shape[::Int](), N: Int = shape[::Int]()](self) -> LayoutTensor[dtype, composition(layout, __init__[::Origin[{False}],::Origin[{False}]](IntTuple(N, M), IntTuple(M, 1))), composition(layout, __init__[::Origin[{False}],::Origin[{False}]](IntTuple(N, M), IntTuple(M, 1))).rank(), address_space=address_space, element_layout=element_layout]

reshape

reshape[dst_layout: Layout](self) -> LayoutTensor[dtype, dst_layout, dst_layout.rank(), address_space=address_space, element_layout=element_layout, masked=masked]

composition

composition[rhs_layout: Layout, dst_layout: Layout = composition(layout, $0)](self) -> LayoutTensor[dtype, dst_layout, dst_layout.rank(), address_space=address_space, element_layout=element_layout]

distance

distance[_uint_dtype: DType = uint32 if address_space.__eq__(3) else uint64](self, addr: UnsafePointer[SIMD[dtype, 1], address_space=address_space]) -> SIMD[_uint_dtype, 1]

Returns the distance from the input address.

distance[_layout: Layout, _uint_dtype: DType = _get_unsigned_type($0, address_space)](self, src: LayoutTensor[dtype, _layout, _layout.rank(), address_space=address_space]) -> SIMD[_uint_dtype, 1]

Returns the distance from the input address.

__get_element_idx

__get_element_idx[elem_i: Int](self) -> Int

copy_from

copy_from(self, other: LayoutTensor[dtype, layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])

copy_from_async

copy_from_async[is_masked: Bool = False, swizzle: OptionalReg[Swizzle] = OptionalReg(None), fill: Fill = 0, eviction_policy: CacheEviction = 0](self, src: LayoutTensor[dtype, layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src_idx_bound: SIMD[_get_index_type(layout, address_space), 1] = SIMD(0), base_offset: SIMD[_get_unsigned_type(layout, address_space), 1] = SIMD(0))

fill

fill(self, val: SIMD[dtype, 1]) -> Self

__str__

__str__(self) -> String

write_to

write_to[W: Writer](self, mut writer: W)

Format 2D tensor in 2D, otherwise print all values in column major coordinate order.