Mojo struct

Info

@register_passable struct Info

Comprehensive information about a GPU architecture.

This struct contains detailed specifications about GPU capabilities, including compute units, memory, thread organization, and performance characteristics.

Fields

name (StringSlice[StaticConstantOrigin]): The model name of the GPU.
vendor (Vendor): The vendor/manufacturer of the GPU (e.g., NVIDIA, AMD).
api (StringSlice[StaticConstantOrigin]): The graphics/compute API supported by the GPU (e.g., CUDA, ROCm).
arch_name (StringSlice[StaticConstantOrigin]): The architecture name of the GPU (e.g., sm_80, gfx942).
compile_options (StringSlice[StaticConstantOrigin]): Compiler options specific to this GPU architecture.
compute (SIMD[float32, 1]): Compute capability version number for NVIDIA GPUs.
version (StringSlice[StaticConstantOrigin]): Version string of the GPU architecture.
sm_count (Int): Number of streaming multiprocessors (SMs) on the GPU.
warp_size (Int): Number of threads in a warp/wavefront.
threads_per_sm (Int): Maximum number of threads per streaming multiprocessor.
threads_per_warp (Int): Number of threads that execute together in a warp/wavefront.
warps_per_multiprocessor (Int): Maximum number of warps that can be active on a multiprocessor.
threads_per_multiprocessor (Int): Maximum number of threads that can be active on a multiprocessor.
thread_blocks_per_multiprocessor (Int): Maximum number of thread blocks that can be active on a multiprocessor.
shared_memory_per_multiprocessor (Int): Size of shared memory available per multiprocessor in bytes.
register_file_size (Int): Total size of the register file per multiprocessor in bytes.
register_allocation_unit_size (Int): Minimum allocation size for registers in bytes.
allocation_granularity (StringSlice[StaticConstantOrigin]): Description of how resources are allocated on the GPU.
max_registers_per_thread (Int): Maximum number of registers that can be allocated to a single thread.
max_registers_per_block (Int): Maximum number of registers that can be allocated to a thread block.
max_blocks_per_multiprocessor (Int): Maximum number of blocks that can be scheduled on a multiprocessor.
shared_memory_allocation_unit_size (Int): Minimum allocation size for shared memory in bytes.
warp_allocation_granularity (Int): Granularity at which warps are allocated resources.
max_thread_block_size (Int): Maximum number of threads allowed in a thread block.

Implemented traits

AnyType, Stringable, UnknownDestructibility, Writable

Methods

`lt`

__lt__(self, other: Self) -> Bool

Compares if this GPU has lower compute capability than another.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if this GPU has lower compute capability, False otherwise.

`le`

__le__(self, other: Self) -> Bool

Compares if this GPU has lower or equal compute capability.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if this GPU has lower or equal compute capability.

`eq`

__eq__(self, other: Self) -> Bool

Checks if two GPU Info instances represent the same GPU model.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if both instances represent the same GPU model.

`ne`

__ne__(self, other: Self) -> Bool

Checks if two GPU Info instances represent different GPU models.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if instances represent different GPU models.

`gt`

__gt__(self, other: Self) -> Bool

Compares if this GPU has higher compute capability than another.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if this GPU has higher compute capability, False otherwise.

`ge`

__ge__(self, other: Self) -> Bool

Compares if this GPU has higher or equal compute capability.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if this GPU has higher or equal compute capability.

`is`

__is__(self, other: Self) -> Bool

Identity comparison operator for GPU Info instances.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if both instances represent the same GPU model.

`isnot`

__isnot__(self, other: Self) -> Bool

Negative identity comparison operator for GPU Info instances.

Args:

other (Self): Another GPU Info instance to compare against.

Returns:

True if instances represent different GPU models.

`target`

target(self) -> target

Gets the MLIR target configuration for this GPU.

Returns:

MLIR target configuration for the GPU.

`from_target`

static from_target[target: target]() -> Self

Creates an Info instance from an MLIR target.

Parameters:

target (target): MLIR target configuration.

Returns:

GPU info corresponding to the target.

`from_name`

static from_name[name: StringSlice[StaticConstantOrigin]]() -> Self

Creates an Info instance from a GPU architecture name.

Parameters:

name (StringSlice[StaticConstantOrigin]): GPU architecture name (e.g., "sm_80", "gfx942").

Returns:

GPU info corresponding to the architecture name.

`occupancy`

occupancy(self, *, threads_per_block: Int, registers_per_thread: Int) -> SIMD[float64, 1]

Calculates theoretical occupancy for given thread and register config.

Occupancy represents the ratio of active warps to the maximum possible warps on a streaming multiprocessor.

Note: TODO (KERN-795): Add occupancy calculation based on shared memory usage and thread block size and take use the minimum value.

Args:

threads_per_block (Int): Number of threads in each block.
registers_per_thread (Int): Number of registers used by each thread.

Returns:

Occupancy as a ratio between 0.0 and 1.0.

`write_to`

write_to[W: Writer](self, mut writer: W)

Writes GPU information to a writer.

Outputs all GPU specifications and capabilities to the provided writer in a human-readable format.

Parameters:

W (Writer): The type of writer to use for output. Must implement the Writer trait.

Args:

writer (W): A Writer instance to output the GPU information.

`str`

__str__(self) -> String

Returns a string representation of the GPU information.

Converts all GPU specifications and capabilities to a human-readable string format.

Returns:

String containing all GPU information.

Fields​

Implemented traits​

Methods​

__lt__​

__le__​

__eq__​

__ne__​

__gt__​

__ge__​

__is__​

__isnot__​

target​

from_target​

from_name​

occupancy​

write_to​

__str__​

Fields

Implemented traits

Methods

`lt`

`le`

`eq`

`ne`

`gt`

`ge`

`is`

`isnot`

`target`

`from_target`

`from_name`

`occupancy`

`write_to`

`str`