Skip to main content
Log in

Mojo struct


@register_passable struct Info

Comprehensive information about a GPU architecture.

This struct contains detailed specifications about GPU capabilities, including compute units, memory, thread organization, and performance characteristics.


  • name (StringLiteral): The model name of the GPU.
  • vendor (Vendor): The vendor/manufacturer of the GPU (e.g., NVIDIA, AMD).
  • api (StringLiteral): The graphics/compute API supported by the GPU (e.g., CUDA, ROCm).
  • arch_name (StringLiteral): The architecture name of the GPU (e.g., sm_80, gfx942).
  • compile_options (StringLiteral): Compiler options specific to this GPU architecture.
  • compute (SIMD[float32, 1]): Compute capability version number for NVIDIA GPUs.
  • version (StringLiteral): Version string of the GPU architecture.
  • sm_count (Int): Number of streaming multiprocessors (SMs) on the GPU.
  • warp_size (Int): Number of threads in a warp/wavefront.
  • threads_per_sm (Int): Maximum number of threads per streaming multiprocessor.
  • threads_per_warp (Int): Number of threads that execute together in a warp/wavefront.
  • warps_per_multiprocessor (Int): Maximum number of warps that can be active on a multiprocessor.
  • threads_per_multiprocessor (Int): Maximum number of threads that can be active on a multiprocessor.
  • thread_blocks_per_multiprocessor (Int): Maximum number of thread blocks that can be active on a multiprocessor.
  • shared_memory_per_multiprocessor (Int): Size of shared memory available per multiprocessor in bytes.
  • register_file_size (Int): Total size of the register file per multiprocessor in bytes.
  • register_allocation_unit_size (Int): Minimum allocation size for registers in bytes.
  • allocation_granularity (StringLiteral): Description of how resources are allocated on the GPU.
  • max_registers_per_thread (Int): Maximum number of registers that can be allocated to a single thread.
  • max_registers_per_block (Int): Maximum number of registers that can be allocated to a thread block.
  • max_blocks_per_multiprocessor (Int): Maximum number of blocks that can be scheduled on a multiprocessor.
  • shared_memory_allocation_unit_size (Int): Minimum allocation size for shared memory in bytes.
  • warp_allocation_granularity (Int): Granularity at which warps are allocated resources.
  • max_thread_block_size (Int): Maximum number of threads allowed in a thread block.
  • flops (Flops): Floating-point operations per second capabilities for different precisions.

Implemented traits

AnyType, Copyable, ExplicitlyCopyable, Movable, UnknownDestructibility, Writable



__lt__(self, other: Self) -> Bool

Compares if this GPU has lower compute capability than another.


  • other (Self): Another GPU Info instance to compare against.


True if this GPU has lower compute capability, False otherwise.


__le__(self, other: Self) -> Bool

Compares if this GPU has lower or equal compute capability.


  • other (Self): Another GPU Info instance to compare against.


True if this GPU has lower or equal compute capability.


__eq__(self, other: Self) -> Bool

Checks if two GPU Info instances represent the same GPU model.


  • other (Self): Another GPU Info instance to compare against.


True if both instances represent the same GPU model.


__ne__(self, other: Self) -> Bool

Checks if two GPU Info instances represent different GPU models.


  • other (Self): Another GPU Info instance to compare against.


True if instances represent different GPU models.


__gt__(self, other: Self) -> Bool

Compares if this GPU has higher compute capability than another.


  • other (Self): Another GPU Info instance to compare against.


True if this GPU has higher compute capability, False otherwise.


__ge__(self, other: Self) -> Bool

Compares if this GPU has higher or equal compute capability.


  • other (Self): Another GPU Info instance to compare against.


True if this GPU has higher or equal compute capability.


__is__(self, other: Self) -> Bool

Identity comparison operator for GPU Info instances.


  • other (Self): Another GPU Info instance to compare against.


True if both instances represent the same GPU model.


__isnot__(self, other: Self) -> Bool

Negative identity comparison operator for GPU Info instances.


  • other (Self): Another GPU Info instance to compare against.


True if instances represent different GPU models.


target[index_bit_width: Int = 64](self) -> target

Gets the MLIR target configuration for this GPU.


  • index_bit_width (Int): The bit width for indices (default: 64).


MLIR target configuration for the GPU.


static from_target[target: target]() -> Self

Creates an Info instance from an MLIR target.


  • target (target): MLIR target configuration.


GPU info corresponding to the target.


static from_name[name: StringLiteral]() -> Self

Creates an Info instance from a GPU architecture name.


  • name (StringLiteral): GPU architecture name (e.g., "sm_80", "gfx942").


GPU info corresponding to the architecture name.


occupancy(self, *, threads_per_block: Int, registers_per_thread: Int) -> SIMD[float64, 1]

Calculates theoretical occupancy for given thread and register config.

Occupancy represents the ratio of active warps to the maximum possible warps on a streaming multiprocessor.

Note: TODO (KERN-795): Add occupancy calculation based on shared memory usage and thread block size and take use the minimum value.


  • threads_per_block (Int): Number of threads in each block.
  • registers_per_thread (Int): Number of registers used by each thread.


Occupancy as a ratio between 0.0 and 1.0.


write_to[W: Writer](self, mut writer: W)

Writes GPU information to a writer.

Outputs all GPU specifications and capabilities to the provided writer in a human-readable format.


  • W (Writer): The type of writer to use for output. Must implement the Writer trait.


  • writer (W): A Writer instance to output the GPU information.


__str__(self) -> String

Returns a string representation of the GPU information.

Converts all GPU specifications and capabilities to a human-readable string format.


String containing all GPU information.