Skip to main content
Log in

Mojo struct

ContinuousBatchingKVCacheCollection

struct ContinuousBatchingKVCacheCollection[type_: DType, kv_params_: KVCacheStaticParams, assert_write_mode: Int = 0]

This is a "view" of the cache for the given sequences in the batch.

This object does not own the underlying buffers in k_cache and v_cache, it's borrowing them from the BlockWrappers in our KVCacheManager. It does own the Pointer[NDBuffer[type, 3]] and valid_lengths buffer

Fields

  • cache_lengths (NDBuffer[uint32, 1, MutableAnyOrigin]):
  • lookup_table (NDBuffer[uint32, 1, MutableAnyOrigin]):
  • blocks (NDBuffer[type_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]):
  • max_seq_length (SIMD[uint32, 1]):
  • max_cache_length (SIMD[uint32, 1]):
  • kv_cache_dynamic_shape (IndexList[4]):
  • kv_cache_dynamic_strides (IndexList[4]):

Implemented traits

AnyType, Copyable, KVCollectionT, Movable, UnknownDestructibility

Aliases

blocks_shape

alias blocks_shape = DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size))

blocks_stride

alias blocks_stride = _strides_from_shape[::DimList,::Int]()

blocks_type

alias blocks_type = NDBuffer[type_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]

CacheType

alias CacheType = ContinuousBatchingKVCache[type_, kv_params_, assert_write_mode]

kv_params

alias kv_params = kv_params_

name_str

alias name_str = "continuous_batching"

type

alias type = type_

Methods

__init__

__init__(out self, blocks: NDBuffer[type_, 6, MutableAnyOrigin], cache_lengths: NDBuffer[uint32, 1, MutableAnyOrigin], lookup_table: NDBuffer[uint32, 1, MutableAnyOrigin], max_seq_length: SIMD[uint32, 1], max_cache_length: SIMD[uint32, 1])

copy

copy(self) -> Self

Explicitly construct a copy of self.

Returns:

A copy of this value.

get_key_cache

get_key_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[type_, kv_params_, assert_write_mode]

get_value_cache

get_value_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[type_, kv_params_, assert_write_mode]

cache_length

cache_length(self, bs_idx: Int) -> Int