Skip to main content

Python module

manager

KVCacheInputs

KVCacheInputs

class max.nn.kv_cache.manager.KVCacheInputs

A base class that holds KV cache related (Tensor) inputs.

It is meant to be subclassed by concrete KV cache input types. For example, here’s a derived class for a text KV cache manager:

@dataclass
class RaggedKVCacheInputs(KVCacheInputs):
    blocks: Tensor
    cache_lengths: Tensor
    lookup_table: Tensor
    max_lengths: Tensor

KVCacheInputsSequence

class max.nn.kv_cache.manager.KVCacheInputsSequence(kv_cache_inputs)

KVCacheInputsSequence is a sequence of KVCacheInputs.

It is primarily used in our multistep execution to represent batched KVCacheInputs.

Parameters:

kv_cache_inputs (Sequence[KVCacheInputs])

kv_cache_inputs

kv_cache_inputs: Sequence[KVCacheInputs]

RaggedKVCacheInputs

class max.nn.kv_cache.manager.RaggedKVCacheInputs(blocks, cache_lengths, lookup_table, max_lengths)

RaggedKVCacheInputs is a class that holds the inputs for KV cache when used together with ragged tensors.

Parameters:

blocks

blocks: Tensor

cache_lengths

cache_lengths: Tensor

lookup_table

lookup_table: Tensor

max_lengths

max_lengths: Tensor

Was this page helpful?