Python module
manager
KVCacheInputs
KVCacheInputs
class max.nn.kv_cache.manager.KVCacheInputs
A base class that holds KV cache related (Tensor) inputs.
It is meant to be subclassed by concrete KV cache input types. For example, here’s a derived class for a text KV cache manager:
@dataclass
class RaggedKVCacheInputs(KVCacheInputs):
blocks: Tensor
cache_lengths: Tensor
lookup_table: Tensor
max_lengths: Tensor
KVCacheInputsSequence
class max.nn.kv_cache.manager.KVCacheInputsSequence(kv_cache_inputs)
KVCacheInputsSequence
is a sequence of KVCacheInputs
.
It is primarily used in our multistep execution to represent batched KVCacheInputs.
-
Parameters:
-
kv_cache_inputs (Sequence[KVCacheInputs])
kv_cache_inputs
kv_cache_inputs: Sequence[KVCacheInputs]
RaggedKVCacheInputs
class max.nn.kv_cache.manager.RaggedKVCacheInputs(blocks, cache_lengths, lookup_table, max_lengths)
RaggedKVCacheInputs
is a class that holds the inputs for
KV cache when used together with ragged tensors.
blocks
blocks: Tensor
cache_lengths
cache_lengths: Tensor
lookup_table
lookup_table: Tensor
max_lengths
max_lengths: Tensor
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!