Python module
cache_params
KVCacheParams
class max.pipelines.kv_cache.cache_params.KVCacheParams(dtype: DType, n_kv_heads: int, head_dim: int, enable_prefix_caching: bool = False, cache_strategy: KVCacheStrategy = KVCacheStrategy.CONTINUOUS, n_devices: int = 1)
dtype_shorthand
property dtype_shorthand*: str*
The textual representation in shorthand of the dtype.
static_cache_shape
property static_cache_shape*: tuple[str, str, str, str, str]*
KVCacheStrategy
class max.pipelines.kv_cache.cache_params.KVCacheStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
CONTINUOUS
CONTINUOUS = 'continuous'
NAIVE
NAIVE = 'naive'
PAGED
PAGED = 'paged'
uses_opaque()
uses_opaque() → bool
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!