Skip to main content

/

Docs

Nightly

v25.1

Python module

attention_without_mask

An opaque KV Cache optimized vanilla attention mechanism, with Mask Variants provided inside the Kernel.

`AttentionWithoutMask`

class max.pipelines.nn.attention.attention_without_mask.AttentionWithoutMask(n_heads: int, kv_params: max.pipelines.kv_cache.cache_params.KVCacheParams, layer_idx: max.graph.value.TensorValue, wqkv: max.graph.value.TensorValue, wo: max.pipelines.nn.linear.Linear, mask_variant: max.pipelines.nn.kernels.MHAMaskVariant)

`mask_variant`

mask_variant*: MHAMaskVariant*

AttentionWithoutMask
- mask_variant

Was this page helpful?

Thank you! We'll create more content like this.

Thank you for helping us improve!