Mojo module
types
This module contains the types for the key-value cache APIs.
The module includes structs implementing several different types of KV caches.
This module defines two traits that define the roles of the different structs
KVCacheT
: Defines the interface for a single (key or value) cache.KVCollectionT
: Defines the interface for a pair of caches (keys and values).
Structs
-
ContinuousBatchingKVCache
: Wrapper for the ContinuousKVCache of a given layer in the transformer model. -
ContinuousBatchingKVCacheCollection
: This is a "view" of the cache for the given sequences in the batch. -
KVCacheStaticParams
: -
PagedKVCache
: The PagedKVCache is a wrapper around the KVCache blocks for a given layer. It is used to access the KVCache blocks for PagedAttention. -
PagedKVCacheCollection
:
Traits
-
KVCacheT
: Trait for different KVCache types and implementations. -
KVCollectionT
: Trait for a pair of caches (keys and values).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!