Python module
hf_utils
Utilities for interacting with HuggingFace Files/Repos.
HuggingFaceFile
class max.pipelines.hf_utils.HuggingFaceFile(repo_id: str, filename: str, revision: str | None = None)
A simple object for tracking Hugging Face model metadata. The repo_id will frequently be used to load a tokenizer, whereas the filename is used to download model weights.
download()
Download the file and return the file path where the data is saved locally.
exists()
exists() → bool
filename
filename*: str*
repo_id
repo_id*: str*
revision
size()
HuggingFaceRepo
class max.pipelines.hf_utils.HuggingFaceRepo(repo_id: str, revision: str = 'main', trust_remote_code: bool = False, repo_type: RepoType | None = None)
A class for interacting with HuggingFace Repos.
download()
download(filename: str, force_download: bool = False) → Path
encoding_for_file()
file_exists()
files_for_encoding()
files_for_encoding(encoding: SupportedEncoding, weights_format: WeightsFormat | None = None, alternate_encoding: SupportedEncoding | None = None) → dict[max.graph.weights.format.WeightsFormat, list[pathlib.Path]]
formats_available
property formats_available*: list[max.graph.weights.format.WeightsFormat]*
info
property info*: ModelInfo*
repo_id
repo_id*: str*
The HuggingFace repo id. While it’s called repo_id, it can be a HF remote or local path altogether.
repo_type
repo_type*: RepoType | None* = None
The type of repo. This is inferred from the repo_id.
revision
revision*: str* = 'main'
The revision to use for the repo.
size_of()
supported_encodings
property supported_encodings*: list[max.pipelines.config_enums.SupportedEncoding]*
trust_remote_code
trust_remote_code*: bool* = False
Whether to trust remote code.
weight_files
property weight_files*: dict[max.graph.weights.format.WeightsFormat, list[str]]*
download_weight_files()
max.pipelines.hf_utils.download_weight_files(huggingface_model_id: str, filenames: list[str], revision: str | None = None, force_download: bool = False, max_workers: int = 8) → list[pathlib.Path]
Provided a HuggingFace model id, and filenames, download weight files : and return the list of local paths.
-
Parameters:
- huggingface_model_id – The huggingface model identifier, ie. modularai/llama-3.1
- filenames – A list of file paths relative to the root of the HuggingFace repo. If files provided are available locally, download is skipped, and the local files are used.
- revision – The HuggingFace revision to use. If provided, we check our cache directly without needing to go to HuggingFace directly, saving a network call.
- force_download – A boolean, indicating whether we should force the files to be redownloaded, even if they are already available in our local cache, or a provided path.
- max_workers – The number of worker threads to concurrently download files.
repo_exists_with_retry()
max.pipelines.hf_utils.repo_exists_with_retry(repo_id: str, revision: str) → bool
Wrapper around huggingface_hub.revision_exists with retry logic. Uses exponential backoff with 25% jitter, starting at 1s and doubling each retry.
We use revision_exists here instead of repo_exists because repo_exists does not take in a revision parameter.
See huggingface_hub.revision_exists for details
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!