vllm.tokenizers.hf ¶
HfTokenizer module-attribute ¶
HfTokenizer: TypeAlias = (
PreTrainedTokenizer | PreTrainedTokenizerFast
)
CachedHfTokenizer ¶
Bases: TokenizerLike
Source code in vllm/tokenizers/hf.py
from_pretrained classmethod ¶
from_pretrained(
path_or_repo_id: str | Path,
*args,
trust_remote_code: bool = False,
revision: str | None = None,
download_dir: str | None = None,
**kwargs,
) -> HfTokenizer
Source code in vllm/tokenizers/hf.py
get_cached_tokenizer ¶
get_cached_tokenizer(tokenizer: HfTokenizer) -> HfTokenizer
By default, transformers will recompute multiple tokenizer properties each time they are called, leading to a significant slowdown. This proxy caches these properties for faster access.