matx.text.wordpiece_tokenizer module¶
- class matx.text.wordpiece_tokenizer.WordPieceTokenizer(vocab_path: str, lookup_id: bool = True, unk_token: Any = '[UNK]', subwords_prefix: str = '##', skip_empty: bool = True, max_bytes_per_token: int = 100)[source]¶
Bases:
object
- class matx.text.wordpiece_tokenizer.WordPieceTokenizerImpl(vocab_path: str, lookup_id: bool = True, unk_token: Any = '[UNK]', subwords_prefix: str = '##', skip_empty: bool = True, max_bytes_per_token: int = 100)[source]¶
Bases:
object