Tokenizer

public interface Tokenizer

Tokenize a string into multiple tokens

Functions

Link copied to clipboard
public List<List<String>> batchSplit(List<String> texts)

A more efficient approach for native tokenizers, i.e. HuggingFaceTokenizer

Link copied to clipboard
public abstract List<String> split(String text)

Inheritors

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard