A sentence tokenizer which returns each sentence as a token using simple heuristics.
A more efficient approach for native tokenizers, i.e. HuggingFaceTokenizer