nlp
Welcome to londogard-nlp-toolkits, com.londogard:nlp
!
This project is created to make NLP tools more accessible to the JVM world.
It includes a multitude of features, such as
Embeddings (Word & Sentence)
Tokenizers (Word, Char & Subword)
Stopwords, Word Frequencies & Stemming
Vectorizers & Encoders (TF-IDF, BM-25, OneHot, ...)
Classifiers (NaïveBayes, Logistic Regression w/ SGD & Transformers including HuggingFace)
Token Classifiers (Hidden Markov Chains & Transformers including HuggingFace)
Keyword Extraction
Packages
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
The Word Frequencies are taken from wordfreq.py
a library by LuminosoInsight and are hosted directly on the GitHub. The object looks as follows: