Package com. londogard. nlp. wordfreq
The Word Frequencies are taken from wordfreq.py
a library by LuminosoInsight and are hosted directly on the GitHub. The object looks as follows:
object WordFrequencies {
fun getAllWordFrequenciesOrNull(language: LanguageSupport, size: WordFrequencySize = WordFrequencySize.Largest): Map<String, Float>?
fun wordFrequency(word: String, language: LanguageSupport, minimum: Float = 0f, size: WordFrequencySize): Float // Throws if language does not support wordfreq
fun wordFrequencyOrNull( word: String, language: LanguageSupport, minimum: Float = 0f, size: WordFrequencySize): Float?
}
Content copied to clipboard
This object has an internal cache which saves the previously loaded language. Use WordFrequencies.getAllWordFrequenciesOrNull
to simply retrieve the WordFrequencies and use them yourself as a Map<String, Float>
.
Methods to recieve zipfFrequencies
also exists.
See samples in wordfreq.ipynb
Types
Link copied to clipboard
Returns word frequency based on a language. This component builds upon the great work of wordfreq.py
by LuminosoInsight.
Link copied to clipboard
Some Languages can possibly use a larger word frequency list. This util automates that behaviour for you.