Package com.londogard.nlp.wordfreq

The Word Frequencies are taken from wordfreq.py a library by LuminosoInsight and are hosted directly on the GitHub. The object looks as follows:

object WordFrequencies {
fun getAllWordFrequenciesOrNull(language: LanguageSupport, size: WordFrequencySize = WordFrequencySize.Largest): Map<String, Float>?

fun wordFrequency(word: String, language: LanguageSupport, minimum: Float = 0f, size: WordFrequencySize): Float // Throws if language does not support wordfreq
fun wordFrequencyOrNull( word: String, language: LanguageSupport, minimum: Float = 0f, size: WordFrequencySize): Float?
}

This object has an internal cache which saves the previously loaded language. Use WordFrequencies.getAllWordFrequenciesOrNull to simply retrieve the WordFrequencies and use them yourself as a Map<String, Float>.
Methods to recieve zipfFrequencies also exists.

See samples in wordfreq.ipynb

Types

Link copied to clipboard
public class WordFrequencies

Returns word frequency based on a language. This component builds upon the great work of wordfreq.py by LuminosoInsight.

Link copied to clipboard
public enum WordFrequencySize extends Enum<WordFrequencySize>

Some Languages can possibly use a larger word frequency list. This util automates that behaviour for you.