WordEmbeddings

public final class WordEmbeddings implements Embeddings

Ordinary Word Embeddings.

Constructors

Link copied to clipboard
public WordEmbeddings WordEmbeddings(    Path filePath,     Integer dimensions,     Character delimiter)

Types

Link copied to clipboard
public class Companion

Functions

Link copied to clipboard
public final List<Pair<String, Float>> analogy(    String w1,     String w2,     String w3,     Integer N)

Find the N closest terms in the vocab to the analogy:

Link copied to clipboard
public Boolean contains(String word)

Check if the word has an embedding.

Link copied to clipboard
public Float cosineDistance(String w1, String w2)
Link copied to clipboard
public final List<Pair<String, Float>> distance(List<String> input, Integer N)

Find the N closest terms in the vocab to the input word(s).

Link copied to clipboard
public Float euclideanDistance(String w1, String w2)
Link copied to clipboard
public Character getDelimiter()
Link copied to clipboard
public Integer getDimensions()
Link copied to clipboard
public Map<String, NDArray<Float, D1>> getEmbeddings()

Vocabulary, word to embedded space

Link copied to clipboard
public Path getFilePath()
Link copied to clipboard
public Set<String> getVocabulary()
Link copied to clipboard
public final List<Pair<String, Float>> nearestNeighbours(    NDArray<Float, D1> vector,     Set<String> inSet,     Set<String> outSet,     Integer N)

Find N closest terms in the vocab to the given vector, using only words from the in-set (if defined) and excluding all words from the out-set (if non-empty). Although you can, it doesn't make much sense to define both in and out sets.

Link copied to clipboard
public final List<Pair<String, Float>> rank(String word, Set<String> set)

Rank a set of words by their respective distance to some central term.

Link copied to clipboard
public List<NDArray<Float, D1>> traverseVectors(List<String> words)
Link copied to clipboard
public List<NDArray<Float, D1>> traverseVectorsOrNull(List<String> words)
Link copied to clipboard
public NDArray<Float, D1> vector(String word)

Fetches Embedding if it exists for word

Properties

Link copied to clipboard
private final Character delimiter
Link copied to clipboard
private final Integer dimensions
Link copied to clipboard
private final Map<String, NDArray<Float, D1>> embeddings

Vocabulary, word to embedded space

Link copied to clipboard
private final Path filePath
Link copied to clipboard
private final Set<String> vocabulary