Distributional hypothesis as isomorphism between word-word co-occurrence and analogical parallelograms

doi:10.1371/journal.pone.0312151

Table 1.

List of distributional vector space models in the present paper.

More »

Expand

Fig 1.

Four-term analogy performances of distributional models.

More »

Expand

Fig 2.

Analogy performance of the vocabulary-restricted word2vec models with negative sampling.

To manipulate vocabulary size, artificial corpora were generated from the co-occurrence statistics and used for training, instead of the original corpus.

More »

Expand

Fig 3.

Learning curve and analogy performance of a word2vec model trained with back-propagation.

More »

Expand

Fig 4.

Analogy performance of a word2vec model trained with back-propagation.

More »

Expand

Fig 5.

A hidden Markov model generating the 24 sentences in the toy corpus.

Any hidden state other than the verbs generates the word with probability 1. For example, the state “king” generates the word “king”. On the other hand, the two hidden states for each verb generate the same. For example, both states “live (R)” and “live (C)” generate the word “live”. “BOS”: Beginning of Sentence, “EOS”: End of Sentence.

More »

Expand

Fig 6.

The co-occurrence matrix generated with the uniform sentence probability distribution.

The numbers in the cells represent their relative frequency.

More »

Expand

Fig 7.

(Non-)parallelepipeds embedded in the co-occurrence matrix of (a) uniform toy corpus (b) with orthogonal basis and (c) non-uniform toy corpus. (d) Parallelepiped embedded in the co-occurrence matrix of a natural corpus.

More »

Expand

Fig 8.

The co-occurrence matrix generated with the arbitrary sentence probability distribution.

The numbers in the cells represent their probability. The row vectors are illustrated in a low dimensional subspace.

More »

Expand