Cross-Language Distributions of High Frequency and Phonetically Similar Cognates

doi:10.1371/journal.pone.0063006

Table 1.

Intralingual and interlingual language similarities in terms of semantics (S), orthography (O), and phonology (P).

More »

Expand

Table 2.

Calculations of relative cognate frequency distinguish closely related language pair from less closely related language pairs.

More »

Expand

Figure 1.

Orthographic similarity distributions across translation equivalents.

The data points represent the normalized Levensthein distances binned into 18 equal parts on the obtained similarity scale. The solid line uses locally weighted scatter plot smoothing and spline interpolation over the bins. Notice the logarithmic scale on the y axis.

More »

Expand

Figure 2.

Phonetic similarity distributions across translation equivalents.

See the legend of Figure 1 for a description.

More »

Expand

Figure 3.

Similarity distributions of cognates for closely related language pairs.

German – Dutch and Italian – Spanish are coded with dashed lines. The solid lines use locally weighted scatter plot smoothing and spline interpolation over 18 bins.

More »

Expand

Figure 4.

Similarity distributions of cognates for less closely related language pairs.

French – English and Fench – Dutch are coded with dashed lines. See the legend of Figure 6 for a description.

More »

Expand

Table 3.

Correlations between orthographic and phonetic similarity measures of cognates.

More »

Expand

Figure 5.

Relative cognate frequency predicts degree of genetic relatedness between languages.

Average frequencies are shown for both languages in each language pair. The straight line represents the result of a linear discriminant analysis between the classes more and less closely related language pairs.

More »

Expand

Figure 6.

Comparisons of cognate to translation frequency distributions for six closely related language pairs.

The x axes show cognate frequencies per million words. The y axes show the numbers of cognates observed. The frequency distributions of translation equivalents are plotted with dotted lines. The blue colored lines code for the L1, the red colored lines code for the L2. The order of languages in the subtitles indicate which language is the L1 and which language is the L2. The cognate frequencies are binned into 14 equal parts on the word frequency scale. The lines use locally weighted scatter plot smoothing over the bins. Notice the logartithmic scales.

More »

Expand

Figure 7.

Comparisons of cognate and translation frequency distributions for six less closely related language pairs.

See the legend of Figure 6 for a description.

More »

Expand

Table 4.

Classification rates as based on subjective measurements.

More »

Expand