The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research

doi:10.1371/journal.pone.0144016

Fig 1.

Distribution of the NILC source of origin.

Distribution of the different textual genres and written materials that composed the original source of the NILC [38].

More »

Expand

Table 1.

Numbers of word tokens, word types, and lemmas by grammatical category before and after data processing for the Brazilian Portuguese Lexicon.

More »

Expand

Table 2.

Numbers, columns, and descriptions of the Brazilian Portuguese Lexicon.

More »

Expand

Table 3.

Conventions used in the grammatical category and grammatical information columns in the search engines and results at the Brazilian Portuguese Lexicon.

More »

Expand

Fig 2.

The simple and the complex input search engines of the Brazilian Portuguese Lexicon.

Simple search allows a list of words as input and complex search allows specific criteria specification.

More »

Expand

Table 4.

Symbols used as wildcards in the search engines in the Brazilian Portuguese Lexicon.

More »

Expand

Fig 3.

Results of the complex search made in the Fig 2 example.

The top-left space presents the general search information, the top-right space provides basic statistics, and the down space displays the search results.

More »

Expand

Fig 4.

General distributions of the Brazilian Portuguese Lexicon corpus.

a) number of words by grammatical category; b) number words according to the number of letters for each grammatical category; c) log10 frequency by word rank distribution; and d) Zipf’s law (i.e., log10 frequency by log10 rank) for each grammatical category.

More »

Expand

Fig 5.

General interactions between variables in the Brazilian Portuguese Lexicon.

a) log10 number of words by log10 orthographic neighborhood for each grammatical category; b) log10 number of words by OLD20 for each grammatical category; c) mean orthographic neighborhood by number of letters for each grammatical category; and d) mean OLD20 by the number of letters for each grammatical category.

More »

Expand

Table 5.

General means and standard deviations between parentheses by grammatical category.

More »

Expand

Fig 6.

Correlations between the different current Brazilian Portuguese corpora.

LexPorBR: Brazilian Portuguese Lexicon; SubtlexBR: SUBTLEX-PT-BR [15]; WlBlog, WlTwitter, and WlNews for the three Worldlex (Portuguese Brazil) corpora [16]. Correlations were calculated using the Zipf scale frequency [12]. Pearson correlation above the diagonal, histograms with corpora distribution on the diagonal, and bivariate scatter plots with loess smooth fits and ellipses below the diagonal [42].

More »

Expand

Table 6.

Relative percentage (%) of word types contained in the LexPorBR, SubtlexBR [15], and Worldlex (Portuguese Brazil) [16] corpora.

The head corpus contains the percentage of word types of the left corpus and the left corpus is contained by the head corpus.

More »

Expand

Table 7.

Overestimated and underestimated words by the Brazilian Portuguese Lexicon compared to the SUBTLEX-PT-BR [15] and Worldlex (Portuguese Brazil) [16].

Between parentheses is the number of the most frequent words verified to list the 10 words presented in each list; Zipf scale range interval of the words found is indicated under heads.

More »

Expand