A mechanism for the cortical computation of hierarchical linguistic structure

doi:10.1371/journal.pbio.2000663

Fig 1.

A DORA representation of the proposition waste (games, time) during processing that illustrates how time-based binding works.

We use different shapes to represent units in different layers (ovals for Proposition node "P-units"/sentences, rectangles for Role-filler binding nodes or "RB units"/phrases, triangles and large circles for Propositional Object (PO) units/words and argument roles, and small circles for semantic units/features) for the purposes of clarity. Abbreviations "wtr." and "wtd." signify the role of waster and wasted in the proposition waste(games, time), respectively. In the model, these units are simply nodes in different layers of the network. Darker units denote when a unit is firing at a given time step (panels a–d in the Fig 1), which in this case corresponds to 250 msec/4 Hz. Please see page 18 for a detailed discussion of P-units, RB-units, and PO units.

More »

Expand

Fig 2.

A representation of the sentence "Dry fur rubs skin" in Learning and Inference with Schemas and Analogies (LISA; [21]) /DORAese predicate calculus.

We use different shapes to represent units in different layers (ovals for P-units/sentences, rectangles for RB units/phrases, triangles and large circles for PO units/words, and small circles for semantic units/features) for the purpose of clarity. In the model, these units are simply nodes in different layers of the network.

More »

Expand

Fig 3.

Grammatical sentences: DORA network power spectrum compared to human cortical oscillations.

The solid line represents cortical power while participants listened to four syllable/word sentences played over 1 s in Ding et al. [6]. Power increases are evident at the 1 Hz (sentence duration), 2 Hz (phrase duration), and 4 Hz (word duration) range. The dashed line depicts firing in DORA while processing the same sentences used in Ding et al. Units in DORA fire for the duration of the sentence, at intervals of half the length of the sentence and at intervals lasting a quarter of the length of the sentence. Data from the stimulation and the code to run it are available at https://osf.io/eb2vp/ and https://github.com/AlexDoumas/dingetal_sent.

More »

Expand

Fig 4.

DORA network power spectrum plot of the Word List, Phrases, and Jabberwocky conditions.

For Word List, an increase in firing only occurred at the 4 Hz range, corresponding to firing of nodes coding for words. Lack of firing at other frequencies indicates that no hierarchical representations were processed in the Word list condition. In the Phrases condition, there was an increase in power at 2 Hz and 4 Hz, indicating that units coding words and units coding phrases were active during the processing of this condition. No sentence units were active. In the Jabberwocky condition, there was an increase at 1, 2, and 4 Hz range, similar to the pattern seen for grammatical sentences, indicating that hierarchical representations were indeed activated. See Fig 5 for a comparison of activation across the propositions in the model's long-term memory between Jabberwocky and Grammatical sentences. Data from the simulations and the code to run them are available at https://osf.io/eb2vp/ and https://github.com/AlexDoumas/dingetal_sent.

More »

Expand

Fig 5.

Left Panel: Grammatical.

Plot of active propositions in memory across trials in the Grammatical condition. On the x-axis are P-Units, on the y-axis are processing iterations of the model, equivalent to trials or instances of processing a sentence during the simulation. The darker colour indicates more activation of existing propositional role-filler binding combinations in memory. Grammatical sentences resulted in stronger activation of extant propositions than Jabberwocky sentences did. Right Panel: Jabberwocky. The Jabberwocky condition did not activate as many single existing propositions in the model's memory as the Grammatical condition did, rather, activation was spread more broadly across memory, despite both conditions producing similar oscillations in DORA. Data from the simulations and the code to run them are available at https://osf.io/eb2vp/ and https://github.com/AlexDoumas/dingetal_sent.

More »

Expand

Fig 6.

Proportion of units (range: 0–0.2) active above threshold (0.7) in the recurrent layer of the RNNs.

Note: total number of units in the recurrent layer n = 50 for all conditions except for Phrases, in which n = 30. On average, between 5–9 units out of 50 in the hidden layer activated at 4 Hz. There was no evidence of activity at 1 Hz or 2 Hz, and hence no evidence for coding or tracking of linguistic structures in the RNN. Data from the simulations and the code to run the simulations are available at https://osf.io/eb2vp/ and https://github.com/AlexDoumas/dingetal_sent.

More »

Expand

Fig 7.

Initial state of the network before learning.

The model assumes the existence of objects and features and initially must learn the relationships between features sets and objects. After learning, the model's internal representations are in a predicate calculus (see Figs 1 and 2).

More »

Expand

Fig 8.

Graphical depiction of banks of units in DORA containing represented propositional structures.

The comparison process that is crucial for learning occurs across the driver and recipient (see [19] for details).

More »

Expand

Fig 9.

Architecture of the three-layer RNN.

x, s, and o represent vectors of activation of units in the input, recurrent, and output layers, respectively. U, V, and W represent input, output, and recurrent weight matrices, respectively.

More »

Expand