A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language

Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities.


1
Type of connections and notations Figure 1 represents the notations used in the following diagrams. Plain (one-dimensional) SSMs are represented by rectangles with a solid line. Generally, all neurons in the same SSM have the same bias, which is reported on the bottom-right corner of the rectangle.
In two-dimensional SSMs (2D SSMs), the neurons are arranged on a two-dimensional array. 2D SSMs are represented by rectangles with a dashed line. In case of fixed-weight links, the weight is indicated next to the type of connection. Variable-weight links, which are updated through the discrete Hebbian learning rule (DHL rule), are indicated by the word "hebbian".

2
Input phrase acquisition Figure 2 represents a schematic diagram of the architecture used for input-phrase acquisition. When a sentence is written in the terminal or read from a file, the interface submits its words one by one to the system. Each word is converted to a binary pattern, based on its ascii representation, and submitted to the system input W. The input nodes are fully connected to the input-word buffer (IW), and the link weights are initialized randomly. IW is updated using the winner-take-all (WTA) rule: the neuron with the highest activation state (winner neuron) is switched to the level one, while all other neurons of IW are switched to zero. The links from the input nodes to the winner neuron of IW are updated through the discrete-hebbian-learning (DHL) rule: if the input node signal is one, the link weight is saturated to its maximum value (+1), otherwise it is saturated to its minimum value (-1). This ensures that if this word is submitted again to the system, the winner neuron will be the same.
PhI (phrase index) is a SSM that represents the position of the current word in the phrase: the neuron of PhI corresponding to the position of the word in the phrase is in a high-level state, while all the others are in a low-level state. The words of the input phrase are submitted to the system by loading them, one by one, in the word buffer, and increasing the phrase index from 1 to the number of words in the phrase. The system itself initializes the phrase index at the beginning of a phrase acquisition, and increases it after the acquisition of each word, as it will be discussed in Sect. 9.
This structure is suitable for a broad range of problems in adaptive behavior, not only language understanding. In general, a "word" can be defined as a specific input pattern. The system can associate a key to each word received as input and generate a unique pattern corresponding to the couple (key, word). A "phrase" is set of couples (key, word), temporarily stored in the system. The key can be any pattern, not necessarily representing an integer number, however in the SSM approach a single neuron or a small number of neurons should be active for any key pattern. In the case of natural language, the "phrase index" is a key that represents the position of each word in a phrase.
The SSM InI (input index) in Fig. 2 is single-connected to the phrase index PhI and fully connected to the gatekeeper neuron InFlag (Input Flag). The neuron InFlag is used to control the acquisition: when it is on, a word can be acquired in the input phrase buffer. The bias of InI is set in such a way that if InFlag is on, then InI is simply a copy of the phrase index PhI, otherwise it is blank (all neurons off).
InPhFL (input-phrase front layer) is a two-dimensional SSM (2DSSM) having a number of rows equal to the size of InI and a number of columns equal to the size of the input word buffer IW. Each row of InPhFL is single-connected to IW, while each column of InPhFL is single-connected to InI. Therefore, the neuron (i, j) in row i, column j of InPhFL, is connected to neuron i of InI and to neuron j of IW. The link weights and the bias of InPhFL are set in such a way that the neuron (i, j) will be on only if both input neurons are on. In this way a couple (word, phrase-index) is mapped to the neuron of InPhFL located in the row i corresponding to the phrase index and in the column j corresponding to the word-mapping neuron index. The input-phrase front layer is single-connected to the input-phrase buffer (InPhB). The inputphrase buffer is also single-connected to itself (self connection). In this way, it can store all words of a phrase and keep them stored until it is cleared by a flush signal. 3 Copy of the input phrase to the working-phrase buffer Figure 3 shows how the input phrase is copied from the input-phrase buffer (InPhB) to the workingphrase buffer (WkPhB). The copy is triggered by the gatekeeper neuron WkFlag, which is fully connected to the working-phrase front layer (WkPhFL) and to the current working phrase (CurrWkPh).
The links from WkFlag to CurrWkPh have negative weight with a very large absolute value, BW (big weight). When WkFlag is on, CurrWkPh is cleared. At the same time, the content of InPhB is copied to WkPhFL, which is then copied to the working-phrase buffer, where it is stored thanks to the links between WkPhB and CurrWkPh.

4
Extraction of a word from the working-phrase buffer Figure 4 represents the architecture used to extract from the working-phrase buffer the word corresponding to the phrase index PhI. WkWfI (working-phrase word-from-index) is a 2DSSM, singleconnected to the working-phrase buffer WkPhB and with each column single-connected to PhI. The bias is set in such a way that only the word corresponding to the phrase index PhI is copied from WkPhB to WkWfI. This word is then copied to the current-word buffer CW.

5
Copy of a word from the current-word buffer to the word-group buffer Figure 5 shows how the word in CW is copied to the word-group buffer WGB. This mechanism, together with the one described in the previous paragraph, can be used to extract a group of words from the working-phrase buffer and to store it in the word-group buffer. The architecture is quite similar to that shown in Fig. 2. When the two gatekeeper neurons GetFlag and WGFlag (word-group flag) are on, the word in CW is copied to WGCW (word-group current word) and the index WGI (word-group index) is copied to WGIFL (word-group-index front layer). The 2DSSM WGFL (word-group front layer) is connected to WGCW and to WGIFL in such a way that the couple (word, index) is mapped to the neuron of WGFL located in the row i corresponding to the word-group index and in the column j corresponding to the word. The content of the WGFL is then copied to the word-group buffer (WGB), where it is stored thanks to the self links, until the word-group buffer is cleared by a flush signal. 6 Copy of the word group to the output buffer Figure 6 shows the architecture that is used to send the current word group to the output. When the gatekeeper neuron OutFlag is in a high-level state, the content of the word-group buffer is copied to OutPhFL (output-phrase front layer) and from there to OutPhB (output-phrase buffer), where it is stored through the self connections until OutPhB is cleared by a flush signal.

7
Memorization of a phrase Figure 7 represents the architecture used to memorize permanently a phrase. MemPh (memorized phrase) is a SSM used to map all phrases memorized by the system. Every time a new input phrase is acquired by the system, it is mapped to a neuron of MemPh. The neuron update mechanism will be described in a following paragraph. The memorization of a new phrase is triggered by the gatekeeper neuron NewMemPhFlag. When this flag is on, the phrase index in MemPh is copied to RemPhIL (remembered-phrase input layer) and then to RemPh (remembered phrase). RemPh is an associative SSM. The build process is controlled by NewMemPhFlag: when this neuron is on, the link weights from the active neuron of RemPh to WkPhB are updated through the discrete Hebbian learning rule. In this way, the association between the phrase index in MemPh and the phrase is stored permanently in the link weights.

Memorization and retrieval of the association between a word group and a phrase
The association between the word group in WGB and the phrase in WkPhB is memorized permanently by the system using the architecture shown in Fig. 8. RemPhfWG is an associative SSM, with input links fully connected to WGB and output links fully connected to RemPh. The build process is controlled by the gatekeeper neuron BuildAs: when this flag is on, a neuron of RemPhfWG is switched to a high-level state through the winner-take-all (WTA) rule, and both the weights of the input links from RemPh to this neuron and the weights of the output links form this neuron to WkPhB are updated through the discrete Hebbian learning rule (DHL rule). In this way, the association between the current content of WGB and the current content of RemPh is permanently memorized by the system.
The retrieve process is triggered by the gatekeeper neuron RetrAs. The word group in WGB is sent as input to RemPhfWG. The neurons having link weights matching the word group will have the highest activation state, and a single winner is selected among them through the WTA rule. The winner neuron will retrieve the phrase associated to the input word group by using its forcing output links to set the activation state of WkPhB.  Phrase index and word-group index update Figure 9 represents the architecture used to update the phrase index. This process is controlled by the two gatekeeper neurons CurrPhIFlag (current phrase-index flag) and NextPhIFlag (next phraseindex flag). Those neuron are mutually exclusive: when one is in a high-level state, the other one must necessarily be in a low-level state. When CurrPhIFlag is on, the current phrase index is copied from PhI to CurrPhI, and then it is copied back to PhI. In this way, PhI can hold the current phrase index. On the other hand, when NextPhIFlag is on the current phrase index is copied from PhI to NextPhI. The connection from NextPhI to PhI is a single connection displaced by one neuron, i.e. each neuron of NextPhI is connected to the neuron of PhI located in the following position. In this way, when NextPhIFlag is on PhI will be updated to the next phrase index. If neither CurrPhI nor NextPhI are selected PhI will be cleared. The links from NextPhIFlag and NextPhI to the first neuron of PhI are used to initialize the phrase index in PhI to 1 when PhI is initially empty and NextPhI is selected.
The word-group index is updated using the same type of architecture, as shown in Fig. 10. The update is controlled by the two gatekeeper neurons CurrWGIFlag (current word-group-index flag) and NextWGIFlag (next word-group-index flag). When CurrWGIFlag is on, WGI holds the current wordgroup index. On the other hand, when NextWGIFlag is on, WGI is updated to the next word-group index. If neither CurrWGI nor NextWGI are selected WGI will be cleared.  10 Memorized phrase and remembered phrase index update The architecture used to update the index for phrase memorization, shown in Fig. 11, is quite similar to that used for the phrase index update described previously. The update is controlled by the gatekeeper neuron NewMemPhFlag. When NewMemPhFlag is off, MemPh holds the current memorized phrase index. On the other hand, when NewMemPhFlag is on, MemPh is updated to the next memorized phrase index.
The system has the ability to retrieve sequentially the memorized phrases. After a memorized phrase is remembered, for instance by the association mechanism described previously, the system can retrieve the next memorized phrase through the architecture represented in Fig. 12. When the gatekeeper neuron CurrRemPhFlag (current remembered phrase flag) is on, the system holds the memorized phrase index stored in RemPh. On the other hand, when the gatekeeper neuron NextRemPhFlag is on, RemPh will be updated to the next memorized phrase index.

Equal-words vectors
The system is able to recognize whether a phrase buffer contains words that are equal to the word currently stored in CW, or to the words stored in the word-group buffer WGB. The equality among words is represented by binary vectors. For instance, the equality among the words in the input phrase buffer InPhB and the word in CW is represented by the state of the neurons in the input-equal-words (InEqW) SSM. If the i-th word of the input phrase buffer is equal to the word in CW, then the i-th neuron of InEqW will be on, otherwise it will be off. Figure 13 represents the architecture used to produce the equal-words vector InEqW.
In a similar way, the words in the phrase buffers are compared against the words in the word-group buffer. The current version of the system considers only the first four words of the word-group buffer.
This simplification is justified by the consideration that the word-group buffer should not contain too many words. For instance, the equality among the words in the input phrase buffer InPhB and the first word in WGB is represented by the state of the neurons in the input-equal-word-group-word-1 (InEqWGW1) SSM. Figures 13 through 17 illustrate the architectures used to produce the equal-word vectors for the input phrase buffer, the working phrase buffer, the word-group buffer and the output buffer.     12 Previous working phrase, previous word-group and previous equal-word vectors Before extracting a word group from the working phrase, the system clears the current content of the word-group buffer. Similarly, before retrieving a phrase through the association mechanism, the system clears the current content of the working phrase buffer. The learning ability of the system can be improved by preserving the information stored in the word-group buffer and in the working-phrase buffer before clearing them. This information is copied in dedicated SSMs , i.e. a previous-workingphrase buffer, a previous-word-group buffer and previous-equal-words vectors. Figure 18. Copy of the word group from the word-group buffer to the previous-word-group buffer. Figure 18 shows the architecture used to copy the word group from the word-group buffer to the previous-word-group buffer before flushing it. The intermediate SSM PrevWGFL (previous-wordgroup front layer) is used to decouple the previous word group from the current word group, and to allow the copy only when a flush signal is sent to the word group. It is important to outline that PrevWGFL must be updated before WGB, so that the content of WGB is copied before it is flushed.
For the same reason, PrevWG must be updated before PrevWGFL.
When a FlushWG signal is sent, the content of WGB is copied to PrevWGFL and PrevWG is cleared. In the next update, when the FlushWG signal is off, the content of PrevWGFL is copied to PrevWG, where it is stored thanks to the self links.    13 Set and retrieve the starting phrase of a context The system is able to mark the starting phrase of a context and to retrieve it from any other phrase of the same context. Figure 22 shows how a memorized phrase is marked as starting phrase. When the gatekeeper neuron StartPhFlag is switched on, the memorized-phrase index is copied to StartPhIL (starting-phrase input layer) and then to StartPh (starting phrase). On the other hand, when StartPhFlag is off StartPh holds its value thanks to the bidirectional connections to CurrStartPh (current starting phrase).
The system should be able to retrieve the starting phrase from any phrase of the same context. In order to do this, each memorized phrase must be associated to the corresponding starting phrase.

Figure 23 show how this association is built. When a new phrase is memorized, the gatekeeper neuron
NewMemPhFlag is on; the index of the starting phrase of the current context is copied from StartPh to StartWkPhIL (strarting-working-phrase input layer) and then to StartWkPh (starting working phrase).
During this stage RemPh holds the memorized phrase index, because NewMemPhFlag is on, as shown in Fig. 7. The output links from the active neuron of RemPh to StartWkPh are updated by the discrete Hebbian learning rule. In this way the association between the current phrase and the starting phrase of the current context is stored permanently in the links.  When the gatekeeper neuron RetrRemPh is on, StartWkPh is retrieved thanks to the forcing links of RemPh, as shown in Fig. 23. On the other hand, when the action flag CurrStartWkPhFlag is on, StartWkPh holds its value thanks to the bidirectional connections to CurrStartWkPh (current starting working phrase). Figure 24 illustrates how the starting phrase of the working-phrase context is retrieved. The retrieval is triggered by the gatekeeper neuron RemStartPhFlag (remember-starting-phrase flag). When this flag is on, the content of StartWkPh is copied to RemStartPh (remembered starting phrase) and then to RemPh. The starting phrase is thus retrieved in the working-phrase buffer thanks to the forcing output links from RemPh to WkPhB.

The goal stack
When the working phrase and/or the word group are associated to some important task that cannot be performed immediately, they can be inserted in the top of a goal stack through an architecture that is similar to that described in Sect. 3. This process is controlled by the gatekeeper neuron SetGoalFlag, as shown in Figs. 25 and 26. When this neuron is on the working phrase and the word group are copied to the goal phrase and to the goal word group, respectively.   The goal stack can hold simultaneously up to 10 phrases and word groups. The insertions and the extractions follow the LIFO (last in, first out) order. The architecture used for storing more phrases and word groups is similar to that described in Sect. 9. A goal index, stored in the SSM GoalI, can be increased or decreased through the gatekeeper neurons NextGoalIFlag and PrevGoalIFlag, respectively, as shown in Fig. 28.

Goal equal-words vectors
The system is able to recognize whether the working phrase or the goal phrase buffer contains words that are equal to the words stored in the goal word group GoalWG. Figures 30 and 31 illustrate the architectures used to produce the equal word vectors, which are analogous to those described in Sect. 11.  16 Action and gatekeeper neurons  The way how gatekeeper neurons are used in the architecture to control the system operations was described in the previous sections. Below is the list of the acquisition, elaboration and reward actions, with a short description of their effect.
Acquisition actions: • FLUSH: clears the content of all phrase and word-group buffers; • ACQ_W: acquires a word from the word-map front layer to the input-phrase buffer.
• NEXT_AS_W: copies a word from the working-phrase buffer to the word-group buffer and points to the next word; • BUILD_AS: stores the association between the current word group in WGB and the current working phrase; • MEM_PH: memorizes the current working phrase; • SET_START_PH: sets the current working phrase as the starting phrase of the context.
Elaboration actions: • NULL_ACT: null action; • FLUSH_WG: clears the content of the word-group buffer; • W_FROM_WK: initializes the phrase index (PhI) to zero, to prepare the extraction of words from the working-phrase buffer; • W_FROM_IN: initializes the phrase index (PhI) to zero and copies the input phrase to the working phrase buffer; • NEXT_W: skips a words of the working phrase buffer; • GET_W: copies a word from the working phrase buffer to the word-group buffer; • RETR_AS: retrieves a phrase associated to the word group by the association mechanism.
• FLUSH_OUT: clears the content of the output buffer; • WG_OUT: copies the content of the word-group buffer to the output buffer; • GET_NEXT_PH: retrieves sequentially phrases belonging to the same context; • GET_START_PH: retrieves the starting phrase of the same context of the working phrase; • CONTINUE: labels the end of a state-action sequence that can receive a partial reward; • DONE: labels the end of a state-action sequence that can receive a (conclusive) reward; • PUSH_GL: copies the working phrase to the goal stack; • DROP_GL: delete a phrase from the goal stack; • GET_GL_PH: copies a phrase from the goal stack to the working phrase buffer; • SNT_OUT: sends to the output the next part of the working phrase, after the current word, and all subsequent phrases of the same context until the end of the context itself.
Reward actions: • NULL_RWD: null action; • STORE_ST_A: stores the current state and action in the state-action memory and increases the state-action index; • START_ST_A: resets the state-action index to the beginning of the state-action sequence; • RETR_ST_A: retrieves the state and action from the state-action memory and increases the state-action index; • RWD_ST_A: rewards the last action by a change in the link weights of the state-action association SSM; • CHANGE_ST_A: stores the current state and action in the state-action memory without increasing the state-action index; • GET_ST_A: retrieves the state and action from the state-action memory without increasing the state-action index; • RETR_EL_A: uses the state-action association SSM to find the best action associated to the current state; • STORE_SAI: stores the current state-action index in a buffer; • RETR_SAI: retrieves the state-action index from the buffer where it was stored; When an exploration phase leads to a target output, the human interlocutor can use the interface to trigger a reward signal, which puts the system in the reward operating mode. In this mode the system retrieves the state-action sequence that yielded the reward, and memorizes the association between each state of the sequence and the corresponding action. The system can memorize up to 300 states and actions, which are indexed by the SSM StActI (state-action index). The architecture used to reset and to increase StActI, which is illustrated in Fig. 33, is analogous to that described in Sect. 9.
The state-action sequence is memorized and retrieved through the associative SSM StActMem, as shown in Fig. 34. When the gatekeeper neuron BuildStAct is on, the association between the index StActI and the state and action are memorized as a change of the output links of StActMem through the DHL rule. On the other hand, when the action flag neuron RetrStAct is on, the state and action corresponding to the index StActI are retrieved.

State-Action association
As discussed in the previous section, during the reward phase the state-action sequence that yielded the reward is retrieved. The association between each state of the sequence and the corresponding action is memorized through the associative SSM ElActfSt. This process is controlled by the gatekeeper neuron BuildElActfSt, as shown in Fig. 35. When this neuron is on, the system state is mapped to a previously unused neuron of ElActfSt through the WTA rule, and both the links from the state to the winner neuron of ElActfSt and the links from this neuron to ElAct are updated through the DHL rule. During the exploitation phase, ElActfSt associates an elaboration action to the system state. This process is triggered by the gatekeeper neuron RetrElActfSt. ElActfSt is updated through the k-winnertake-all (KWTA) rule. On the other hand, ElAct is updated through the (one) WTA rule. In this way, ElAct selects a single action, the one that is more represented among the outputs of the k winner neurons of ElActfSt.
19 Operating-modes subnetworks  The high-level pseudocode, the low-level pseudocode and the block diagram of the five operating modes are shown below.

Acquisition
High-level pseudocode: