Application of Hierarchical Dissociated Neural Network in Closed-Loop Hybrid System Integrating Biological and Mechanical Intelligence

Neural networks are considered the origin of intelligence in organisms. In this paper, a new design of an intelligent system merging biological intelligence with artificial intelligence was created. It was based on a neural controller bidirectionally connected to an actual mobile robot to implement a novel vehicle. Two types of experimental preparations were utilized as the neural controller including ‘random’ and ‘4Q’ (cultured neurons artificially divided into four interconnected parts) neural network. Compared to the random cultures, the ‘4Q’ cultures presented absolutely different activities, and the robot controlled by the ‘4Q’ network presented better capabilities in search tasks. Our results showed that neural cultures could be successfully employed to control an artificial agent; the robot performed better and better with the stimulus because of the short-term plasticity. A new framework is provided to investigate the bidirectional biological-artificial interface and develop new strategies for a future intelligent system using these simplified model systems.


Introduction
Although artificial intelligence has been investigated since 1960s, many problems associated with recognition, cognition, evolution, learning and memory still remain unsolved. Lack of plastic developmental phase was one of the reasons cited. However, this process has been discovered in the biological nervous system [1][2][3][4], with indications of positive performance [5][6][7]. The plastic developmental phase can be observed within a wide range of different temporal [8][9][10][11][12] and spatial [13][14][15] scales in biological systems. Therefore, differential plasticity has resulted in the development of different biological-mechanical hybrid systems in order to create more powerful and effective intelligence systems addressing this issue.
In the recent past, several different hybrid systems have been developed, consisting of human brains [16][17][18], living animal nervous systems [19][20][21], and dissociated neural cultures [22][23][24][25][26] coupled to different robot systems. We focused our attention on robot systems coupled to the dissociated neural cultures because not only can robots be perfectly controlled to execute the task, but also the principles of information transfer within the biological neural network can be easily observed in these simple preparations.
Previous neuro-robot studies have shown that dissociated neurons were generally cultured by a polydimethylsiloxane (PDMS) ring to grow them into an isolated random network, which has always limited the abilities to process the complex information input. Several types of dissociated neural networks, which were cultured with a unique PDMS shape in order to develop a multilevel structure to address this limitation [27], were discussed in previous works. These neural cultures were employed to study persistent activities of the neural network [28], connectivity among neurons [29] and construct a new 'logic' device [30]. These primeval investigations suggested a potential for hierarchical structure of the neural cultures to make decisions under complex conditions using the neuro-robot system.
The use of neuro-robot hybrid systems is limited. Although, the modular neural network with two parts interconnected has been employed to control a virtual robot in previous works [31], performance of the idealized robot persisted at low levels (many hits). The differences embodied in the robot's performance (number of hits) between random and modular cultures remained insignificant. In Warwick's work [22], the neural networks were connected to the actual robot to execute an 'avoiding wall' task, which was accomplished at a low performance level (one information input, 67% turning accuracy, and chance level 50%). Other works [24] utilizing the actual robot did not allow the robot execute any task, and resulted in 42.86% correct turning in both directions (chance level 25%). Earlier works, including the Porter group [25], used a hybrid neural-algorithmic controller rather than a neural-centered type.
In order to develop a more effective neural controller that merged the biological intelligence with the mechanical intelligence, first we built a closed-loop hybrid robot system by interfacing a differential mobile robot with a population of dissociated neurons that was cultured on Micro Electrode Arrays (MEA). Subsequently, a PDMS template was developed to culture the '4Q' hierarchical neural network, which contained a more complex hierarchical construction than previous networks. It resulted in a significant improvement in robot control compared with a random neural network. The differences embodied in the neural networks' functions and robot's performance between random and '4Q' cultures were significant and will be detailed in this paper, respectively. The robot system was able to complete the searching object's task via the control of the dissociated neural network, and it eventually achieved percentages of correct turning at over 80% in both directions (chance level 25%) within the '4Q' cultures. Surprisingly, without tetanus training, the performance of the robot system improved with the stimulus period in the experiments, providing possible evidence of plasticity change in the neural assemblies following feedback from an external environment.

Ethics Statement
This study was carried out in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals. The animal experimental facilities were approved by the Anhui Provincial Department of Science and Technology [Approval ID: SYXK (Anhui) 2005-004]. All protocols involving animal work were approved by the Ethics Committee of Animal Experiments of the University of Science and Technology of China (Permit Number: USTCACUC1201051).
All surgery was performed under anesthesia. The adult pregnant rats were anesthetized with 99% carbon dioxide. After death by cervical dislocation, we removed the embryo from the uterus of the pregnant rat. The neurons were directly extracted from the hippocampus of embryo. All efforts were made to minimize animal suffering.

Culture preparation
Low dense cultures of embryonic rat hippocampus were prepared as previously described [32] with minor modifications. Briefly, hippocampi were removed from embryonic day 17-18 (E17-18) rats and were treated with trypsin for 15 min at 37°C, followed by washing and gentle trituration. Dissociated cells were plated on MEA chips at a density of 50-80 cells per mm 2 . MEA chips were coated with poly-L-lysine to facilitate cell adherence. Neurons growing near the electrodes formed a relatively isolated island using PDMS rings. The culture medium comprised DMEM (BioWhittaker) supplemented with 10% heat-inactivated bovine calf serum (HyClone), 10% Ham's F-12 with glutamine (Bio-Whittaker), 50 units ml penicillin streptomycin (Sigma), and 1×B-27 supplement (Invitrogen Gibco). Twenty-four hours after plating, onethird of the culture medium was replaced with a fresh medium supplemented with 20 mM KCl. Cytosine arabinoside (Sigma) was added to the culture dish (final concentration, 5 μM) around 7-10 days in vitro (div) to prevent overgrowth of glial cells. Cultures of 14 to 20 days were used in the experiments.

Neuro-robot architecture
The overall closed-loop system consisted of two parts including the main PC and slave PC components (Fig 1). Framework of the neuro-robot system. Two parts are included in this hybrid system:(i) Main PC components, employed to run the developed software tools, comprising the stimulator, the MEA coupled to live neurons, the amplification, and the data acquisition card; (ii) Slave PC, hosting the robot control software to command the robot system. The main PC communicates with the slave PC by a mean of network port running the TCP/IP protocol.
The hardware of main PC included the MEA device, where electrical signal was recorded by a MEA1060-Inv-BC amplification system (Multichannel Systems, MCS, Reutlingen, Germany). The home-made stimulating hardware (illustrated in Fig 2A) provided electrical stimulation using the arbitrary waveform to 60 electrodes on the MEA in real time. It was a workstation directly linked to a DAQ data acquisition card for conducting computationally expensive neural data analysis. A pair-pulse electrical stimulation with an interval time of 50 milliseconds and an amplitude ±230 mv (negative first, duration was 1.5 milliseconds) was provided to neural network to serve as coding information input.
The software of the main PC component was modified from the toolbox written by Daniel Wagenaar (http://www.danielwagenaar.net/res/software/meabench/) and it ran on the directly linked workstation in a Linux operating system. This modified software, which retained all the functions of the toolbox, also added new functions to control all the parameters of the stimulation and performed the required data processing, such as communication with the robot control system, the implementation of the coding/decoding, spike detection, artifact filter, and short-term plasticity schemes.
The hardware of the slave PC consisted of a separate workstation running the robot control interface along with a robot system that included a two-wheel differential robot and a camera. Software installed in the separate workstation was programmed into a GUI interface to control the robot. As illustrated in Fig 2B, this GUI monitored the position of the robot and objects in real time, controlled the motion of the robot manually, and displayed the data exchanged between the robot and the cultured neurons. Various components of architecture were communicated via TCP/IP sockets. This design allowed distributed processing of the computational loads via multiple machines throughout the internal network.
'4Q' culture An SU-8 mold was utilized to create a mask in order to culture the '4Q' neural network hierarchy. The SU-8 mold was fabricated by UV lithography (Fig 3A). It consisted of a two-layered SU-8 photoresist (MicroChem). The first layer SU-8 3005 photoresist was spin-coated on the cleaned Si wafer to a thickness of 10μm. The coated wafer was baked on the hot plate at a temperature of 95°C for 2min (soft bake). The SU-8 3005 photoresist was exposed by UV light under the photo mask containing a microchannel. After UV exposure, the coated wafer was baked on the hot plate at a temperature of 95°C for 2min (after exposure bake). Next the exposed SU-8 photoresist was developed by a SU-8 developer. After finishing the first layer, SU-8 2150 was spin-coated as a 150-μm-thick second layer and soft baked at 95°C for 30min. The chamber photo mask was aligned to the microchannel patterned on the wafer followed by UV exposure, and baked again at 95°C for 15min. Finally, the SU-8 2150 wafer was created by the SU-8 developer.
The mixture of PDMS pre-polymer and curing agent (10: 1 (w/w), Dow Corning Corp) was poured on the SU-8 mold. PDMS was cast on the SU-8 mold and cured on the hot plate at 80°C for 40min. After curing, the PDMS device was removed from the SU-8 mold and attached to the MEA platform (Fig 3B and 3C). Subsequently, low density cultures of embryonic rat hippocampus were prepared to produce the dissociated '4Q' neural network (Fig 4A and 4B).

Experiment protocol
The typical experimental protocol included the following steps: 1. The dissociated neural cultures were transferred from the culture medium to the extracellular solution (ESC).
2. The I/O of areas was selected by supplying a stimulus to a set of electrodes.
3. The robot was run for 15 minutes.
4. After resting for 2 to 4min, some of the robot parameters were optimized. During the first step, the spontaneous activities of the neural network were reduced by changing the culture medium to the ECS. During the second step of the experiment, the stimulus described above was delivered to each of the selected candidate electrodes every 30 seconds until at least two electrodes were identified to evoke the network firing. Empirically, two or more electrodes would be selected as the 'input areas' based on the difference in activities evoked by them, respectively. Areas that responded the most to stimulation from the corresponding input area were selected as the 'output areas', implying that one electrode/input area corresponded to one 'output areas'. Every time one 'output areas' was confirmed, the remaining areas of neural network were defined as the corresponding 'other areas'.

The step 3 and step 4 were repeated at least twice
During step 3 and 4, the robot commanded by neurons moved freely from any corner of the arena to the objects in the middle of the arena. At least a 20-second interval was ensured between two adjacent stimulations due to the existence of the refractory period of the neurons. The trial was not stopped if the robot reached the objects, however, the robot would be resettled to the start position and the trial continued. Each trial ended after a 15-minute operation. This step was repeated at least twice in view of the hypothesis that the neural network dynamically modulated its activities to respond stably to external input along with the data constantly delivered to the neural network [33]. We hoped the hypothesis would improve the robot's performance. Finally, all the trajectories of the robot's movements and activities of the dissociated cultures were recorded in the hard disk.

Statistical analysis
We used a total of 14 different cultures, ranging from 12 to 20 DIV (days in vitro), including 6 random hippocampal cultures. The other 8 experiments used the hippocampal cultures, divided into four interconnected quadrants by a confinement device, as described above.
The Kolmogorov-Smirnov test (K-S test or KS test) was employed to assess the significant differences among diverse experimental conditions. The K-S test was used to evaluate the differences between two groups of independent samples. This non-parametric test was performed with a MATLAB program and P values of < 0.05 were considered significant.

Dissociated neural network with a hierarchical structure
As illustrated in Fig 4A and 4B, the neurons were cultured on the preparative MEA ( Fig 3C) to develop neural networks named '4Q' cultures, with a hierarchical structure. The functional activities could be obtained by the MEA system from 14 DIV. Spontaneous activities were always observed synchronously throughout the network since the 4 quadrants were interconnected with the microchannels on the mask (Fig 4C). However, the refined hierarchy of these cultures was activated under stimulation. This discovery will be further described in detailed in next section.

Network dynamics in the two neural networks
An electric stimulation with an amplitude of ±230 millivolts was applied to the random and '4Q'cultures to evoke the activities of these neural networks. The neuron burst firing was also observed without any stimulation and lasted several seconds throughout the neural network. In both types of cultures, all the electrodes surrounding the growing neurons recorded their activities in the spontaneous condition of neural network (top panels of Fig 5A and 5B). During the spontaneous activity, differences in firing based on the hierarchical structure were easily observed. Most of the neurons in the neural network fired during every burst firing in random . In A and B, the activity of 59 electrodes is displayed. Each small vertical bar represents a spike. Each line represents an electrode. C. Box-plot of the evoked spikes' firing rate in the 'output areas' and the 'other areas' compartment with respect to stimulating electrodes in the random cultures. In the random network, No statistical differences were noted with 229 electrical stimulations in 6 cultures. D. Box-plot of the evoked spike firing rate in the 'output areas' and the 'other areas' compartment with respect to stimulating electrodes in the '4Q' cultures. In the '4Q' network, the evoked spike firing rate is statistically higher for the 'output areas' compartment; 257 electrical stimulations in 6 cultures. E. Box-plot of the percentage of redundant electrodes in the two stimulations (total of 20 datasets was calculated. Ten of those datasets were acquired from 6 random cultures, and others were from 8 '4Q' cultures). Box range: percentile 25-75; box whiskers: percentile 5-95; line: median. Two sample Kolmogorov-Smirnov Z test for nonparametric test, significant differences are indicated by asterisks, and the significance level = *P < 0.05. cultures, but often only part of neurons in the neural network fired in '4Q' cultures. Most of the electrodes recording the spontaneous activities showed a response to the stimulation in the random culture, but only some of these electrodes responded in the '4Q' cultures because of the hierarchical cultures (bottom panels of Fig 5A and 5B). As illustrated in Fig 6, this response specificity was increasingly apparent in the '4Q' cultures than the random cultures. The network in the '4Q'culture employed various neural circuits to respond to different stimuli, but nearly the same circuit was used to respond to different stimuli in the random cultures.
To further assess the differences in response to the stimulation in these two kinds of culture, we calculated a ratio, by dividing the number of electrodes responding simultaneously to different stimuli with the number of total electrodes responding to the stimuli in one culture. A smaller ratio suggested a greater probability of different responses evoked by the neural network to different stimuli. We obtained the ratios from six random cultures and eight '4Q' cultures (Fig 5E), and a total of 10 ratios from the random cultures were compared to 10 ratios from the '4Q'cultures to evaluate their differences. As shown in Fig 5E, ratios of the '4Q' cultures were significantly smaller than those of the random cultures (P < 0.05, Kolmogorov-Smirnov Z test). It must be emphasized that the response specificity of the '4Q' cultures was stronger than that of the random cultures.
The response specificity was presented more easily in the '4Q' cultures than in the random culture. The evoked firing rates in the corresponding 'output areas' and 'other areas' were calculated, respectively, and were also compared with each other in both types of cultures. As illustrated in Fig 5D, the firing rate of the 'output areas' was significantly larger than the one of the 'other areas' in the '4Q' culture (P < 0.05, Kolmogorov-Smirnov Z test). However, this result was not observed in the random cultures ( Fig 5C). As previously mentioned, compared to previous random cultures, we demonstrated that the '4Q'cultures had completely different properties in signal propagation. This difference resulted in a significant improvement in the performance of the neuro-robot hybrid systems.

Comparison of the performance of robot systems with different neural controllers
A closed-loop robot control was implemented in both types of cultures. All the parameters associated with robot movement and activities of the neural network have been recorded experimentally. Consistent with the results above (Figs 5 and 6), the evoked activities showed a strong overlap regardless of the position of the stimulating electrodes in the random cultures. The percentages represented by the number of evoked electrodes divided by recording electrodes in the corresponding 'output areas' were 58%±22.3% for the random cultures and 62% ±20.1% for the '4Q' cultures. These percentages indicated that the movement of the robot was controlled under the correct evoked activities of the neural network. The results comparing the performance of the robot controlled by the random and '4Q' cultures are illustrated in Fig 7C  and 7D. Time spent on finding the objects and correct turning percentage was employed to evaluate the performance of the robot systems. As shown in Fig 7C and 7D, the time spent on finding the objects was less (Fig 7D) and the number of correct turns of the robot controlled by the '4Q' culture ( Fig 7C) was larger than in the random culture. These results further suggested that the hybrid system with the hierarchical cultures yielded a better performance. As  Fig 7A and 7B, the hybrid system in the closed-loop robot controlled by the '4Q'culture could reach the objects within 400 seconds and performed 7 turns, which were all correctly executed by the robot. This result demonstrated that the performance of the neurorobot system has been powerfully improved by the neural network constructed with a complex hierarchy.
Plasticity of the biological neural network in the neuro-robotic system Although the robot reach the objects under the control of the '4Q' cultures during the first phase, errors still occurred and intermittently the robot took more time to reach the objects, especially in the random cultures. To address this issue, a strategy previously described in the experimental protocol was employed by providing the plasticity of the biological intelligence in the dissociated culture to the hybrid system. As illustrated in Fig 8A, the performance of the robot controlled by the '4Q' culture was improved when the experimental process was employed. In this example, time spent on finding the objects changed from 360 seconds, 316 seconds, to 279 seconds, and the correct turning ratios during this time were 67%, 75%, and 100%, respectively. The same performance of the robot was further assessed in Fig 8B and 8C. The graphs in Fig 8B showed the changes in time spent finding the objects in our hybrid system. However, the expected reduction due to the plasticity change was only observed in the '4Q' cultures. A statistically significant difference was occurred between the first 15 minutes and the last 15 minutes (P < 0.05, Kolmogorov-Smirnov Z test). Another parameter evaluating the performance of the robot is illustrated in Fig 8C. The obvious increase of the correct ratios in two directions along with the experimental process was observed in the '4Q' cultures, but in only one in the random cultures. Interestingly, the correct ratios in two directions were counted, suggesting that the '4Q' cultures simultaneously modulated themselves to adapt to two different information inputs (In fact, we obtained data, where the '4Q' cultures only dynamically adapted to one of the two inputs like the random cultures). These preliminary analyses of changes in our hybrid system showed that the dissociated neural networks were able to dynamically 'recognize' the different stimulation inputs to strengthen the corresponding connections that were employed to transmit information by the '4Q' culture. Statistically significant difference in the correct turning ratio was observed between the first 15 minutes and the last 15 minutes (P < 0.05, Kolmogorov-Smirnov Z test). The increase in correct turning ratio in both directions (along with reduced time) was present in some random cultures, but the fact that more cultures changed in one or none of the directions impacted the final statistical results. Therefore, despite the occurrence of plasticity in the random cultures, it was not adequate to further improve the performance of the robotic system due to inherent limitation of the structure of the random cultures.

Discussion
We have provided evidence that the employment of hierarchical cultures creates different neural circles and greatly reduces the appearance of network-wide synchronized bursts (Figs 5 and 6), resulting in significant improvement in the robot's performance (Fig 7). Interestingly, the hierarchical network contributed to the plasticity of neural network even without tetanus training (Fig 8). These results prompted us to increase the hierarchy of the network in an attempt to enable the dissociated cultures to process more complex behavioral information.
In our experiment, the robot's movement was controlled by cultures via electrical stimulation that lasted about one hour. These short one-hour simulations were ideal for improving the robot's performance by inducing plasticity change of the neural network. Based on the hierarchical culture of dissociated neurons, it is possible to stimulate initially during the beginning of neural network formation to functionally define the neurons, but not physically distinct subpopulations neurons. These stimulations will be used in future experiments within the same culture.
Though the plasticity developmental phase was eventually employed in our hybrid system, it took nearly one hour for a satisfying performance. In previous studies, a training strategy utilizing the tetanic and spatiotemporal stimulations was used and demonstrated effective improvements of the robot's performance [23,26,31]. Therefore, in order to effectively evoke the plasticity of the cultures, a more productive learning scheme needs to be designed. The spiketiming dependent plasticity (STDP) learning mechanism detailed in [34] is a promising candidate. The application of this training phase will provide a useful method to investigate the mechanism of synaptic transmission and receptors involved in the process of adaptation and learning depending on specific stimulation protocols.
Our long-term research objective is to exploit neural plasticity in more complex hybrid systems to perform closed-loop experiments that measure the effective use of biological, computational and learning principles even with relatively simple neural preparations. The neuro-robot framework is also a valid tool for the study of mechanisms of neural coding, neural computation, self-organization and reinforced learning. This kind of architecture is a promising foundation that can be employed in the design of neural prosthetics, and can create fundamentally different types of artificial or hybrid intelligence.

Conclusion
In this paper, issues related to bi-directional closed-loop interface involving cultures with more complex hierarchy than previously reported were introduced for the first time. Statistical comparisons obtained from the '4Q' and random cultures were calculated to highlight the advantages of modular cultures and neural assemblies that are relevant to the performance of the actual artificial agent. The self-adaptive plasticity of the 'biological brain' in a robot system was observed and employed in our hybrid system. Finally, we successfully connected both types of dissociated neural networks originating in the hippocampus of embryonic rats to a mobile robot system bidirectionally. However, we found that the robot system controlled by the hierarchical neural network was more likely to adapt to the plasticity change evoked by stimulation, and was able to execute its task of finding the object at a higher performance level.
Supporting Information S1 Text. A copy of this manuscript showing the changes. (DOC)