Basic reversal-learning capacity in flies suggests rudiments of complex cognition

The most basic models of learning are reinforcement learning models (for instance, classical and operant conditioning) that posit a constant learning rate; however many animals change their learning rates with experience. This process is sometimes studied by reversing an existing association between cues and rewards, and measuring the rate of relearning. Augmented reversal-learning, where learning rates increase with practice, can be an important component of behavioral flexibility; and may provide insight into higher cognition. Previous studies of reversal-learning in Drosophila have not measured learning rates, but have tended to focus on measuring gross deficits in reversal-learning, as the ratio of two timepoints. These studies have uncovered a diversity of mechanisms underlying reversal-learning, but natural genetic variation in this trait has yet to be assessed. We conducted a reversal-learning regime on a diverse panel of Drosophila melanogaster genotypes. We found highly significant genetic variation in their baseline ability to learn. We also found that they have a consistent, and strong (1.3×), increase in their learning speed with reversal. We found no evidence, however, that there was genetic variation in their ability to increase their learning rates with experience. This may suggest that Drosophila have a hitherto unrecognized ability to integrate acquired information, and improve their decision making; but that their mechanisms for doing so are under strong constraints.

One way to estimate learning rate is to calculate the midpoint of the learning curve, Eq 4. To capture the departure of reversal-learning from the baseline RW model, we can estimate the change inx in subsequent, reversed, association tasks. Ifx does not change significantly from one reversal to another, the underlying model of learning cannot be distinguished from the basic RW model. Ifx increases between learning events, we may infer that the learning rate has decreased, and there is some kind of interference between earlier and later associations. Ifx decreases between learning events, we may infer that the learning rate has increased; that is, that animals are learning to be better learners.
For simplicity, we will denote the difference between the initial belief, a 0 , and the maximum beliefū, as λ =ū − a 0 . The midpoint occurs where w x = λ/2. Solving for x: (2) Note that the term λ cancels out, thus the midpoint is independent of the difference between the lower and upper asymptotes; and depends solely on k, i.e. the salience, and associability of the cue. This property makes the midpoint a very convenient metric of learning rate.
If we solve for k, to try and understand the relationship between learning rate and midpoint, we see:

Midpoint estimation
The midpoints were estimated as the halfway point between the first local minimum and the global maximum, using the loess curve ( Fig A1). Visual inspection showed that these estimates fell out very well, relative to the distribution of the data. We examined all replicates for outliers, and here show 8 graphs, drawn at random from all the time periods and all the trials, for illustration. We estimated 241 learning midpoints, in total, and we show 8 randomly selected trials for illustrative purposes. In all cases, the points fall in the bounds of the graph, in a region where fly preference is rapidly moving to the maximum. These heuristic estimates were far more robust than maximum likelihood estimates, obtained using nonlinear regression. Attempts to estimate midpoints using sigmoid regression (for instance, using the package nplr) were fragile, with high rates of non-convergence, even on aggregated data, and unreasonable estimates, between -5 and 45 logged-minutes. Across all trials, within each genotype, the estimated value ofx, in log-minutes demonstrate a great deal of inter-genomic variation (Fig. A2), as well as distinctly lower reversal learning rates. Analysis of the significance of genotype, period and day effects are presented in the results.

Experimental design, and cross information
The Cosmopolitan recurrent F 1 genotypes that we used (here designated A, B, and C) are genotypes we have used previously [1]. They were constructed as the following maternal×paternal crosses (where numbers are the genotype accession numbers for the Drosophila Genome Reference Project) correspond with accession numbers RAL-360×RAL-335, RAL-732×RAL-774, RAL-486×RAL-380. The Caribbean genotypes (here designated X, Y, Z) were similarly constructed by crosses between inbred lines collected in the Caribbean [2], kindly provided by R. Yukilevich. Showing, the construction of the F 1 genotypes from inbred lines, the composition of each trial, and the 3 stages of each trial; as well as the number of replicate individuals within trials, trials within genotype; and total trials conducted.