Efficient Olfactory Coding in the Pheromone Receptor Neuron of a Moth

The concept of coding efficiency holds that sensory neurons are adapted, through both evolutionary and developmental processes, to the statistical characteristics of their natural stimulus. Encouraged by the successful invocation of this principle to predict how neurons encode natural auditory and visual stimuli, we attempted its application to olfactory neurons. The pheromone receptor neuron of the male moth Antheraea polyphemus, for which quantitative properties of both the natural stimulus and the reception processes are available, was selected. We predicted several characteristics that the pheromone plume should possess under the hypothesis that the receptors perform optimally, i.e., transfer as much information on the stimulus per unit time as possible. Our results demonstrate that the statistical characteristics of the predicted stimulus, e.g., the probability distribution function of the stimulus concentration, the spectral density function of the stimulation course, and the intermittency, are in good agreement with those measured experimentally in the field. These results should stimulate further quantitative studies on the evolutionary adaptation of olfactory nervous systems to odorant plumes and on the plume characteristics that are most informative for the ‘sniffer’. Both aspects are relevant to the design of olfactory sensors for odour-tracking robots.


Introduction
According to the 'efficient-coding hypothesis' [1], the sensory neurons are adapted to the statistical properties of the signals to which they are exposed. Because not all signals are equally likely, sensory systems should best encode those signals that occur most frequently. This idea was first tested by Laughlin [2] in a pioneering study of first order interneurons in the insect compound eye, the large monopolar cells, which code for contrast fluctuations. He showed that the response function of these graded potential cells, measured by intracellular recording, approximates the cumulative probability distribution function of contrast levels measured in the natural fly's habitat with a photodiode.
With a nonlinear stimulus-response function, the neuron encodes differently an equal change in stimulus intensity depending on the actual concentration ( Figure 1A). The key question is, how should a neuron weigh its input so as to transfer as much information as possible? Information theory [13,14] provides the solution. In the simplest scenario (with no other constraints on the response range), the inputs should be encoded so that all responses are used with the same frequency [2]. The optimal stimulus statistics is given by the stimulus probability distribution ( Figure 1B), which is obtained directly from the stimulus-response curve. This simple solution, however, does not hold in the case of olfaction because of the large differences in reaction time at different stimulus concentrations. This is a major difference with respect to Laughlin's approach, in which all response states were assumed to be equiprobable.
In this paper, we paralleled Laughlin's approach [2], adapting his method to suit the specificity of olfaction. We chose a well studied olfactory receptor neuron, the pheromone receptor neuron of male moths, to investigate its adaptation to the natural signal it processes, the sexual pheromone emitted by conspecific females. To our knowledge this neuron and its stimulus provide the only example in olfaction for which enough data are available on the odorant plume and the neuron transduction mechanisms to make a quantitative comparison possible between the predicted optimum signal and the natural signal.
Flying male moths rely on the detection of pheromone molecules released by immobile conspecific females for mating. The atmospheric turbulence causes strong mixing of the air and creates a wide spectrum of spatio-temporal variations in the pheromonal signal ( Figure 2). The largest eddies are hundreds of metres in size and may take minutes to pass a fixed point, while the smallest spatial variations are less than a millimetre in size and last for milliseconds only [15,16]. Due to inhomogeneous mixing, a very high concentration of pheromone can be found in a wide range of distances from the source, though their frequency decreases with distance [15]. Because of its complicated and inhomogeneous structure, the description of the plume must rely on statistical methods, notably the histogram of the fluctuations in pheromone concentration [15][16][17][18][19]. These fluctuations are essential for the insect to locate the source of the stimulus. Experiments in wind tunnels showed that moths would not fly upwind in a uniform cloud of pheromone [20][21][22]. Characteristics like the frequency and intensity of the intermittent stimulation play a key role in maintaining the proper direction of flight [23].
The goal of this paper is to present arguments specifying in which sense the perireception and reception processes occuring in pheromone olfactory receptor neurons (ORNs) can be considered as optimally adapted to their natural stimulus. Although, in the light of previous studies on similar sensory neurons, the ORN may be considered a priori as adapted to the pheromone plume, the exact nature of this adaptation and its proof are more challenging questions. Despite widespread agreement that environmental statistics must influence neural processing [24], precise quantification of the link proved difficult to obtain [8]. So, the main aim of this paper was to identify the specific characteristics to which the pheromone ORN is adapted and to provide quantitative evidence for their adaptation. We proceeded in two steps. First, using the statistical theory of information, we predicted the characteristics of the optimal pheromonal signal that the ORN is best capable of encoding based on the properties of the initial steps of signal transduction. Second, we compared these theoretically-derived properties with statistical characteristics most often determined in experimental measurements, i.e., the probability distribution function of the fluctuations in pheromone concentration, the spectral density function of the stimulation course and the intermittency of the odorant signal.

Model of Pheromone Reception
Pheromone components are detected by specialized ORNs located in the male antenna. We considered a specific ORN type of the moth Antheraea polyphemus detecting (E,Z)-6,11-hexadecadienyl acetate, the major component of the sexual pheromone in this species, for which a wealth of precise information is available (reviewed in [25]). The pheromone molecules are adsorbed on the cuticle, diffuse inside the sensory hair to the neuron membrane and are thought to be enzymatically deactivated [25] then degraded. The initial cell response is triggered by the binding of the pheromone molecules to the receptor molecules borne by the dendritic membrane and the ensuing receptor activation. A cascade of events follows, amplifying this initial response and finally leading to the generation of a train of action potentials conveyed to the brain. The pheromone concentration at each instant determines the ORN response. However the extreme temporal variability of pheromone concentration in plumes prevents a full description of stimulus-response relationships by direct electrophysiological measurements. For this reason we based our study on a model of perireception and reception processes describing how any stimulus (concentration of pheromone in the air) is transformed into the receptor response (concentration of activated receptors). This model, based on extensive biochemical, radiochemical and electrophysiological experiments, was developed by Kaissling and coworkers The amount of transferred information is limited by the finite range of possible response states. Due to the non-linearity of the stimulus-response function, each response state encodes different relative changes in stimulus intensity. (B) Corresponding probability density function (pdf). Maximum information is transferred if all response states are used equally, i.e., if the area under the stimulus pdf is equal for each response state, as shown. In the limit of vanishingly small response states, the optimal stimulus CDF corresponds to the (normalized) stimulus-response function (adapted from [2]

Author Summary
Efficient coding is an overarching principle, well tested in visual and auditory neurobiology, which states that sensory neurons are adapted to the statistical characteristics of their natural stimulus -in brief, neurons best process those stimuli that occur most frequently. To assess its validity in olfaction, we examine the pheromone communication of moths, in which males locate their female mates by the pheromone they release. We determine the characteristics of the pheromone plume which are best detected by the male reception system. We show that they are in agreement with plume measurements in the field, so providing quantitative evidence that this system also obeys the efficient coding principle. Exploring the quantitative relationship between the properties of biological sensory systems and their natural environment should lead not only to a better understanding of neural functions and evolutionary processes, but also to improvements in the design of artificial sensory systems.  [43]. Though the average pheromone concentration in the air decreases with distance, high pheromone concentrations can be found relatively far from the source due to the imperfect mixing of odorant with air. The signal detected by both moving and stationary detectors is therefore always intermittent, consisting of pulses of relatively undiluted pheromone. doi:10.1371/journal.pcbi.1000053.g002 [25,26]. It involves the following system of chemical reactions: The network includes (1) the translocation of the ligand from the air (input pheromone signal L air ) to the hair lumen (L); (2) the reversible binding of L to receptor R and the reversible change of the complex R L to an activated state R * (output signal); (3) the reversible binding of L to a deactivating enzyme N and its deactivation to product P which is no longer able to interact with the receptor. . The evolution of the system 1-3 in time given the external signal L air is fully described by five first order ordinary differential Equations 4-8 and two conservation Equations 9 and 10: Equations 9 and 10 follow from the fact that the total concentration of the receptor molecules, R tot = R+R L +R * , as well as the total concentration of the deactivating enzyme, N tot = N+N L , do not change over time. We assume that at t = 0 the concentrations L, R L , R * , N L and P are zero. The parameter values, derived from extensive experimental investigations, are given in Table 1.

Basic Stimulus-Response Properties
The efficiency of information transfer in the system 1-3 depends critically on its stimulus-response relationship under single and repeated stimulus pulses. For transferring as much information as possible the response states must be optimally utilized. The actual amount of information transferred is limited by biological constraints. In the system studied, information transfer from L air (stimulus) to R * (response) presents three main limitations.
First, it is limited by the finite number of receptor molecules per neuron which places an upper bound on the range of responses. Whatever the pheromone concentration (height of the step) the concentration of activated receptors cannot exceed R Ã max0 :24 mM at any time [26].
Second, temporal details in the stimulus course shorter than a certain lower limit Dt cannot be analyzed by the system. The smallest period of stimulation of the model studied here is 0.4 s [26,27], in agreement with experimental measurements [28,29]. With smaller periods, at higher frequencies, the amplitude of the oscillations of R * becomes too small to be effective. Therefore we set Dt = 0.4 s. Two successive pheromone pulses separated by a time shorter than Dt cannot be distinguished.
Third, information transfer in time is also limited by the response duration, which depends on the deactivation rate of the activated receptors. The time course of R * in response to stimulations of different heights L air and limited duration (0.4 s) is shown in the inset of Figure 3A. The concentration of activated receptors rises at first, reaches R D * at the end of the stimulus pulse, i.e., R D * = R * (t = Dt), and finally decreases. We consider R D * as the ''response'' of the system and for the sake of simplicity in the following, we omit index D. The duration of the falling phase (receptor deactivation) gets progressively longer for higher pheromone concentrations. This deactivation takes typically much longer than the time resolution parameter Dt. The falling phase is often described by the half-fall time, t(R * ), which is the time required for R * (t) to decrease from R * to R * /2. The relationship between R * and t(R * ) is shown in Figure 3A. A unique value of R * corresponds to each value L air , which defines the stimulus-response curve ( Figure 3B). The fact that the deactivation of activated receptors is relatively slow suggests that the reception system cannot encode a long sequence of pheromone pulses in arbitrarily quick succession. This observation plays an important role in the definition of the optimal stimulus course.

Optimal Stimulus Course
In the simplest scenario (with no other constraints on the response range and stimulus-independent additive noise), the inputs should be encoded so that all responses are used with the same frequency [2,30]. The optimal stimulus is thus described by its probability distribution function, which is obtained directly from the stimulus-response curve. Due to the large differences in reaction times at different stimulus concentrations, all response values R * from 0 to 0.24 mM cannot be considered as equally ''usable'' (the long falling phases decrease the efficacy of the information transfer). Therefore, the longer the half-fall time of a given response R * (i.e. the greater concentration R * is) the less frequent it must be. The particular form of the optimal response cumulative probability distribution function (CDF), F R (R * ), which was determined by maximizing the information transferred and minimizing the average half-fall time (see Methods), is shown in Figure 3C. Then, based on the three factors mentioned (stimulus-response curve, Figure 3B; time resolution Dt = 0.4 s; and optimal response probability distribution, Figure 3C), an optimum stimulus course in time can be predicted as explained in the Methods section.
Examples of predicted temporal fluctuations in pheromone concentration are shown in Figure 4 at various time scales and compared to experimental observations. Even though the time resolution of the system studied here is only 0.4 s, it seems sufficient to capture the main bursts of pheromone (see the 10 s sample in Figure 4A). The comparison can be made more precise by describing statistically the heights and occurences in time of the pulses.

Predicted Temporal Pattern of Pulses
Concerning temporal aspects, the bursts of non-zero signal do not occur at periodic intervals but appear randomly. An important descriptor of the temporal structure is the intermittency [15,16], which is the fraction of total time when the signal is present. The intermittency of the predicted optimal stimulus is 20%, which is in relatively good agreement with experimental data. It has been shown using various types of ion detectors [17,19] as well as electroantennogram responses [17,31], that the natural signal is always present less than 50% of the total time, and usually smaller values are found. The average intermittency values reported are 10-20% [15] and 10-40% [16,17], depending on the experimental conditions, such as the detector size or the global meandering of the plume (see Discussion).

Predicted Concentrations of Pheromone Pulses
Concerning pulse height, the overall character of the predicted stimulus course is that pulses of high concentration are much rarer than those of low concentration. This feature of the predicted stimulus can be best quantified by the CDF, P(L air ), of the stimulus. The shape of the CDF is one of the most important properties for comparing theoretical predictions to experimental measurements because it describes the relative distribution of odorant concentrations throughout the plume. In fact, because measuring pheromone concentration in the field is not presently feasible [17], pheromone molecules must be replaced by measurable tracers. Relative quantities are valid for both pheromones and tracers (see Discussion). They are the only quantities known experimentally for pheromone plumes. So, although our model predicts them, we cannot compare values of L air to actual measurements.
Given the definition of the optimal stimulus, function P(L air ) can be directly computed (see Methods). Figure 5 shows a comparison between experimentally measured (A) and predicted (B) concentration CDF. The optimal pheromone concentration CDF ( Figure 5B, solid line) is not known in analytical form but it can be well approximated by an exponential CDF ( Figure 5C, dashed line). The differences between the predicted and true exponential shape can be considered as non-significant, namely, very high values of L air are predicted to be less frequent than in the exponential model. The exponential CDF is in agreement with experimental CDF (Figure 5A), [18,19,32,33] and holds well especially for observations closer to the source (less than 100 m). Although the precise form of the CDF varies with distance from the plume centerline [19] and may be affected by the measurement technique, the shape is always highly skewed.
Other predicted relative quantities (peak-to-mean ratios, dimensionless concentrations L air /AEL air ae) were compared with their . The functions R * (L air ) and F R (R * ) were used for calculating the optimal stimulus probability distribution (shown in Figure 5B). doi:10.1371/journal.pcbi.1000053.g003 experimental counterparts. The results, summarized in Table 2, show that the predicted statistical properties of the stimulus are not contradicted by the experimental observations.

Spectral Density Functions of the Stimulus Course
Spectral density functions of the concentration time course, which analyze the contribution of various frequencies to the overall stimulus course, characterize other properties of the plume which are independent on the nature of the odorant (pheromone or ion source) [19,33]. Furthermore, spectral density function represents a point of view different from the concentration probability distribution.
Several spectral density functions, shown in Figure 6, were calculated from the predicted optimal pheromone stimulation (see Methods). The spectral shapes seem to be almost flat from 0.02 Hz to 0.2 Hz with a decreasing slope close to 22/3 above 0.2 Hz. The same slope 22/3, which is theoretically predicted by the inertial subrange theory [19], was reported in the spectral densities obtained from measurements close to the source (less than 100 m), in the range 0.1 Hz (or 0.5 Hz, depending on records) to 1 Hz [19,33], although the precise range may depend on the technique of measurement.

Discussion
The goals of this study were to determine to which extent early olfactory transduction in olfactory receptor neurons can be considered adapted (in the evolutionary sense) to odorant plumes and to specify the plume characteristics to which it is adapted. The formulation and resolution of this problem benefited from successful studies of efficient sensory coding undertaken in the field of vision and audition. However, transposition from these sensory modalities to olfaction is not straigthforward, which may explain in part why it has not been attempted earlier. Specificities of olfaction concern both the odorant plume and the sensory system.

Odor Plumes
In theory and in practice, the quantitative description of odor plumes and their spatiotemporal distribution is less straightforward than that of visual or auditory scenes. Contrary to light and sound, for which the physical description is essentially complete, the turbulent phenomena which underlie the plume characteristics are still an incompletely mastered domain of physics [34].
In Laughlin's classical experiment in vision a single timeindependent variable, the contrast level, was measured [2] and directly compared with experimental data. In olfaction, however, the odorant concentration (an analogue to the contrast level) is essentially time dependent which results in a complex optimal stimulus course ( Figure 4). Complexity and time dependence make a meaningful direct comparison between predictions and experimental records, but also between different experimental records, impossible. Instead, the comparison must rely on global, statistical descriptors [15,17,19,33]. We identified 5 such descriptors of odor plumes, actually measured and usable in the present context (see Table 2), which summarize the present knowledge on odor plumes.
Moreover, there are no easy-to-use instruments to measure odor plumes in the field, comparable to luxmeters and microphones.
For example, the absolute pheromone concentration cannot be easily known in field experiments [17]. This explains why no experimental values were given for this descriptor in Table 2. In practice, only ratios of concentrations are presented because they are independent of the dispersed molecules. The pheromone is often substituted by an ion or a passive tracer (polypropylene for example) whose concentration can be measured [15,17,19]. Because both pheromone and tracer compounds in the air are governed by the same physical laws, the relative (dimensionless) values are conserved, as confirmed by independent experiments with different sources [15][16][17]33]. More generally, this limitation explains why we compared only relative quantities (i.e. shape of probability distributions, spectral density functions, peak-to-mean ratios, dimensionless concentrations L air /AEL air ae and intermittency values). Other limitations of plume measurements are discussed below.

Model of Early Transduction
The essentially multidimensional and stochastic nature of the odor stimulus has a profound influence on the analysis of olfactory transduction system in its natural context, as undertaken here. Indeed to investigate the problems at hand, the kinetic responses of the system to a very large number of stimuli, varying in intensity, duration and temporal sequence must be known in order to simulate the diversity of stimuli encountered in a natural plume. This task is difficult, if not impossible, to manage in a purely experimental approach. However, this difficulty can be overcome with an exact dynamic model of the system because its response to the diverse conditions mentioned can be computed, provided it includes all initial steps from molecules in the air to the early neural response. This is the case of the perireception and reception stages of the moth pheromonal ORN and the reason why it was chosen in the present study. This choice brings about two questions, one about the validity of the model, the other on its position within a larger context.
The computational model employed has been thoroughly researched and improved over the last three decades [25,[35][36][37]. It describes perireceptor and receptor events in the ORN cell type sensitive to the main pheromone component of the saturniid moth Antheraea polyphemus. At the time of writing it represents the most completely researched computational models of its kind, agreeing with extensive experimental data from various authors and a wide range of experimental techniques. This model is the best description presently available for early events in any ORN and it summarizes in a nutshell a wealth of dispersed knowledge. This model is based on ordinary differential equations 4-8, following the law of mass action for chemical reactions, and is therefore purely deterministic. This approximation is acceptable when the concentrations of reactants are high enough above singlemolecular levels, so that the stochastic fluctuations can be neglected. In this paper, the concentration of R * is always well above that corresponding to one activated receptor molecule per neuron (approximately 10 26.2 mM) because we do not investigate the effect of extremely small pheromone doses. Then, the response of the system can be considered as deterministic, in accordance with the efficient coding hypothesis [8].
The system studied here constitutes only a small part of the whole pheromonal system, although its role is absolutely essential and all other parts depend on it. First, in ORNs, post-receptor mechanisms modify the receptor signal, primarily by a large amplification factor and by sensory adaptation. Second, the ORN population includes cell types with different properties, e.g. the ORN type sensitive to the minor pheromone components can  [19]). The intermittency is included in the plots in the non-zero value of P(C) for zero concentrations (see Methods). The experimental data clearly follow the exponential CDF, except close to C = 0, which is caused by technical issues in the measurement process [19]. The relatively high value of measured intermittency (close to 47%) is caused mainly by initial data processing [19]. (B) CDF predicted by the pheromone reception model together with its best exponential fit, the scales correspond to panel (A) After correcting (see Methods) for the fact that the intermittency predicted by the pheromone reception model (20%) is lower than that measured in [19] (as explained in the Discussion), the predictions correspond well to the measured data in (A), except at very high values of L air where they are less frequent than expected. Since this deviation is apparent only for events occurring with probability P,0.01, it can be considered as non-significant. doi:10.1371/journal.pcbi.1000053.g005 follow periodic pulses up to 10 Hz [29], a performance not yet accounted for in present models [27]. Third, in the brain antennal lobe, convergence of a large number of ORNs on a few projection neurons (PNs) provides another amplification and supports the ability of some PNs to follow periodic signals at 10 Hz or greater [38]. Evolutionary adaptation of an integrated ORN response is difficult to study at the present time because no complete model of the ORN from receptors to the generation of the receptor potential and the ensuing spike train, is yet available, at least with the required degree of precision. The same argument holds a fortiori for higher order processes. Notwithstanding, the study of the early sensory events is not as restrictive as it may seem because any incoming odor signal must be first transduced in the population of membrane receptors. No information can be extracted by the post-receptor transduction system which has not been encoded by the receptors in the first place. For this reason it is essential to investigate the nature of the adaptation of the initial events (pheromone interaction with receptors) to the pheromone signal.

Determination of the Optimal Stimulus
Different response states of the pheromone reception system have different efficacies from the coding point of view: the ''high'' states, with large concentrations of activated receptors, take much more time to deactivate than the ''low'' states, so that for some time after its exposition to a large concentration of pheromone the system is ''dazzled''. It means that in the optimal stimulus the low pheromone concentrations must be more frequent than the high ones. This is a difference with respect to the classical problem where the efficacy of all response states at transferring information is considered the same, as in the vision of contrasts for example. The problem to solve is to find the right balance between two conflicting demands: to use all response states (including the high ones) and to react rapidly (the short transient responses must be as frequent as possible), i.e. to maximimize the information transferred per time unit.
The solution to this optimization problem is provided by information theory as detailed in the Methods section. The optimal balance derives from Equation 19 which relates the average half-fall time and the maximum response entropy distribution. The key factor to consider in the optimization is the average half-fall time, which characterizes globally the ''swiftness'' of the system -smaller average half-fall time means faster stimulation rate. In other words, the average half-fall time characterizes the bias towards ''low'' response states. Simultaneously, the condition of maximum response entropy guarantees that the temporal dynamics of the system is as varied as possible and that during the course of stimulation every possible response state is used (with appropriate frequency). By taking into account only the average half-fall time, and not the precise sequence of its individual values, we therefore do not neglect or limit the temporal dynamics of receptor molecules activation. It is important to note, that the average half-fall time is not a free parameter of the problem; it is not set a priori: its optimal value follows from the optimization procedure (Equation 20). The resulting optimal response CDF is highly biased towards low response states, as expected (see Figure 3C). Table 2. Comparison of statistical characteristics of optimal and actual plumes.

Characteristics a Predicted Values b Experimental Values
Concentration CDF ( Figure 5) Exponential Exponential [18,19,32] Spectra ( Figure 5) Approx. flat to 0.    Figure 4, bottom panels), for different initial random seeds. Calculated spectral shapes are usually almost flat from 0.02 Hz to 0.2 Hz, although exceptions are sometimes observed at lower frequencies, which are also found in experimental data [19]. Above 0.2 Hz there is a decreasing slope close to 22/3. Flat spectrum up to 0.2 Hz and true 22/3 slope beyond are shown for comparison (thick line). Spectra from experimental measurements (not shown) on propylene plume obtained close to the source are reported to exhibit a similar flat region followed by 22/3 slope [19]. doi:10.1371/journal.pcbi.1000053.g006

Nature of the System Adaptation
The main achievement of the present investigation was to predict the characteristics of the stimulus optimally processed by the receptor system based on its biochemical characteristics and an information theoretic approach. The predicted optimal plume was shown to be close to the actual plumes for a series of characteristics, namely intermittency, peak/mean ratio and peak/standard deviation ratio of pheromone pulses, probability distribution of dimensionless pheromone concentration and spectral density function of pheromone concentration ( Table 2, Figures 4-6). The correspondence between the predictions and measurements is very good for the last two characteristics (probability distributions) and fair for the first three (numerical values).
These differences in precision of the predictions may be interpreted by taking into account technical factors. Increasing the noise rejection threshold leads to a decrease of the measured intermittency [15,19], while increasing the detector size or averaging the signal over longer time windows has the opposite effect [39]. So, for example, the small size of olfactory sensilla with respect to detectors may explain in part why in Figure 4B, the predicted intermittency seems lower than that in the corresponding experimental record sample, and also why the peak-to-mean ratio and peak-to-standard deviation ratio are relatively higher. The immobility of the measurement devices, in contrast with the active movements of the moths, is another significant factor. For example, long pauses (of the order of minutes) of zero signal are missing in the prediction but visible in the longest available field record (350 s, Figure 4C). They are caused simply by the plume being blown away from the immobile field detector. First, this loss of signal is clearly an extraneous effect, which cannot be included in our optimal signal predictions and therefore cannot be seen in our results. Second, the moth is not subjected to this extraneous effect, or at least not to the same extent, because, in case of signal loss, it actively seeks the pheromone plume, whereas the fixed detector must passively wait for its return. This difference of mobility may substantially affect the intermittency values, but does not affect the shape of probability distributions (see Methods), hence the better quality of the fits in the latter case. In conclusion, the results obtained suggest that the perireceptor and receptor system investigated here is evolutionary adapted to the pheromone plumes.
Even if one considers that the pheromone olfactory system must be a priori adapted to the average characteristics of the pheromone plumes, it does not logically follow that the system studied is itself necessarily well adapted. Indeed, it is conceivable that the global adaptation results mainly, not from perireception and reception processes but from other downhill intra-and intercellular processes involved in higher signal processing. The respective importance of the former and latter processes in global adaptation cannot be decided a priori. Therefore, the relatively close correspondence between predicted and observed plume characteristics presented here is not trivial. It suggests that the adaptation at the level of receptors is already substantial, and consequently that the global adaptation is not predominantly the result of postreceptor mechanisms involving amplification, sensory adaptation, convergence of different ORN types in the antennal lobes etc. The role of these mechanisms in the global adaptation of the animal remains to be established, as well as the relative importance of the various components of the olfactory system (receptor population, ORN as a whole, population of pheromonal ORNs in the antenna, projection neurons in the antennal lobes, etc.). The response characteristics of these other subsystems, e.g. their various temporal resolutions, will have also to be interpreted, maybe in relation with changing plume characteristics with distance to the source and other factors yet to be identified.

Optimal Response Probability Distribution Function
As mentioned in the Results section, information transfer in the pheromone reception system is limited by the finite response range, (0, R Ã max ), and by the deactivation rate of the activated receptors for each concentration value R * . This deactivation rate is described by the half-fall time t(R * ). The optimal performance of the system is thus reached by a trade-off between two conflicting demands: to employ full response range (maximum information) vs. to employ only the ''fastest'' responses (minimum average halffall time). In other words we need to maximize the information transferred per average half-fall time. In the following we provide the mathematical framework that enabled us to find the probability distribution function over the response states R * that realizes this trade-off.
Information transferred. The information transferred by the pheromone reception system in a selected time window (t,t+Dt) is described by the relation between all possible stimulus values, L air , and the corresponding response values, R * . This relation is explicitly quantified by the mutual information, I(L air ; R * ) (see [13] for details) where H(R * ) is the entropy of the response probability distribution function and the conditional entropy H(R * |L air ) measures the uncertainty in the output given the input, or equivalently, the amount of noise in the information transduction [13,14]. The model of pheromone reception employed here is deterministic and therefore H(R * |L air ) = 0. Thus maximizing the mutual information corresponds to maximizing the response entropy H(R * ). (Note that in the usual setting of signal independent and additive noise the term H(R * |L air ) is constant and then maximization of I(L air ;R * ) again corresponds to maximization of H(R * ).) The available response range, (0, R Ã max ), is naturally discrete, since it is comprised of individual receptor molecules. The expression of H(R * ) is ( [13], p.14) where p(R * ) is the probability of having R * (expressed as a number of molecules). (In the following we use the base of logarithm 2 only to express all information-related quantities in the usual units of ''bit''). The value of R * corresponding to one activated receptor molecule per neuron is approximately DR * = 10 26.2 mM [26], which gives a total of N~R Ã max DR Ã~3 80374 different response states. Since N is so large, the impractical Equation 12 can be replaced by a continuous approximation based on differential entropy, h(R * ), defined as ( [13], p.243) where f(R * ) is the response probability density function. An approximative relation between H(R * ) and h(R * ) is given in ( [13] p.248) In the present case the approximation is excellent because the discretization step DR * is very small compared to the whole response range (R Ã max~0 :24 mM). From relation 14 the mutual information 11 can be expressed in terms of differential entropy Maximizing the information transferred is thus achieved by maximizing the differential entropy h(R * ). The advantage of employing differential entropy is that it lends itself to an elegant approach for entropy maximization in terms of integrals. Information optimization. We adopt the standard procedure for maximizing the differential entropy of a continuous probability distribution constrained by a known function t(R * ). ''Constraining'' means that the average value AEtae of t(R * ) is under our control (see [13], p.409) The task is to find a probability density function, f R (R * ), which (i) maximizes the value of h(R * ) (Equation 13) and (ii) is such that the average AEtae (Equation 19) taken over f R (R * ) is equal to the value we set. The well known solution to this problem (see [13], p.410 or [40] for its derivation) is where Z l ð Þ~ð It depends on new parameter, l, called Lagrange multiplier. In the standard setting of maximum-entropy problems ( [13] p.409 or [8,40]) the mean value of the constraint function, AEtae, is known a priori. The value of l is then determined by substituting f(R * ) = f R (R * ) in Equation 16, so that the following equation between AEtae and l holds In the case of pheromone reception, however, the value of AEtae and consequently of l is unknown. The value of l must be determined by finding a compromise between maximum information transferred (Equation 15) and minimum average half-fall time (Equation 19). This compromise is made explicit by a simple requirement In other words we maximize the information transfer per half-fall time.
Application to pheromone reception. In order to simplify practical calculations we substitute f(R * ) = f R (R * ) into the definition of differential entropy 13 so that Equation 15 reduces to Now we have all the necessary information to calculate (i) the mutual information I(L air ;R * ) from Equation 21 (shown in Figure 7A), (ii) the mean half-fall time from Equation 19 ( Figure 7B) and (iii) their ratio from Equation 20 ( Figure 7C) in dependence on the Lagrange multiplier l. Figure 7A shows that the mutual information is maximized (18.5 bits) for l = 0 which corresponds to the uniform probability distribution function over the whole response range. Generally, since t(R * ) is a monotonously increasing function of R * , the optimal probability density function f R (R * ) (Equation 17) is either monotonously increasing (l,0), monotonously decreasing (l.0), or constant (l = 0). The multiplier l thus decides whether f R (R * ) puts more weight on the ''slow'' response states (l,0) or on the ''fast'' response states (l.0). These observations are confirmed in Figure 7B where the mean half-time monotonically decreases with increasing l. Figure 7C shows the information transferred per average halftime, i.e., it shows the compromise between the ''slowness'' or ''reactivity'' of the system and the transferred information. Clearly, there cannot be a maximum for l,0 where the system is both ''slow'' and below its information capacity (note the sharp decrease of mutual information in Figure 7A for l,0). The optimal balance between reactivity and information transfer is reached for l<6 at 8 bits/s. By substituting l = 6 into formula 17 we obtain the desired optimal response probability density function, f R (R * ), which maximizes the information transfer per average half-time. The corresponding CDF F R (R * ), shown in Figure 3C, is given by The maximum of information transferred per average half-time ( Figure 7C) is not sharply defined, namely, the transfer of 7-8 bits/s persists with values of l greater than the optimal value. At the same time, both mutual information ( Figure 7A) and average half-time ( Figure 7B) decrease slowly in the corresponding region, indicating that the shape of the optimal response probability distribution changes slowly with respect to l. Indeed, as we verified numerically, varying l within reasonable limits (so that information transferred stays close to 8 bits/s) has no impact on the results presented in this work.

Optimal Stimulus Course
The optimal stimulus course in time was calculated as follows. First, at time t 0 = 0 a random value p 0 is drawn randomly from a uniform probability distribution function over the range [0,1]. The concentration R Ã 0 corresponding to probability p 0 is obtained by solving the equation where F R (R * ) is the optimal CDF given by formula 22 (Figure 3C). The predicted optimal concentration L air,0 for a pheromone pulse of duration Dt = 0.4 s which corresponds to R Ã 0 is obtained by solving the equation where R * (L air ) is the stimulus-response function ( Figure 3B). The value L air,0 is plotted at t 0 ( Figure 4). Second, the concentration L air,1 and time of appearance t 1 of the next pulse are determined. Time t 1 follows from the falling phase of activated receptors: optimality requires that no pheromone pulse appears before R * returns to its resting level. In practice it is considered that the resting level is reached when R * falls below 0.01 mM (less than 5% of the coding range). The concentration L air,1 of the pulse at t 1 is determined in the same way as for the pulse at t 0 by drawing a new random number p 1 from the uniform probability distribution function over [0,1]. The same process can be repeated as many times as needed to create an optimal pheromone pulse train of arbitrary length.

Optimal Stimulus Probability Distribution Function
It is common in the literature on the statistical analysis of plumes [15,18,19] to define two types of mean concentrations. The total mean concentration, AEL air ae, describes the ''true'' mean concentration obtained from the whole record of concentration fluctuations in time, i.e., including the parts where no signal was available. On the other hand, the conditional mean concentration, AEL air ae cond , describes the mean concentration inside the plume, i.e., with zero concentrations excluded. The intermittency, c, relates the two means as [19] SL air T~cSL air T cond : ð25Þ (Analogously, the total variances and total standard deviations are calculated by taking into account also the parts where no signal is available [19].) By combining Equations 23 and 24 we may symbolically express the optimal CDF of the stimulus, P(L air ), as Though P(L air ) cannot be expressed in a closed form, it can be well approximated by the exponential CDF where j = (5.2460.01)610 24 mM is the estimated value of AEL air ae cond by least-squares fitting of F exp (L air ) to P(L air ). In order to compare concentration probability distribution functions from different measurements meaningfully, authors [19] plot the CDF for a dimensionless concentration L air /AEL air ae. (In the Figure 5A C/AECae is used, since the data plotted were obtained using a propylene source, not pheromone), see Figure 5A. The scale of such plots is affected by intermittency due to the presence of the total mean in the ratio. Furthermore, information about intermittency is included explicitly in the plots by letting the probability P(L air = 0) of zero concentration be P L air~0 ð Þ1{c: ð28Þ Consequently the CDF P(L air ) must be renormalized [19]. Intermittency affects only the dimensionless scale, L air /AEL air ae, and the value of P(L air = 0) but not the overall shape of CDF [19]. Therefore we can use formulas 25 and 28 to compare our predictions with experimentally measured data by correcting for different intermittency values.

Spectral Density Function of the Stimulus Course
The optimal stimulus course is represented by pulses of different pheromone concentrations, L air , occurring in time intervals 0.4 s long. In order to calculate the spectral density function of such stimulation course we sample the time axis with step Dt = 0.4 s. Thus we obtain a series of pheromone concentrations at these time points, {L air,j }, j = 1…n, where n should be even. The discrete Fourier transform, w k , of {L air,j } is defined for k = 1,…,n values as [41] Q k~X n j~1 L air,j exp {2pi j{1 where i is the complex unit. The zero-frequency term is thus at position k = 1. The spectral density, S ( f ), of the complete time course of the stimulus can be calculated for a total of n/2+1 values of frequency f (given in Hz) [42] S m n Dt ~2 where m = 0, 1, 2,…, n/221, n/2 and f = m/(nDt) are the frequency values. The function P(f) is the Fourier transform of a pulse of unit height, 0.4 s long and starting at t = 0 [41], where a = 2.5 and d = 20.5. The function P( f ) appears in formula 30 because the whole stimulus course (such as shown in Figure 4, bottom panels) can be reconstructed by convolving the discrete series {L air,j } with such a pulse of unit height in the time domain [41].