Human Wavelength Discrimination of Monochromatic Light Explained by Optimal Wavelength Decoding of Light of Unknown Intensity

We show that human ability to discriminate the wavelength of monochromatic light can be understood as maximum likelihood decoding of the cone absorptions, with a signal processing efficiency that is independent of the wavelength. This work is built on the framework of ideal observer analysis of visual discrimination used in many previous works. A distinctive aspect of our work is that we highlight a perceptual confound that observers should confuse a change in input light wavelength with a change in input intensity. Hence a simple ideal observer model which assumes that an observer has a full knowledge of input intensity should over-estimate human ability in discriminating wavelengths of two inputs of unequal intensity. This confound also makes it difficult to consistently measure human ability in wavelength discrimination by asking observers to distinguish two input colors while matching their brightness. We argue that the best experimental method for reliable measurement of discrimination thresholds is the one of Pokorny and Smith, in which observers only need to distinguish two inputs, regardless of whether they differ in hue or brightness. We mathematically formulate wavelength discrimination under this wavelength-intensity confound and show a good agreement between our theoretical prediction and the behavioral data. Our analysis explains why the discrimination threshold varies with the input wavelength, and shows how sensitively the threshold depends on the relative densities of the three types of cones in the retina (and in particular predict discriminations in dichromats). Our mathematical formulation and solution can be applied to general problems of sensory discrimination when there is a perceptual confound from other sensory feature dimensions.


Introduction
In a classical wavelength discrimination experiment, the observer views a bipartite field, one half filled with light of a standard wavelength and the other with light of a comparison wavelength. The wavelength of the comparison field is changed in small steps and the observer adjusts the radiance of the comparison field following each change in an attempt to make the two fields perceptually identical. Wavelength discrimination threshold is reached when the observer reports that the two fields always appear different, regardless of the radiance of the comparison [1]. This discrimination threshold in humans is a ''w'' shaped function of the wavelength of the light: it has a central peak at around wavelength l~540 nanometers (nm), minima at l~490 and l~580 nm, and rises up sharply for lw650 nm and for very short wavelengths [1]; similar results hold for the macaque monkey and presumably other old world primates [2].
This work aims to see if human monochromatic light discrimination thresholds can be understood as optimal decoding of the sensory input using the information available in the cones, regardless of the specific neural mechanisms involved. In particular, we derive and evaluate a photon noise limited ideal observer that performs wavelength discrimination based on the numbers of photons absorbed in the three classes of cone. It is well known that human performance does not approach that of a photon noise limited ideal observer [3,4,5,6], and thus our primary aim here is to determine how well the shape of the human wavelength discrimination function is explained by the ideal observer, regardless of its overall amplitude. If the shape were perfectly explained, then it would imply that the neural mechanisms following the cones are equally efficient for different wavelengths.
Wavelength discrimination of monochromatic lights is one of the visual tasks most suited to ideal observer analysis for the following reasons. Input sampling by the photoreceptors is among the best quantitatively understood process along the visual processing pathway. In particular, the wavelength sensitivities of cones are known, and the stochastic nature of the cone absorption levels can be described by Poisson distributions of absorption levels. The discrimination task is simple because it involves purely chromatic discrimination, so the spatial and temporal aspects of the inputs can be ignored or absorbed by the scale for the total input intensity. Therefore, total cone absorptions by the excited cones can lead to sufficient statistics for analysing the consequent decoding and its uncertainty of the input stimulus.
Geisler [8] in particular used such an analysis to understand many human discrimination tasks based on cone responses. Among these tasks analyzed is our task of monochromatic light discrimination. His work and the current work are both based on the maximum likelihood method which can be used to optimally estimate or discriminate sensory inputs from their evoked neural responses. These two methods are approximately equivalent in the principle of maximum likelihood discrimination of two stimuli. However, this previous work did not identify an important issue that is essential for fully understanding the behavioral data. This issue is that of a confound in perception of multiple sensory features -in particular, human observers can easily confuse an input color change with an input intensity change when monochromatic lights are the inputs; for example a long wavelength input may appear darker when the input wavelength is increased while input intensity is held fixed. This confusion reduces human ability in hue discrimination when observers do not have the full knowledge of input intensities. To fully account for the behavioral data, this confound should be formulated explicitly in the ideal observer analysis.
The current work presents an augmented formulation of the ideal observer analysis to address sensory discrimination under a perceptual confound, and applies it to wavelength discrimination behavior. The sensory input includes both sensory feature dimensions: one is the input wavelength dimension whose discrimination is of interest, and the other is the input intensity dimension which interferes or interacts with wavelength discrimination through the perceptual confound and the experimental methods used. Our mathematical formulation of this problem of sensory discrimination under perceptual confound is general. While it is applied specifically to the wavelength discrimination problem in this paper, it can also be applied elsewhere. It will enable us to identify experimental methods which can provide more reliable measurments of the discrimination performance. From our formulation, we derive how the threshold is related to the cones' wavelength sensitivities and the input light intensity, illustrate how sensitively the predictions depend on the relative densities of the three types of cones in the retina, and analyze why the discrimination threshold varies with the input wavelength in the ways observed. We show that our theoretical predictions from the augmented ideal observer analysis to accommodate the perceptual confound can give a better account of the behavioral data. Furthermore, we show how different sizes of stimuli used by different experiments may explain their different patterns of results. A preliminary report about this work has been presented elsewhere [9].

Methods
The spectral sensitivities of the cones Let there be three types of cone a~L,M,S, which are most sensitive to long, medium, and short wavelengths respectively (they are sometimes called red, green, and blue cones). They have tuning curves f a (l), such that the average cone absorption of a single cone a to a monochromatic light of intensity I at wavelength l is r r a~I f a (l). If n a cones of type a are excited by a uniform patch of light, then the essential quantities for determining input color, regardless of the spatial shape of the input patch, are the total responses from each of the three cone types. For the task of color discrimination, it is equivalent to view the n a cones of type a collectively as a single giant cone with sensitivity n a f a (l), for this giant cone's sensitivity provides a sufficient statistic for the task (i.e., this sensitivity provides all the information relevant to the task) such that viewing individual cones separately does not provide any additional useful information for the task. The allimportant ratios n L f L (l) : n M f M (l) : n S f M (l) depend on both the relative densities and the relative sensitivities of the different cone types.
According to various experimental data on the responses from and light absorption by cones [10,11,12], f a (l) for different cones should peak to the same peak value, if one ignores the pre-receptor absorption by the ocular media. We denote this normalized spectral sensitivity asf f a (l), and will call it the cone fundamental. However, pre-receptor absorption of the input lights by the ocular media makes f a (l)~O(l)f f a (l) where O(l)ƒ1 is the pre-receptor absorption factor. Let O a~fa (l a ), where l a is the wavelength where f a (l) peaks; then f a (l)=O a should correspond to the behaviorally measured (normalized) cone fundamental, and for notation simplicity we still denote it asf f a (l) and thus f a (l)~O af f a (l). Meanwhile, assuming that O(l) does not change as quickly asf f a (l) with l near l a , then O a &10 {OD(la) where OD(l a ) is the optical density of the pre-receptor ocular media at wavelength l a .
In our analysis later, we will include the cone density factor n a and use the notation f a (l)~n a O af f a (l). Furthermore, we normalize f a (l) such that Max l P a f a (l)~1. Given these normalizations, the total photon absorptions of the cones will also scale with the size of the input light field (which determines the total number of cones for each cone type) and the effective input integration time by the viewing of the observers. These scale factors will be absorbed into the input intensity parameter I, which also scales with the input radiance. We will see later that, given f a (l), the shape of the curve relating the discrimination threshold to wavelength is completely determined by the optimal decoding, and the parameter I merely scales the threshold.
As our illustrative starting point, we approximate n L : n M : n S6 : 3 : 1 and O L : O M : O S~1 : 1 : 0:2. These numerical values arise from the following considerations. Firstly, various sources suggest that S cones are almost absent within 0.3 deg from the center of fovea but their contribution to the total cone density rises and peaks to 15% around 1 deg from the center [13] and approaches 7-10% in the periphery [13,14]. Meanwhile, the Pokorny and Smith data [1] were from experiments using a centrally viewed 3 o disc containing the bipartite field of color inputs. We combine this information to assume that the S cones contribute 10% to all cones excited by the Pokorny and Smith stimuli. Secondly, various sources suggest that L cones are about twice as numerous as the M cones [14], we hence assume that L and M cones contribute 60% and 30%, respectively, of all the excited cones by the stimuli. This gives us n L : n M : n S~6 : 3 : 1. Thirdly, the optical density of the pre-receptor ocular media is almost constant in the medium and long wavelength region, giving O L : O M &1, but rises with decreasing l by 0.7 log units when l~l S &440 nm [14], giving O L : O S &10 0:7 &5. Additionally, although the cone fundamentalsf f a (l) from various literature sources are similar, we use those from Smith and Pokorny [15] (obtained from the CVRL website (http://www.cvrl.org) by Andrew Stockman), since we will be fitting their wavelength discrimination data [1]. Combining the considerations above gives f a (l) as shown in Fig. 1. It turns out that these f a (l)'s are not far from those by Vos and Walraven [16], who made P a f a (l)~V (l) where V(l) is the luminous efficiency function, a measure of the visual effectiveness of lights at different wavelengths for luminosity, normalized such that the maximum value of V (l) is 1, i.e., Max l V (l)~1. The biggest discrepancy between the two sets of f a (l)'s is that the S cone contribution is weaker in Vos and Walraven's composition [16] than in ours. This is not too surprising, as although the relative contributions by different cone types to luminosity perception are not necessarily the same as their relative contributions to color perception, they should be related or quite close to each other, except that S cones may contribute to the luminosity perception less than suggested by their density [17]. Our analysis and conclusions do not depend sensitively on our actual approximation for f a (l). We will later explore how our results vary quantitatively when we use other choices for the ratio f L (l) : f M (l) : f S (l). This ratio depends on cone densities and the optical density of the pre-receptor ocular media, which both vary substantially between observers (e.g., by up to one log unit in optical density [14]). This ratio f L (l) : f M (l) : f S (l) also depends on the cone spectral sensitivities, which do not vary as substantially between observers but different literature sources provide slightly different quantitative values for them.

Stochastic cone absorptions in response to monochromatic light
In this paper, we only consider monochromatic inputs. Hence, we describe our input stimulus by s~(l,I), a vector of two parameters, l and I, for the wavelength and intensity of the input light. The actual cone absorption r a for cone a is stochastic following a Poisson distribution with a mean r r a~I f a (l) P(r a js)~( r r a ) ra r a ! exp ({ r r a )~( If a (l)) ra r a ! exp ({If a (l)): ð1Þ Sometimes we also call r a the response of the cone to the input light. The population response r:(r L ,r M ,r S ) has the probability P(rjs)~P a P(r a js)~½P a (If a (l)) ra r a ! exp½{I X a f a (l): ð2Þ Fig. 1 shows how an input of particular wavelength could give rise to many possible responses in the three dimensional space r~(r L ,r M ,r S ) near the mean response r~( r r L , r r M , r r S ).

Maximum likelihood decoding
Given the responses r, one can decode the input stimulus s~(l,I) from the conditional probability P(sjr) (of s given r) by finding the s that makes P(sjr) maximum or large. So the most likely input to evoke r is the one that maximizes P(sjr). By Bayes's formula, we have P(sjr)~P(rjs)P(s)=P(r) where P(s) is the prior probability of input s and P(r)~Ð P(rjs)P(s)ds. When the prior probability P(s) is constant so that it does not favour one s over another, then P(sjr) varies with s only through P(rjs), i.e., P(sjr)!P(rjs): Therefore, the input s for responses r can be found by maximizing P(rjs). As P(rjs) is also called the likelihood of r given s, decoding by maximizing P(rjs) is called maximum likelihood decoding. We will use this method to understand wavelength discrimination.

Decoding for input wavelength when input intensity is known and fixed
When input intensity I is known and fixed, knowing the response r enables us to estimate the input wavelength l using maximum likelihood decoding. We call this the simple model of optimal input wavelength estimation, in the sense that we are not considering the variation of I (as in experimental procedure of Pokorny and Smith [1]) in decoding. With a flat prior expectation that l could be any value (within the visible light spectrum), the best estimatel l for the input l is the one that maximizes the probability P(rjs) or equivalently its natural logarithm, ln P(rjs), which we call the log likelihood. A monochromatic input of wavelength l evokes response r~(r L ,r M ,r S ) from the three cones, L, M, and S. Due to input noise, there is a range of possible responses r from this input. If the mean response to a monochromatic input of nearby wavelength lzdl is one of the typical responses within this range of responses r to input l, then it will be difficult to perceptually distinguish the input l from input lzdl. doi:10.1371/journal.pone.0019248.g001 The best estimatel l is the value of l satisfying In a special case, if r a~I f a (l) under input s~(l,I) for all three cones (i.e., the response of each cone type is exactly equal to the mean absorption), thenl l~l is the value satisfying the above equation. In general, there is nol l to make r a~I f a (l l) exactly for all three cones simultaneously, but one can still find al l to satisfy the equation above. In any case, given an input wavelength l, different responses r will lead to different estimatesl l(r); most of them will be near to but not equal to the actual input wavelength l. So if two different input wavelengths l 1 and l 2 are similar enough, the estimated wavelengthsl l 1 andl l 2 may appear to be drawn from the same probability distribution. In such a case, these two input wavelengths would appear perceptually indiscriminable, or within the discrimination threshold; see Fig. 1.
With strong enough responses r (effectively responses collected from enough cones and sufficiently many captured photons), it is known that the variance of these maximum likelihood decoded l l(r) for a given input l should approach [18] ½s(l) 2 : where I F (l) is the Fisher information defined as where SxT denotes average Ð drP(rjs)x of x over P(rjs). Since and Sr a T~I f a (l), we have As s 2~1 =I F (l), a larger Fisher information gives a smaller estimation error s. This estimation error can be expressed as in whichs s(l) does not depend on intensity I. The estimation error s(l) is identified here as the discrimination threshold, as it characterizes the uncertainty of the perceived wavelength. Fig. 2 shows this threshold s(l) as a function of l, together with the experimentally observed threshold s data (l) from Pokorny and Smith [1]. Let s data (l) and Ds data (l) be the mean and the standard deviation of the wavelength discrimination thresholds of the four observers in Pokorny and Smith [1]. The input intensity I~3210 in Fig. 2 is chosen as the one that minimizes the average square difference: The I that minimizes x 2 is the one that gives Lx 2 =LI~0, leading One can see that the model prediction greatly underestimates the threshold for long wavelengths l §620 nm. Also, the peak location near 550 nm is not quite right. This best fit gives x~1:419, indiciating that for most data points, the model predicts a threshold which departs from the data by more than a standard deviation of the data point.
The poor fit of the simple model arises because of the following. In Pokorny and Smith's experiment, observers adjusted the intensity I of the comparison input field with wavelength lzdl to make it look as perceptually indistinguishable as possible from Figure 2. Wavelength discrimination assuming input intensity I is fixed and known during color matching. It is by maximum likelihood decoding of the cone responses r using the simple model. The solid curve plots the discrimination threshold s(l)~½I F (l) {1=2 as a function of l from the model. The data points with error bars are the mean s data and the standard deviation Ds data of the discrimination thresholds of the four observers of Pokorny and Smith [1]. In fitting the model to the data, I is chosen such that the quantity ½Ds data (l) 2 is minimized. the standard input field which has input wavelength l. This adjustment makes the comparison and standard input fields look indistinguishable until dl is too large, and the wavelength discrimination threshold is defined as the dl when this matching between the two fields starts to become impossible, so the comparison field is perceptually discriminable from the standard field no matter how observers adjust the intensity I. If the observers somehow had the full knowledge of the intensities I in both fields, they should in principle still be able to decode and thus discriminate the wavelength to roughly the same accuracy as predicted by the simple model when the intensity is held fixed and identical in the two fields. The reason the predictions overestimate the human accuracy is because one should not assume that the observers know the intensities I, which also have to be decoded from the same sensory stimuli used to decode the wavelength. To explain the experimental data, our model should let I be unknown and changeable rather than known and fixed. We call this the full model (rather than the simple model) of optimal wavelength estimation, and this model is explained next.

Sensory discrimination under perceptual confoundwavelength discrimination when input intensity is not fixed
Wavelength discrimination when input intensity is not fixed is just one example of a general problem of sensory discrimination under perceptual confound: sensory discrimination along one sensory feature dimension when neural responses are also affected by feature changes in another feature dimension. In the wavelength discrimination case, the two feature dimensions are input light wavelength l and input intensity I. Here, we formulate this problem in general, and it will be clear that our result in equation (20) is general and not specific to our example of monochromatic wavelength discrimination. Meanwhile, we will use our wavelength discrimination problem as an example to illustrate this general result.
Let the sensory input be s~(s 1 , s 2 ), where s 1 and s 2 are feature values in the two feature dimensions, e.g., s 1~l and s 2~I . Let r be the neural responses evoked by s with probability P(rjs). The maximum likelihood estimationŝ s of s from r can be arrived at by finding the solution to L ln P(rjs)=Ls 1~0 , ð13Þ The estimation error isŝ This error depends on the specific response r in each trial. Over many trials, these two dimensional errors (ds 1 ,ds 2 ) have a covariance, generalizing from the simple 1-dimensional case above, given by where I {1 F (s) is the matrix inverse of the Fisher information matrix We note that, when s 1 is l in our example, the matrix element T is exactly the Fisher information we had in our simple model of wavelength discrimination. Let the P(ŝ sjs) be the probability of obtaining the maximum likelihood estimateŝ s when the true input is s. Since the estimation error ds~ŝ s{s has the covariance structure in equation (16), we can approximate P(ŝ sjs) as P(ŝ sjs)&P(sjs) exp½{ 1 2 Note that this approximation makes the error ds have zero mean and gives the correct error covariance. Now the threshold to discriminate s 1 while s 2 is not fixed is the largest ds 1 value that can be obtained to maintain P(ŝ sjs)P (sjs) exp ({1=2), i.e., to give Applying the above to the example of wavelength discrimination, the threshold for wavelength l discrimination while I is not fixed is the largest ds 1~d l value that can be obtained to maintain (I F ) 11 dl 2 z(I F ) 22 dI 2 z2(I F ) 12 dldI~1, a particular example of equation (19). This can be illustrated in Fig. 3. This figure shows the contour plot of the posterior probability P(l l,Î Ijs). This probability peaks at the origin s~(l,I) of the coordinates in this plot. As deviation ds~(dl,dI) ofŝ s~(l l,Î I) from s~(l,I) increases, the probability P(l l,Î Ijs) decreases, as indicated by the contours of probabilities, with larger, darker, contours indicating smaller probabilities. When dI~0, the largest dl to make P ij (I F (s)) ij ds i ds j~1 is dl~½(I F ) 11 {1=2 , the color discrimination threshold in the simple model and indicated by Dl dI~0 in Fig. 3. If dI=0, then the largest wavelength deviation dl~Dl dI=0 on the contour P(l l,Î Ijs)~P(sjs) exp ({1=2) should be larger, as indicated in the figure. This condition of dI=0 means that the decoding system assumes thatÎ I can be different from the default I, i.e., the intensity of the comparison field can be different from the intensity of the standard field in the color matching.
We can show (detailed derivation in the next subsection after equation (28)) that the discrimination threshold for feature s 1 when input feature s 2 is not fixed (e.g., wavelength discrimination threshold at wavelength l when input intensity I is not fixed) is In particular, I F in our wavelength discrimination problem is Optimal Decoding of Monochromatic Light Since we have L ln P(rjs) Ll~X a r a f 0 and L 2 ln P(rjs) LI 2~{ X a r a I 2 , ð24Þ then, given r r a~I f a (l), we have Plugging the above into equation (20) we have wavelength discrimination threshold Again, this threshold can be writen as s(l)~s s(l)= ffiffi ffi I p . This predicts precisely how wavelength discrimination threshold should vary with wavelength l, and that it should scale with 1= ffiffi ffi I p as in the simple model. Like the simple model, the full model only has one free parameter, I.

Mathematical proof of equation (20)
For matrix I F , let us denote its normalized eigenvectors as V 1 and V 2 , with corresponding eigenvalues F 1 and F 2 . Note that the two eigenvectors V 1 and V 2 are orthogonal to each other, since I F is a symmetric matrix, so any 2 dimensional vector (ŝ s{s):ds( ds 1 ,ds 2 ) T (where the superscript T denotes transpose) can be expanded in their basis as ds~v 1 V 1 zv 2 V 2 with coefficients v 1 and v 2 respectively. Then P 2 i,j~1 (I F ) ij ds i ds j~F1 v 2 1 zF 2 v 2 2 due to the invariance of this quantity to the bases used. Note that since I F is positive definite, , and find the largest dl on this curve, and this largest dl should be the discrmination threshold. One can always find a parameter h (see Fig. 3), such that the eigenvectors are One notes that the dot product From these we can solve for dl in terms of v 1 and v 2 as The values of v 1 and v 2 on the curve we have, equating (I F ) 22 on the right hand side of the equation to that in the left hand side Also noting that F 1 F 2~d et (I F ), the determinant of the I F matrix, we have Results Figure 4 illustrates the full model's predicted threshold (in equation (28)) fitted to the data. It uses the optimal I, as in equation (12), such that the summed squared difference (as in equation (11)) between the predicted and observed thresholds is minimized. The fitting quality is much better than that by the simple model. In particular, with x~0:663, the predicted threshold is within the standard deviation of experimental data for most data points. As in the data, the predicted threshold rises sharply as l approaches the ends of the spectrum.
The wavelength-intensity confound and the divergence of threshold near the red and blue ends of the spectrum The thresholds predicted by the full and simple models differ most towards the red and blue ends of the spectrum. This is because only one cone type can be substantially activated at the spectrum ends, making the system practically color blind, just like in scotopic vision when only the rods are active. For example, in the red end of the spectrum when the M and S cones are almost silent, an increase in l, i.e., dlw0, weakens the L cone response r L , i.e, dr L v0. The simple model uses dr L for wavelength discrimination by attributing it to dl with the relationship d r r L~I f 0 L (l)dl. The full model however sees this dr L v0 as equally attributable to a reduced input I, i.e., dIv0, with d r r L~fL (l)dI, making it hard to distinguish whether the input gets redder or darker. This wavelength-intensity confound for the same dr L makes wavelength discrimination difficult. In the procedure of the Pokorny and Smith experiment [1], it means that an increase in l can be easily compensated by an increase in I, making the threshold large.
The wavelength-intensity confound is present generally even when all cone types are substantially activated. Let l L , l M , and l S , with l L wl M wl S , be the preferred wavelengths of the L, M, and S cones respectively. This confound is stronger when lwl L or lvl S , when the predictions from the simple and full models differ most (see Fig. 4. In these wavelength regions, a change dl causes response change dr~½f 0 L (l), f 0 M (l), f 0 S (l)Idl, which either simultaneously increases or simultaneously decreases the responses from all cone types, just like the response change dr~½f L (l), f M (l), f S (l)dI caused by an intensity change dI. Although a dl slightly changes the ratio r r L : r r M : r r S while a dI does not, the difference between the dr caused by dI and the dr caused by dl could be submerged under noise such that the two causes are perceptually indistinguishable.
This confound is weaker when l S vlvl L , when a wavelength change dl will raise responses from some cone types while lowering responses from other cone types. In this case, a dl cannot be easily compensated for by an dI, which raises or lowers the responses from all cone types simultaneously. Hence, the simple and full model predict similar thresholds, particularly when l M vl&560nm vl L is in between the preferred wavelengths of the two most numerous cone types, L and M. For l*520 nm, the S cones are still insensitive, while both the L and M cones prefer larger l, and the confound is again significant, causing a substantial difference in the predicted thresholds from the simple and the full models. This is because a dl increases or decreases the responses from the L and M cones simultaneously (while affecting the S cone response relatively little), and can be easily compensated for by a dI.

Implications of the wavelength-intensity confound on the experimental procedures and on the stability of the threshold measurements
The wavelength-intensity confound, especially when l6 [ (l S , l L ), means that there can be problems with some experimental methods used to measure wavelength discrimination threshold. In many such experiments (e.g., [19,20]), the procedure requires adjusting the intensity of the comparison field such that the brightnesses of the two fields match. The confound means that, when observers see a difference between the two fields, it is not easy to tell whether it is a brightness difference or a hue difference. This is a known difficulty noted in the accompanying discussions of Wright and Pitt's paper by fellow color vision scientists (pages 469-473 of [20]). Supposedly, ½Ds data (l) 2 is minimized. The dashed curve shows the results from the simple model using this same input intensity I. The data points with error bars are the mean s data and the standard deviation Ds data of the discrimination thresholds of the four observers of Pokorny and Smith [1]. B: cone sensitivities plotted on a linear scale. doi:10.1371/journal.pone.0019248.g004 the threshold is the smallest wavelength difference between the two fields when observers deem the two fields to differ in hue but not in brightness. However, whether the observers judge some perceptual difference to be a brightness or hue difference is likely to be dependent on the following factors: observers' internal criteria based on their expectations or biases, specific task instructions given by the experimenters, and perhaps even the visual environment around the experimental set up. These factors cannot be predicted straightforwardly from our optimal decoding theory, and could also cause variabilities between data from different observers and from different laboratories.
The procedure used by Pokorny and Smith [1] differs from the procedure above. They ask the observers to adjust the intensity of the comparison field until the two fields match in both hue and brightness, and the threshold is the smallest wavelength difference when this match is impossible by any intensity adjustment. This procedure does not require observers to decide whether any perceptual difference is due to brightness or hue, as they simply need to judge whether the two fields differ or not. This makes the threshold data more stable. Therefore, we do not intend to compare our theoretical prediction with data other than those by Pokorny and Smith [1].

The effect of the cone densities and pre-receptor light transmission on wavelength discrimination
It is clear from the analysis that the discrimination threshold depends on the relative sensitivities f a (l) for different cone types a. Since our f a (l)~n a O af f a (l) scales with the relative cone density n a and the relative pre-receptor transmission factor O a for each cone a, n a and O a should affect discrimination. We remind ourselves that the cone fundamentalf f a (l) for all cones a have the same peak value Max lf f a (l)~1, and we have the normalization Max l X threshold s (l )1 So s(l) at any particular wavelength l scales roughly with 1= ffiffiffiffi ffi d a p for the cone type a that dominates at l. For example, increasing the fraction n L of the L cones among all cones would relatively lower the discrimination threshold near the red end of the spectrum, and increasing light absorption by the pre-receptor ocular media near the short wavelength region would decrease O S and thus raise the threshold near the blue end of the spectrum. Fig. 5A-C shows the predictions using Smith and Pokorny cone fundamentals [15]f f a (l) but with different settings for d L : d M : d S . Fig. 5A is a replot of Fig. 4 with different scales on the axes. Its  [13,14]. We note that its worst predictions are near wavelength l~500 nm, which is in the region where S cones' sensitivityf f S (l) has large slopes df S =dl and hence a high sensitivity to wavelength changes. Fig. 5B has d L : d M : d S~1 : 1 : 1 which could be seen as a situation when all cones have the same density and prereceptor optical transmission. It raises the relative density for the S cones way over the physiological reality, and slightly raises the relative density of the M cones over the L cones. Consequently, it vastly over-estimates the discrimination sensitivities near the region l*500{550 nm, in the domain of the S and M cone contribution. As a result, it gives a x~1:289 that is substantially worse than the x~0:663 in Fig. 5A. Fig. 5C has a d L : d M : d S~1 3 : 9 : 1 ratio that minimizes x, such that the predicted thresholds best agree with experimental data. This d L : d M : d S ratio is obtained by exhaustively searching all integer values of 1ƒd L ,d M ƒ120 with d S~1 held fixed. With x~0:576, almost all the data points are within a standard deviation from the predicted values. Compared with Fig. 5A, Fig. 5C raises the weights for the S cones (and slightly for the M cones), but not as dramatically as Fig. 5B does. Hence, it corrects the worst predictions in Fig. 5A near the l~500 nm region without overshooting the correction. Fig. 5D-F show the best predicted thresholds like Fig. 5C by three other cone fundamentalsf f a (l) obtained from different sources in the literature: [21], [16], and [22] (see Andrew Stockman's webpage http://www.cvrl.org/). Compared with the predictions when using the Smith and Pokorny cone fundamentals [15] in Fig. 5C, their best predicted d L : d M : d S ratios are similar, and so are their goodness of fit x~0:702, 0:614, and 0:628, which are only slightly worse than x~0:576 in Fig. 5C. This finding is not so surprising, as the cone fundamentals from different literature sources are similar to each other. Meanwhile, it may not be a coincidence that the cone fundamentals of Smith and Pokorny [15] best fitted the wavelength discrimination data obtained by them. It is likely that different researchers have different research styles and experimental procedures and hence different sets of experimental data obtained by the same style are more likely to be consistent with each other.

The importance of the S cone minority
Experimental data for wavelength discrimination for lv440 nm are scarse and very variable. These may be caused by many factors, including the large inter-subject variabilities (e.g., in cone densities and optical density of the ocular media) in that wavelength region, the difficulties of delivering stimulus in the short wavelength region, where light absorption by ocular media is dramatic [14], and, as discussed above, the wavelength-intensity confound makes some experimental procedures problematic in that wavelength region. However, Bedford & Wyszecki [19] reported that, as threshold rises with decreasing l below 500 nm, it dips again around 410{430 nm before rising sharply. Wright and Pitt reported in 1934 [20] a much shallower dip at a slightly larger l&445 nm. As we argued, a perceptual confound between wavelength and intensity for lvl S &440nm, the most preferred wavelength by S cones, should make threshold rise continuously with decreasing l as all three cone types become less and less sensitive. So it may seem puzzling how this dip could arise from our full model, which shows a continuous rise of the threshold as l decreases. Bedford and Wyszecki [19] acknowledged and discussed that the presence of this dip was controversial experimentally. In fact, a dip in the very long wavelength region was also seen by earlier studies and was then invalidated by later studies [20], and is no longer seen in modern day data [19,1].
We suggest that the extra dip near l&440 nm may be the side effect of an extra peak in threshold at l&460 nm caused by too few blue cones involved in some experiments. We note that Bedford and Wyszecki [19] used input bipartite fields that were 1 o or smaller. This is smaller than the input field 3 o used by Pokorny and Smith [1]. As the density of S cones drops drastically to zero within 1 o from the center of the fovea [14], there are fewer S cones involved if the central viewing color matching fields are smaller than 1 o . (Note that observers in Bedford and Wyszecki's experiment [19] used free viewing for their task. We consider such free viewing in this attention demanding task as central viewing since gaze follows attention mandatorily in free viewing [23]). If there are no S cones, wavelength discrimination relies on L and M cones only. A close examination of the L and M cone spectral sensitivities reveals that, in a small region of l around l&460 nm, f L (l)~cf M (l) with a scale factor c that is almost constant within that region. This means, as l changes in that region, the responses of the L and M cones co-vary almost completely (except for noise) so that they act together as if a single rather than two different cone types. This makes the L+M dichromatic system almost color blind in that local wavelength region, and consequently the discrimination threshold shoots up. This covariance of the two cone types can be seen in the signature ?0, and we can define a degree of co-variance as Mathematically, the 2x2 Fisher information matrix I F reduces its rank to 1 when both cones have their f a (l) scale with each other, and thus the two dimensional wavelength-intensity input space is collapsed into one by the two redundant cone types acting as one. Fig. 6 illustrates how this Degree of Co-variance between the L and M cones shoots up near l&460nm, thus giving a peak in threshold around that wavelength when there are too few S cones. The exact location of the peak depends slightly on thê f f a (l) cone fundamentals used, whether it is the [15] cone fundamentals or other cone fundamentals, but this difference is not big. This rise in threshold around l&460 nm can be prevented by having sufficiently many S cones to remove the collapse of dimensionality. The dramatically worse discriminability at l*460 nm with smaller color matching field sizes or in tritanopic dichromats (who lack S cones) has been observed in previous studies ( [24,25].

Discussion
Our maximum likelihood decoding model can explain the experimental data reasonably well. This is based on adjusting a single free parameter, I, which characterizes the net effect of the radiance of the input light, the effective integration time (within the observer's visual system), and the total area of the input field, etc. Although we did not compute overall quantum efficiencies (i.e., the ratio between the number of photons needed by the ideal and human observers for the same task; [7]), they are undoubtedly quite low (typically they are less than 0.1, [3,4,5]). Nonetheless, the good agreement between the model and data shows that, for wavelength discrimination, the efficiency of human color processing mechanisms is nearly constant over the spectrum (i.e., information is extracted with equal efficiency at all wavelengths).
The best fit between data [1,15] and theoretical prediction suggests that the ratios between effective densities of different cones are d L : d M : d S~1 3 : 9 : 1. Here, the effective density d a~na O a for each cone type a is the actual cone density n a diluted by the prereceptor optical transmission factor O a v1. Meanwhile, evidence suggest that on average n L : n M : n S~6 : 3 : 1 and O L : O M : O S1 : 1 : 0:2 [14,13], giving d L : d M : d S~3 0 : 15 : 1. Since variability in human optical density can give up to a factor of 10 difference in O a , and a difference in human n L =n M by a factor of 3 seems not unusual [14], our finding of an optimal d L : d M : d S~1 3 : 9 : 1 can be seen as within the range of variability of the human quantities.
We analyzed the probable causes of the differences in results across color matching experiments, and how the results could sensitively depend on the experimental procedures and stimulus parameters. It is expected and straightforward to conclude that discrimination threshold should be smaller when color matching is done without adjusting the matching field intensity. Furthermore, we identify that different sizes of the centrally viewed matching fields may cause different findings regarding whether or not there is a dip in discrimination threshold below 450 nm, or a peak around 460 nm. This peak and the resulting dip in particular may arise from small, foveally viewed, fields such that fewer blue cones are excited by the inputs. We also point out that the brightness-hue confound can make some experimental procedures give more accurate and stable results than others. In particular, the procedure used by Pokorny and Smith [1], in which subjects only need to judge whether the two fields differ, is better than other matching procedures in which subjects need to match the brightness of the two fields before judging whether they differ in hue.
The factors responsible for the low overall quantum efficiency of wavelength and other simple discriminations are unknown, but presumably they include photoreceptor inefficiencies, limits in the spatial and temporal integration (by the post-receptor neural mechanisms) of the photoreceptor responses, and neural noise. Any of these factors would tend to reduce overall quantum efficiency while preserving constant relative efficiency [4,6].
Our method in this paper can easily be applied to predict wavelength discrimination by dichromats. Fig. 7 shows that, Figure 6. Illustration of how reducing the density of S cones should create a threshold peak near l&460nm. Because the L and M cones have their spectral sensitivity co-vary with each other as l varies near l&460 nm, they act as if they are a single cone type around that l. As threshold eventually increases when l approaches 400 nm, this local threshold peak at 460 nm creates a threshold dip between 400 and 460 nm. doi:10.1371/journal.pone.0019248.g006 compared with the trichromats, the protanopes and deuteranopes should have much larger thresholds in the long wavelength region, and the tritanopes should have much larger thresholds in the short wavelength region. These predictions seem to suggest that, for trichromats, wavelength discrimination is mediated by the protanopic/deuteranopic system at short wavelengths and on the tritanopic system at long wavelengths. These theoretical predictions are in line with known observations [25]. They are intuitively expected since color discrimination in the long wavelength region requires the combined activations of both L and M cones, while the S cones are essential for short wavelength discrimination since L and M cones are both only weakly active and co-vary considerably in that wavelength region. These qualitative predictions are insensitive to the actual cone densities used in our formula. These results are arrived at by assuming that the number of L/M cones in a protanope/deuteranope is the same as the total number of L and M cones in a trichromat (as suggested by data from [26]), and that the missing S cones in tritanopes are replaced proportionally by additional L and M cones so that the total number of cones is conserved. The predictions then follow naturally from equation (28) except to replace all summations over three cone types by the corresponding summations over two cone types. One caveat of these predictions is that the large threshold predictions, especially for the dichromats, should be taken as only qualitatively rather than quantitatively trustworthy. This is because our Fisher information formulation for discriminability is based on discriminating two stimuli very close to each other such that a Taylor expansion of log likelihood ratio is a suitable approximation. The suitability of this approximation breaks down when the two stimuli are very different from each other, when the discrimination threshold is too large. This issue has been raised by a previous work on tritanopia [27].
Our formulation of an ideal observer analysis for sensory feature discrimination under perceptual confound is general, and can be used in other sensory discriminations beyond our example case in this paper. More specifically, let a sensory world contain two feature dimensions, whose feature values are denoted by s 1 and s 2 respectively, hence s~(s 1 , s 2 ). And suppose we have an experiment to find the minimum difference in feature s 1 needed to distinguish a comparison input from a standard input, regardless of the feature s 2 in the comparison input, analogous to the method of Pokorny and Smith [1]. Let r be the population neural responses to the sensory input s with probability P(rjs). One can derive Fisher information matrix I F as in equation (17) Figure 7. Theoretical preditions of the wavelength discrimination by dichromatics as compared to that by the trichromats. All these curves are by fixing input intensity I~1, while using f a (l)~n a d af f a (l) in whichf f a (l) is normalized by Max lf f a (l)~1, while f a (l) are no longer normalized by Max l X a f a (l)~1. The values ½n 1 ,n 2 ,n 3 are ½0,9,1, ½9,0,1, ½6:7,3:3,0, and ½6,3,1 for protanopes, deuteranopes, tritanopes, and trichromats, respectively. doi:10.1371/journal.pone.0019248.g007 with elements (I F ) ij ; then equation (20) gives the discrimination threshold in s1 while feature s2 may present a perceptual confound.