Optimizing Population Variability to Maximize Benefit

Variability is inherent in any population, regardless whether the population comprises humans, plants, biological cells, or manufactured parts. Is the variability beneficial, detrimental, or inconsequential? This question is of fundamental importance in manufacturing, agriculture, and bioengineering. This question has no simple categorical answer because research shows that variability in a population can have both beneficial and detrimental effects. Here we ask whether there is a certain level of variability that can maximize benefit to the population as a whole. We answer this question by using a model composed of a population of individuals who independently make binary decisions; individuals vary in making a yes or no decision, and the aggregated effect of these decisions on the population is quantified by a benefit function (e.g. accuracy of the measurement using binary rulers, aggregate income of a town of farmers). Here we show that an optimal variance exists for maximizing the population benefit function; this optimal variance quantifies what is often called the “right mix” of individuals in a population.


For any
4. Compute the next f i`1 using Remarks. Monotonicity of f pLq is needed for the inverse function f´1pxq to exist. QpL, sq is monotonic in L because Q is the integral of a pdf, which is everywhere non-negative.

General lemmas
Lemma 1 (Dominance). If f i ă f pLq then f i`1 ą f i .
Proof: The proof hinges on the monotonicity of f pLq and QpL, sq. The monotonicity of f pLq tells us that If f i " f pλ i q ă f pLq then λ i " f´1 pf i q ă L.
From (S3) we get f i " Qpλ i , s i q and from (S4) we get f i`1 " QpL, s i q. Suppose to the contrary that f i`1 ď f i then f i " Qpλ i , s i q ě QpL, s i q " f i`1 . Because Q is monotonic in L if follows that λ i ě L. However, because of the monotonicity of f pLq it follows [statement (S5)] that f pλ i q " f i ě f pLq, which contradicts our assumption. Therefore, f i`1 ą f i as claimed.L Proof is similar to Lemma 1.
Lemma 2 (Uniqueness). f pLq is the unique accumulation point.

S2
Proof: Suppose there is f pLq which is the limit of the sequence of the Sloppy Algorithm and we assume without loss of generality that f pLq ă f pLq.
And from (S3) we have Equating these two expressions give QpL, sq " Q´f´1pf pLqq, s¯.
But if f pLq ă f pLq then by statement S5 if follows that f´1pf pLqq "L ă f´1pf pLqq " L.
But because Q is monotonic in L then QpL, sq ą Q´f´1pf pLq, s¯contradicting the equality. Therefore, f pLq ě f pLq.
We can argue similarly that f pLq ď f pLq therefore, f pLq " f pLq and thus proving uniqueness.R emark: We proved Lemmas 1 and 2 without specifying f or φ so they hold generally.
Convergence of the Sloppy Algorithm when φ " N and f pLq " L The Sloppy Algorithm will not, in general, converge monotonically for any pair f pLq, φ˘. In the case where f pLq " L and φ is the normal distribution, the Sloppy Algorithm converges monotonically everywhere. The proof given below can be adapted to study convergence of the Sloppy Algorithm for any`f pLq, φp air.
Theorem 1 (Convergence Theorem). Let f pLq " L on L P r0, 1s. Let φ be the normal distribution N pL, 0, sq with mean of zero and standard deviation s. QpL, sq is defined by (S2) where Γ is the range from 1{2´L to 8. Then tf i u defined by recursion rules (S3) and (S4) converges monotonically to f pLq everywhere.

S3
Note that in this case f i " λ i .

Proof:
The key to the proof and understanding whether the sequence tf i u converges monotonically depends on the shape of the s˚pLq curve. s˚pLq solves the fixed point problem f pLq " QpL, s˚pLqq.
For the pair`f pLq, φ˘"`L, N pL, sq˘, QpL, sq is s˚pLq is found by solving (S6) with f pLq " L, s˚pLq " ?
which is shown in Figure A. Note that as L Ñ 1{2, erf´1p2L´1q Ñ ? π 2 p2L´1q so s˚p1{2q " 1{ ? 2π « 0.4. We now have the pieces needed to complete the proof.  Case 1A: Suppose λ i and L lie on the same side of 1/2, say, λ i ă L ă 1{2 ( Figure A). From Lemma 1 we know that λ i`1 ą λ i but we do not yet know S4 whether λ i`1 is greater than or less than L. Define z as zpsq " ? 2p2L´1q 4s . (S10) Recall that λ i`1 " QpL, s i q " Qpz i q (step (S4) in the Sloppy Algorithm). Because s i ă s˚pLq (see Figure A) and because L ă 1{2 (which makes the numerator ă 0) it follows that z i ă z˚pLq. Because Q is SMI in z, it follows that λ i`1 " Qpz i q ă Qpz˚pLqq " L. Thus tλ i u is a SMI sequence bounded above by L so by the monotone convergence theorem (MCT), the sequence converges to someλ. However, Lemma 2 tells us thatλ " L.
Case 1A 1 : The cases where L ă λ i ă 1{2, 1{2 ă L ă λ i , and 1{2 ă λ i ă L can be handled in the same way to show tλ i u is a SM increasing (decreasing) bounded above (below) by L.
From Lemma 1 we know that λ i`1 ą λ i . Therefore tλ k u is a SMI sequence bounded by 1/2 so by the MCT λ k Ñλ. However, Lemma 2 demands that λ " L therefore, contrary to our assumption, there must have been some k where λ k ą 1{2. Beyond this k, the situation is identical to Case 1A or 1A 1 .
Because i was arbitrary, it follows that any sequence tλ i u generated by the Sloppy Algorithm converges monotonically to L.F igure B shows the monotonic convergence of the Sloppy Algorithm. Each colored path represents a different L. Note that for L ą 1{2, λ i`1 ą λ i while the opposite is true for L ă 1{2.
The Sloppy Algorithm when φ " LN and f pLq is sigmoidal (S11) Substituting φ " LN into (S2) and integrating between 0 and gives Qp , m, sq " 1 2 Qp , m, sq " f p q is satisfied when sp q is given by Then m must be m " ln`¯ ˘. (S14) The plot of sp q is shown in Figure C.
Case 2A: λ i ă L ă¯ . From Lemma 1, f i ă f i`1 but we don't know if f i`1 is greater than or less than f pLq. By definition f i`1 " QpL, s i q and QpL, s˚q " f pLq. From Figure C we see that s i ą s˚pLq. Let Because L ă¯ , the numerator is ă 0 so z i ą z˚. Therefore, f i ă f pLq ă f i`1 and we do not get monotonic convergence.
Case 2B:¯ ă L ă λ i . By Lemma 5, f i`1 ă f i but we don't know whether f i`1 is greater than or less than f pLq. From Figure C we see that s˚ă s i . Therefore, z˚ą z i (now the numerator is ą 0) therefore, f i`1 " QpL, s i q ă  Remarks: Using the same kinds of arguments we can show that for the cubic function f pLq " L´pγ{2qL¨pL´1{2q¨pL´1q (for γ P r0, 4s) and φ P N p0, sq the Sloppy Algorithm converges monotonically everywhere. However, when φ P LN plnp1{2q, sq then the Sloppy Algorithm converges monotonically only for L ą 1{2; convergence is oscillatory for L ă 1{2.

S7
s˚pLq solves the fixed point problem QpL, s˚q " L. Writing z " 2L´1 the fixed point problem becomes erfˆ? 2z 4s˚˙" z.
Because L P r0, 1s then z P r´1, 1s. We use the approximation erfpxq « x for x P r´1, 1s. Because this approximation holds over all L P r0, 1s it follows that s m almost solves the fixed point problem for all L, that is, QpL, s m q " L.

Condition for having a useful magic number
When the benefit BpL, νq is defined as (eqns. (9) and (11) in the main text) then BpL, νq is maximized when ν " f pLq{L. s˚pLq solves the fixed point problem QpL, s˚pLqq " f pLq so B always is maximized when s˚pLq is used. For arbitrary s, QpL, sq ‰ f pLq so νpsq " QpL, sq will not maximize B. We'd like to replace the continuum s˚pLq with a single magic number s m that almost maximizes BpL, νq for all L.
Clearly, the closer QpL, s m q approximates f pLq for all L P r0, 1s, the closer BpL, νq will be it to its maximum value. In other words, QpL, s m q should "look S8 like" f pLq in the sense that QpL, s m q is close to f pLq everywhere. The natural metric for this is the maximum norm, d`QpL, s m q, f pLq˘" max`ˇˇQpL, s m q´f pLqˇˇfor all L P r0, 1s˘. (S17) If d`QpL, s m q, f pLq˘is small then BpL, νps m qq « BpL, νps˚qq, meaning the benefit is nearly maximized.
The reason the normal distribution gave such poor performance when f pLq was sigmoidal and the lognormal distribution gave excellent performance is because Q N pL, s m q (the integral in (S2) when φ P N ) does not look sigmoidal whereas Q LN pL, s m q (φ P LN ) looks remarkably sigmoidal.

Determining the grayscale level using the Sloppy Algorithm
Algorithms that work well in a computer simulation can fail miserably outside of a simulation. We tested whether the Sloppy Algorithm would work when a human was part of the iteration loop. The problem was to see if the Sloppy Algorithm could be used by a person to determine the absolute, as opposed to a relative, magnitude of a quantity. Examples of this kind of task is determining the brightness of a variable star by eye (http://www.aavso.org/) or the weight of an ox [1].
The specific problem task was to determine the gray scale value of an image. The image, a square displayed on the computer monitor, had a gray scale value between 0 and 255. Next to the test square, a comparison square of equal size whose gray level was randomly chosen from a normal distribution with a mean level of 128 and standard deviation s i was shown. The person had to decide whether the test square was brighter or dimmer than the comparison square. After making N comparisons,λ i was calculated fromλ i " n{N , where n was the number of times that a test square was judged brighter than the comparison square and i is the iteration number. This is step [4] (eqn. (S4)) in the Sloppy Algorithm. Based onλ i a new s i that solved eqn. (S3) was determined. Fig. E shows results from three tests (from 2 subjects). The dashed lines mark the correct gray level L. The initial s value was set to 1ˆ∆ (filled circle) or 0.1ˆ∆ (filled square) where ∆ " 256 equals the range of possible gray values. These initial s values were chosen so that the first estimateλ 1 would be far from L (the program "knew" the value of L but the person did not) thereby S9 allowing us to see how the estimates converged to L. The convergence to the correct grayscale value is similar to that seen in Figure 2 in the main text except that the convergence is nonmonotonic. Nonmonotonicity arises from the finite number of decisions (N ) that were made; simulations show that the convergence becomes monotonic as N Ñ 8. We used N " 150 to get good estimates ofλ i but making such a large number of decisions (150ˆ6 iterations " 900 decisions) is tiring. Therefore, we tested whether setting s 0 to the magic number would hasten the convergence.
The results (open triangle) in this case show that even on the first iteration the estimate (71.4) is already close to the correct value (70).  For this example one observer makes N decisions while in the main text each of the N rulers makes one decision. These two approaches are mathematically equivalent.

S10
Sloppy rulers when combined with the Sloppy Algorithm can make accurate, high-resolution measurements even though each sloppy ruler has the lowest possible resolution. Dithering is a technique that can also improve measurement resolution [2] and has long been used to reduce quantization errors of analogto-digital conversion [3]. Noise is essential in both dithering and sloppy rulers.
However, sloppy rulers and dithering are different mathematically and in their arenas of application. In dithering the output signal is the average of both positive and negative excursions over many quantized states (256 states in an 8-bit analog-to-digital converter) centered around the input signal. By contrast, sloppy rulers average over only two states, zero and one.
This difference in what quantities are averaged is important in determining what the optimal noise level, s, should be to get accurate measurements. For dithering any s larger than half of the quantization step size will produce an accurate output [2]. For sloppy rulers, there is a unique s for each input value L that produces an accurate output, which the Sloppy Algorithm finds. Choosing s arbitrarily produces estimates of L shown in Fig. 3A in the main text. The y-axis is the estimate of L; only by happenstance does the estimate match the true value of L.
The Sloppy Algorithm and dithering are useful in different systems. Sloppy rulers represent a wide class of systems that make binary decisions. Such systems include yes-or-no voting in politics, all-or-none protein expression in cells, choice of crops to plant. Dithering, on the other hand, is useful when there are many signal levels as in analog-to-digital converters and in smoothing out pixelation in images.