Correlation Functions Quantify Super-Resolution Images and Estimate Apparent Clustering Due to Over-Counting

We present an analytical method using correlation functions to quantify clustering in super-resolution fluorescence localization images and electron microscopy images of static surfaces in two dimensions. We use this method to quantify how over-counting of labeled molecules contributes to apparent self-clustering and to calculate the effective lateral resolution of an image. This treatment applies to distributions of proteins and lipids in cell membranes, where there is significant interest in using electron microscopy and super-resolution fluorescence localization techniques to probe membrane heterogeneity. When images are quantified using pair auto-correlation functions, the magnitude of apparent clustering arising from over-counting varies inversely with the surface density of labeled molecules and does not depend on the number of times an average molecule is counted. In contrast, we demonstrate that over-counting does not give rise to apparent co-clustering in double label experiments when pair cross-correlation functions are measured. We apply our analytical method to quantify the distribution of the IgE receptor (FcεRI) on the plasma membranes of chemically fixed RBL-2H3 mast cells from images acquired using stochastic optical reconstruction microscopy (STORM/dSTORM) and scanning electron microscopy (SEM). We find that apparent clustering of FcεRI-bound IgE is dominated by over-counting labels on individual complexes when IgE is directly conjugated to organic fluorophores. We verify this observation by measuring pair cross-correlation functions between two distinguishably labeled pools of IgE-FcεRI on the cell surface using both imaging methods. After correcting for over-counting, we observe weak but significant self-clustering of IgE-FcεRI in fluorescence localization measurements, and no residual self-clustering as detected with SEM. We also apply this method to quantify IgE-FcεRI redistribution after deliberate clustering by crosslinking with two distinct trivalent ligands of defined architectures, and we evaluate contributions from both over-counting of labels and redistribution of proteins.


Introduction
Recent advances in super-resolution imaging have enabled imaging of cellular structures at close to molecular length scales using light microscopy [1,2,3,4,5]. In conventional fluorescence microscopy, the average distance between fluorescently labeled molecules is typically very small compared to the width of the point spread function (PSF) of the microscope (,250 nm). In this limit, the fluorescence character of individual labeled molecules does not contribute significantly to the final image, since many individual labeled molecules are averaged within the PSF of the measurement. Super-resolution fluorescence imaging and localization techniques can improve lateral resolution by an order of magnitude. In this limit, the average distance between neighboring labeled molecules can be close to the resolution of the measurement, and the finite size of individual labeled molecules as well as the finite size of the measurement resolution can significantly impact the resulting images. For example, undersampling of super-resolution images can lead to lower effective resolution by some measures, as discussed in previous work [6,7,8].
In this study, we explicitly assess how inadvertent over-sampling of individual labeled molecules can lead to the erroneous appearance of self-clustering. The situation can arise in both super-resolution localization images of fluorescently labeled proteins and in electron microscopic images of gold labeled proteins. When not considered explicitly, this apparent self-clustering could be incorrectly interpreted as self-clustering of labeled proteins. This is an important consideration since correctly determining the organization of membrane components is vital for deciphering how membrane organization is linked to cellular functions.
Over-counting of labels in nano-scale resolution imaging techniques is a common but under-appreciated problem. Overcounting can occur, for example, when target proteins are labeled with primary and secondary antibodies or when antibodies are conjugated to multiple fluorophores. It can also occur when the same fluorophore is counted two or more times because it cycles reversibly between activated and dark states. In all of these cases, over-counting can lead to the artifactual appearance of selfclustering over distances that correspond to the effective resolution of the measurement. In this study we first describe a method to quantify the distribution of labeled molecules in images, and we then develop a simple model to predict the magnitude of apparent clustering arising from over-counting. We show how this formalism applies to deliberate over-counting and thereby provides a useful measure of the effective average lateral resolution of a reconstructed super-resolution fluorescence localization image. We use this analytical approach to quantify high resolution images of the high affinity IgE receptor (FceRI) on the surface of RBL-2H3 mast cells obtained using both stochastic optical reconstruction microscopy (STORM/dSTORM) and scanning electron microscopy (SEM). We also apply the method to an example of IgE-FceRI complexes that are deliberately clustered on the cell surface by crosslinking with defined trivalent ligands. In this case, the observed clustering contains contributions from the redistributed proteins in addition to the inherent over-counting of multiple labels. Our approach can also be applied to other types of high resolution imaging methods, including transmission electron microscopy (TEM) and has recently been applied to quantify images obtained using photoactivated light microscopy (PALM/ fPALM) [9].

Results and Discussion
Pair auto-correlation functions quantify over-counting Pair correlation functions quantify organization in heterogeneous systems and are easily applied to super-resolution localization data. The pair auto-correlation function, g(r), that reports the increased probability of finding a second localized signal a distance r away from a given localized signal, is efficiently calculated using Fast Fourier Transforms, and can account for complex boundary shapes without additional assumptions. Detailed methods used to calculate correlation functions are described in Materials and Methods, and a Matlab function to calculate g(r) from images is supplied in File S1.
If an ensemble of molecules is distributed on a two dimensional surface with centers at positionsr r described by the density function r(r r) and an average density Sr(r r)T~r, the associated pair autocorrelation function of molecular centers is: g(r r)~Sr(R R)r(R R{r r)T=r 2 , where the average is over all positionsR R in the image. In this definition, g(r r)~1 represents a random distribution. Often it can be assumed that g(r r) is symmetric to rotations, and it is averaged over angles to obtain g(r). At r~0, g(r) contains a delta function, d(r), with magnitude of 1=r. Correlation functions are plotted for rw0, as g(r~0) is a trivial contribution. However, if g(r) is calculated from an image obtained from a measurement with finite resolution in the presence of over-counting, the measured correlation function will contain a remnant of this delta function at nonzero radius: where g psf (r) is the correlation function of the average PSF of the measurement, g(rw0) represents the correlation function for the distribution of labeled molecules, and Ã denotes a two dimensional convolution. The convolution acts to smear d(r) to finite radius. A detailed derivation of the above equation is included in Materials and Methods and a discussion of some important caveats are included later in this section. If we assume a Gaussian-shaped form of the PSF with a standard deviation of ó, the normalized PSF(r)~exp {r 2 =2s 2 È É = 2ps 2 À Á and g psf (r)~exp {r 2 =4s 2 È É = 4ps 2 À Á . In this case, g meas (r) becomes: The first term of g meas (r) arises from over-counting of labeled molecules with finite resolution and is inversely proportional to the average density of labeled molecules (r). The second term describes the distribution of labeled molecules within the resolution limits imposed by the average PSF and is independent of the density of labeled molecules. This is graphically depicted in Figure 1 for the example of labeled molecules partitioned either randomly or into circular domains. In the special case of a random distribution of labeled molecules, g(rw0)~1 and For comparison, another methodology commonly used to quantify heterogeneity in labeled membrane systems is the modified Ripley's K function, denoted L(r){r ð Þ =r. L(r) is related to the average number of signals within a radius r of a given particle [10], which is the integral of 2prg(r). As a result, Ripley's methods are not well suited to quantify images that are subject to over-counting, since over-counting at short distances is propagated to long distances through the integration. By contrast, the correlation function is not much affected by over-counting when evaluated at distances larger than the width of the PSF, as demonstrated by comparison of Figures 1C and 1E. The mathematical relationship between g(r) and L(r){r ð Þ =r used to generate the curves in Figure 1E is presented in Materials and Methods.

Some considerations when estimating the magnitude of apparent clustering
The estimates of apparent clustering due to over-counting that are presented in the first terms of Eqns. 1 and 2 are valid only when over-counting occurs via a random process. More rigorously, this applies when the number of times a given labeled molecule is sampled is well approximated by a Poisson distribution. This is expected to be the case for the majority of high-resolution measurements that are subject to over-counting, such as stochastic blinking of fluorophores in STORM/dSTORM measurements and reversible switching of fluorescent proteins in some PALM/fPALM measurements. This case should also apply when over-counting occurs through conjugation of multiple organic fluorophores to proteins or ligands, or when labeling of proteins with primary and secondary antibodies. As has been documented previously by others, these equations also hold in diffraction limited images in the limit where an ensemble of photons samples the PSF of each observed fluorophore and similar properties of measured correlation functions have been exploited to extract the oligomizeration state of labeled molecules [11].
Our estimates of clustering will not be accurate if over-counting is not randomly distributed over all labeled molecules. The first terms of Eqns. 1 and 2 will over-estimate apparent clustering from over-counting for cases where labeled molecules are sampled less frequently than expected from a Poisson distribution. This would occur, for example, when detection of a signal from a labeled molecule decreases the probability that the same labeled molecule will be detected additional times. This occurs in super-resolution fluorescence localization measurements if there is a significant probability of bleaching a fluorophore after it is activated. If, in fact, imaging is conducted in a manner that ensures that all labeled molecules are counted at most once, then measured correlations are due only to clustering of labeled molecules, and over-counting is not a problem. This is the ideal case for PALM/fPALM measurements if every activated fluorophore is irreversibly bleached after being counted, or for EM measurements if a labeling strategy is employed that ensures at most a single gold particle label per target protein. We note that several recent studies have demonstrated that some popular 'irreversible' PALM/fPALM probes show reversible blinking under some imaging conditions [9,12,13]. Our estimates of clustering will also not be accurate if there is significant noise in the image. Noise in the form of incorrectly identified signals or nonspecific labeling would act to decrease the magnitude of all correlations.
The first terms of Eqns. 1 and 2 will underestimate the magnitude of apparent clustering when labeled molecules are sampled more frequently than expected from a Poisson distribution. This would occur, for example, when the act of counting a signal from a labeled molecule increases the probability that additional signals will be detected from the same labeled molecule. This condition occurs in super-resolution fluorescence localization measurements if activated probes are counted once for each frame in which they are imaged, including cases when the same signal remains activated in multiple sequential image frames. A rigorous derivation demonstrating how deviations from a Poisson distribution quantitatively alter the magnitude of the over-counting term can be found in Materials and Methods.

Deliberate over-counting quantifies effective resolution
Deliberately over-counting probes is useful for isolating the over-counting term in Eqn. 1 and thereby directly measuring the effective average PSF of the measurement. An example of this approach is shown in Figure 2 for the case of a reconstructed super-resolution fluorescence localization image of labeled IgE-FceRI on the RBL cell surface. We isolate the autocorrelation of the average PSF of the measurement, g psf (r), by first tabulating correlation functions from two images reconstructed from the same set of localized single molecule centers (signals). The first image is shown in Figure 2A and is reconstructed from intentionally over-counted signals (i.e. where signals localized in the same position in sequential frames are counted independently), whereas the second image shown in Figure 2B is reconstructed from signals where over-counting is avoided by grouping signals that occur within some small distance in sequential observations. Subtracting g meas (r) of the grouped image from g meas (r) of the intentionally over-counted image results in a curve that is proportional to g psf (r), as the second term of Eqn. 1 is independent of the number of times a labeled molecule is counted. This is shown in Figure 2C. Note that in this example, both the raw and grouped measured correlation functions do not go to 1 at the largest radii shown in Figure 2C (r = 120 nm). This is because, for demonstration purposes, the entire image was used to calculate the measured correlation function and the majority of the image intensity is localized within the cell that extends for many microns, leading to long range contributions to g meas (rw0). These Correlation functions calculated from B for structures as indicated. Red (green) signals are sampled at random from red (green) PSF areas with OCR = 1, as described in A. g(r) for red centers and gray domains are equivalent within error, but g(r) for red signals shows additional clustering at short r, in agreement with Eqn 1. Green signals are also clustered at short r as described by Eqn 2, while g(r) for green centers is random within error. (D) Simulated g(r) for labeled red molecules partitioned into the gray domains as in B but with different average surface densities (r). Apparent clustering at short r decreases as r is increased, but long range correlations are unchanged, consistent with Eqn 1. (E) Modified Ripley's functions, (L(r)2r)/r, calculated from clustered red centers is slightly lower than but resembles functions calculated for red signals at large r. As expected, modified Ripley's functions for randomly distributed green centers do not show significant clustering over any radius. In contrast, functions calculated from green signals show significant apparent clustering over large distances. doi:10.1371/journal.pone.0031457.g001 contributions are not present in g psf (r). All remaining correlation functions presented in subsequent figures are tabulated using only contiguous regions of the cell membrane, as described in Materials and Methods.
In an ideal experiment, the range of g psf (r) will be simply related to the average localization precision of acquired signals. In many cases, this calculated g psf (r) will be broader than the average localization precision extracted from fitting single fluorophores because it also contains contributions from limitations that are not explicitly accounted for in the experiment. Such factors could include incomplete correction for stage drift, finite mobility of labeled molecules [14], or inadvertent grouping of distinct fluorophores. This method will not produce accurate effective resolutions if sequential occurrences of the same fluorophore are not appropriately grouped (e.g. if the grouping radius is too small), if immobilized probes are incorrectly localized due to orientation effects on fluorescence emission [15], or if artifacts that reduce resolution occur on time-scales much longer than the lifetime of activated fluorophores.

Pair correlation functions quantify heterogeneity
For cases in which measured correlation functions contain contributions that cannot be attributed to over-counting, such as when g meas (r)&1 for r&s, then the residual correlations can be attributed to clustering of labeled molecules. Much information can be extracted to discern the underlying structural distribution by monitoring both the shape and the magnitude of the correlation function. For example, the number of labeled molecules that are clustered together on average is given by , and the effective potential of mean force (PMF) between labeled molecules is given by PMF (r)~{k B T ln g(r) f g [16]. The shape of the correlation function also sheds light on the physical basis that governs heterogeneity [17]. Three examples of different simulated particle distributions are shown in Figure 3A, and their calculated correlation functions shown in Figure 3B have distinct features that can be used to distinguish the organizing principles giving rise to these distributions. Simulations of particles placed within a series of circular domains produce correlation functions that are damped oscillations, where the frequency of the oscillations corresponds to the average domain size, and the decay length quantifies correlations between neighboring domains [18]. By contrast, simulations of particles distributed in fluctuations produce correlation functions that decay as exponentials [19]. Both micro-emulsion (circles) and fluctuation models have been proposed as physical mechanisms that could produce small and subtle heterogeneity in resting cell plasma membranes [20,21], and, in principle, the shapes of correlation functions can be used to distinguish these different models.

Over-counting in super-resolution fluorescence localization images
We apply this correlation analysis to two types of superresolution data obtained with labeled IgE specifically bound to the high affinity FceRI receptor on RBL-2H3 mast cells. Figure 4A shows a reconstructed super-resolution fluorescence localization image of Alexa-647 fluorophores conjugated directly to IgE on the ventral (bottom) surface of a chemically fixed cell. In these measurements, the majority of probes are forced into a reversible dark state in the presence of bright light, a reducing environment, and basic pH [4,5]. This enables imaging and localization of a sparse subset of fluorophores at any given time. Probes stochastically switch between bright and dark states, and high shown in A is reconstructed from raw data where each localized signal is counted independently. In B, intentional over-counting arising from probes remaining activated for multiple sequential frames is removed by grouping localized signals found at the same location within a small radius in sequential raw images. Grouping methods are described in Materials and Methods, and several locations which differ between the grouped and raw images are highlighted with green squares in the zoomed images. (C) Correlation functions are calculated from both the raw image to obtain g raw (r) and from the grouped image to obtain g group (r). The correlation function of the raw image contains more apparent clustering at short radii than the measured correlation function of the grouped image because there are additional contributions in the raw image from intentional over-counting. Subtracting g group (r) from g raw (r) results in a curve that is proportional to the correlation function of the effective point spread function, g PSF (r). This is a measure of the effective resolution of the measurement. In this example, the black points are fit assuming a Gaussian PSF, g PSF (r)~A exp {r 2 =4s 2 È É , where s is determined to be 9.6 nm and A = 4.9 is an constant related to the average number of times each probe was deliberately over-counted. In A and B, images on the left are filtered with a Gaussian PSF with standard deviation of 75 nm and zoomed images on the right are filtered with a Gaussian PSF with standard deviation of 10 nm for display purposes. doi:10.1371/journal.pone.0031457.g002 resolution images are reconstructed from samples imaged over time, as described in Materials and Methods.
Correlation functions derived from images of localized single molecules from cells labeled with Alexa-647 conjugated IgE show significant auto-correlations at short distances and weak correlations that extend to longer distances, as shown in Figure 4B. We fit this measured correlation function to Eqn. 1 by approximating g(rw0) Ã g psf (r) as a single exponential given by where A is the amplitude and î describes the size of the structure. The best fit value for the average surface density (r) of labeled IgE is r = 20066 mm 22 , which is in good agreement with previous studies [22]. The short range autocorrelation (red curve) arises from over-counting as confirmed by cross-correlation analysis (see below). The long range autocorrelation (green curve) can be fit to obtain an amplitude of A = 0.256.03 and a range of j = 9568 nm.
Strong evidence that the large correlations at short radii arise from over-counting labels on single IgE-FceRI complexes and not from self-clustering of proteins is provided by measurements of cross-correlation functions calculated from two-color images ( Figure 4C,D). Similar to auto-correlation, the cross-correlation function, c(r), quantifies the increased probability of finding a signal a distance r away from a given signal of a different type. Unlike the auto-correlation function, the cross-correlation function does not contain a delta function at r = 0, and therefore it is not affected by over-counting, even when an experiment is conducted with finite resolution. A detailed derivation of this statement is included in Materials and Methods. In the two-color experiment, we created two separate pools of FceRI on the cell surface by preincubating cells with a mixture of IgE labeled with either the fluorophore Alexa647 or the fluorophore Alexa532 prior to fixation. Importantly, by this scheme, both species of fluorophore . The top and bottom panels under each heading in A display the same particle distributions, while the bottom panels in A show both the particles and the template for demonstration purposes. Correlation functions are tabulated from a large number of simulations resembling the ones shown in the top panels (A). The correlation functions in B are fit to two different functional forms to account for distinct features in the curves. g(r) for the two circle distributions have a well defined dip below g(r) = 1, and are fit to a damped cosine function: g(r) = 1+A6exp(2r/a)6cos(pr/2r o ), where A is an amplitude, a is a measure of the coherence length between circles, and r o is the average circle radius. This is the predicted functional form for a correlation function of a micro-emulsion [18]. The correlation function to the fluctuation model does not dip below g(r) = 1 and is fit to the predicted form for critical systems: g(r) = 1+A6r 21/4 6exp(2r/j). From this example, it is apparent that both the shape and range of the correlation function can reveal significant information regarding the underlying structure that gives rise to the heterogeneity. Also, when correlation functions are fit to the appropriate model, they accurately reproduce the radii of the circle distributions and the correlation length of the fluctuating distribution shown in part A. doi:10.1371/journal.pone.0031457.g003 cannot label the same FceRI protein because only a single IgE antibody binds to each FceRI protein [23]. After cell fixation, each color channel was imaged sequentially. Final reconstructed images of the different color channels are merged with the aid of fiduciary markers for accurate alignment ( Figure 4C).
Measured cross-correlation functions lack the large correlations at short distances that dominate auto-correlations functions tabulated from single color images ( Figure 4B), but they retain the weak correlations at larger radii ( Figure 4D). This measurement confirms that large clustering at short radii arises from overcounting IgE-FceRI complexes in auto-correlated, single-label experiments. Fitting measured cross-correlation functions to an exponential function c(r)~1zA exp {r = j n o yields an amplitude of A = 0.266.02 and a range of j = 8966 nm. Both parameters are in good agreement with those extracted from fitting the autocorrelation function in Figure 4B after isolating contributions from over-counting as described above.
The magnitude of measured cross-correlation functions suggests that IgE-FceRI clustering arises from a thermally driven mechanism, since PMF (r)~{k B T ln g(r) f g indicates that the potential of mean force is on the order of 1k B T. The shape of the measured cross-correlation function is well fit to an exponential and does not appear to drop below g(r)~1. This is consistent with an irregular structure that more closely resembles the image of fluctuations than the images of circles in Figure 3. These measured auto-correlation and cross-correlation functions are consistent with our recent theoretical predictions of critical fluctuations in plasma membranes at physiological temperatures [20,24], although it is equally possible that weak correlations arise from other mechanisms such as undulating membrane topology or interactions with the glass substrate.

Over-counting in scanning electron microscopy images
This correlation analysis can also be applied to scanning electron microscopy (SEM) images where target proteins are labeled with primary antibodies followed by secondary antibodies conjugated to gold particles as described in Materials and Methods. Figure 5 shows a flat section of the top surface of a RBL-2H3 cell with IgE-FceRI complexes that are immuno-labeled with 10 nm gold particles. This labeling scheme allows for multiple gold particles to decorate individual target proteins, and the correlation function detects clustering over short distances ( Figure 5B). In this experiment, the PSF is governed by the finite size of labeling antibodies and gold particles and not by the precision of localizing the gold particle centers. Measured correlation functions tabulated from images of gold particle centers show depletion at very short radii, g meas (rv15 nm)v1, because the gold particles cannot pack closer than their hard sphere radius. Fitting the measured auto-correlation function to either Eqn. 1 or 2 yields s = 1360.5 nm and r = 15765 mm 22 . This surface density is comparable but somewhat lower than that calculated from our fluorescence measurements, but still within expected values [22]. It is possible that this extracted surface density of IgE-FceRI underestimates the actual surface density of complexes, since labeling of gold particles may not be well approximated by a Poisson distribution due to the large size of gold particle labels.
Direct evidence that apparent clustering of labeled IgE-FceRI complexes is dominated by contributions from over-counting is provided by double-label SEM experiments, where distinguishable but functionally identical pools of IgE-FceRI are labeled with differently sized gold particles ( Figure 5C). Just as in our double label fluorescence experiments, this measurement was conducted by first creating two separate pools of FceRI on the cell surface by pre-incubating the cells with a mixture of IgE labeled with either the fluorophore Alexa488 or the fluorophore FITC prior to fixation. These were distinctively labeled with fluorophore-specific primary antibodies of different species followed by species-specific secondary antibodies conjugated to gold particles of different sizes ( Figure 5C). By this scheme, small and large gold particles cannot bind to the same FceRI protein. We find that cross-correlation functions tabulated between differently sized particles indicate random distributions within experimental error bounds ( Figure 5D). This comparison shows that the appearance of clustering in single label images ( Figure 5B) is dominated by overcounting individual target proteins.
Thus, unlike our super-resolution fluorescence localization measurements (Figure 4), we do not detect significant selfclustering over longer distances when we visualize gold labeled proteins using SEM. This could be because we selected morphologically flat regions of the cell surface for our SEM measurements (see Materials and Methods), while we could not independently measure surface topology in our fluorescence measurements. Another possible reason for the difference could be that receptors are organized differently on the top and bottom surfaces of the cell. SEM measurements were acquired from the top (dorsal) cell surface, while the fluorescence images were acquired from the bottom (ventral) cell surface.
Our analysis of both super-resolution fluorescence localization and SEM images yields results that differ from those of several previous studies which report that IgE-FceRI complexes are tightly pre-clustered into small domains in unstimulated RBL-2H3 cells by electron microscopy [25,26,27]. Since similar strategies were used to label IgE-FceRI in these studies, we expect that overcounting of IgE-FceRI complexes was incorrectly identified as selfclustering of these target proteins. It is possible that previous reports of self-clustering of other membrane components visualized by electron microscopy can also be attributed to overcounting, since labeling schemes often require the use of multiple or polyclonal antibodies. This potential pitfall of electron microscopy labeling and imaging was noted in early work that contributed to the Fluid Mosaic Model of biological membranes [28].

Quantifying receptor clustering and over-counting in SEM images
Large-scale clustering of IgE-FceRI is observed when cells are treated with a multivalent antigen that crosslinks multiple surfacebound IgE antibodies. Figure 6 shows reconstructed SEM micrographs of RBL cells treated for 10 minutes with trivalent dinitrophenyl (DNP) ligands. These architecturally defined ligands are based on a Y-shaped, DNA scaffold with DNP groups conjugated to each of the three 59 ends. The distance between DNP molecules is set by the number of bases in each of the complementary single strands that are annealed to form the double stranded Y-structure, and for Y16-DNP and Y46-DNP that distance is 561 nm and 1362 nm, respectively [29]. Because the anti-DNP IgE used in these experiments contain two DNP binding sites, the trivalent Y-DNP ligands can cross-link IgE-FceRI complexes into branched clusters.
Gold particles labeling IgE-FceRI from cells incubated for 10 min with Y16-DNP show clear extended clusters in reconstructed SEM images ( Figure 6A), and this structure is reflected in measured auto-correlation functions ( Figure 6B). Correlation functions from Y16-DNP treated cells are well fit by Eqn 1, assuming an exponential form of g(rw0) Ã g psf (r)~1z A exp {r=j f g, and extracted fit parameters are given in the caption to Figure 6. The average dimensions of the clusters (j = 3962 nm) is much larger than the width of the effective PSF (s = 1061 nm), and this provides confidence in the fit of both the long-range and short-range components of the data. However, the best fit value for surface density is r = 2764 mm 22 , which is significantly lower than our anticipated surface density of IgE-FceRI complexes and well below our measured gold surface density of 107 golds/mm 2 . It is likely that the peak at short radius also contains contributions from IgE-FceRI complexes organized into small oligomers as a result of exposure to crosslinking ligand. In this case, we can interpret the best fit surface density to represent the surface density of small oligomers. If we assume that the actual surface density of IgE-FceRI is well approximated by the surface density of gold labels, then we would conclude that IgE is organized into tetramers on average. It is also possible that the gold surface density over-estimates (or under-estimates) the IgE-FceRI surface density and complexes are organized into trimers (or pentamers) on average. Unfortunately, we do not explicitly Extended clusters are less apparent in reconstructed images of gold labeled IgE-FceRI complexes in cells incubated for 10 min with the larger Y46-DNP ligand ( Figure 6C). Auto-correlation functions tabulated from these images are shown in Figure 6D and can also be fit to Eqn. 2 assuming an exponential form of g(rw0) Ã g psf (r). In this example, extracted fit parameters cannot be determined with confidence because the size of extended structures (j = 1165 nm) are comparable to the extracted width of the PSF (s = 1361 nm). We also find that the extracted surface density (r = 50623 mm 22 ) is much lower than the measured surface density of gold particles labeling IgE (148 mm 22 ), again suggesting the presence of small IgE-FceRI oligomers on the cell surface. If the surface density of IgE-FceRI complexes is well approximated by the surface density of gold particles, then we would conclude that receptor complexes are organized primarily as trimers. Unfortunately we cannot draw quantitative conclusions since we do not have independent measurements of receptor surface density under these conditions. Our previous studies showed that Y46-DNP stimulates less cell activation than Y16-DNP, consistent with the lower amount of extended clustering of IgE-FceRI with the former that is revealed in these images [29].
In conclusion, we demonstrate that correlation functions provide an analytical tool to quantify heterogeneous distributions of labeled molecules in super-resolution experiments, even in the presence of over-counting that gives rise to the artifactual appearance of short-range clustering. We present an analytical method that predicts the magnitude of correlations arising from over-counting, and we describe a procedure to measure the apparent PSF of an image for cases when signals can be intentionally over-counted. We have validated this analysis methodology by quantifying the lateral distribution of IgE-FceRI complexes on the surface of unstimulated RBL-2H3 cells imaged using super-resolution fluorescence localization and SEM. We detect weak clustering of IgE-FceRI complexes when imaged on the ventral cell surface using TIRFM and super-resolution fluorescence localization methods, and these complexes appear randomly distributed when imaged on flat areas of the dorsal surface by SEM. Our interpretations of single-labeled IgE-FceRI images are confirmed by direct measurements of cross-correlation functions in double label experiments using both imaging methods. We additionally quantify over-counting and long-range clustering in cells that have been stimulated using defined Y-DNP ligands and discuss the advantages and limitations of applying this correlation method to interpret clustered distributions of proteins. These examples emphasize the importance of explicitly considering over-counting when quantifying images of proteins in membranes, where the extent of heterogeneity may be small and subtle.

Super-resolution fluorescence localization imaging
Sample preparation. Rat Basophilic Leukemia (RBL-2H3) cells were cultured as described previously [30], then harvested using Trypsin-EDTA, and plated sparsely overnight at 37uC in glass-bottom MatTek dishes (Ashland, MA). The cells were sensitized with either A647-labeled IgE (1 mg/ml) (for single color experiments) or a mixture of A647-labeled IgE and A532labeled IgE (1 mg/ml total) (for two color experiments) in HEPES buffered media for 1 to 2 hours at room temperature. Dishes Gold particles labeling IgE-FceRI in Y46-DNP treated cells appear to be clustered into smaller structures, as reflected in the fit of the measured correlation function to Eqn 1, with extracted fit parameters: s = 1361 nm, r = 50623 mm 22 , A = 13629, and j = 1165 nm, and the average surface density of gold particles labeling IgE is 148 golds/mm 2 . Note that the errors associated with fit parameters are significantly larger in the case of Y46-DNP treated cells compared to Y16-DNP treated cells because the observed structure is of a size that is comparable to the effective PSF of the SEM measurement. doi:10.1371/journal.pone.0031457.g006 containing cells were rinsed, incubated in media at 37uC for 5 minutes, rinsed again with warm PBS, and were then chemically fixed (4% paraformaldehyde 0.1% glutaraldehyde in PBS) for 10 minutes at room temperature. Samples were then blocked with 2% fish gelatin, 2 mg/mL BSA in PBS for 10 minutes.
Imaging. Single label samples were imaged on an inverted microscope (Leica DM-IRB, Wetzlar, Germany) under throughobjective TIRF illumination by a 100 mW 642 nm diode pumped solid state (DPSS) laser (Crystalaser, Reno, NV). Double label experiments were conducted on an inverted Olympus IX81-ZDC microscope with a cellTIRF module (Olympus America, Center Valley, PA) under through-objective TIRF illumination by either a 75 mW 642 nm DPSS laser (Coherent, Santa Clara, CA) or a 150 mW DPSS 532 laser (Cobolt, Stockholm, Sweden). In both cases, images were captured with an Andor iXon 897 EM-CCD camera (Belfast, UK) using custom image acquisition code written in Matlab (Mathworks, Natick, MA). To induce A647 or A532 photo-switching, cells were imaged in the presence of an oxygenscavenging and reducing buffer containing 100 mM Tris, 10 mM NaCl, 10% w/w glucose, 500 mg/mL glucose-oxidase, 40 mg/mL catalase, and 1% b-mercaptoethanol at pH 8. Movies of A647 or A532 photo-switching were acquired at between 5 and 25 frames per second for at least 2500 frames and analyzed by localizing the centers of diffraction limited spots through least squares fitting a two dimensional Gaussian shape using the fminfunc() function in Matlab. An example image with fits is shown in Figure 7A-B. Localized centers were culled to exclude outliers in standard deviation and localization precision in an effort to remove contributions from multiple emitters and poorly fit diffraction limited spots. Culled events are not correlated in space, and statistics for a typical example are shown in Figure 7C. We find that the fit parameters width and localization precision of diffraction limited spots are normally distributed around expected values, while brightness follows a skewed distribution, as has been noted previously [32]. Localized centers were combined (grouped) in single label measurements when the same fluorophore was identified in sequential images at the same position within twice the maximum allowed localization precision of the population of fits. This grouping is done to minimize intentional over-counting of single fluorophores in single color experiments. No grouping was done in two color measurements. Reconstructed images are assembled by incrementing a pixel value once for each time that a localized signal is identified at that location. Correlation functions are tabulated from these unfiltered reconstructed images. For display purposes, reconstructed images are filtered with a Gaussian PSF as indicated in the figure captions.

Scanning electron microscopy (SEM)
Sample Preparation. RBL-2H3 mast cells were grown overnight to ,50% confluency on 2 mm62 mm silicon chips at 37uC under standard cell culture conditions [33], and high affinity IgE receptors (FceRI) were labeled with either A488-IgE (1 mg/ mL) (for single label experiments) or a 1:1 mixture of A488-IgE and FITC-IgE (total 1 mg/mL) (for double label experiments) for 2-3 hr prior to the experiment. Cells were washed quickly in phosphate buffered saline (PBS), and immediately fixed in 4% (w/ v) p-formaldehyde and 0.1% (w/v) glutaraldehyde for 10 min at room temperature in PBS. Fixed cell samples were washed in blocking solution (2 mg/mL BSA and 2% (v/v) fish gelatin in PBS) and labeled sequentially with primary antibodies and gold conjugated secondary antibodies in blocking solution. Incubations were 1 h at room temperature with wash steps in between. After labeling, the cell samples were further fixed in 4% p-formaldehyde and 1% glutaraldehyde for 5 min at room temperature, and then thoroughly washed in distilled water. Following dehydration through a series of graded ethanol washing steps, samples were critical point dried, mounted on round aluminum SEM stubs, and sputtered with carbon to prevent charging. For single label experiments the primary antibody was rabbit anti-Alexafluor 488 and the 10 nm gold conjugated secondary antibody was goat anti-rabbit IgG. For double label experiments, the primary antibodies were mouse anti-FITC and rabbit anti-Alexafluor 488, while the secondary antibodies were 5 nm gold-conjugated antirabbit IgG and 10 nm gold-conjugated anti-mouse IgG. Samples were labeled first with 10 nm and then 5 nm gold antibody conjugates.
Imaging: Mounted samples were imaged with a Schottky field emission Scanning Electron Microscope (LEO 1550) at 20 KeV. The dorsal (top) surfaces of intact, adherent cells were imaged using secondary electron detection (SED) and backscattered detection (BSD) at high magnification. Flat membrane regions were selected for imaging. For imaging 10 nm gold particles, individual micrographs were obtained at 35 K magnification, and typical images cover 2.4 mm 2 of the cell surface. For imaging 5 nm gold particles and in double-label experiments with 10 and 5 nm gold particles, micrographs were obtained at 75 K-100 K magnification. Immuno-gold labeled protein distributions for $10 different cells and $2 individual experiments were obtained for all experimental conditions presented. Gold particle centers were localized by finding the weighted centroid of identified particles using automated image processing software written in Matlab. Correlation functions were tabulated from these binary images of gold centers. Reconstructed images are formed by convolving an image of the particle centers with a Gaussian shape with half-width given by the gold particle radius.

Calculation of correlation functions
Pair auto-correlation functions were tabulated in Matlab using Fast Fourier Transforms (FFTs) as follows: where FFT {1 is an inverse Fast Fourier Transform and Nr r ð Þ is a normalization that accounts for the finite size of the acquired image. In the case of super-resolution fluorescence localization measurements, I is the unfiltered reconstructed image of localized probes, generated as described above. For SEM measurements, I is a binary image of localized gold particle centers. In either case, the image I is padded with zeros in both directions out to a distance larger than the range of the desired correlation function (maximally the size of the original image) to avoid artifacts due to the periodic nature of FFT functions. The normalization factor N(r r) is the autocorrelation of a window function W that has the value of 1 inside the measurement area, and is also padded by an equal number of zeros.
This normalization is essentially the total squared area over which the correlation function is calculated accounting for the fact that there fewer possible pairs separated by large distances due to the finite image size. When calculating correlation functions from reconstructed super-resolution fluorescence localization images, the cell interior was first masked, and this mask was then used as the window function W. The choice of the window function can impact the tabulated correlation function, and efforts were made to exclude regions of the cell periphery or regions with noticeable membrane topology. Under these conditions, the measured correlation functions do not depend strongly on the mask used. Pair cross-correlation functions were computed using two images. In super-resolution fluorescence localization measurements, one image was reconstructed from localized Alexa 647 fluorophores (I 1 ), while the second image was reconstructed from localized Alexa 532 fluorophores (I 2 ). In SEM measurements, one image was reconstructed from the locations of 5 nm gold particle centers (I 1 ) and the second image was reconstructed from locations of 10 nm gold particle centers (I 2 ). : Here conj½ indicates a complex conjugate, r 1 and r 2 are the average surface densities of images I 1 and I 2 respectively, and Re{} indicates the real part. This computation method of tabulating pair auto and cross-correlations is mathematically identical to brute force averaging methods. Correlation functions were angularly averaged by first converting to polar coordinates using the Matlab command cart2pol(), and then binning by radius. g(r) values are obtained by averaging gr r ð Þ values that correspond to the assigned bins in radius. Errors in g(r) are dominated by counting statistics.

Calculation of modified Ripley's K functions
The statistical significance of clustering can also be determined using the Ripley's K function, which measures the increased density of particles within a circle of radius r and is related to the pair correlation function through integration:  Figure 1E. Derivation of equations to estimate over-counting in pair auto-correlation functions Below, we provide a detailed mathematical derivation of the equations used to analyze pair auto-correlation functions throughout the Results and Discussion section. First, we describe how to calculate a pair auto-correlation function of a collection of point particles. We then expand this to describe how this correlation function is modified when point particles are replaced by molecules that are sampled stochastically with finite resolution. We then take an expectation value of this stochastic autocorrelation function to obtain the equations used in the main text.
Consider a set of N point-like molecules at positionsr r i for 1vivN with average surface density r~N=A, where A is the total area. The density of molecules as a function ofr r is given by r(r r)~P i d(r r{r r i ), where d(r r{r r i ) is a delta function at positioñ r r i .The exact correlation function of these molecules is given by: Where in the last step we have defined g(r rw0) as the correlation function with only those terms where i=j. Note that this correlation function is normalized to 1 at spatial infinity, as defined in previous sections. Now consider stochastically building this correlation function by taking repeated measurements of individual molecule positions with finite resolution. Such a measurement is stochastic in two respects. First, measurements stochastically sample the normalized effective point spread function PSF (r r). More rigorously, a particle located at position r will be measured at r 0 with a probability given by P(r r 0 jr)~PSF(r r 0 {r r). Second, the number of times that any given molecule is counted is itself stochastic. In this initial derivation we assume that individual measurements are uncorrelated, so that the number of times each molecule is sampled is governed by a Poisson distribution. When this assumption is valid, each measurement is taken independently from the distribution: P meas (r r 0 j½r r 1 ,r r 2 . . .r r N )~1 N X i PSF(r r 0 {r r i ), where N molecules are located at positionsr r i as described above. After making M of these measurements, the average measurement density is given by r meas~M =A and we can construct a measured correlation function: In this equation, k and l sum over measurements, and not molecules. This g meas (r r) is stochastic even for a fixed positioning of underlying molecules, but we can relate its expectation value vg meas (r r)w to the bare correlation function, g(r r) by averaging over the possible measurements of particle positions. Using the above assumptions for the probability distribution of each measurement, we calculate the expected value of g meas (r r) as follows: Sg meas (r r) T~S A In the first line we have separated out terms where k = l and removed them from the expectation value. In the next line we note that each term appearing in the expectation value where k=l is proportional to the correlation function of the probability distribution of a single measurement with itself. Properly this term should be multiplied by a pre-factor of (M 2 {M)=M 2 since we have removed terms where k~l, but we replace this with 1 in the limit where M&1. If we re-write the probability distribution in terms of the actual molecule positions r i in accordance with our form for P meas (r r), this expression becomes: Sg meas (r r)T~1 r meas d(r r)z A N 2 X i,j ð PSF (R R)PSF (r r i {r r j zR R{r r)dR R: Using the definition of a convolution in two dimensions (denoted with a *) and defining g PSF (r r) to be the correlation function of the point spread function with itself: g PSF (r r): Ð PSF (R R)PSF (R R{r r)dR R, the expectation value for the measured correlation function can be written as: Sg meas (r r)T~1 r meas d(r r)z 1 r g PSF (r r)zg PSF (r r) Ã g(r rw0): The only term in the above expression with a dependence on the density of measurements, r meas , is the delta function centered at r r~0 and arises from terms where k = l. This contribution is easily disregarded since it does not contribute to any values of Sg meas (r rw0)T. In contrast, we cannot easily distinguish the contribution that arises from duplicate measurements of the same molecule from measurements from distinct molecules. This happens for two reasons. First, we have no way of knowing whether two independent measurements (k=l) came from the same molecule (i~j). Second, the delta function that arises from including i~j terms in g(r r) is spread over a PSF in Sg meas (r r)T so that it becomes 1 r g PSF (r r). This term extends to finite radius and can no longer be easily distinguished from terms coming from the convolution of the point-spread function with g(r rw0).

Modifications for cases where sampling of labeled molecules is not well approximated by a Poisson distribution
In the following section, we briefly discuss how these derivations would have to be modified if our assumption that each measurement is independent fails. In general, given a distribution, P n , for the number of times, n, that each individual molecule is measured over the course of an experiment we expect to observe: Sg meas (r rw0)T~S n 2 T Pn {SnT Pn rSnT 2 Pn g PSF (r r)zg PSF (r r) Ã g(rw0) Where ST Pn denotes the expectation value under the probability distribution P n . In a Poisson distribution Sn 2 T Pn {SnT PnS nT Pn 2 so that this equation reduces to the case derived in the text where we assumed that each measurement is independent. For cases where a subset of labeled molecules are sampled more frequently than expected from a Poisson distribution, then Sn 2 T Pn {SnT Pn wSnT Pn 2 , and the amplitude of the g PSF (r r) term of the measured correlation function will be greater than expected based in the measured surface density of labeled molecules. In contrast, when labeled molecules are sampled less frequently than expected from a Poisson distribution, then Sn 2 T Pn {SnT Pn vSnT Pn 2 , and the amplitude of the g PSF (r r) term of the measured correlation function will be smaller than expected based in the measured surface density of labeled molecules. If each particle is measured exactly zero or one time then Sn 2 T Pn~S nT 2 Pn , and the measured correlation function becomes: Sg meas (r rw0)T~g PSF (r r) Ã g(rw0): In this case, there is no longer any apparent clustering in g meas (r rw0) due to the over-counting.

Modifications for measured cross-correlation functions
In this section, we briefly demonstrate important differences between measured pair auto-correlation functions and pair crosscorrelation functions. An analogous calculation to the pair autocorrelation function described previously can be carried out for the pair cross-correlation function of two signals c(r r). Given two distinguishable molecular types each located with centers at positionsr r 1i andr r 2j with 1vivN 1 and 1vjvN 2 , the cross correlation is defined by: Note that the last equality stresses that there is no delta function contribution at the origin ( r r~0). This is because i and j sum over different sets of distinguishable molecules and therefore terms where i = j do not represent cases where the same molecule is being detected by different signals. We note that this is only the case when a labeling scheme is employed that eliminates the possibility that two distinguishable probes label the same molecule. Carrying through an analogous calculation to the one previously described for Sg meas (r r)T yields: Sc meas (r r)T~c PSF (r r) Ã c(r rw0) We use c(r rw0) rather than c(r r) to stress that there is no artifacts due to over-counting and where the cross-correlation function of the distinguishable effective point spread functions is given by: c PSF (r r): ð PSF 1 (R R)PSF 2 (R R{r r)dR R We note that c PSF (r r) may differ from g PSF (r r) for each individual effective point spread function.

Supporting Information
File S1 A Matlab function to tabulate correlation functions from a two dimensional image. To use, rename file as get_autocorr.m and call within a Matlab function, script, or at the command line. This function has been used successfully in Matlab version 2010a. Further information on function usage can be found within the file. (M)