STED Nanoscopy with Time-Gated Detection: Theoretical and Experimental Aspects

In a stimulated emission depletion (STED) microscope the region in which fluorescence markers can emit spontaneously shrinks with continued STED beam action after a singular excitation event. This fact has been recently used to substantially improve the effective spatial resolution in STED nanoscopy using time-gated detection, pulsed excitation and continuous wave (CW) STED beams. We present a theoretical framework and experimental data that characterize the time evolution of the effective point-spread-function of a STED microscope and illustrate the physical basis, the benefits, and the limitations of time-gated detection both for CW and pulsed STED lasers. While gating hardly improves the effective resolution in the all-pulsed modality, in the CW-STED modality gating strongly suppresses low spatial frequencies in the image. Gated CW-STED nanoscopy is in essence limited (only) by the reduction of the signal that is associated with gating. Time-gated detection also reduces/suppresses the influence of local variations of the fluorescence lifetime on STED microscopy resolution.


Introduction
Far-field fluorescence microscopy is a powerful imaging tool for investigating (living) cells due to its non-invasive access to the cellular interior, the specific and sensitive detection of cellular features through fluorescence tagging, and the simple sample preparation. However, many features are too small to be discerned with standard light microscopy, whose spatial resolution is curtailed by diffraction to 200-350 nm [1].
Stimulated emission depletion (STED) microscopy [2,3] overcame the diffraction barrier and increased the spatial resolution of fluorescence microscopy for the first time by a large factor; in principle it can reach resolution at the molecular scale. For this purpose, STED microscopy (or nanoscopy) uses stimulated emission to inhibit fluorescence emission at predefined sample coordinates such that adjacent features emit sequentially in time. Similarly, other fluorescence inhibition processes may be used to overcome the diffraction barrier [4], such as the shelving into metastable dark states in ground-state depletion (GSD) nanoscopy [5,6] or the use of photoswitchable fluorescence markers in the generalized concept called RESOLFT [4,7,8]. The strategy of modulating the fluorescence emission of neighbouring features has also been exploited in more recent far-field fluorescence nanoscopy approaches [9,10,11,12,13] that perform on-off fluorescence switching molecule by molecule randomly in space. Meanwhile, STED nanoscopy has addressed many questions in biology [14,15,16,17] and its implementation has become simple [18,19,20]. STED currently also provides the fastest subdiffraction resolution recordings [21].
In a typical STED microscopy implementation, a laser beam inducing stimulated emission and featuring at least one zerointensity point is overlaid with a regularly focused excitation beam. Thus, the STED beam inhibits fluorescence emission everywhere but at the zero-intensity points. A common design is a doughnutshaped focal intensity pattern of the STED beam. If the intensity of the STED beam at the doughnut crest I STED strongly exceeds the value I s at which half the fluorescence is suppressed, the effective fluorescence signal is confined to subdiffraction dimensions. Scanning the co-aligned excitation and STED beams through the sample yields the final subdiffraction resolution image, whereby the resolution can be adjusted by the intensity of the STED beam.
STED nanoscopy can be implemented with both continuous wave (CW) [18,22] and pulsed lasers [23]. The latter modality relies on synchronized trains of excitation and STED pulses with the pulses of the STED beam reaching the focal plane simultaneously or right after the excitation pulses, but within a fraction of the lifetime of the fluorescent state [23]. Pulses of the order of 0.1-1 ns suppress undesired polarization effects [24,25], jitter in pulse timing [26], multi-photon excitation [27], and photo-bleaching [28].
Although CW STED beams simplify the implementation of STED nanoscopy, the less efficient spatial confinement of fluorescence associated with this method is disadvantageous. Unlike in the pulsed mode, where all the photons (of the STED pulse) act shortly after the excitation event, in the CW implementation, the instantaneous STED intensity is typically lower, and so is the instantaneous probability, i.e. the rate, of stimulated de-excitation. A non-negligible part of the molecules still emits fluorescence because they have not been exposed to enough de-exciting photons. Such fluorescence is particularly prevalent right at the slopes of the zero-intensity point of the STED beam where the STED beam is weaker thus contributing to blur [26]. In other words, the suppression of fluorescence strongly depends on the number of STED photons to which the molecule is exposed while residing in the excited state.
It has been known that in a pulsed STED scheme the fluorescence photons should be detected right after the STED pulse has left [24,29]. This has also been shown in an experiment combining time-correlated-single-photon-counting and pulsed STED nanoscopy [30]. Likewise, the generalization of the STED principles to other optical transitions between two distinct states has shown that for constantly acting (CW) beams, the obtainable effective resolution scales with the duration of the action of the beam [4]. Thus, by applying pulsed excitation and time-gated detection, the residual fluorescence produced at the slopes of the CW STED beam can be solved by detecting fluorescence only from molecules that have been exposed to the beam for a duration.T g after excitation [31,32]. In fact, recent experiments have established gated CW-STED microscopy as a simple but powerful approach to observe the cellular nanoscale, including of living cells [32]. However, despite its recent popularity many physical aspects of this recording mode, as well as its benefits and limitations have not been elucidated.
In this paper, we therefore develop a theoretical framework describing the time evolution of the effective point-spread-function (PSF) of a STED nanoscope, thus quantifying the resolution obtained by time-gated detection. Together with experimental data, this framework provides a comprehensive view of the performance of gated STED nanoscopy in both the all-pulsed (P-STED) and the CW-STED (pulsed excitation, CW STED) modalities, especially with respect to the choice of the time-gated detection window and the reduction of the signal-to-noise orbackground ratio. While hardly any improvement is expected for time-gated P-STED, the time-gated detection not only increases the effective spatial resolution of CW-STED nanoscopy but also reduces adverse effects of fluorescence lifetime heterogeneities in the sample.

Sample Preparation
We used ,40 nm large fluorescent beads (Crimson beads, Invitrogen, Carlsbad, CA; excitation and emission maxima at 625 nm and 645 nm, respectively) and ,35 nm large fluorescent nano-diamonds (FNDs) [33] (Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, Taiwan; excitation and emission maxima at 560 nm and 700 nm, respectively) for the experimental characterization of the effective point-spread-function (PSF) of our STED nanoscope. A dilute dispersion of the fluorescent beads was prepared by drop casting a solution of the beads on a poly-L-lysine (Sigma, Saint Louis, MO) coated glass coverslip and mounting it with Mowiol (Sigma-Aldrich, Taufkirchen, Germany). The FND sample was prepared by spin-coating the particles in poly(vinyl-alcohol) (PVA) on a microscope cover glass.
The mammalian PtK2 cell line was grown as described previously [34]. Cells were seeded on standard glass coverslips to a confluence of about 80% and fixed with ice-cold methanol (220uC) for 4 min followed by an incubation in blocking buffer (PBS containing 1% BSA). Microtubules were stained using an immunofluorescence labelling protocol [35] involving a primary antibody (anti b-tubulin mouse IgG (monoclonal), Sigma) and a secondary antibody (sheep anti-mouse IgG, Dianova, Hamburg, Germany) labelled with the organic dye ATTO647N (Atto-Tec, Siegen, Germany) or KK114 [36]. All antibodies were diluted in blocking buffer and incubated for 1 h each followed by several washing in blocking buffer. Mounting was again performed with Mowiol.

STED Nanoscope
Our STED nanoscope setup [32] featured a 532 nm (PicoTA, PicoQuant, Berlin, Germany) and a 635 nm (LDH-D-C-635, PicoQuant) pulsed diode laser for excitation and a Ti:Sapphire laser (Mira900, Coherent, Santa Clara, CA) for STED, which was tuned to 740 or 760 nm and operating either in the CW or in the mode-locked pulsed mode with a repetition rate of 76 MHz. The STED light was guided through two glass rods and coupled into a 120 m long polarization maintaining single mode fiber (AMS Technology, München, Germany), which in the pulsed modality stretched the pulse width to ,250 ps. In the pulsed STED modality, the excitation diode lasers were synchronized to the STED laser by a home-built electronic delay unit. In the CW modality, the repetition of the pulsed excitation lasers was tuned to 40 or 80 MHz, based on the application. The doughnut-like intensity distribution of the STED light was created by introducing a polymeric phase plate (RPC Photonics, Rochester, NY) applying a helical phase ramp of exp(iQ), with 0,Q,2p in the STED beam that was then imaged into the back aperture of a 1.4 NA objective lens (HCX PL APO, 1006/1.40, oil, Leica, Wetzlar, Germany). Excitation and STED beams were aligned on the same optical axis using custom-made dichroic mirrors (AHF Analysentechnik, Tübingen, Germany). The fluorescence was detected through the same objective lens, filtered out with appropriate bandpass filters to reject laser scattering and imaged onto a multimode optical fibre with an opening of the size of about an Airy disc of the imaged excitation PSF. The fibre was attached to a single-photoncounting module (id100-MMF50, id Quantique, Carouge, Switzerland) and connected to a time-correlated single-photoncounting board (SPC-730, Becker & Hickl GmbH, Berlin, Germany). The image acquisition was performed by scanning the sample with a 3D piezo stage (NanoMax TS 3-axis, Thorlabs GmbH Europe, Dachau, Germany). The STED and confocal reference images were recorded simultaneously on a line-by-line basis by opening and closing a mechanical shutter in the STED beam.

Intensity and Power Measurements
Both for the STED and the excitation light we indicate the average power P measured at the back aperture of the objective. Due to losses in the objective, the power at the sample is actually lower by 30% and 25% at 760 nm and 740 nm, respectively. The average STED intensity at the doughnut crest can be estimated by I STED = kP STED/ A STED with A STED denoting the STED focal area of a nearly diffraction-limited light spot; k = 0.3 is a scaling factor correcting for the doughnut-shaped intensity distribution. We determined A STED <p(FWHM STED /2) 2 from the diameter FWHM STED of a regularly focused (nearly Gaussian) spot. The value of FWHM STED <350 nm was measured from a scattering gold bead of sub-diffraction diameter (80 nm gold colloid, En.GC80, BBinternational, Cardiff, UK) in a non-confocal mode. For the P-STED modality the transient (or peak) intensity during a rectangular pulse of duration T STED is given by I STED * = I STED / (fT STED ) with f being the repetition rate of the laser.

Lifetime and Signal-to-noise Ratio Analysis
We performed fluorescence lifetime recordings and analysis using time-correlated single-photon-counting (TCSPC) and a maximum-likelihood estimation method with a Poissonian assumption of the error distribution [37]. The fitting to the experimental TCSPC data included a multi-exponential decay with i components, g i a i exp(2t/t i ), and a convolution with the instrument response function. For each of the i components t i represents the decay time; a i is a photon-weighted amplitude. Consequently, each component i contributes a fraction c i = a i t i / g j (a j t j ) to the fluorescence signal and the mean decay time (intensity-weighted average lifetime) is given by ,t. = g i c i t i . The instrument response function was measured on a purely scattering sample.
To quantify the signal-to-noise ratio (SNR) and signal-tobackground ratio (SBR) of the experimental images, we defined the peak SNR (PSNR) and peak SBR (PSBR). The PSNR and PSBR represent the SNR and SBR in the brightest part of the recorded images. With g(i) giving the photon count rate recorded (gated or un-gated) at the pixel i of an image (i.e. the number of counts detected per pixel dwell-time at pixel i) where f b is the uncorrelated background count rate, p t the pixel dwell-time, and DT/T = (T2T g )/T the time-gated fraction of the pulse period T. f b was directly estimated from the late time-bins of the TCSPC histogram (the histogram of the photon arrival times): when a photon has been registered in a late time-bin it has been most likely generated by an uncorrelated background source. Following the same notation we defined the PSBR as Theory The main equations governing the theory of time-gated detection for STED nanoscopy have been reported [31,32]. Starting from the temporal evolution of the fluorescence signal under stimulated emission, the spread of coordinates where fluorescence is allowed and hence registered, i.e. the effective point-spread-function (E-PSF) of a time-gated STED nanoscope, has been derived. However, the gain in resolution in a gated STED nanoscope over time has not been explicitly quantified yet.
To this end, we analyze the characteristics of the E-PSF as a function of the time of action of the CW-STED beam. We shall denote this time-dependent E-PSF as tE-PSF throughout the manuscript.

Fluorescence Signal Under Stimulated Emission
At first we derive the temporal dynamics of the fluorescence signal under stimulated emission. We make several assumptions: (i) The fluorescent marker is described by a simple two-level model consisting of a ground S 0 and a first excited electronic state S 1 ; dark states and vibrational sub-states are neglected. (ii) Excitation from S 0 to S 1 by the STED light is neglected as well. (iii) The fluorophores are initially in their S 1 state due to a brief excitation pulse. We also assume that the time period T = 1/f between two pulses is longer than the excited-state lifetime t of the markers, i.e., all markers have relaxed to S 0 before the arrival of the next excitation pulse; hence, the conditions at the beginning of every excitation cycle are the same. (iv) Spontaneous S 1 R S 0 deexcitation takes place with a rate constant k S1 = 1/t, (with t denoting the excited state lifetime), and fluorescence photons are emitted with a quantum yield q fl , i.e., with a rate k fl = k S1 q fl . (v) We evolve our calculations from the pulsed STED modality assuming rectangular STED pulses with a temporal width T STED , and generalize for the CW-STED case by setting T = T STED . (vi) The rate of stimulated emission during the pulse is given by k STED = s STED , I STED * with s STED , = s STED l STED /(hc) being the stimulated emission cross section divided by the photon energy (l STED the wavelength of the STED light, hc = 1.99?10 225 Jm is the product of Planck's constant and the velocity of light) and I STED * = I STED T/T STED the transient STED intensity. I STED defines the time-averaged STED intensity derived from the directly measurable average power of the beam. (vii) We define a transient saturation intensity I s * as the transient STED intensity at which k S1 = k STED, i.e. I s * = k S1 /s STED , , revealing a transient suppression or saturation factor z * = I STED * /I s * = k STED/ k S1 . (viii) STED experiments usually apply a circular polarization of the STED light; therefore, we neglect orientation or rotation characteristics of the fluorophore, which can decrease the efficiency of stimulated emission [24,25].
The fluorescence signal is proportional to the relative population P S1 of the first excited state S 1 , whose change over time t can be expressed by the rate equation dP S1 dt~{ k S1 P S1 {k STED P S1 : ð3Þ With P S1 (0) = 1, the fluorescence emission rate at a time t after excitation is The first and second exponentials describe the spontaneous decay and the action of the STED beam, respectively.

Time Evolution of the E-PSF (tE-PSF)
In previous theoretical work on STED, the fluorescence signal was usually integrated over time and the temporal evolution discarded. Here we regard the temporal evolution of the fluorescence signal under stimulated emission, calculating the tE-PSF. At low excitation intensities (no saturation of the excited state) the spatial modulation of the probability to excite a fluorophore follows the excitation intensity profile. Hence, the tE-PSF of a STED microscope h(t,r) can be derived by the product of the excitation intensity profile h exc (r), the probability of fluorescence emission F(z * (r),t) given in Equation (4), and the detection efficiency profile h det (r). For an analytical description of the tE-PSF we approximate the product h c = h exc h det by a Gaussian distribution with a full-width-at-half-maximum (FWHM) d c and an amplitude equalling unity, i.e., h c (r) = exp(24ln2 r 2 /d c 2 ). In the vicinity of the zero-intensity point (r = 0), the STED intensity profile of the doughnut can be approximated by a parabola I STED (r) <4I STED a 2 r 2 , with the intensity I STED at the doughnut crest, and a constant a that depends on the shape of the doughnut minimum [38]. Using Equation (4), the tE-PSF for 0#t# T STED For i.e. the spatial shape (r-dependence) of the tE-PSF does no longer change after the STED pulse. We normalized the tE-PSF to unity in the focal centre (r = 0) at t = 0. Examples of the simulated tE-PSF h(t,r) for P-and CW-STED are shown in Figure 1. The tE-PSF is sharpened over time of the STED action accompanied by a decrease of the amplitude, accounting for the spontaneous decay (Equations (5) and (6)). Following Equations (5) and (6), the tE-PSF h(t,r) can be approximated by a Gaussian with amplitude and a time-dependent FWHM

E-PSF for Gated Detection
Following the time evolution of the tE-PSF h(t,r), the observation/detection spatial range is reduced by detecting fluorescence at a later point of time t, i.e., by performing a time-gated detection. Here, fluorescence is detected only after a time t = T g from the excitation pulse.
For the P-STED (i.e. all-pulsed) modality with time-gated detection (gP-STED) one usually chooses T g $ T STED . In this case, the E-PSF is given by where for the right-hand side we have assumed T..t, i.e. exp(-T/t) negligible. By simple computations we obtain where z = z*TSTED/(tln2) = (TSTED ISTED*)sSTED,/ln2 is the saturation factor for P-STED. It is usually defined as z = ISTED/Is with the saturation intensity Is being the average intensity at which half the recorded spontaneous emission is suppressed [38]. Obviously Is and thus z depend on the spectroscopic properties of the molecules and the illumination timing. It is however important to note that z and thus the FWHM in gated P-STED depends only on the pulse energy (TSTED ISTED*) and the cross section, not on the fluorescence lifetime. We further note that in the absence of a time-gate or in the unusual case of setting Tg,TSTED, i.e. for gate delays shorter than the pulse width of the STED laser, the expression becomes more complex. However, for lifetimes t..TSTED (which is usually the case) the integral across the pulse duration, Tg,t#TSTED, can be neglected in Equation (9) and, in good approximation, Equation (10) remains valid. This means that under the assumption that the pulse duration is short with respect to the fluorescence lifetime, also for classical P-STED the FWHM does not effectively depends on the lifetime of the fluorophore.
In the case of CW-STED with pulsed excitation and time-gated detection (gSTED) the E-PSF h gCW (r) is given by where we have assumed T..t, i.e. exp(-T/t) negligible. In the absence of gating (T g = 0), the E-PSF has the Lorentzian shape known for the original CW-STED implementation featuring CW excitation [26]. When a gated detection scheme is introduced (T g .0), the E-PSF consists of an Gaussian term (due to the suppression by the STED light before the detection) and a Lorentzian term (because the remaining excited molecules are then imaged under the same condition of the original CW-STED implementation). A good approximation for the FWHM of Equation (11) is found by replacing the Lorentzian term in Equation (11) with a Gaussian term with the same FWHM (see Supplementary Text S1). We then have Note that FWHM gCW (T g = 0) is the FWHM of the Lorentzian E-PSF of the original CW-STED implementation. Unlike for the well-implemented gP-STED, the FWHM of the gCW-STED implementation depends on the fluorescence lifetime t. Albeit the approximation made when deriving Equation (12) values of FWHM(T g ) calculated using Equation (12) are similar to those obtained from a rigorous model, which calculates the intensity profiles of the excitation and STED intensities based on Fourier diffraction theory [39] (inset Figure 2B). A sharpening of the observation/detection area in the sample is observed with increasing time delay T g of the time-gated detection, concomitant with a decrease in signal (or amplitude). Notably, the reduction of the FWHM is accompanied by a strong reduction of the pedestal (or Lorentzian tail) of the E-PSF. We note that for very large T g , Equation (12) slightly underestimates the FWHM of the E-PSF (inset Figure 2B), since for t = 3.4 ns and T = 1/ 80 MHz the fluorophores still have a non-negligible probability to be in the excited state before the next excitation pulse arrives, i.e. h(T,0) is not completely zero as assumed.
The E-PSF of the time-gated detection can also be regarded as a weighted sum of different Gaussian distributions with decreasing FWHM and decreasing weights represented by the tE-PSF. Collecting the photons after a time delay T g from the excitation pulse, i.e., performing time-gated detection, removes the early tE-PSFs characterized by a larger FWHM and therefore improves the effective resolution at the expense of a loss in overall signal, as outlined in Figures 1 and 2.
For the pulsed STED implementation, our theoretical framework ( Figure 2) reveals that collecting the photons immediately after the STED action (T g = T STED ) produces the sharpest E-PSF, as expected. The use of a time-delay T g larger than the STED pulse width T STED only reduces the brightness without further reducing the FWHM of the E-PSF, of course. Furthermore, if the pulse width of the STED beam T STED is short compared to the excited-state lifetime t (T STED /t,,1), which is usually the case, the impact of time-gating is negligible. In our calculations, we have assumed the same average intensity at the doughnut crest I STED for the P-STED and the CW-STED implementation. The much larger transient intensity I STED * = I STED T/T STED of the P-STED modality results in much lower transient saturation factors, z * = 4.8 for the CW-STED compared to z * = 200 for the P-STED recordings, and thus by default to a much more confined E-PSF for the P-STED implementation. However, increasing the time delay T g of the gCW-STED recordings results in a convergence of the two E-PSFs ( Figure 2B). This is not surprising, since the gCW-STED implementation can be viewed as a pulsed implementation whereby the virtual pulse is the exposure time during the gate.

Optical Transfer Function (OTF)
The optical transfer function (OTF) is the Fourier-transform of the E-PSF in space, meaning that large spatial frequencies (above the noise level) yield features with high spatial resolution. Figures 2C, D compare the OTFs and thus the spatial frequencies transmitted by the imaging modality for different time gates T g . The increase in effective spatial resolution is obviously not realized by elevating the transmission of large frequencies per se. Rather, the transmission of lower frequencies is damped by the gating process, thereby increasing the relative contributions of the larger frequencies. We therefore refer to the improvement in image contrast by an increase in 'effective resolution'. Similarly to the E-PSF, the OTF can also be regarded as a weighted sum of different OTFs with increasing bandwidth but decreasing strengths, represented by a temporal OTF (tOTF -which are the spatial Fourier-transforms of the respective tE-PSFs). Introducing the gated detection removes the contributions of the early tOTFs, which are mainly characterized by low spatial frequencies.

Signal-to-noise (SNR) and -Background (SBR) Ratio
As highlighted in Figures 1 and 2, the signal amplitude h(0,t) of the non-normalized t-EPSF degrades with time. In terms of the OTF, the frequencies boosted by the gated detection might thus be masked by noise ( Figure 2C, D). Following our previous considerations a time-delay T g decreases the fluorescence signal at the peak of the E-PSF, h gCW/gP (0), by a factor exp(2T g /t), while any uncorrelated background signal is just reduced in proportion to the width DT = (T2T g ) of the detection gate. We defined uncorrelated background as background that is uncorrelated with the pulsed excitation. Important sources of such background are ambient light, fluorescence excited by the STED laser (Anti-Stokes excitation, AStEx) [40] and scattering from the STED beam. Following this definition, background from the excitation light or AStEx background in P-STED [40] is not uncorrelated.
Because all of our images are recorded with a photon counting module, we can assume shot noise as the major source of noise. The shot noise of fluorescence detection scales with the square root of the detected signal. Consequently, the signal-to-noise ratio (SNR) decreases with T g as where b u is the relative signal level of the uncorrelated background without time-gating. Obviously, the SNR decreases strongly for large gates T g ..t, simply because all fluorophores will have decayed by then. The proposed gated detection approach can be also limited by the reduction of signal-to-background ratio (SBR) and thus by the reduction of contrast. In case of uncorrelated background due to the aforementioned ambient light, STED beam scattering and AStEx fluorescence we define a signal-to-background ratio (SBR), which decreases with T g as As for the SNR, the SBR degrades for increasing T g and hence the improvement of the effective resolution has to be pondered against the reduced SBR and SNR.
Uncorrelated background signal can certainly be reduced; for example, we have recently presented a digital lock-in method to remove the AStEx background [40]. This method is readily introduced in g-STED nanoscopy as well. Furthermore, for experiments applying low repetition rates, T..t, where late detection will be dominated by the uncorrelated background, one may improve the SNR and SBR by detecting only until a time T end ,T before the next pulse, i.e., by reducing the detection window DT = T end 2T g .
On the other hand, gating significantly reduces correlated background contributions due to scattering of the pulsed excitation or pulsed STED light and due to unspecific background fluorescence of very short lifetime, since these contributions only appear for short times t. For example, gated detection has often been applied to increase the SNR/SBR of single-molecule detection experiments [41]. Therefore, it is usually helpful to at least set a time gate T g at the end of the excitation pulse.

gCW-STED Imaging: Comparison to Theory
Our experimental data fully confirms the theoretical considerations regarding the tE-PSF. We first imaged densely packed ,40 nm large fluorescent crimson beads and ,35 nm sized fluorescent nano-diamonds (FNDs) to experimentally demonstrate the characteristics of time-gated detection for CW-STED ( Figure 3A, B). As expected, a clear rise in effective resolution is obtained by adding the STED light, which is further improved by introducing the gated detection, all in all allowing a much clearer separation of adjacent features. The gCW-STED images of the point-like objects (FNDs and beads) reveal their nominal size (35-40 nm), which indicates that we have reached a spatial resolution of 35 nm or below. These image conditions have been reached at comparatively low average CW-STED powers of P STED = 250 mW for the FNDs and 230 mW for the crimson beads. Note that similar improvements in image resolution on the crimson beads have been obtained previously by the use of a similar STED laser system running in pulsed mode (T = 1/80 MHz and T STED <300 ps) with an average power P STED = 50 mW, i.e., with ,10 times higher peak STED intensity I STED * [38]. Notably, the effective resolution of the gCW-STED images continuously improves with increasing time delay T g (Supplementary Movie S1, Supplementary Movie S2) up to the limit imposed by the degradation of the SNR and SBR (Equations (13) and (14)).
The fluorescent beads are very bright objects and the uncorrelated backgrounds, including the dark counts background of the detector (few Hz), are negligible in their images. Thereby for increasing time delay the SBR reduction does not represent a substantial problem. The major source of degradation introduce by time gating is the increase of photon counting noise and consequently by optimizing the pixel dwell-time, the images start degrading for relative long time delay T g .6 ns, compared to the lifetime t , 3 ns of the fluorescent beads ( Figure 3A and Supplementary Figure S1).
In contrast the FNDs are less bright objects, and these samples show a relative high level of uncorrelated background due to the scattering of the STED beam (6 KHz). Consequently, even if the SNR of the CW-STED image is high (T g = 0, PSNR , 25), the SBR reduces faster and the images degrade already significantly for time delays T g .18 ns, that are small relative to the FND's lifetime t , 15-20 ns ( Figure 3B). Since caused by scattering, increasing the pixel-dwell time will not improve the SBR.
We next checked our theoretical considerations on the performance of the gCW-STED nanoscope for the imaging of biological samples. Figure 3C shows a gCW-STED image of the microtubule network of fixed mammalian PtK2 cells immunostained with the organic dye ATTO647N. Structural details of this network could much better be visualized in the gCW-STED than in the CW-STED or confocal images. Supplementary Movie S3 shows the evolution of the effective resolution with increasing timedelay T g . A substantial improvement in effective resolution is already obtained at rather low values T g ,1.5 ns. The quality of the gCW-STED images, however, degrades for relative short timedelays T g .3 ns compared to the label's lifetime t , 3 ns (Supplementary Figure S2). This is mainly due to the uncorrelated background induced by the STED beam (AStEx), which for large T g dominates over the desired signal.
Importantly, the SBR reduction due to the AStEx or scattering signal induced by the STED laser can be compensated for by using can be compensated using a lock-in detection method able to subtract such uncorrelated background signals [40]. However, one has to keep in mind that also the lock-in system is limited by noise [40].

gCW-STED Imaging: Lifetime Dependence
In the usual gCW-STED image recording scheme, the STED intensity I STED is fixed and (apart from strong local optical bias due to polarization [42] or aberration effects) the optical parameters, namely, d c and a, do not change during the recording. Assuming a constant cross section of stimulated emission s STED , of the fluorescence label, the effective resolution, i.e. the FWHM of the observation volume scales as 1/!(1+ T g /(tln2)), i.e., with the ratio T g /t (Equation (12)). Therefore, the time delay T g of the gated detection has to be adapted for fluorophores with different fluorescence lifetimes to reach similar effective resolution. For example, the time-correlated single-photon counting (TCSPC) data recorded for the single fluorescent beads showed a single exponential decay with an average lifetime t , 3 ns, while the FNDs showed multi-exponential decays with intensity weighted average lifetimes ,t. between 5-25 ns. For an optimized gCW-STED performance, we have therefore adjusted T g = 1.5 ns and T g = 6 ns for the beads and the FNDs, respectively (Figure 3).
On the other hand, t depends very sensitively on the local molecular environment of the fluorophore and may thus vary over the sample. We have chosen the FNDs as an example to outline the implication of such variation on the performance of (g)CW-STED. As mentioned above and reported previously [33,43,44], the average lifetimes ,t. of the FNDs are heterogeneous, varying from 5 to up to 25 ns (see also Supplementary Figure S3). Applying a constant STED intensity and a fixed gate T g , the heterogeneity in lifetimes resulted in a variation of the E-PSF from one FND to the next. Figure 4 shows the correlation between the lifetime and the FWHM of the E-PSF determined from the gCW-STED images of different single isolated FNDs. Without gating, i.e., for CW-STED, shorter lifetimes result in a less confined E-PSF. Time gating reduces this dependence: While the FWHM of the E-PSFs of the pure CW-STED recordings decreases by a factor of two for lifetime values from 5 to 25 ns, this factor is only 1.2 for the gCW-STED images with T g = 5 ns and even less for T g = 10 ns. This directly follows from Equation (12), since the dependence on t becomes very weak for the gated gCW-STED recordings with T g .t. Consequently, at the expense of loosing signal from short-lifetime emitters, gating reduces bias due to lifetime heterogeneities of the recorded fluorophores.
We also observed a change in fluorescence lifetime for the dye ATTO647N when applied for immunolabeling. Attachment of several Atto647N dyes to an antibody shortens the average fluorescence lifetime due to concentration-quenching of these labels [45]. Figure 5A shows representative fluorescence lifetime decays determined from our ATTO647N-antibody labelled microtubule (compare Figure 3C). During the first image scan, the lifetime of ATTO647N (averaged over all pixels) can only be described by a two-exponential decay with an intensity-weighted lifetime of ,t. = 1.9 ns and a significant (70%) contribution of a short lifetime component of t = 0.3 ns. This is significantly different from the lifetime of ATTO647N in aqueous solution, where its excited state decays mono-exponentially with a lifetime of 3.4 ns. Interestingly, the lifetime of ATTO647N in the immunolabeled samples increased for subsequent image scans to a final value of t = 3.3 ns with a more-and-more mono-exponential decay. Most probably, the continuous photobleaching of an increasing number of the dye molecules with each scan, not only reduces the total signal but also the concentration-quenching. While this change influences consecutive CW-STED recordings, gCW-STED recordings are less affected as illustrated in Figure 5B. The effective resolution (expressed as the FWHM values determined from intensity line profiles across single microtubules) was rather low for the initial CW-STED recordings and increased with successive image scans. In contrast, there is hardly any influence on the image scan number for gCW-STED. This directly follows from Equation (12), where for CW-STED (T g = 0) an increase in t immediately translates into a reduction of FWHM and thus an improvement of the effective resolution; this change in FWHM or effective resolution with t is significantly reduced for T g .0. Notably, the effective resolution of the CW-STED recordings for increasing imaging runs approaches the effective resolution expected from theory (grey-dotted line).
We note that there are other ways to minimize concentrationquenching, for example by optimizing/minimizing the labelling  Figure 3C. The data can be described by a twoexponential decay for the first two scans (1st: t 1 = 2.3 ns (30%) and t 2 = 0.3 ns (70%) with ,t. = 1.9 ns; and 2nd: t 1 = 2.8 ns (60%) and t 2 = 0.35 ns (40%) with ,t. = 2.6 ns) and a mono-exponential decay for the 3 rd and 4 th scan (3.  degree (which may however be weighed against a reduction of the overall brightness of the labelled structures) or by using dye-labels that show less self-quenching (as for the dye KK114 shown in Figure 6A).

gP-STED: Gating with Pulsed STED Lasers
For STED recordings using pulsed excitation and pulsed STED light (gP-STED), theory predicts an increase in image contrast for gated detection with a delay time that is smaller or equal to the pulse width of the STED laser, T g #T STED , but not further on, T g .T STED (Figure 1). Therefore, it has been well accepted that in a pulsed STED scheme, the photons should ideally be detected right after the STED pulse [24,29,30]. However, in most cases the pulse width of the STED laser (<100-300 ps) is shorter than the excited-state lifetime t of the fluorescent markers (<1-4 ns). Consequently, time-gated detection should thus hardly improve the image contrast in the pulsed STED modality. This is shown in the P-STED images of Figure 6A, where we imaged microtubule of fixed PtK2 cells, which were labeled with the dye KK114 [32]. The lifetime of KK114 in this sample was rather long (t , 3 ns), and the fluorescence emission of KK114 on the antibody was not quenched. Consequently, with a pulse width T STED ,300 ps the gated and non-gated images are non-discernable. However, this is different for lifetimes t that are in the range of the pulse width T STED as for our samples that were immunolabeled with the dye ATTO647N, which showed a strong component with a short fluorescence lifetimes t = 0.3 ns <T STED (compare Figure 5). As Figure 6B depicts and as expected from theory, gating in this case (and only in such cases) leads to an improvement in contrast. Note that in a well implemented gP-STED scheme, the influence of the fluorescence lifetime on the spatial resolution does not exist, as becomes obvious from Equation (10). Therefore, lifetime heterogeneities ideally do not influence gP-STED resolving power.

Discussion and Conclusions
The efficiency of inhibiting the spontaneous emission of a fluorescent marker increases with the duration of the STED beam action, as long as this duration is within the range of the lifetime of the fluorescent state or shorter. Time-gated detection uses this fact to improve the effective resolution of STED nanoscopy. This improvement is most significant for the modality using the combination of pulsed excitation with CW-STED lasers, while for an all-pulsed laser implementation (P-STED) the improvement becomes small if the STED pulse duration is much shorter than the excited-state lifetime of the fluorescent marker. We have shown that in some experimental cases, where the excited-state lifetime is shortened by inter or intra-molecular quenching, timegated detection also improves the contrast of P-STED imaging.
STED nanoscopy, as all nanoscopy (superresolution) techniques, overcomes the diffraction resolution limit by shuffling the fluorescent marker between two distinguishable states. Thereby, any inhomogeneity of the transfer properties across the sample can lead to a variation in the imaging performance for all superresolution techniques. In the case of STED, a spatial variation of the fluorophore's excited state lifetime can generate variations in the effective resolution of the imaging modality. Time-gated detection largely reduces this bias for CW-STED, and it removes it completely for P-STED implementations. A downside of this solution is that a notable part of the signal of short-lifetime fluorophores is reduced and may be even lost. As already discussed the loss of signal is acceptable as long as the feature or molecule is identifiable and separable from its neighbours, i.e SNR and SBR do not degrade drastically.
Given the required pulsed lasers and wavelengths are available, the use of entirely pulsed (P-STED) systems currently still remain the methods of choice, especially if the fluorophore shows little photobleaching scaling with (higher orders of) the applied STED beam intensity. Yet, time-gated detection can alleviate the performance difference between the P-and CW-STED modalities remarkably well. We showed that in combination with gated detection, the moderate instantaneous light intensities realized with CW-STED sources can in many cases provide similar resolution as pulsed systems. The main practical limitation of the gCW-STED implementation is the inherent loss of 'good' signal stemming from the location of the zero and the concomitant compromise in signalto-noise and signal-to-background ratios. It should be noted that applying the gate does not increase transmission of high spatial frequencies (which are already present in the conventional CW-STED image) but rather acts as a spatial frequency filter which is able to selectively reduce the low spatial frequency contribution, thus boosting the relative strength of high resolution signal. Even so, for customary imaging parameters, time-gated detection greatly improves the effective resolution in CW-STED imaging, and helps to reveal finer details in the sample.
In our gCW-STED and gP-STED implementation the images were realized off-line using the TCSPC image measurement and selecting only those photons recorded after a time-delay T g from the excitation pulse. However, fast electronic detection gates can be realized to obtain real-time images [32]. Simply disregarding the photons that arrive outside the gate is somewhat wasteful, because they too carry spatial information about the sample. We therefore anticipate a further improvement from combining TCSPC measurements with new methods of deconvolution that take into account the time-dependent E-PSF of a CW-STED microscope [46,47].  Figure S3 Image heterogeneities of the 35-nm nanodiamonds. Correlative plot of value pairs of luminescence lifetime AEtae and peak signal-to-noise ratio (PSNR) (A) and full-width at halfmaximum (FWHM) and PSNR (B) as determined from 14 single FNDs of the CW-STED (black) and gCW-STED recordings with Tg = 5 ns (red) and 10 ns (blue). The large variations but on the other hand the non-correlative characteristic of the value pairs demonstrates differences in the composition of each nanodiamond with respect to number of Nitrogen vacancy (NV) centers and their charges or distances to the particle surface, or due to surface inhomogeneities and contaminations. (EPS)

Supporting Information
Movie S1 gCW-STED image of fluorescent beads for increasing time-delay T g. Same description as Figure 3A. Each frame is normalized to its maximum and represents a different time-delay T g (upper left).

(ZIP)
Movie S2 gCW-STED image of fluorescent nano-diamonds for increasing time-delay T g. Same description as Fig. 3(B). Each frame is normalized to its maximum and represents a different time-delay T g (upper left).

(ZIP)
Movie S3 gCW-STED image of ATTO-647N-immunolabelled microtubules for increasing time-delay T g. Same description as Fig. 3(C). Each frame is normalized to its maximum and represents a different time-delay T g (upper left).

(ZIP)
Text S1 Full-width at half-maximum of the gCW-STED point spread function. (DOC)