Estimation of Parameters in the Two-Compartment Model for Exhaled Nitric Oxide

The fractional concentration of exhaled nitric oxide (FeNO) is a biomarker of airway inflammation that is being increasingly considered in clinical, occupational, and epidemiological applications ranging from asthma management to the detection of air pollution health effects. FeNO depends strongly on exhalation flow rate. This dependency has allowed for the development of mathematical models whose parameters quantify airway and alveolar compartment contributions to FeNO. Numerous methods have been proposed to estimate these parameters using FeNO measured at multiple flow rates. These methods—which allow for non-invasive assessment of localized airway inflammation—have the potential to provide important insights on inflammatory mechanisms. However, different estimation methods produce different results and a serious barrier to progress in this field is the lack of a single recommended method. With the goal of resolving this methodological problem, we have developed a unifying framework in which to present a comprehensive set of existing and novel statistical methods for estimating parameters in the simple two-compartment model. We compared statistical properties of the estimators in simulation studies and investigated model fit and parameter estimate sensitivity across methods using data from 1507 schoolchildren from the Southern California Children's Health Study, one of the largest multiple flow FeNO studies to date. We recommend a novel nonlinear least squares model with natural log transformation on both sides that produced estimators with good properties, satisfied model assumptions, and fit the Children's Health Study data well.


Introduction
The fractional concentration of exhaled nitric oxide (FeNO) is considered a biomarker for airway inflammation. FeNO is being increasingly studied in the clinical, occupational, and epidemiological literature [1,2,3]. Early on, it was discovered that FeNO is highly flow rate dependent [4,5], with lower FeNO at higher flows. Guidelines have been developed for the standardized measurement of FeNO at a single 50 ml/s exhalation flow rate [6]. At the relatively low flow rate of 50 ml/s, FeNO is primarily from proximal airway sources [7]. Higher flow FeNO provides more information about distal/alveolar sources, but is an imperfect proxy for alveolar NO concentration [8]. Several mathematical models have been developed that describe the physiology of NO in the lower respiratory tract [9,10,11] using parameters that quantify both proximal and distal NO contributions. A simple and widely used two-compartment model uses differential equations to relate FeNO at a constant flow rate ( _ V V ) to three NO parameters: maximum airway flux (J9 aw NO), airway tissue diffusing capacity (D aw NO), and alveolar NO concentration (C A NO) [12]. The closed form solution of this model is: Numerous methods have been developed to estimate twocompartment model NO parameters using data from FeNO measured at multiple flow rates [12,13]. Analysis of multiple flow FeNO offers a non-invasive method to quantify physiologic parameters that cannot otherwise be assessed. This has strong potential to inform on mechanisms of airway inflammation relevant to diseases affecting NO metabolism and the study of environmental exposure health effects [14]. However, there is no standardized protocol or method for estimating NO parameters, different methods produce different estimates [15], and the uncertainty in parameter estimation is often ignored. The lack of statistically validated, standardized methodology is delaying progress in this field because results from the growing number of studies using different methods may not be comparable.
Multiple flow analysis was originally developed in well-controlled, small-scale experiments (,30 participants) [16,17,18,19], although it is being increasingly translated to medium-sized studies (100-300 participants) [20,21,22,23,24,25,26,27]. The Southern California Children's Health Study (hereafter referred to as CHS) is a longitudinal cohort study on the long term effects of air pollution on children's respiratory health. After several years of collecting FeNO at 50 ml/s in the CHS [28,29,30,31,32,33,34,35], multiple flow FeNO data was collected in a pilot study [36] followed by full-scale collection from 1640 participants [37]. The CHS contains one of the largest multiple flow FeNO datasets to date, and these data motivate this paper.
The primary goal of this paper is to compare statistical methods for estimating NO parameters from the two-compartment model and to identify a method with good statistical properties that also quantifies uncertainty in NO parameter estimation. First, we developed a unified framework in which to present a comprehensive set of existing and novel statistical estimation methods. Then, we compared these estimation methods using simulation studies and CHS data. The secondary goal of this paper is to inform on the validity of comparing results across studies using different statistical estimation methods. We used the CHS data to assess the sensitivity of parameter estimates to the estimation method. Throughout this paper, we focused on the estimation of twocompartment model parameters. For comparison we also included two existing estimators from more complex mathematical models [10,11] although these estimators may not be suited to the healthy or mildly asthmatic children that compose the CHS study population.

Ethics Statement
The protocol for collection of multiple flow FeNO in the Southern California Children's Health Study was approved by the University of Southern California Health Sciences Campus Institutional Review Board. Written informed consent was obtained from a parent or guardian on behalf of each child participant.

Multiple flow FeNO data in the CHS
FeNO was collected at schools in 8 Southern California communities from March-June 2010, from 1640 middle-school students (ages [12][13][14][15] in the active CHS cohort, using EcoMedics analyzer systems in a protocol described previously [36,37]. Children were requested to perform 9 FeNO maneuvers, in the following order: 3 at the conventional 50 ml/s target flow rate (to ensure comparability with prior 50 ml/s FeNO data collected in the CHS), and 2 at each of the following target flows: 30, 100, and 300 ml/s (to balance the need for rich multiple flow data with the constraints on time and resources inherent to a large study of children). Additional maneuvers were permitted when the initial records showed technical problems or inconsistent FeNO readings, in the technician's judgment. Procedures conformed to standard guidelines [6], except that NO concentration was determined from the 3-second plateau interval with minimum coefficient of variation, rather than the first acceptable interval. There were a total of 16201 maneuvers with an acceptable 3 second plateau. We further screened these raw data to remove maneuvers with technical problems. For this paper, we considered only data from the 1507 participants (female: 832, male: 789) with at least one valid maneuver at each of the 4 target flow rates (13614 maneuvers in total) to ensure a clean comparison of estimation methods. Of the 133 children excluded, 101 had valid maneuvers at 3 flow rates. In substantive data analyses, it may not be necessary to exclude all these participants. Additional information on equipment, study protocol, and raw data screening is available in Supporting Information S1.

General estimation of NO parameters
Drawing inspiration from the excellent review of nonlinear models for repeated measurement data by Davidian and Giltinan [38], the various methods used to estimate two-compartment model NO parameters from multiple measurements of FeNO (indexed by j) at different flow rates, _ V V j can be represented using a regression model of the following general form: For each of the methods, the outcome Y j is either FeNO (FeNO j , referred to as the ''Pietropaoli (P)'' formulation [17]), NO output (FeNO j | _ V V j , referred to as the ''Tsoukias (T)'' formulation [16]), or a natural log transformation of FeNO (log(FeNO j ), in a novel formulation). Unexplained error due to local variability in the realization of the exhalation maneuver and instrument measurement error is represented by e j . The error is assumed to be normally distributed with mean 0 and variance s 2 that is constant across flow rates (i.e., homoscedastic). The mean function, f , depends on the flow rate, _ V V j , and the vector of NO parameters, a, and is a linear or quadratic approximation to the right-hand side of Equation 1, or a function of the right-hand side of Equation 1. Below, we present a set of existing and novel two-compartment model based estimators and two alternative estimation methods based on more complex mathematical models. We omit the subscripts j for simplicity. Detailed descriptions of the calculations for all methods and corresponding code for the freely available statistical software R [39] can be found in Supporting Information S1.

Linear approximation models
These models use a first order linear approximation to the exponential function in the two-compartment model (exp(-x) <1x). This approximation is valid when the ratio of D aw NO to _ V V is small (i.e., for adequately high flow rates for a given value of D aw NO). A further simplification comes from the assumption that C A NO is small relative to J9 aw NO/D aw NO [12]. Given these two assumptions, Equation 1 can be linearly approximated based on P and T formulations with: Both formulations are implemented using a simple linear regression model estimated via ordinary least squares. The resulting regression coefficient estimates (intercept:â a 0 and slope: a a 1 ) are interpreted as estimates of the corresponding NO parameters. The ''relatively small C A NO'' assumption excludes D aw NO from the model, which is necessary in order to express the remaining NO parameters as functions of the two regression coefficients. While it may be reasonable to assume that C A NO is small for many subjects [12], the practical implication of this assumption is an inconsistency (setting C A NO = 0 in only part of the expression) which has generated criticism [13]. For this reason, some researchers prefer not to make the assumption that C A NO is small. In this case, the parameter estimated from linear approximation models should be interpreted as J aw NO rather than J9 aw NO. J aw NO is the (non-maximal) flux of NO in the airway compartment. We make the assumption of small C A NO here so that we are estimating J9 aw NO and can compare estimates of this parameter across different estimation methods.
In the simulation study, we fit linear approximation models to all data and then to the subset of data from target flow rates .30 ml/s (50, 100, and 300 ml/s) since the linear approximation is most appropriate for higher flows. For the CHS data, we fit linear approximation models to data from target flow rates .30 ml/s (50, 100, and 300 ml/s). We avoided using only 100 and 300 ml/s flows because estimation of a linear model is unstable with such a small number of data points (n = 4, by design in the CHS and the simulation study). The linear approximation to Equation 1 is mathematically valid when the ratio of D aw NO to _ V V is small (e.g., #0.1). So, the approximation is valid when the flows are $50 ml/s and D aw NO #5 pl?s 21 ?ppb 21 (or, similarly, when the flows are $100 ml/s and D aw NO #10). Hence, using 50 ml/s rather than 100 ml/s as the lower bound of flows involves a more restrictive assumption about the upper bound of D aw NO in the study population.

Quadratic approximation models
These models use a second order quadratic approximation to the exponential function (exp(2x)<12x+x 2 /2). This approximation is valid when the ratio of D aw NO to _ V V is moderately small (say #0.33, e.g., for flow rates of $15 ml/s if D aw NO = 5). Due to the relaxed flow rate assumption, it is appropriate to apply the following quadratic models to the range of flow rates in the CHS.
No assumption about small C A NO is used in the following models: Both formulations are implemented using multiple linear regression with explanatory variables that are appropriate functions of the flow rate (quadP: inverse flow and inverse flow squared, and quadT: flow and inverse flow) and regression coefficients (intercept: a 0 and slopes: a 1 and a 2 ) estimated using ordinary least squares. Formulas to calculate NO parameter estimates and their approximate standard errors (derived using the Delta method) from standard multiple linear regression model output are included in Supporting Information S1. As presented, the quadratic approximation models are new but a simpler version of the quadP model (which assumed C A NO was small relative to J9 aw NO/D aw NO) was developed empirically in the CHS pilot study [36], where it was noted that the fit of the linP model could be improved by adding an inverse flow-squared term.

Nonlinear models
As proposed previously by Silkoff et al [18], nonlinear least squares can be used to estimate Equation 1 parameters directly using data from all flow rates: We propose a novel alternative in which we take a natural logtransformation of both sides: Nonlinear least squares is straightforward to implement using the Gauss-Newton algorithm [40] included in the nls function [41] in R. Nonlinear least squares can also be implemented in most major software packages (e.g., PROC NLIN in SAS, the nl command in Stata, NLR in SPSS, or using the solver tool in Microsoft Excel). The estimation algorithm does not guarantee convergence, but lack of convergence was rare in the CHS data. A key input is a set of starting values for each parameter to be estimated. We used quadP or quadT estimates as the starting values. Supporting Information S1 includes additional details on and code for implementing the nonLin and nonLinLog models in R and SAS.

Constrained models
C A NO represents the concentration of NO in the alveolar compartment, so negative C A NO estimates are not interpretable. We had initially considered a duplicate set of candidate regression models where C A NO was constrained to be $0.1 using the lower bound option in the nls function in R. When bounds are specified in nls, the function implements constrained optimization via the 'nl2sol' algorithm [41,42,43]. A lower bound of 0.1 was selected because the limit of detection for the analyzer equipment was 0.06 ppb [44]. In practice, we found that for datasets where estimated C A NO was $0.1, unconstrained and constrained models produced estimates of all NO parameters that were identical, up to rounding error. For datasets where estimated C A NO was ,0.1 in unconstrained models, the corresponding constrained model produced estimates of C A NO equal to 0.1 (the boundary), which slightly affected the estimates of J9 aw NO and D aw NO. Due to the similarity of results from unconstrained and constrained model, we present only a constrained version of the nonLinLog model, which we denote nonLinLogC. Supporting Information S1 includes code for implementing nonLinLogC models in R.

Hö gman and Meriläinen algorithm (HMA)
For the HMA, average FeNO values at low, medium, and high target flow rates (in the CHS: 30, 100, 300 ml/s) are input to an iterative algorithm based on a T formulation, third order approximation to the two-compartment model (similar to quadT, where quadT is a second order approximation), with starting values for C A NO and J9 aw NO obtained from a linT model fit to medium and high flow rate data [13,45]. The algorithm includes an internal data consistency check [13,45] that ensures the estimate of D aw NO is positive and that a fourth parameter, C aw NO, can be reasonably estimated. In the standard HMA implementation, multiple flow datasets that: (1) fail the consistency check or (2) produce negative estimates of C A NO are considered to not have parameter estimates. However, we found that imposing these 2 criteria produced biased parameter estimates in our simulation studies (using the flow rates available in the CHS) and resulted in 20.4% of CHS participants having no HMA parameter estimates (224 participants' datasets failed the first consistency check and 84 ''consistent'' datasets produced negative C A NO estimates (minimum: 23.53 ppb; median: 20.55 ppb)). Hence we did not impose these 2 criteria and rather considered all available HMA parameter estimates. R code to implement the HMA is provided in Supporting Information S1, but HMA can also be readily implemented using software bundled with the EcoMedics analyzer system. Because HMA is an iterative algorithm with no explicit regression model form, we did not report measures of model fit or measures of uncertainty for parameter estimates.

Refined deterministic models
Finally, we considered two methods to estimate parameters from two more complex mathematical models that account for: (a) back-diffusion of NO during exhalation from the higher concentration airway region to the lower concentration alveolar region and/or (b) increasing cross-sectional area in increasingly distal airways. The first method was developed by Condorelli et al to estimate NO parameters from a trumpet shaped axial diffusion model in two steps [10]. In this method-as tailored to the CHS multiple flow data [36]-linT models are first fit to data from 100 and 300 ml/s target flow rates to produce estimates of C A NO (linT.50) and J9 aw NO (linT.50) . Second, Condorelli parameter estimates are produced by applying adjustment factors (X = 740 ml/s and Y = 1.7) in the following equations [10,36]: The second method was developed by Kerckx et al [46] to estimate the alveolar NO concentration due to in-situ NO production (not from back-diffusion) by: where FeNO (50) is the mean concentration of exhaled NO at 50 ml/s, which we assessed using the ATS/ERS method [6] except that we allowed repeated measurements to differ by #15% rather than 10%. Parameters estimated by the Condorelli and Kerckx methods are not conceptualized as flow-independent. The Condorelli and Kerckx parameter estimates are not directly comparable to two compartment model parameter estimates since the simpler two-compartment model neglects back-diffusion.

Simulation study
We generated data assuming true NO parameter values similar to those previously reported for children [12,32]: C A NO = 2 ppb, J9 aw NO = 800 pl/s, and D aw NO = 5 pl?s 21 ?ppb 21 and assuming an error structure similar to that observed in the CHS. Generally, theoretical FeNO was calculated from a two-compartment model (Equation 1) with the NO parameter values above and then perturbed by adding random error. Specifically, each simulated dataset consisted of 2 FeNO values at each of the CHS target flow rates (30,50,100, and 300 ml/s). We generated random normal error with a standard deviation that decreased as a function of flow rate (3.1, 1.4, 0.8, and 0.5 ppb for 30, 50, 100, and 300 ml/s, respectively). These standard deviations were selected to approximate the within-subject standard deviations of FeNO at a given flow rate in the CHS data, as estimated by linear mixed models [47] with observed FeNO as the outcome and random intercepts for participants, fit separately for each target flow rate. Thus the simulation scenario was based on features observed in real multiple flow FeNO data. Under this scenario of unequal variance in FeNO across flow rates, we expected that methods that allowed for non-constant variance (linT, quadT, HMA, nonLinLog) would outperform methods that assumed constant variance (linP, quadP, nonLin). In the simulation study, we generated a total of 10,000 datasets. Then, we applied all candidate two-compartment model estimation methods to each dataset and recorded the estimated NO parameters to assess bias and the corresponding nominal 95% confidence intervals (CI) to assess coverage probability. Given the simulation sample size of 10,000 and the observed Monte Carlo (i.e., simulation-based) standard deviations, the Monte Carlo estimates of bias (sample mean bias across the 10,000 datasets) have standard errors of: #0.009 ppb for C A NO, #2.25 pl/s for J9 aw NO, and #0.2 pl?s 21 ?ppb 21 for D aw NO. Similarly, the Monte Carlo estimates of the coverage probabilities for the nominal 95% CI have standard errors of #0.005. We performed three additional sets of simulation studies to assess the sensitivity our results to the assumptions of: (1) the value of the parameter C A NO generating the data, (2) non-constant variance, and (3) the value of the minimum flow rate. In the first set of additional simulations, we generated datasets with C A NO = 1 and C A NO = 4, but holding the other parameters constant. Our general conclusions held across the 3 true values of C A NO, so we present results only for C A NO = 2 ppb. In the second additional simulation study, we added random normal error with 1 ppb standard deviation (constant across flow rates), similar to a previous simulation study [48], and found-as expected-that this scenario favored the more refined estimation methods that assume constant variance in FeNO (quadP, nonLin). However, this scenario clearly violated features of multiple flow FeNO data in the CHS, so we do not present the results here. In the third additional study, we generated datasets where the lowest target flow rate was 20 ml/s (with standard deviation of random error in FeNO equal to 4 ppb) rather than 30 ml/s (with standard deviation of random error in FeNO equal to 3.1 ppb). The results and general conclusions were similar to the study in which the lowest flow rate was 30 ml/s and so we do not present the results here.

Criteria for comparison
In the simulation study, NO parameters estimators were compared using 2 criteria: (1) empirical bias, the estimated value minus the true value and (2) 95% CI coverage, the proportion of the 10,000 samples for which the 95% CI contained the true value. For the CHS data, the fit of the candidate regression models to the data was assessed using adjusted R 2 since different numbers of parameters were estimated across the different methods. While some may prefer R 2 to adjusted R 2 , our findings were similar using R 2 so we presented only adjusted R 2 results. To evaluate the homoscedasticity (equal variance) assumption for the error, we calculated the standard deviation of standardized residuals for all CHS participant datasets at each target flow rate (standardization ensured the residuals were comparable across participant datasets). To evaluate the normality assumptions for the residuals, we calculated Shapiro-Wilk tests for normality [49] and reported the proportion of CHS datasets for which the null hypothesis of normality was rejected. Distributions of NO parameter estimates were displayed using boxplots where the ''whiskers'' extend to the most extreme data point less than 1.5 times the interquartile range beyond the first or third quartile. Finally, we assessed the correspondence between estimated NO parameters across models using Spearman's correlation (R). Given the number of CHS participants (N = 1507), the width of the 95% CI for the Spearman's correlations ranged from ,0.10 for R = 0.01 to ,0.002 for R = 0.99 [50]. The HMA, Condorelli, and Kerckx estimation methods all involved iterative or multiple step approaches without an explicit model of the form Y j~f ( _ V V j ,a)ze j , making it difficult to calculate model fit statistics or residuals. These methods have no standard asymptoticallyderived inference and it is impractical to implement resamplingbased inference for 10,000 simulations. So, for these three methods we only report the corresponding parameter estimate distributions and correlations. All statistical analysis and data simulation was performed using R [39]. Figure 1 shows the distribution of NO parameter estimates across the 10,000 simulated datasets. As shown in Table 1, quadratic, nonlinear, and HMA estimators had negligible bias for C A NO (absolute value of estimated bias ,0.05 ppb). Nonlinear model estimators had the smallest estimated bias for J9 aw NO (,10 pl/s) and D aw NO (,0.9 pl?s 21 ?ppb 21 ). Linear model estimators were biased (positive bias for C A NO and negative bias for J9 aw NO) with smaller bias for T formulation models than for P formulation models and smaller bias when fitting the models to data only from 50, 100, and 300 ml/s flows rather than all flows. When imposing the 2 consistency criteria, HMA estimators had larger bias (estimated bias of 20.30 ppb for C A NO, 71 pl/s for J9 aw NO, and 7.7 pl?s 21 ?ppb 21 for D aw NO). As shown in Figure 2, linP, quadP, and nonLin models produced conservative nominal 95% CI for C A NO, with estimated coverage probabilities ranging from 0.98 to 0.99. T formulation models produced anti-conservative nominal 95% CI for C A NO, with estimated coverage probabilities of 0.84 for linT, 0.88 for linT.30, and 0.91 for quadT. The only models that appeared to have appropriate 95% CI coverage probabilities for all 3 parameters were nonLinLog and nonLin-LogC (for both methods the estimated coverage probabilities were: 0.94 for C A NO, 0.95 for J9 aw NO, and 0.94 for D aw NO). Hence the nonLinLog and nonLinLogC models best satisfied the two

CHS data
After data screening, there were 6 to 12 valid FeNO maneuvers per participant. The geometric mean (and standard deviation) of FeNO was 23.6 (2.1), 15.5 (2.1), 9.2 (2.0) and 4.0 (1.9) ppb at the 30, 50, 100, and 300 ml/s target flow rates, respectively (Figure 3, left-hand panel). The actual mean flow rates achieved during the minimum CV plateau were closest to the target for lower flows (mean at each target was 29.1, 48.1, 96.7, and 286.9 ml/s, respectively). For a typical CHS participant, nonlinearity in FeNO as a function of flow was reduced through the application of P or T formulation data transformations (Figure 3, right-hand panel). Figure 4 shows that T formulation models displayed relatively poor fit to the CHS datasets (adjusted R 2 : median = 0.87 to 0.91 and 10 th percentile = 0.58 to 0.73) when compared to P formulation and nonlinear models (adjusted R 2 : median = 0.98 to 0.99 and 10 th percentile = 0.94 to 0.97). All the methods for which we could calculate residuals appeared to adequately satisfy the normality assumption (proportion of CHS datasets for which normality was rejected ranged from 0.021 (for linT.30) to 0.051 (for linT), which was similar to the Type I error rate of 0.05). Only the nonLinLog and nonLinLogC models had good fit and satisfied the homoscedasticity assumption ( Table 2). Figure 5 shows that most NO parameter estimates spanned a relatively narrow range of values, despite a considerable number of outlying values (the estimation method-specific interquartile ranges of estimates ranged from 1.1 to 1.5 ppb for C A NO, 641.5 to 1155.2 pl/s for J9 aw NO, and 11.6 to 19.8 pl?s 21 ?ppb 21 for D aw NO). The distributions of J9 aw NO estimates were very similar across two-compartment model estimators, while Condorelli estimates tended to be higher due to the adjustment upwards to account for back-diffusion.
In methods without a constraint on C A NO, negative C A NO estimates were less frequent in linear approximation methods and more frequent in methods adjusting for back-diffusion. There were 14 negative C A NO estimates for linP.30, 15 for linT.30, 87 for quadP, 67 for quadT, 145 for nonLin, 98 for nonLinLog, 85 for HMA (given that negative estimates were not excluded a priori),  534 for Condorelli, and 245 for Kerckx. An alternative implementation of the Condorelli method that adjusts for airway volume and which should be more appropriate for children (see Supporting Information S1) [10,36] produced 841 negative C A NO estimates, which was considerably more than the method presented here, although this method was developed for adults. The model with a constraint on C A NO, nonLinLogC, produced estimates of C A NO equal to the lower bound of 0.1 ppb for 119 CHS participants (113 of whom had an estimated C A NO of ,0.01 in the nonLinLog model).
The median values of estimated D aw NO were similar across estimation methods (quadP: 8.6 in pl?s 21 ?ppb 21 , quadT: 10.3, nonLin: 10.4, nonLinLog: 12.3, nonLinLogC: 12.0, HMA: 12.6). Recall that the linear approximation required for the linear models was valid for higher flow rate data and relatively low D aw NO. From the nonLinLog model, the estimated D aw NO was less than 5 pl?s 21 ?ppb 21 for only 417 CHS participants (27.7%) and less than 10 pl?s 21 ?ppb 21 for only 637 participants (42.2%). Hence, assuming the nonLinLog model produced appropriate estimates of D aw NO, the majority of the CHS study population would require FeNO to be measured at multiple target flows higher than 100 ml/s in order to apply a linear model where the linear approximation assumption was valid (e.g., target flows of 200, 250, 300 ml/s, but recall that the CHS only has data at 300 ml/s). Finally, all CHS participants had positive estimates of D aw NO from the HMA algorithm, even though we did not impose the internal data consistency criterion.
Next we assessed the sensitivity of NO parameter estimates to the estimation method. Comparing across two-compartment model estimation methods, estimates of C A NO tended to have lower correlation than estimates of J9 aw NO (minimum Spearman's correlation: 0.48 for C A NO and 0.91 for J9 aw NO) (Tables 3 and 4). Estimates of C A NO were more highly correlated within models of the same class (Spearman's correlation: 0.91 for linear models, 0.75 for quadratic models, and $0.78 for nonlinear models). Spearman's correlation between C A NO estimates from two established estimation methods, nonLin and HMA, was only moderate (0.54). For each of the 3 parameters, HMA estimates were most strongly correlated with quadT, nonLinLog, and nonLinLogC estimates (Tables 3-5), which could be expected because the HMA estimation algorithm is based on a third order approximation to a T formulation of the two-compartment model. For a given estimation method, participants' estimates of C A NO and J9 aw NO were approximately uncorrelated for quadratic and nonlinear models and HMA (Spearman's correlation: 0.02 for quadP, 0.07 for quadT, 20.08 for nonLin, 20.04 for nonLinLog, 20.01 for nonLinLogC, and 20.004 for HMA), but weakly to moderately correlated for methods based on linear approximations (Spearman's correlation: 0.53 for linP.30, 0.47 for linT.30, and 20.56 for Condorelli).

Availability of parameter estimates
NO parameter estimates were available for all participants using models estimated by ordinary least squares (linear and quadratic approximation models), nonLin and nonLinLogC models, and the Condorelli method. The nonLinLog model failed to converge for 9 CHS participants. The HMA algorithm failed to produce estimates for 1 participant. Kerckx estimates were not available for 66 participants who had inadequate data at the 50 ml/s flow rate for calculation of ATS/ERS mean FeNO at 50 ml/s. This section included all available parameter estimates for each estimation method.

Discussion
In this paper, we developed a unifying framework for a comprehensive set of existing and novel estimators of twocompartment model NO parameters and compared these candidate methods. We used simulated data to assess properties of bias and inference, and used CHS data-one of the largest multiple flow FeNO datasets to date-to assess model fit, model assumptions, and the sensitivity of parameter estimates to the choice of estimation method. A novel nonlinear least squares model with natural log-transformation (nonLinLog or nonLin-LogC) produced unbiased NO parameter estimates with appropriate measures of uncertainty, had excellent fit to the CHS data, and satisfied modeling assumptions. Although popular for their simplicity of implementation, linear approximation methodsusing the flow rates available in the CHS-relied on an assumption necessary for the linear approximation that was invalid for most CHS participants (since D aw NO was .5 pl?s 21 ?ppb 21 for 72.3% of CHS participants), produced biased  NO parameter estimates, and failed to fractionate multiple flow FeNO into independent proximal (J9 aw NO) and distal (C A NO) contributions to FeNO. Estimates of J9 aw NO were highly correlated across estimation methods, and hence were robust to the estimation method. However, estimates of C A NO had relatively modest correlation across methods, and hence were sensitive to the estimation method. Because J9 aw NO is highly correlated with conventional FeNO at 50 ml/s, one of the primary goals of multiple flow analysis is estimation of C A NO. The sensitivity of C A NO estimates to the estimation method raises important concerns about the validity of comparing C A NO estimates from studies that used different estimation methods. The large number of negative C A NO estimates from the Condorelli method (and, to a lesser extent, the Kerckx method) suggest that the existing implementations of these methods may not be appropriate for the CHS study population or may overcorrect for back-diffusion, mirroring concerns raised elsewhere [51]. While many of our candidate estimation methods have been developed and applied previously (linP [17], linT [9,16], quadP with more assumptions [36], and nonLin [15,18]), the nonLinLog and nonLinLogC models are novel to this application. Nonlinear models are popular in applications where processes are described by differential equations (e.g., pharmacokinetics). A nonlinear model (nonLin) had been proposed previously for multiple flow FeNO [18], but it has not been widely adopted and it has limitations that we acknowledge here. The nonLinLog model was inspired by the typical analytic approach to FeNO data measured at the conventional 50 ml/s flow rate. These data are approximately log-normally distributed and are typically log-transformed prior to modeling via linear regression. Log-transformation cannot be implemented in linear or quadratic approximation models because the resultant regression coefficients would no longer be interpretable as NO parameters. Log-transforming both sides in the nonLinLog model is a straightforward approach that allows us to simultaneously addressed the two issues of: (1) right skew in FeNO at a given flow and (2) non-constant variance across flows, while retaining the physiologic interpretation of the model parameters. Two alternative approaches could have been used to address violations of the equal variance assumption: (1) weighted least squares, although this method assumes known weights which would have to be calculated from appropriate reference data, or (2) formal modeling of the variance structure in addition to the mean structure in the nonlinear regression model [52].
Our results agree with and extend the results of previous studies comparing NO parameter estimation methods. An earlier simulation study (500 datasets generated under assumptions similar to our second ''sensitivity'' simulation scenario) investigated bias-but not inference-for linear approximations of the twocompartment model and found that the median bias for estimators of C A NO and J9 aw NO was smaller for the T formulation than for the P formulation [48]. We observed the same pattern in our simulations. An advantage of the T formulation is that it more heavily weights FeNO from higher flow rates, where the linear approximation is most valid [48]. However, for quadratic approximation models, where the approximation is valid for lower flow rates, we found a negligible difference in bias between formulations. Nonlinear models are appropriate for data from all flow rates and had negligible bias. Roy et al [15] compared NO parameter estimates, model fit (sum of squared error), and the impact of considering select flow rates across a number of different models using data from adults (35 healthy and 50 with chronic obstructive pulmonary disease) and found that a nonlinear model fit the observed data better than a model based on the HMA method, similar to our finding of higher adjusted R 2 in P formulation or nonlinear models than for T formulation models.
FeNO is typically measured multiple times at each target flow rate with the goal of assessing reproducibility. For the HMA, mean FeNO at a given flow rate is calculated from the multiple maneuvers and mean FeNO is used to estimate NO parameters [45]. We prefer using maneuver-specific FeNO values rather than mean FeNO at a given target flow rate for two reasons: (1) if, for a given target flow rate, the actual flow rates from two maneuvers are different, then the FeNO values should also be different and (2) if the number of acceptable maneuvers differ across target flow rates (CHS data is unbalanced across target flow rates by design and from missing data due to technical problems), then using mean FeNO calculated with different sample sizes imposes heteroscedasticity in the data, since for independent and identically distributed NO parameters have physiological interpretations, so parameter estimates should lie within a range of plausible values. Negative estimates of C A NO have been obtained in previous studies using two-compartment models [36] or trumpet with axial diffusion models [10,53]. Under unconstrained estimation of a positive parameter, sampling variation alone can produce negative estimates, as we observed in the simulation study where ''true'' C A NO was 2 ppb. A statistical approach to ensure plausible parameter estimates is to impose constraints, as in the non-LinLogC method. However, we observed that all 98 negative C A NO estimates from the unconstrained nonLinLog model were assigned the lower bound of 0.1 in the constrained nonLinLogC model. Additional information may be gained through careful study of participants with negative or boundary estimates of C A NO. Negative or boundary C A NO estimates could reflect: (a) inadequately performed exhalation maneuvers or otherwise invalid data values, (b) inadequacies of parameter estimation methods, and/or (c) inadequacies of the underlying mathematical models to adequately represent a complex physiological process that might vary according to patient characteristics. In future research, comparing participants with negative or boundary C A NO estimates to those with positive estimates may provide insight into the problem.
This study has several strengths. We provided a unifying framework for an extensive set of methods used to estimate the parameters of the two-compartment model. We developed and evaluated a novel nonlinear least squares model with natural log transformation. We derived the approximate variance of calculated parameter estimates for the quadratic approximation models, so that estimates of uncertainty are available for all regressionbased estimation methods. We developed and provided code for nonlinear models and HMA in the freely available statistical software R. We applied the estimation methods to one of the largest sets of multiple flow FeNO data to date. The CHS data was collected according to strict protocol by well-trained field staff, using state of the art online collection techniques, and with detailed screening and review of raw data. We performed the first comprehensive assessment of the statistical properties of twocompartment model parameter estimation methods using simulated data sets under several scenarios (emphasizing one motivated by CHS data), and assessed model fit and the sensitivity of parameter estimates across models using CHS data. Finally, we explored the issue of negative C A NO estimates which has been largely overlooked in the literature.
This study also has limitations. First, we focused on methods to estimate parameters from the basic -but robust -two compartment mathematical model of NO exchange [9], which has a simple closed form solution to the set of governing equations  ) but neglects axial diffusion for simplicity. In our simulation studies, we assumed data were generated from the twocompartment model, so the observed statistical properties may not hold for data for which a two-compartment model is inadequate. Simulations involving generating data from the TMAD model or estimation of parameters from the multiple airway compartment extension of the TMAD model [11] were beyond the scope of this paper. Second, we evaluated estimation methods assuming multiple flow FeNO data from target flow rates in the CHS (30, 50, 100, and 300 ml/s), which reflect a range of flows that can reasonably be collected in our child study population. Some of the estimation methods considered had clear restrictions on the range of appropriate flow rates (linear approximation models: linP, linT), while other estimation methods had no theoretical limitations on flow rates (HMA, nonLin, nonLinLog). It could be possible that the set of CHS target flow rates favored the nonLinLog method over other methods. Our sensitivity analysis (simulation study with a low flow rate of 20 ml/s rather than 30 ml/s) suggests that the relative statistical performance of the methods was not affected by a slight decrease in the lower bound on flow rates. There is no standard protocol for the set of flow rates for multiple flow FeNO collection, but an interesting area for future research would be to determine the optimal set of reasonable flow rates at which to collect a small set of FeNO measurements to use for NO parameter estimation in large studies. Third, the fact that convergence is not guaranteed for nonlinear least squares models was not a major problem in the CHS (convergence failed for only 9 of 1507 participants for nonLinLog), but missing parameter estimates could potentially limit the generalizability of studies of determinants of NO parameters estimated by nonLin or non-LinLog models. Finally, the statistical theory underlying the calculation of inference (e.g., confidence intervals) relies on the properties of larger sample sizes, but each regression model was fit to a dataset with a small number of observations (N = 8 in the simulation study and 6#N#12 in the CHS). We may have observed better 95% confidence interval coverage than could be expected with real data because the unexplained error in the simulated data was generated from a normal distribution.
In conclusion, we recommend the novel nonLinLog or nonLinLogC method (nonlinear least squares models with natural log-transformation of both sides) for estimation of two-compartment model parameters from multiple flow FeNO data. These methods can be readily used to quantify the uncertainty in parameter estimation, have good statistical properties in our simulation studies, have no theoretical limitations in terms of valid flow rates, and can be implemented in any software capable of fitting nonlinear least squares regression (e.g., R, SAS, Stata, SPSS, or using the solver tool in Microsoft Excel). C A NO is often of primary interest in multiple flow FeNO analyses, but we demonstrated that C A NO estimates are sensitive to the estimation method. This sensitivity highlights the need for an appropriate, standardized statistical method for NO parameter estimation. Widespread adoption of the nonLinLog or nonLinLogC method would produce more comparable two-compartment model parameters estimates across studies and would allow researchers to better acknowledge the statistical uncertainties in parameter estimation.