Beta regression model nonlinear in the parameters with additive measurement errors in variables

We propose in this paper a general class of nonlinear beta regression models with measurement errors. The motivation for proposing this model arose from a real problem we shall discuss here. The application concerns a usual oil refinery process where the main covariate is the concentration of a typically measured in error reagent and the response is a catalyst’s percentage of crystallinity involved in the process. Such data have been modeled by nonlinear beta and simplex regression models. Here we propose a nonlinear beta model with the possibility of the chemical reagent concentration being measured with error. The model parameters are estimated by different methods. We perform Monte Carlo simulations aiming to evaluate the performance of point and interval estimators of the model parameters. Both results of simulations and the application favors the method of estimation by maximum pseudo-likelihood approximation.


Introduction
Regression models for dependent variables that assume values in the unit interval have been proving quite important in the literature, with a special highlight to the beta regression model proposed by [1] and generalized by [2] whom proposed the nonlinear beta regression model. In the above mentioned models traditionally all covariates are considered fixed, not random, without measurement error. In practice, covariates may not be observed directly or may be subject to measurement errors. It is important to emphasize that if this assumption is not respected, unreliable inferential results shall be obtained [3]. In these circumstances, regression models with measurement errors are usually defined and structured so that the mean response is explained by covariates x t whose measurements are inaccurate. Thus, instead of the true value of x t , the value of another predictor variable, w t , which is associated with a measurement error, is considered.
It is reasonable that in regression models for dependent variables assuming values in the unit interval, some of the covariates are not observed directly, but acquired with possible  [4] proposed the beta regression model with measurement errors, in which covariates measured with errors can enter in the mean and precision submodels described by linear predictors. A possible extension to the model proposed by [4] is to consider a nonlinear modeling for the parameters. In fact, a real problem has shown the necessity of this modeling proposal. The interest of the problem lies in modeling the percentage of crystallinity of a chemical reagent based on different concentrations of vanadium and steam, considering two values for temperature. It is expected that the higher the concentrations of vanadium and steam, the lower the percentage of crystallinity. The loss of crystallinity is a negative effect of the process and precise knowledge of how vanadium concentration destroys this crystallinity is one of the most chief goals of this regression modeling. As we shall describe in the application, the measurement of vanadium concentration may be inaccurate, which characterizes the possibility of this covariate to present measurement error. These data were modeled by [5] based on beta and simplex nonlinear regressions in which the vanadium was taken as a fixed covariate. In our application the nonlinearity is related with steam, that is here treated as a fixed covariate. Nonlinear models with measurement errors have been being recently studied, as shown in the literature; see [6][7][8][9]. Our aim here is propose a particular class of nonlinear beta regression with measured errors, in which the nonlinearity is on a parameter related to a fixed covariate. We consider three methods for the estimation of the parameters, namely: approximate maximum likelihood, approximate pseudo maximum likelihood and regression calibration. We compare these three methods with the estimation of the naive model, which considers that the regression does not have covariates measured in error, that is, it considers the classical nonlinear beta regression model. We evaluate the properties of point and interval estimators for the four estimation methods considered, based on Monte Carlo simulations and different scenarios. Both the simulations and our application pointed out that the estimation method by approximate pseudo maximum likelihood is the one that presents the best performance. Furthermore, recently [10] proposed the simplex regression models with measurement error, and as a future research we shall extend this proposal to nonlinear case.
Following [4] we consider that the variables with measurement errors in the submodels for mean and precision are the same. Also, the vector of random measurements x t is not directly observed. The observed vector is w t = (w t1 , . . ., w tr ) > , which we here suppose related to x t in the form w t = τ 1 + τ 2 �x t + e t , where e t is a vector of random errors, τ 1 = (τ 11 , . . ., τ 1r ) > 2 IR r and τ 2 = (τ 21 , . . ., τ 2r ) > 2 IR r are vectors of unknown parameters and "�" represents the Hadamard product; for two vectors u 1 = (u 11 , . . ., u 1r ) > and u 2 = (u 21 , . . ., u 2r ) > of same dimension, u 1 �u 2 = (u 11 u 21 , . . ., u 1r u 2r ) > . We may assume τ 1 to be a vector of zeros, as we may assume τ 2 to be a vector of ones. However, we can also assume them to be unknown parameters to be estimated, as we shall see in the application.
Hence, our vector of unknown parameters is s 2> e Þ > are, respectively, the vectors of interest and nuisance parameters. The joint density function of (y t , w t ) is In (3), f(y t |x t ;θ) represents a density for a beta distribution, f ðw t jx t ; s 2 e Þ is a conditional density of w t given x t and f(x t , ξ) is a density for the explanatory variable x t . Observe that, to obtain (3), we must assume that y t is independent of w t given x t , that is, the distribution of y t given (w t , x t ) involves only x t [8]. Therefore, the log-likelihood function for the n independent observations is The log-likelihood above involves integrals that are analytically unsolvable, which means that approximation methods need to be used. We will consider here three different approaches to estimate the parameters and, aiming to describe each one of the three methods, we will suppose, for illustration purposes, that only x t is measured with error. However, it is important to stress that the methods here used can be easily generalized to r covariates with measurement errors.
From now on, we make the following assumptions: and, finally, x t and e t 0 , with t, t 0 = 1, . . ., n, are independent. This is a structural model for which the vector of non-observed variables follows a normal distribution. It is important here to stress that the models for μ t and ϕ t , t = 1, . . ., n are based in (2). From the above assumptions, we have where k x is known as the reliability coefficient. An additional assumption is that the variance s 2 e of the measurement error is known. This assumption is necessary to avoid identifiability problems. When we have replicas for w t , it is possible to use the sample variance of w t to estimate the variance of the measurement error. In order to establish the log-likelihood for the model to be studied, we use the results in (5) and also that s 2 e ¼ b s 2 e , from which we obtain 1 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi 2ps 2 We also suppose that the log-likelihood is a concave function of the parameters to be estimated. The estimators we shall propose have no closed form. Therefore, we shall use the nonlinear optimization method Fisher's scoring with initial values proposed by [11] to the nonlinear beta regression models.

Estimation by approximate maximum likelihood
The integral in (8) can be approximated using the Gauss-Hermite quadrature, given by where s q and ν q are, respectively, the q-th zero and weight of the Q-th order orthogonal Hermite polynomial, Q being the number of points for quadrature [12]. Using the change of vari- ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2s 2 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2s 2 x t jw t q u t , we obtain an approximation for the integral in (8) based in (9), which yields the approximation for the loglikelihood function given by where here, μ tq and ϕ tq are the parameters of the beta distribution of y t when the explanatory value x t takes the value x � tq ¼ m x t jw t þ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2s 2 x t jw t q s q , m x t jw t and s 2 x t jw t being defined in (6) and s q being the qth zero of the Q-th order orthogonal Hermite polynomial. Consequently, from (2), we have Typically in the beta linear models with errors in covariates the MLEa's asymptotic distribution is normal with mean C and covariance matrix J À 1 a ðΨÞ for n and Q large enough [4]. Here, we also shall consider this result to build asymptotic confidence intervals for the interest parameters of the model defined in (2) with a level α. We define J a (θ) as the partition of J a (C) related with the interest parameters in the Appendix. Previous to J a (θ) matrix, we also present in the Appendix the score functions for the parameters of the model. Here, MLEa is the approximate maximum likelihood estimator.

Estimation by approximate pseudo maximum likelihood (pseudolikelihood estimation)
This estimation method considers maximizing a function depending only of the parameters of interest, with, following [13], the nuisance parameters being replaced by consistent estimators in the original likelihood function defined in (3). Pseudo-likelihood estimation consists of first obtaining the optimal point b x of the log-likelihood corresponding only to the nuisance parameters, which is given by ' r ðxÞ ¼ P n t¼1 ' 1t ðxÞ, with ℓ 1t (ξ) as in (7). Once b x is obtained, the pseudo log-likelihood is defined as ' % ðy; b xÞ ¼ P n t¼1 ' 1t ð b xÞ þ P n t¼1 ' 2t ðy; b xÞ: It is important to emphasize that the ℓ 2t integral defined in (8) will also be approximated by Gauss-Hermite quadrature. However, since ' 2t ðy; b xÞ involves only θ, its approximation will be a function only of the interest parameters. Let ' y 2t ðy; b xÞ be the approximation for ' 2t ðy; b xÞ obtained with (9). From this, the approximate pseudo log-likelihood function for the nonlinear beta regression model with unidimensional error measures is defined as where ℓ tq (μ tq ;ϕ tq ), defined in (11), is evaluated at b x, that is m b r ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2b s 2 Letỹ be the approximate pseudo maximum likelihood estimator of θ, obtained by maximizing (12). Then, under assumptions that are usually satisfied in practice, it can be shown that the asymptotic distribution of ffi ffi ffi n p ðỹ À yÞ is normal with zero mean and covariance matrix ℓ rt (ξ) and ℓ pt (θ,ξ) are, respectively, the t-th element of the log-likelihood restricted to perturbation parameters ' r ðxÞ ¼ P n t¼1 ' 1t ðxÞ, and of ℓ p (θ,ξ) given in (12). For the nonlinear beta regression model with measurement errors, the matrices I θ , I ξ and I θ have no explicit forms. Since the integral is approximated numerically, simply the second derivatives of ℓ rt (ξ) and ℓ en (θ,ξ) are used, thus the expected information matrix is replaced by the observed information matrix [14].

Estimation by regression calibration
When we estimate by regression calibration, the central idea is to replace the non-observed variable x t by an estimate of the expected value of x t given w t , E(x t |w t ) in the likelihood function. If only one variable is measured with error, the calibration function is rðw t ; x; s 2 e Þ ¼ m x þ k x ðw t À m x Þ. Since w t � N ðm x ; s 2 x þ s 2 e Þ, then, � w ¼ P n t¼1 w t =n and s 2 w ¼ P n t¼1 ðw t À � wÞ 2 =ðn À 1Þ are optimal estimators of μ x and s 2 x þ s 2 e , respectively. However, it is not possible to estimate s 2 x from the observed data w 1 , . . ., w n . We assume that k x or s 2 e to be known, or, alternatively, that s 2 e can be estimated since it is possible to observe replicas of w t .
Replacing the estimated calibration function in the probability density function of y t given x t , that is, in f(y t |x t ;α, β, γ, λ) given in (1), we obtain the modified log-likelihood as wÞ, b k x being the known or estimated value of k x . Observe that the log of the modified likelihood involves only the interest parameter θ and is the same as that of the usual beta regression model, since x � t plays the role of an explanatory variable that is measured with no error. Standard errors of these estimators can be estimated via nonparametric bootstrap.

Simulations
In order to check the performances of the estimation methods that were described in Section (2), we have carried out simulation studies with 10,000 Monte Carlo replications each. We consider beta regression models where some of the covariates are measured with error and, at the same time, is nonlinear with respect to the parameters associated to the covariates that are measured with no error. This was the structure we have used in our application. The model for which we perform our simulations is a particular case in the class of models we propose in Section (2).
Model parameters are estimated with the naive model, (ι naive ), which is the traditional beta likelihood function, approximated maximum likelihood (ι a ), approximated pseudo maximum likelihood (ι p ) and regression calibration (ι rc ). In all optimization processes, we have used the nonlinear BFGS quasi-Newton algorithm with numerical derivatives that is implemented in the programming language Ox [15]. Integral approximations were tried by using Q = 12, 30, 50 and 80 points of quadrature, these four different numbers of points yielding very similar results. We present here results for Q = 50.
We have made simulations for n = 40, 80 and 160. At each Monte Carlo replication, the n fixed observations of the covariate with no measurement errors, denoted by z t1 , were obtained as independent draws from a uniform distribution on (0.2, 1.2), t = 1, . . ., n. Furthermore, x t and e t 0 being independent, t, t 0 = 1, . . ., n. The simulations were performed for three different values of k x , namely: 1À m t , that is the logit link function and g 2 (ϕ t ) = log(ϕ t ), t = 1, . . ., n.

Scenario 1-Constant precision
The model is defined as The true values of the parameters submodels are α 1 = −0.6, α 2 = 2.4 and β 1 = 0.8. Aiming to compare the performances of the competing estimation schemes presented in Section (2), for different precision magnitudes, we have chosen three values for γ 1 , namely: 2.8, 4.0 and 5.7 leading to ϕ = 16.44, ϕ = 54.60 and ϕ = 298.87, typically these ϕ's values in beta regression represent three degree of precision model, namely: substantially low, low to medium and medium to high, respectively. Concerns to the nuisance parameters we chose μ x = 0 and s 2 , k x = 0.75 (s 2 e ¼ 0:333, moderate measurement error) and k x = 0.95 (s 2 e ¼ 0:053, low measurement error).

Scenario 3-Varying precision II
This is the most complex scenario. The beta regression model with varying precision, and the both submodels present measurement error and nonlinearity, such that the complete model is defined as We admit that v t1 = z t1 , for t = 1, . . .  Table 1 displays the RMSE (root mean square error) of the Scenario 1's estimators. From this table, overall, we note that the RMSE decreases considerably when the precision model, the k x value and the sample size increase, which is the expected behavior. However, this behavior does not hold for the γ 1 's estimators, in special when we use the naive and regression calibration methods. It is noteworthy how the b g 1 's RMSE for these two methods become larger when ϕ increases. We shall focus on k x = 0.50 and n = 160. Be ϕ = 16.4 and ϕ = 298.9, the b g 1 's RMSE for the naive (ι naive ), regression calibration (ι rc ), approximate likelihood (ι a ) and pseudo-likelihood (ι p ) estimation methods are (0.7796, 0.7796, 0.7784, 0.7149) and (3.0352, 3.0352, 0.9895, 0.9565), respectively. Furthermore, the RMSE of b g 1 does not decrease for the naive and regression calibration methods when the sample size increases, notably for k x = 0.50 and k x = 0.75.

Simulation results
The best RMSE-related performances of the estimators are due the approximated likelihood schemes, overall. The largest RMSEs are of the b g 1 . When ϕ = 298.9 these RMSEs are larger than when ϕ = 16.4. However, these RMSEs decrease substantially when the sample size and reliability coefficient increase. Furthermore, the b a 1 , b a 2 and b b 1 do not display large RMSEs when the precision is so low. Lets focus in the ι p scheme, pseudo-likelihood and in the b b 1 , the estimator related to measurement error. Be k x = 0.75 and n = 40, when ϕ = 16.4 and ϕ = 298.9 their RMSEs are 0.1825 and 0.1059, respectively. Finally, it is important to note that only when n = 40, k x = 0.50 and for the parameter β 1 , RMSE-related, the estimation by approximate likelihood outperforms the pseudo-likelihood method. Fig 1 presents plots for the biases of the parameter estimators of the Scenario 1, when n = 40, 80, 160, ϕ � 17, 55, 300 and k x = 0.50. It is noteworthy as the naive structure and regression calibration perform poorly, in especial for the precision estimation. As the ϕ value increases, the biases of the γ 1 estimators moving considerably away from zero showing absolute values close to three. Both estimation schemes also show the worst performances for estimating the nonlinearity parameter α 2 , since their biases moving further away from zero as the sample size increases. However, it is important to note that the absolute biases of b a 2 , equally close to 0.2. for both methods are considerably small compared to the actual value of the parameter, α 2 = 2.4.
In fact, it is remarkable how badly the naive β 1 estimator behaves. Its bias is constant for any value of n and ϕ and considerably high, equal to 0.4 while β 1 = 0.8. Since β 1 is the coefficient of the covariate measured with error, this behavior considerably disfavors the naive structure.
Regarding methods based on approximate likelihood (ι a and ι p ), the Fig 1 leads to important conclusions. In general, their biases are not markedly high, especially when the sample Table 1. Root mean square error (RMSE) for the estimators of α 1 , α 2 , β 1 and γ 1 for the nonlinear beta regression model with constant precision. Scenario 1.   size increases, even when ϕ is quite low. As an example, for β 1 , the coefficient of the covariate measured with error, the largest biases of these two estimators occur when n = 40 and ϕ � 17, and are close to −0.1, getting closer and closer to zero for ϕ � 55 and ϕ � 300 (Fig 1(g)-1(i)). Nevertheless, be α 2 , the nonlinearity parameter. We note that the biases of the approximate likelihood and the pseudo-likelihood estimators for this parameter are close to zero when n = 160 (Fig 1(d)-1(f)).
For significant value of the reliability coefficient, i.e., k x = 0.50 only when the ι a and ι p methods are used, the biases of the estimators for all model parameters decrease as the sample size increases. and their performances are quite similar. However, for b g, the ι p method exhibits slightly lower bias than the ι p method, particularly for n = 40.
The fact described above is the first evidence of a better performance of the ι p method, typically. The smaller the b g's bias, the smaller the bias of the variances of the model estimators and the better the performance of the confidence intervals and hypothesis tests related to the model parameters.
In what follows we shall discuss estimation by confidence interval with respect to Scenario 1. The Table 2 displays the coverage rates of the estimated asymptotically 95% confidence intervals for k x = 0.50, 0.75, 0.95, n = 80 and ϕ = 16.4, 298.9.
At this point, it is interesting to note how the high b g 1 biases negatively affected the performance of the interval estimation. In particular, the considerable overestimation of γ 1 when we use the naive and regression calibration methods leads to variances of the estimators highly underestimated. Especially the larger the value of ϕ. Therefore, we have interval lengths that are too small, to the point of not coverage the true values of the parameters. For example, when ϕ = 298.8 the coverage rates for the parameter γ 1 based on the two methods are equal to zero, and 100% of the estimated values are greater than the true value of the parameter ( Table 2).
We turn now to the coefficient for the covariate with measurement error, i.e., the parameter β 1 . Since b b 1 relative to the naive method considerably overestimates that parameter, regardless of the value of ϕ (Fig 1(j)-1(l)), in association with the underestimation of variances already commented above, its β 1 interval estimator produces values that are 100% higher than the true value of the parameter. The performance of the interval estimator based on the regression calibration method is considerably better than the naive related to β 1 . However, it is still considerably poorer than that of the ι a and ι b methods. For k x = 0.50 and ϕ = 298.9, the coverage rates and average lengths for ι rc , ι a and ι p are (66.59%, 73.27%, 93.38%) and (0.40, 0.75, 2.02), respectively. Here we note the outperformance of the ι p scheme. Its coverage rate is the closest the nominal level 95% and the average length equal to 2.02 decreases to 0.31, for k x = 0.75, ensuring a coverage rate closer to the nominal level (Table 2).
We shall now evaluate the interval estimators of α 2 . Based on the Table 2 we note that for ϕ = 16.4, the schemes: ι naive , ι rc , ι a and ι p show coverage rates near each other and equal to (90.12%, 88.94%, 91.65%, 89.45%), respectively, for k x = 0.75. These coverage rates are still closer when k x = 0.50 and k x = 0.95. However, when ϕ = 298.9 the behavior of the four methods differs considerably. When k x = 0.75 those coverage rates above become equal to (90.93%, 89.82%, 72.78%, 75.20%). When k x = 0.50, these coverage rates differ further, in especial related to ι a estimator, being equal to (87.91%, 87.26%, 61.40%, 75.29%). Here it is necessary to increase the sample size to better understand the behavior of the estimators.
Thus, Fig 2 displays the coverage rates of the interval estimators of the beta regression model with measurement errors and nonlinearity with constant precision equal to 16.4. We consider n = 40, 80, 160 and k x = 0.95, 0.75, 0.50.
Those results of coverage rates are related to the biases of the point estimators. In this case, the ι naive and ι rc methods have been incorrectly favored for being more biased. Their respective point estimators slightly overestimate α 2 , this fact associate with the considerably underestimate the variance, in particular when ϕ = 298.9 (Fig 1(l)) leads to interval estimators that cover the true value of the parameters without loss of accuracy. However, this fact changes as the sample size increases. We can see in Fig 2(f) that when n = 160, the coverage rates of the interval estimators for α 2 , related to ι a and ι p methods, are close to the nominal level of 95%, while those of the ι naive and ι rc methods remain constant for all sample sizes and close to 90%.

Parameters
Interval  its largest b a 2 average bias is less than −0.1 while α 2 = 2.4. Thus, the chance that their interval estimates cover the true value of the parameter only depend on the goodness of approximation of the point estimator's distribution to the normal distribution. For the ι a estimators this approximation seems to improve as the sample size increases, once that coverage rate equal to 84.68% cited above, for n = 80 and ϕ = 16.4, increases and remains close to the nominal level, in this case 95%. This behavior can be seen in the Fig 2(f).
From this figure, we confirm what we had discussed in the last paragraph. The interval estimators of the ι a and ι p methods are the ones that present the closest coverage rates to the nominal level of 95%, taking into account all parameters of the model. Most importantly, these rates get closer to the nominal level when the sample size increases. For the nonlinear beta regression with measurement error and constant precision, we should conclude that the best performance regarding interval estimation, concerns the approximate pseudo maximum likelihood method. Table 3 presents the RMSE (root mean square error) of estimators for second and third scenarios. We begin by considering the second scenario. We observe that, in practically all situations, this RMSE decreases when the sample size increases. The exceptions lie in the estimation of γ 1 of the precision model. When we use the naive model and regression calibration with significant values of the reliability coefficient k x , i.e., k x = 0.75 and k x = 0.50. For example, when using the naive model for k x = 0.50, we obtained RMSEðb g 1 Þ ¼ ð0:6956; 0:7475; 0:7660Þ, respectively, for n = (40, 80, 160).
In fact, for smaller values of k x it is reasonable that for at least one estimator (in our case it was b g 1 ) we should obtain a larger RMSE when n increases when running the naive model. This is because, for small values of k x , the naive model represents a poor approximation for the true model and larger values of n represent more information about data. Therefore, larger values of n will detect the approximation is not good. This is translated in a larger RMSE for at least one estimator. Also, since regression calibration does not work with the full information about data distribution, but only with moments, it is expected that this loss of information that is intrinsic to regression calibration will be felt in the estimation process if more data are available. Anyway, the poor performance of the naive model is a clear indication of the necessity, in practice, of the model here proposed.
We turn now to the parameter β 1 , which is the coefficient for the covariate that has error measurement. Estimation by approximate maximum likelihood (ι a ) and approximate pseudo maximum likelihood (ι p ) present better performances. This is expected, since they use the full information about data distribution. We then investigate if calibration regression has comparable performance, and we can see it does not. For example, in the first scenario, for k x = 0.50 and n = 40, we have for the b l 1 estimator that RMSEð b l 1 Þ ¼ 7:549, while the values of the RMSE are typically smaller than 1 for the other estimators.
In general, with respect to the complete estimation of the model, when we have large measurement errors, k x = 0.50, and the sample size increases, the performances of the approximate pseudo maximum likelihood and the approximate maximum likelihood are comparable. Therefore, the pseudo maximum likelihood method is here recommended.
From the simulation results in Table 3, we can compare, for different reliability coefficients, the performances of the different estimators in terms of RMSE for the beta regression model with measurement errors in one covariate and nonlinearity in one parameter both for mean and precision submodel, i.e., Scenario 3. For the coefficient β 1 of the covariate with measurement error, the naive model is the one presenting the worst performance, in particular, when the reliability coefficient is k = 0.50. For example, for n = 80 and k = 0.50, we have the RMSEs of the estimators ι naive ι rc , ι a and ι p for β 1 given, respectively, by 0.8355, 0.4412, 0.3088 and 0.3032. Thus, the approximate maximum likelihood and approximate pseudo maximum likelihood estimators present much better performances. The much worst performance of the naive estimator indicates again the usefulness of our proposed model. Also, we observe that the approximate maximum likelihood and approximate pseudo maximum likelihood are comparable, which means that pseudo-likelihood can be a useful tool in our specific proposal. Figs 3 and 4 present plots for the biases of the estimators of the parameters in Scenario 2, for n = 40, 80, 160. It is possible to see that the naive structure and regression calibration  the estimators of α 1 , α 2 , β 1 , γ 1 , γ 2 and λ 1 for the nonlinear beta regression model with nonconstant precision. Scenario 2 and Scenario 3.
It is remarkable how bias of b a 2 , (related to nonlinearity) increases with measurement error for the regression calibration method. (Fig 3(d)-3(f)). This same behavior can be observed for the precision submodel parameters, in particular for b l 1 , which estimates the coefficient of the covariate with measurement error. In this case, the biases of the estimators using regression calibration increase considerably for large sample sizes and measurement errors. (Fig 4(d)-4(f)). In fact only approximate maximum likelihood and approximate pseudo maximum likelihood produce estimators with biases tending to zero as the sample size increases.
Figs 5 and 6 present the plots of biases of the estimators of the parameters in Scenario 3, namely: nonlinear predictors with measurement error on both submodels. Approximate maximum likelihood (ι a ) and approximate pseudo maximum likelihood (ι p ) present much better performances than those of naive and regression calibration methods (ι rc ) for b b 1 , the estimator of the coefficient of the covariate with measurement error (mean submodel). (Fig 5(g)-5(i)). Estimators of the parameters related to nonlinearity are more biased, but, even so, the performances of ι a and ι p methods are slightly superior to those of the ι Naive and ι rc , in particular for larger values of the measurement error (Fig 5(f)). We can also check that, for the parameters of the precision submodel, the maximum likelihood based methods are clearly superior (Fig 6). In fact, once more, only the ι a and ι p methods yield estimators with biases decreasing when sample size increases. Fig 5. Biases of estimators of α 1 , α 2 and β 1 for k x = 0.95, 0.75, 0.50. ι a (square), ι p (circle), ι rc (triangle) and ι naive (star). Scenario 3:    6. Biases of estimators of γ 1 , γ 2 and λ 1 for k x = 0.95, 0.75, 0.50. ι a (square), ι p (circle), ι rc (triangle) and ι naive (star). Scenario 3: It is still important to stress how inference based on interval estimation for β 1 can be affected when we do not consider that the covariate is measured with error, that is, when we use the naive method. That becomes more remarkable if we think this is perhaps the parameter we are more interested in, the coefficient of the covariate in the mean submodel. Fig 7(g)-7(i)   Fig 7. Coverage rates of the nominal 95% confidence intervals for the parameters α 1 , α 2 and β 1 . ι a (square), ι p (circle), ι rc (triangle) and ι naive (star). Scenario 3: clearly shows how we lose with the naive estimation, even when the measurement error is small, k x = 0.95, since coverage rates for the naive estimation become considerably distant from the nominal 95% level. This is particular true for large sample sizes (Fig 7(g)). The sample size n = 160 is information enough to detect that we may be dealing with a wrong asymptotic  parameters γ 1 , γ 2 and λ 1 . ι a (square), ι p (circle), ι rc (triangle) and ι naive (star). Scenario 3: . ., n, n = 40, 80, 160. https://doi.org/10.1371/journal.pone.0254103.g008 normal distribution for the parameters, or, perhaps, even with nonconsistent parameters, due to the wrong assumption that the covariate has no measurement error. This behavior tends to become more seriously large as the measurement error increases, as expected (Fig 7(h) and 7(i)). Fig 8 presents interval estimation of the parameters of the precision submodel. Here we emphasize, in particular, the very good performance of approximate pseudo maximum likelihood estimation. Considering all parameters, measurement errors and sample sizes, this seems to be the recommended method. This is an important aspect, since good estimation of the precision parameters of the observations is directly linked with the efficiency of the estimators and robustness of hypotheses tests about the model parameters.

An application: Fluid Catalytic Cracking (FCC)
Fluid Catalytic Cracking (FCC) is an important chemical process used in petroleum refineries to convert heavy hydrocarbons in small molecules with high comercial value. This process is accomplished by contact of those hydrocarbons with a catalyst having as main component a mineral that is known as zeolite Y [16]. A chemical element that can be found in FCC is vanadium. This chemical element takes part in the process by reducing, among other characteristics, the crystallinity of zeolite Y, in particular when water steam is present. The chemical reaction also depends on temperature, which must be near 720˚C (Salazar, 2005). Hence, at the end of the process, it is important to study how crystallinity fraction of zeolite Y is influenced by different concentrations of vanadium, by water steam and by temperature. The data set is constituted of n = 28 observations. The authors report that the response is moderately concentrated in the upper part of the unit interval. According to them, 75% of the observations are not smaller than 0.77. This characteristic will be considered as an important factor in our modeling. [5] have modeled the response, considering the beta and simplex distributions, by using a nonlinear predictor for the mean but without measurement errors in the covariates. However, following [16], the concentration of vanadium for each observation was determined by a spectrophotometric method, with the aid of a chemical complex that includes the element and using a calibration curve. This calibration curve was defined with the same spectrophotometric method, by using another complex, that includes vanadium and a different reagent. Therefore, the concentration of vanadium here can be considered a covariate that is obtained for each case with measurement error. Extending the nonlinear model in [5], we propose the nonlinear beta regression model with measurement errors given by where x 1t represents vanadium concentration, z 1t denotes water steam and z 2t is a categorical variable indicating in which of the two temperatures the experiment was done (0 = 700˚C and 1 = 760˚C). Since x 1t is a covariate with measurement error, we need to previously estimate the nuisance parameters m x ; s 2 x and s 2 e . Thus, we perform a simple linear regression for the relation between the original x and observed w chemical complexes, as w t = τ 1 + τ 2 x t + e t , with t = 1, . . ., 15, e t � Nð0; s 2 e Þ. The obtained estimates were b t 1 ¼ 0:181 (0.123), b t 2 ¼ 0:713 (0.089) and b s 2 e ¼ 0:03810, with R 2 = 0.98. Also, we have b s 2 x ¼ 0:68, supposing x t � Nðm x ; s 2 x Þ. This yields an estimate of the reliability coefficient k x ¼ s 2 x =ðs 2 x þ s 2 e Þ of b k x ¼ 0:9436, a low measurement error.
We have tried two different link functions: the logit and the complementary log-log. According to [17], when the mean of the response variable is near 1, the complementary log-log link function for the mean softens the impact of influent points in maximum likelihood estimation. In this sense, we consider here the logit and the complementary log-log link functions and check the performances of both modelings.
Tables (4) and (5) present the lower and upper limits of the estimated confidence intervals for the parameters of Model (13) obtained for the confidence levels of 90%, 95% and 99%, by considering the logit and complementary log-log link functions, respectively. We also present the lengths of the interval estimates. From both tables, it can be seen that the only interval estimate having zero is that for α 2 at the nominal level of 99% using the method of regression calibration and with the logit model for the mean. Note that by accepting the hypothesis that α 2 = 0, we eliminate the term a 2 z 1t z 1t þa 3 from the model and, consequently, not only the covariate z 1t , but also nonlinearity. Also, observe that the length 0.245 of the confidence interval is quite large, more than twice the length of 0.0936 for the confidence interval of this same parameter and method of estimation, but using the complementary log-log as the link function.
In fact, it is remarkable how the complementary log-log model presents better interval estimation with confidence intervals having lengths much smaller than the corresponding ones for the logit model, in all cases. For example, for α 3 , the coefficient related to temperature, when the nominal confidence level is 95% and we use approximate pseudo maximum likelihood, the lengths of the interval estimates are 0.4220 and 0.2099 for the logit and complementary log-log models, respectively. Comparing the estimation methods, it is also remarkable that It is important here to observe that approximate pseudo maximum likelihood presents much better performance than the naive model, that which does not take measurement error into account, even for a very low measurement error, that is b k x � 0:95. This shows how important to consider measurement errors can be when estimating the beta nonlinear regression models.
From all the results here presented, we select for this particular data set the nonlinear beta regression model for which the covariate x 1t , the vanadium concentration, has measurement error, and the regression structure defined by . . . ; 28: Inference for measurement error in x 1t produces b s 2 e ¼ 0:03810, b s 2 x ¼ 0:68, b k x � 0:95 and estimation based on approximate pseudo maximum likelihood yields b a 1 , b a 2 , b a 3 , b b 1 , b g 1 equal to 0.910, −0.049, −26.451, −0.182, −0.292 and 4.405, respectively, with standard errors given by 0.063, 0.011, 1.891, 0.054, 0.050 and 0.286, all parameters being significant at the 1% level.
Finally, note that b � ¼ exp ðb g 1 Þ � 82 which for the beta regression can be considered a moderate precision, neither too low, but far from being considered high. That is, the application shows similarities with Scenario 1 of the simulations, for ϕ = 55 and K x = 0.95. The results of the application only confirm those of the simulations in which, among the methods evaluated, the approximate pseudo maximum likelihood method presents the best performance regarding interval estimation.

Conclusions
We have proposed in this work a beta regression model with parameter nonlinearity and covariates measurement with errors for mean and precision submodels. Log-likelihood for this kind of model are written in terms of integrals with difficult solutions. For this reason, we have proposed to approximate those integrals using a Gaussian quadrature. This yields the estimators which we have referred to in this paper as the approximate maximum likelihood estimators. Because the approximate likelihood can become a very difficult function to maximize, we have also considered an approximate pseudo maximum likelihood, where we first estimate the nuisance parameters. The advantage of this two-phase method is to reduce the dimension of the problem, which makes maximization problem easier. We have also tried a regression calibration method, where the nonobserved variable is replaced by its conditional expectation in the likelihood function.
Numerical simulations have shown that the approximate pseudo maximum likelihood proposal is a very good alternative to estimate the model parameters, competing with the original maximum likelihood estimators. On the other hand, regression calibration is not a good proposal, which means that too much information about the observations can be truncated when we use this method. Therefore, regression calibration is not recommended here, although we strongly recommend approximate pseudo maximum likelihood.
An application has also shown that approximate pseudo maximum likelihood presents much better performance than the naive model, which does not take measurement error into account, even for very low measurement error. This shows how important to consider measurement errors can be when estimating beta nonlinear regression models. Furthermore, noteworthy are the superior performances related to the use of complementary loglog link function, in a sample whose responses are close to the upper limit of the unit interval.
Supporting information S1 File. We create a directory in to allocate the supplementary information, namely: "S1 File". As supplementary material we placed two simulation programs, the application program, the data set, the file of quadrature points, and the description of the data set. (ZIP)