Application of an Integrated Statistical Design for Optimization of Culture Condition for Ammonium Removal by Nitrosomonas europaea

Statistical methodology was applied to the optimization of the ammonium oxidation by Nitrosomonas europaea for biomass concentration (CB), nitrite yield (YN) and ammonium removal (RA). Initial screening by Plackett-Burman design was performed to select major variables out of nineteen factors, among which NH4Cl concentration (CN), trace element solution (TES), agitation speed (AS), and fermentation time (T) were found to have significant effects. Path of steepest ascent and response surface methodology was applied to optimize the levels of the selected factors. Finally, multi-objective optimization was used to obtain optimal condition by compromise of the three desirable objectives through a combination of weighted coefficient method coupled with entropy measurement methodology. These models enabled us to identify the optimum operation conditions (CN = 84.1 mM; TES = 0.74 ml; AS = 100 rpm and T = 78 h), under which CB = 3.386×108 cells/ml; YN = 1.98 mg/mg and RA = 97.76% were simultaneously obtained. The optimized conditions were shown to be feasible through verification tests.


Introduction
Ammonium is the main pollutant in waste water. It can be toxic to aquatic life, cause eutrophication in receiving water, and affect chlorine disinfection efficiency [1]. Traditionally, ammonium removed from waste water is mainly based on biological technology, for it is more effective and relatively inexpensive [2]. The biological removal of ammonium is undertaken in biological nitrification/denitrificantion processes, which represents a key process in global nitrogen cycle [3]. Among all the microorganisms listed as good degrader of ammonium, Nitrosomonas europaea, the model chemolithoautotrophic ammonium oxidizing bacterium (AOB), obtains energy from the aerobic oxidation of ammonia (NH 3 ) to nitrite (NO 2 2 ) [4]. Carbon dioxide (CO 2 ) was used as the preferred assimilative carbon source via the Calvin-Benson-Bassham cycle for N.europaea's growth and maintenance [5]. A significant amount of the energy obtained from oxidation of hydroxylamine has to be invested in ammonia oxidation to hydroxylamine, reverse electron transport to generate Nicotinamide adenine dinucleotide (NADH) and CO 2 fixation. As a consequence, the growth rate and yield of N.europaea are relatively low [6].
To successfully apply this bacterium to the biological removal of ammonium, large quantities of biomass need to be produced. The growth of N.europaea and its ammonium oxidation process are believed to be extremely affected by variously nutritional and environmental conditions [7,8]. For instance, the physiological growth state of N.europaea and the efficient substrate utilization were influenced by NH 4 + , dissolved oxygen and NO 2 2 according to Yu and Chandran [9]. Meanwhile, the production of nitrous from aerobic oxidation of ammonia constantly decreases the pH, and therefore the regular pH steps are required to neutralize the production of nitrous in the culture [10]. Previous improvements have been done by manipulating the nutritional parameters and physical parameters, such as, temperature [11], amino acids [12], and oxygen concentration [13]. The range of carbon sources capable of supporting growth has also been extended to include pyruvate [14] and fructose [15]. Moreover, metallic ions can lead to the reprogramming of cellular metabolic network during fermentation. The addition of Mg 2+ [16], Cu 2+ and Fe 3+ [5] is found to increase the activity of the ammonia oxidation.
The optimization of culture condition by single dimensional search is laborious and time consuming, especially for a large number of variables and it does not ensure desirable conditions [17]. Statistical experimental design techniques are useful tools for developing, improving, evaluating and optimizing biochemical and biotechnological process based on only a small number of experiments. In this paper, a novel integrated statistical design, which incorporated Plackett-Burman design, path of steepest ascent, response surface methodology (RSM) and multi-objective optimization will hopefully provide a valuable approach for optimizing the ammonium biological removal technology in waste water treatment.  Materials and Methods

Microorganism and Preparation
N.europaea (ATCC 19718) suspensions were prepared by transferring a fresh 280uC frozen cell stock to minimal growth medium (ATCC medium 2265) containing 45 mM NH 4 Cl. The pH was adjusted to 8.060.1 via periodic addition of sterile 10 N NaOH. The cultures were grown in the dark at 26uC shaking at 100 rpm for 72 h. Cells were harvested by centrifugation (9 000 rpm, 30 min, 4uC), and washed twice with mineral medium without ammonium. The cell pellet was re-suspended in 40 mM KH 2 PO 4 buffer (pH 7.8) at a concentration of 5610 9 cells/ml with an average viability of 98%. Purity of the culture was checked by periodically plating onto Luria broth agar plates.

Growth of Bacterium
Batch experiments were conducted in 250 ml Erlenmeyer flasks containing 100 ml of liquid medium. The composition of medium used in PBD experiments was described as  [9]. Initial pH was adjusted to 7.8 using 10 N NaOH. Phenol red (0.0003% final concentration) was used as pH indicator as described elsewhere [18]. All the flasks were heatsterilized by autoclaving at 121uC at 103 kPa for 15 min prior to inoculating in a shaking incubator, in the dark. Cell concentrations at the beginning of each experiment was measured and normalized to ensure consistency among experiments. The pH of the culture was readjusted daily to pH 6.8-7.4 by the addition of the sterility 1 M NH 4 HCO 3 or 1 M NaHCO 3 . All batch reactor conditions were run in triplicate. Each data point was Table 3. Central Composite Design (CCD) matrix, experimental data and predicted values by the response surface analysis.

Run
Factor a Responses b expressed by an average with an error bar (i.e. standard deviation from three independent samples).

Analytical Methods
Cells counts were performed by light microscopy using a Helber chamber (standard deviation [SD], 5%) [19]. Biomass concentration (C B , cells/ml) was calculated as the number of cells per volume of fermentation mash. Ammonium and nitrite were determined spectrophotometrically according to the stander method [20]. The cellular protein content was quantified by the Micro BCA Protein Assay Kit (Rockford, Illinois) according to manufacturer's instructions after cells were collected by centrifugation at 1 000 g 610 min, rinsed with MilliQ water to remove salt, and digested in 3 N NaOH at 65uC for 30 min. Nitrite yield (Y N , mg/mg) was calculated as a ratio of nitrite to the total cellular protein according to Shu et al [21]. Ammonium removal (R A , %) was determined by (ammonium initial -ammonium end )/ammonium total .

Experimental Designs
4.1. Screening of significant factors. Plackett-Burman design (PBD) is a highly effective technique which screens the critical factors that significantly influence the process and eliminates the insignificant factors from a large number of candidate factors [22]. This technique assumes that interactions among the factors will be much smaller than the important main effects. It is a fraction of a two-level factorial design (+1 or 21), which requires fewer runs than a comparable factional design and allows the investigation of n21 factors in at least n experiments.
Here, 19 independent factors, with initial values determined by preliminary experiments based on the literature reviews, are shown in Table 1. The design matrix created by the statistical software package Minitab 16.0 (Minitab Inc., USA) is also represented in Table 1. The C B , Y N and R A were measured in triplicate and the averages were taken as the responses.
4.2. Steepest ascent method. To approach the optimal range of the selected factors, steepest ascent method is used to move rapidly to the general vicinity of the optimum via experimentation [23]. This method constructs a path through the center of the design based on the coefficients from the PBD functions [24].
In this study, experiments for each response were performed along the path of steepest ascent with defined intervals, which were determined by the estimated coefficients and practical experience. The design and experimental results obtained are shown in Table 2. Once the path of ascent no longer led to an increase, the point would be near the optimal point and could be used as the center point for subsequent optimization.

Optimization of significant variables using
CCD. Response-surface methodology (RSM), which includes factorial design and regression analysis, helps in understanding the interactions among the factors at varying levels and selecting the optimum conditions for the design response [25]. This method has been widely used for the optimization of various processes in biotechnology.
In this study, the four selected independent factors were studied at five levels (22, 21, Table 3. 4.4. Multi-objective optimization. The Multi-objective optimization (MOO) was applied when the second-order polynomial function for each objective was determined. Through a combination of entropy measurement technique (EMT) and weighted coefficient methodology (WCM), which is used to evaluate individual response weights on the basis of entropy of the entire process [26], which is employed to form a scalar objective function for finding solution for MOO [27], the overall C B , Y N and R A could be simultaneously optimized.
By using the EMT, the original sequence of all the three responses should be firstly normalized by the normalized value   (r pn * ) in higher-the-better type, according to Datta [26]. After entropy of each response (h n ) was calculated by the distinguishing coefficient j ij and normalized coefficient k, the weight coefficient (w n ) of each response was obtained.
The weight of each response should be real numbers like that w n .0 for all responses. The three second-order polynomial functions were normalized and aggregated. Pareto optimal set for the problem is generated by systematically varying the weighting parameters for the objective functions. A total maximum strategy was obtained by solving the aggregating of second-order polynomial function.
After the calculation of the weight coefficient of each objective, the relational coefficients (C n ) were computed by the following equation: The higher the values of the relational coefficients are, the closer the corresponding factor combination is said to be the optimal.

Experimental Strategy
In this paper, culture condition of N. europaea leading to maximum biomass concentration (C B ), nitrite yield (Y N ) and ammonium removal (R A ) was determined by a highly efficient integrated statistical design. The design comprised the following steps:(1) apply the Plackett-Burman Design (PBD) to identify the significant factors from 19 different factors; (2) use the path of steepest ascent for moving rapidly to the general vicinity of the optimum via experimentation; (3) employ the response surface methodology (RSM) in evaluating the significant factors and obtained optimization condition for each response; (4) employ the combination of weighted coefficient method with entropy measurement methodology in obtaining multi-objective optimization (MOO) condition for all the three desirable objectives; (5) carry out the confirmatory experiments under the optimized conditions to check the models.

Statistical Analysis
Variation due to model inadequacy was evaluated by Lack-of-fit test (LOF). The analysis of variance (ANOVA), which was carried out by Fisher's statistical test (F-test), was employed for the determination of the significance of the models. Furthermore, the quality of the model was evaluated by the coefficient R 2 . The significance of the regression coefficient was tested via Student's ttest.

Screening of Significant Variables using Plackett-Burman Design
In the first step of this study, the influences of 19 independent factors on C B , Y N and R A were investigated using PBD. The experimental data of each response in Table 1 were correlated as first-order models and were shown in Table 4. The determinant of the coefficient R 2 of these models for C B (0.9352), Y N (0.9340) and R A (0.9559), indicated that the data variability could be explained by the models very well. Probability (P) values were used to check the significance of the coefficients, which are necessary to understand the pattern of the mutual interactions of the test factors. A smaller magnitude of the probability means a more significant correlation coefficient. The significance of the regression coefficient was tested with the confidence of 95%, so p#0.0001 meant very significant; p#0.05 was considered to denote a statistically significant difference and p#0.01 also shown the power of significance.

Path of Steepest Ascent
Based on the aforementioned linear model equation (Table 1), the path of steepest ascent was determined to find the proper direction of changing the variables according to the sign of the main effects to improve C B , Y N and R A . Therefore, the path of the steepest ascent started from the center of the PBD and searched the proper direction to alter the levels of TES, C N and AS with other factors fixed at zero level.
The experimental design and corresponding results were shown in Table 2. The results indicated that the highest response was reached at the experiment 4 when C N , TES and AS were selected to be 80.0 mM, 0.80 ml and 104 rpm, respectively, which suggested that this level for each of the three factors was near the region of the maximum response.

CCD and RSM Analysis
Lack-of-fit tests (LOF) for deriving the best correlation between independent factors and responses were carried out. If a specific type of function, such as the linear function, the two-factor interaction(2FI) function, and the second-order polynomial function, adequately fits the data, as the largest portion of the error sum of squares is not due to lack of fit, the p-value should be large (.0.05). The ''p-value'' was 0.0027, 0.0001 and 0.0092 of the linear functions and was 0.0018, 0.0001 and 0.0112 of the 2FI functions for C B , Y N and R A , respectively. Both the linear functions and the 2FI functions had small p-values (,0.05) and should be cautiously used as the response predictor. By applying multiple regression analysis to the experimental data, the second-order polynomial function was established. Models that included insignificant factors were termed over-fitted, and were often unrealistically well fitted. Backward elimination is the used to improve the reliability of the RSM models. Therefore, the model terms, which p-values were lager than 0.05, have been omitted to give a better fit. As shown in Table 5, the ''LOF F-value'' of 1.92 (C B ), 9.08 (Y N ), and 2.21(R A ), and the ''LOF p-value'' of 0.2429 (C B ), 0.0119 (Y N ), and 0.1960 (R A ) were indicating that the ''Lack of Fit'' were not significant relative to the pure error.
The results of ANOVA for the second-order polynomial functions are tabulated in Table 5. The ''Model F-value'' of 28.87 (C B ), 23.21 (Y N ) and 14.60 (R A ) indicated that the models were significant. There was only a 0.01% chance that a ''model Fvalue'' this large could occur due to noise (P,0.0001) and most of the variation in the response could be explained by this regression equation. Meanwhile, the fit of the model was examined by the  (Table 4).
By performing the significance test for each coefficient of the equation, it was found that the linear term of C N was shown very significant in C B (Table 5). Stein and Arp [28] reported that production of dense cultures of N.europaea with active cells must provide adequate ammonium. Meanwhile, the linear term of agitation speed (AS), which was interrelated to carbon source (CO 2 ) supplied to the culture [10], also showed very significant effects on both C B and R A ( Table 5) . The interaction terms of TES6T on Y N showed very significant effects. The quadratic terms, except for the T 2 on C B , TES 2 on Y N and TES 2 , AS 2 , T 2 on R A , also showed the very significant effects.
The graphical representations of the regression model, were called the response surface plots and their corresponding contour plots were obtained using Design-Expert software and were presented in Fig. 2-4. The three-dimensional (3D) response surface curves were based on the second-order polynomial functions in which two variables were kept constant at their zero levels while the others varied. The shapes of the contour plot, such as elliptical or saddle, indicated that the mutual interactions between the variables were significant. The interaction terms of C N 6AS and T6AS showed significant effects on all the three reposes. Fig. 2b, Fig. 3a and Fig. 4b showed the 3D response surface curves, the combined effect of C N and AS on the C B , Y N and R A, respectively. It revealed that at low and high levels of the C N and AS, the Y N was minimal. Meanwhile, responses of C B and R A were both increased with the increase of C N and the decrease of AS. The effects of AS and T on the responses of C B , Y N and R A at fixed C N and TES levels were shown in Fig. 2d, Fig. 3d and Fig. 4d, respectively. The curve and the curvature of the contour on the bottom indicated that at low and high level of the AS and T, the Y N was minimal. Meanwhile, it was observed that increasing the T drastically increased both the C B and R A , while increasing the AS drastically decreased both the responses.
The optimal conditions for maximum C B , Y N and R A were extracted by the Design Expert software through graphical model optimization and listed in Table 6. The maximum C B , Y N and R A were predicted to be 3.919610 8 cells/ml, 2.26 mg/mg and 97.86%, respectively.

Multi-objective Optimization
Through graphical model optimization, maximum C B , Y N , and R A were achieved under different optimal conditions. We need to find a multi-objective maximum strategy where the requirements simultaneously meet the critical properties. Thus, a compromise among the conditions for all the three responses is desirable. The multi-objective optimization method is used to achieve such a goal. The weighted coefficients method has been studied successfully for the optimization of multiple-response process [27].
Both C B and R A are significant indexes in the ammonium removal process, and both of them should be simultaneously taken into account when the overall ammonium removal process is evaluated. In addition, nitrite is the product of this process. Thus, Y N as an important index should also be considered. However, because of their different important degrees, weights of all the three response have to be determined. In this study, entropy measurement methodology was used to calculate the weight coefficients. After the experimental data in Table 3 were normalized, the calculated values of entropy for C B , Y N , and R A had been found as 0.912, 0.938 and 0.960, respectively. Therefore, the weights of C B , Y N , and R A are shown as: W = [0.463, 0.328, 0.209].
After three response functions were normalized and linear combining with the obtained weight coefficients, a MOO second-order polynomial response function was found. By solving this function, the maximum values of C B (3.492610 8 cells/ml), Y N (1.92 mg/mg), and R A (97.86%) were predicted to be obtained simultaneously. The optimized culture condition and its predicted response value were obtained and listed in Table 6. The relational coefficient (C n ) for each strategy was also calculated using Eq.1. The C n of 0.613 indicated that MOO strategy had the best performance among the four strategies.

Model Validation
To confirm the adequacy of the model equations, confirmatory experiments under the optimized condition were carried out. All the confirmatory experiments were conducted in triplicate and the values predicted by the optimization model were set as controls.
In the single object maximum strategy, the C B of 3.88660.072 610 8 cells/ml reached 99.2% matching degree compared with the predicted values by the software (3.919 610 8 cells/ml). The Y N obtained on the optimal culture condition was 2.1860.08 mg/mg. It also reached 96.4% of the predicted value (2.26 mg/mg) under the same condition. The mean value of maximum R A was 98.0260.67%, which also matched well with the predicted value (99.8%). All RSM models were successfully built with good validities.
In the MOO strategy, all the response values obtained by measuring in the optimal condition reach above 95% of the values predicted by the software under the same condition simultaneously. Therefore, the integrated statistical design strategies were successfully built with good validities. To the best of our knowledge, it is the first time of such high values of all the three objectives simultaneously obtained [1,2].

Conclusion
In this study, a four-step design, including the PBD, path of steepest ascent, RSM and WCM coupled with EMM for MOO, is used to optimize the culture condition of N.europaea. The novel method provides an attractive solution to simultaneously optimize four main influential variables (C N , TES, AS and T) on the C B and Y N as well as R A . Further confirmatory experiments demonstrate that such an integrated statistical design is an effective and powerful approach. In summary, the proposed integrated statistical design was useful for optimizing the ammonium removal process and held promise to be effective in wastewater treatment technology.