Experimental Design for Parameter Estimation of Gene Regulatory Networks

doi:10.1371/journal.pone.0040052

Figure 1.

Models of the DREAM6 Estimation of Model Parameters challenge.

The two steps of transcription and translation have been combined. Green arrows represent an activating interaction while black lines indicate a repression. Numbers on edges correspond to the numbering of the Hill kinetics used by the organizers of the challenge. The proteins marked by a red star have to be predicted under a perturbed setting to evaluate the accuracy of model predictions. A: Model with 29 kinetic parameters and 8 Hill coefficients/interactions; B: Model with 35 kinetic parameters and 10 Hill coefficients/interactions; C: Model with 49 kinetic parameters and 15 Hill coefficients/interactions.

More »

Expand

Figure 2.

Experimental design procedure based on profile likelihood.

The left panel shows the profile likelihood of a practically non-identifiable parameter. To resolve the corresponding uncertainty of the parameter estimate, a set of parameter vectors along its profile is chosen, represented by the dashes on the x-axis. The parameters are subsequently modified according to two designs, a highly informative experiment in the upper branch and a weakly informative experiment in the lower branch. Model predictions are simulated for the chosen parameter vectors, represented by the time courses in the middle column. A wide spread of the simulated time courses indicates an informative experiment. The impact of data purchasing is shown in the most right subplots by the blue curves, where the acquisition of informative data narrows the parameter down to a small variance, in contrast to a poor experiment with nearly no improvement of the estimate.

More »

Expand

Table 1.

Overview of the criteria that were considered for the final decisions.

More »

Expand

Figure 3.

Number of practically non-identifiable parameters during the experimental design process.

The elimination of non-identifiabilities was one of the major goals of the applied experimental design strategy. The number of practically non-identifiable parameters, represented by profile likelihoods reaching one or both borders of the parameter domain under a given statistical threshold, demonstrates the performance of the strategy (dashed lines). As the parameter domain for the Hill coefficients was restricted to a smaller range, i.e. , some of the underlying true values lay at the boundary of the parameter space. Counting the non-identifiabilities for Hill coefficients can therefore give a wrong impression when the MLE was correctly at the border of the parameter space. The number of practically non-identifiable parameters except the Hill coefficients is plotted by solid lines. Note that the number of non-identifiabilities is calculated from noisy data, hence the measurement errors propagate into these values. Therefore it can happen, that the number is increasing at some steps yet it is decreasing in general. An analogous observation can be made in Fig. 4.

More »

Expand

Figure 4.

Deviation of true parameters and our estimates during the experimental design process.

Purchasing informative experiments reduces the distance d of the estimates and the true parameters. Because the estimated parameters are random variables, the distance d sometimes increases by chance. This preferentially occurs if there are parameters with flat profiles.

More »

Expand

Table 2.

Summary of the decisions to spend the budget for model .

More »

Expand

Figure 5.

Profile likelihoods at step 5 in the experimental design process for model .

The marked profile for the production strength of mRNA4 shows a practical non-identifiability, since its upper boundary is only restricted by the border of the parameter space. Most parameters are already in an asymptotic setting, however some exhibit deviations from a quadratic profile likelihood.

More »

Expand

Figure 6.

Trajectories of knock-down (siRNA) of mRNA5 experiment for model along the marked profile of Fig. 5.

Three cases can be discriminated in this figure. In case A, there is almost no spread in the predictions and measuring protein p1 yields no additional information and represents a poor experiment. In case B, some spread can be seen in the predictions, but still many trajectories lying on top of each other. Hence, data acquisition of protein p3 for this perturbation is a medium informative experiment. In case C, a maximal informative experiment is visualized, represented by protein p4, that is able to accurately identify the parameter since the associated curves to every point on the profile are clearly discriminable. Note that the time courses have been normalized by corresponding standard deviation , hence all trajectories of proteins start at .

More »

Expand

Figure 7.

Comparison of all possible experiments at step 5 of the experimental design process of model .

This figure demonstrates the information gain that could be obtained by purchasing time-course data. All possible experiments were taken into account. The smaller the distance (x-axis) after adding the data, the closer are the maximum likelihood estimates compared to the underlying truth. The purchased experiment was among the most informative. Moreover this particular experiment was cheaper than the other similarly informative experiments.

More »

Expand