^{1}

^{*}

^{2}

^{3}

^{4}

^{2}

^{5}

^{*}

M. Kupiec conceived and designed the experiments. M. Kupiec performed the experiments. A. Kaufman analyzed the data. A. Kaufman, A. Keinan, I. Meilijson, M. Kupiec, and E. Ruppin contributed reagents/materials/analysis tools. A. Kaufman, A. Keinan, M. Kupiec, and E. Ruppin wrote the paper.

A. Keinan, I. Meilijson and E. Ruppin are the authors of a pending PCT patent entitled “A New Method for Multi-Silencing Analysis.”

Perturbation studies, in which functional performance is measured after deletion, mutation, or lesion of elements of a biological system, have been traditionally employed in many fields in biology. The vast majority of these studies have been qualitative and have employed single perturbations, often resulting in little phenotypic effect. Recently, newly emerging experimental techniques have allowed researchers to carry out concomitant multi-perturbations and to uncover the causal functional contributions of system elements. This study presents a rigorous and quantitative multi-perturbation analysis of gene knockout and neuronal ablation experiments. In both cases, a quantification of the elements' contributions, and new insights and predictions, are provided. Multi-perturbation analysis has a potentially wide range of applications and is gradually becoming an essential tool in biology.

Which are the important elements of a system? What are their relative contributions to the performance of the various tasks the system is involved in? These simple and basic questions typically arise when analyzing the workings of any system, and of biological systems in particular. In the latter, the elements may be genes, proteins, cells, or tissues, depending on the level and scope of the analysis. To address these questions in a causal manner, perturbations are required, where the elements are perturbed and the resulting performance function is recorded. This approach has been one of the cornerstones of biological research. However, it has usually been confined to the perturbation of a single element at a time, which may lead to misleading results if the elements of the system functionally interact with each other. This paper addresses these questions by providing a quantitative and rigorous method for the analysis of multi-perturbation experiments, where more than one element may be concomitantly perturbed. The workings of the new method are demonstrated in the analysis of genetic multi-knockout experiments of DNA repair in the yeast

System identification (localization of function) in biological networks is currently mainly studied in genetics by high-throughput expression profiling and in neuroscience by functional brain imaging. While these techniques have proved to be very useful and productive [

To address this challenge, Keinan et al. [

The goal of MPA is to define and calculate the contribution (importance) of system elements to a certain function, from a dataset of a series of multi-perturbation experiments. In each such experiment, a different subset of the system elements is concomitantly perturbed (denoting a perturbation configuration), and the system's performance in the studied function is measured. The FIN algorithm analyzes the same multi-perturbation data. It describes the incremental contribution of each subset of elements to the function studied, and produces a compact representation, composed only of the most important subsets. As the full set of all theoretically possible multi-perturbation experiments required for the MPA and FIN computation is usually unavailable, both analyses employ a predictor algorithm to compute the system's performance on the missing multi-perturbation experiments.

In the following sections we describe the MPA and FIN methods and present their application to two different biological systems: the DNA post-replication repair (PRR) pathway in

The starting point of MPA [

The basic observation underlying MPA is that the multi-perturbation setup is essentially equivalent to a coalitional game. A coalitional game is defined by a pair (

A payoff profile of a coalitional game is the assignment of a payoff to each of the players. A value is a function that assigns a unique payoff profile to a coalitional game. The function is efficient if the sum of the payoffs assigned to all players is

Then, the Shapley value is defined by the payoff

assigned to player _{i}

Obviously, conducting the large number of multi-perturbation experiments (exponential to the size of the system) required for the computation of the Shapley value is most often intractable. In such cases, MPA involves training a predictor using a given subset of multi-perturbation experiments to predict the performance levels of all missing experiments. Given the predicted outcomes of all multi-perturbation experiments, the predicted Shapley value is calculated as the Shapley value based on these outcomes. The accuracy of such an analysis depends on the accuracy of the predictions [

The FIN algorithm, based on the series of multi-perturbation experiments, begins with the computation of a performance prediction function ^{n}

We performed a multi-knockout study of the DNA PRR system of the yeast

A key physiological target of the PRR pathway is PCNA, a homotrimeric ring-shaped protein that encircles DNA, functioning as a freely sliding clamp that tethers DNA polymerase to the DNA template. The current hypothesis posits that following the stalling of the replicative DNA polymerases (when lesions are encountered), PCNA is modified, and the replicative polymerase is replaced by trans-lesion polymerases. Ubiquitination of PCNA is carried out by the Rad6 ubiquitin-conjugating enzyme, which is targeted to the stalled replication fork through physical interactions with the Rad18 cofactor [

The analyzed data include a series of multi-knockout experiments carried out in the lab of one of the authors (M. K.), testing the ability of the resulting mutants to resolve the single-stranded gaps created after UV irradiation. Hence, the perturbations were gene knockouts, and the elements were the five genes listed above. The performance under investigation was UV survival, measured by the relative number of colonies that survived compared to the wild-type yeast strain (normalized on a scale from zero to one). The dataset included 21 multi-knockout experiments (see

The multi-knockout data can be utilized to construct a unique weighted multilinear performance prediction function ^{n}

Visualizes the compact performance function

The FIN analysis gave rise to two new hypotheses. First, as is evident in

We turn to address the question of function localization in the nervous system, focusing on laser ablation experiments of the

The contributions of the neuron pairs across the different tasks can be summarized in a contribution matrix, where _{ij}

Uses the two main principal components of the SVD, which together explain 96% of the data's variance.

(A) “Task space,” presenting the projections of the neurons' contribution vectors (column vectors of the contribution matrix) onto the two main principal eigenvectors (PCs) of the task space.

(B)

This paper presents a multi-perturbation analysis of two different biological systems. The analysis reinforces previously known knowledge in a quantitative manner and leads to new insights. The MPA analysis of the PRR system shows that each of the RLCs has a different magnitude of contribution to the PRR process. The FIN analysis gives rise to the hypotheses that there are additional polymerase loading complexes in yeast and that DNA polymerase ζ encoded by

MPA and FIN are the first methods to our knowledge to harness game theory concepts for the analysis of biological systems. Further work is needed to better adapt these methods to the constraints of biological systems, most notably, the limited depth (i.e., number of concomitantly perturbed elements) of multi-perturbations in biology. However, this is likely to be a very rewarding endeavor, as such multi-perturbation analysis has potentially many applications. The most direct and natural ones are those concerning the analysis of causal perturbation data, e.g., in genetics, using gene silencing with RNA interference. In neuroscience, there is now a new prospect of carrying out experimental perturbation studies using transcranial magnetic stimulation. This technique allows researchers to induce “virtual lesions” in normal subjects performing various cognitive and perceptual tasks [

Importantly, MPA and FIN are not limited to causal perturbation analysis, where one controls the lesions made. They may well be applied to sets of naturally given multi-perturbations, e.g., by studying the brain localization of cognitive functions from “multi-lesion” data from stroke patients. In summary, multi-perturbation studies are a necessity if one wants to understand the processing of biological networks in a quantitatively causal manner. The methods described in this paper are a harbinger of this new kind of study, offering a novel and rigorous way of making sense out of them.

The basic MPA and FIN analysis methods are described at the beginning of the Results. Here we provide a description of the extension of MPA to a two-dimensional interaction analysis and the details of the FIN algorithm.

In complex systems, the importance of an element may strongly depend on the state (perturbed or intact) of other elements. A higher order description may be necessary to capture these interactions. Such high-dimensional analysis provides further insights into the network's functional organization.

We focus on the description of two-dimensional interactions. A natural definition of the latter is as follows [

be the Shapley value of element ^{N}^{\{j}} is the value function over the set (

Let us now define the coalitional game (^{M}^{M}

where _{(i, j)} = γ_{(i, j)}(^{M}

which quantifies how much the average marginal importance of the two elements together is larger (or smaller) than the sum of the average marginal importance of each of them when the other is perturbed. Intuitively, this symmetric definition _{i,j} = I_{j,i})_{i,j¯}_{i,j}

The performance prediction function

The dividend computation is performed in an iterated manner. It begins from the dividend of the null group, and each iteration computes the dividend (incremental contribution) of subsequently larger, subsuming subsets.

To compute a compact and intelligible approximation of

where

(47 KB PDF)

(82 KB PDF)

(53 KB PDF)

(52 KB PDF)

(50 KB PDF)

The SwissProt (

We thank Nir Yosef, Ya'acov Ritov, Shawn Lockery, and Cori Bargmann for their valuable comments and suggestions. A. Kaufman is supported by the Yeshaya Horowitz Association through the Center of Complexity Science. A. Keinan was supported by the Dan David Prize Scholarship. M. Kupiec was supported by grants from the Recanati Foundation and the Israel Cancer Fund. E. Ruppin's research is supported by the Tauber Fund and the Center of Complexity Science.

functional influence network

multi-perturbation Shapley value analysis

post-replication repair

Replication factor C

Replication factor C–like complex

Singular value decomposition