True versus false parasite interactions: a robust method to take risk factors into account and its application to feline viruses.

Background Multiple infections are common in natural host populations and interspecific parasite interactions are therefore likely within a host individual. As they may seriously impact the circulation of certain parasites and the emergence and management of infectious diseases, their study is essential. In the field, detecting parasite interactions is rendered difficult by the fact that a large number of co-infected individuals may also be observed when two parasites share common risk factors. To correct for these “false interactions”, methods accounting for parasite risk factors must be used. Methodology/Principal Findings In the present paper we propose such a method for presence-absence data (i.e., serology). Our method enables the calculation of the expected frequencies of single and double infected individuals under the independence hypothesis, before comparing them to the observed ones using the chi-square statistic. The method is termed “the corrected chi-square.” Its robustness was compared to a pre-existing method based on logistic regression and the corrected chi-square proved to be much more robust for small sample sizes. Since the logistic regression approach is easier to implement, we propose as a rule of thumb to use the latter when the ratio between the sample size and the number of parameters is above ten. Applied to serological data for four viruses infecting cats, the approach revealed pairwise interactions between the Feline Herpesvirus, Parvovirus and Calicivirus, whereas the infection by FIV, the feline equivalent of HIV, did not modify the risk of infection by any of these viruses. Conclusions/Significance This work therefore points out possible interactions that can be further investigated in experimental conditions and, by providing a user-friendly R program and a tutorial example, offers new opportunities for animal and human epidemiologists to detect interactions of interest in the field, a crucial step in the challenge of multiple infections.


Introduction
Numerous parasites species circulate simultaneously in natural populations. Many of them are able to infect a same host species and a host individual can therefore be infected by several parasites at the same time. These multiple infections are not only common in nature but usually more frequently encountered than infections by a single parasite [1]. Within a host individual, parasites can thus interact, either in a synergistic manner (parasite A favours infection by parasite B or worsens the symptoms caused by B) or in an antagonistic manner (parasite A decreases the infection risk by parasite B or reduces the symptoms caused by B) [2]. As these interactions can have important epidemiological, biological and clinical consequences (e.g., [3][4][5][6][7]), detecting, understanding and evaluating them is essential to understand the phenomena and to control and manage infectious diseases.
In recent years, the question of polyparasitism has attracted considerable attention [4,8,9], although in reality the subject has a long history of experimental investigation under laboratory conditions [10,11]. Many epidemiological studies have also been conducted on the main human pathogens, motivation for the study of polyparasitism being in particular driven by the urgency to understand the epidemiological and clinical consequences of infection by parasites potentially interacting with HIV and other emerging diseases [12] and the mechanisms of their interactions. A large amount of work indeed revealed interactions between HIV and tuberculosis, malaria, sexually transmitted diseases, and helminths (e.g., [6,[13][14][15][16][17][18][19][20]); as well as interactions between plasmodia parasites and helminths (e.g., [21][22][23][24]). Studies on animal hosts also revealed interactions between their parasites, with many studies on helminth communities (mammals: [4,25,26], birds: [27], fish: [28]), and fewer on protozoan species (e.g., [29]) or viruses (e.g., [30]). Many diseases have been revealed to be affected by the presence of other disease-causing agents, altering the rates of species co-occurrence, levels of infection and disease severity. Parasite interactions have also been shown to affect the success of parasite vaccination strategies [31] and could be involved in disease (re)emergence [32], reinforcing the interest of these studies.
If laboratory experiments have clearly demonstrated that interspecific parasite interactions occur, often mediated by host immune responses [1,[33][34][35], attempts to detect such effects in natural populations have generally been less successful. Indeed, detecting their existence on the field is not easy, due to complex networks of indirect effects making it difficult to infer underlying processes. Field studies are however essential as experimental systems are oversimplified and require an existing suspicion of interaction between the studied parasites. In addition, only studies in natural populations can give access to infection and co-infection probabilities. In other words, before studying their mechanisms in the lab, interactions of interest must be identified in the field. Main difficulties encountered in field studies are methodological. Many confounding factors can create statistical associations between parasites even if there is no true biological interaction between them, which may alter conclusions about the importance of interspecific interactions [36][37][38][39][40]. A similar transmission mode, for example, can alone increase the risk of co-infection. The excess of positive associations found in strongylid communities in domestic horses, ruminants and macropod marsupials is in particular likely to be due to the common habit of these hosts feeding on pastures contaminated with the larvae of a number of nematode species [41][42][43]. In addition, environmental, behavioural or host-specific factors can be associated with both types of infection and influence epidemiological and geographic patterns of infection and disease. Among such common risk factors, some have long been recognised, such as sexual behaviours for sexually transmitted diseases (e.g., [44]), socio-economic status for infections particularly prevalent in poor regions such as helminth infection and malaria [45], or age for many diseases (e.g., [46]). As apparent associations between two infections may be due to common risk factors, they are crucial to identify and to take into account in the analysis. However, such confounding factors are difficult to control and few methods enable to take them into account.
A variety of analytical approaches have been suggested to detect associations in parasite communities, primarily focusing on macroparasite (parasitic helminth) communities (e.g., [4,39,47,48]). However, they implicitly assume that the direction and strength of an observed association between parasite species reflects an underlying biological interaction, and their reliability to detect interactions has been recently questioned [49]. The adoption of a generalized linear mixed modelling (GLMM)-based approach has been rather suggested by Fenton et al. [49] (see also [50]). Apparently more robust to detect interactions between macroparasites, this method has the advantage of offering the opportunity of taking into account the variance caused by other factors.
Nevertheless, field data, particularly relating to microparasites, are most of the time serological (i.e. presence-absence data). Indeed, viral excretion is usually too short to make antigen detection an efficient tool to follow microparasites in natural populations, as host capture and sampling would have to be done exactly during the excretion period, especially during nonepidemic phases. Most field data are thus limited to observed frequencies of seronegative, seropositive and doubly seropositive individuals. In this context, the search for potential interactions between pairs of microparasites is traditionally done by calculating odds ratios in stratified data or by a Pearson's chi-square test of independence (e.g., [29,51,52]). The latter compares the observed frequencies to the frequencies expected if parasites are independent, under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals. However, such methods ignore confounding factors and/or the possible simultaneous action or interaction of several of them. Significant associations detected in this manner can therefore be either true biological interactions or statistical associations, with no means of distinguishing the two. Alternative methods have been therefore proposed to determine the expected frequencies in a modified chi-square analysis. Some are based on the estimation of ''pre-interactive'' species prevalences [53], which requires previous knowledge of dominance relationships between parasites species. Some others are based on log-linear models [e.g., [54][55][56]. In addition, another way to take risk factors into account is to include them in a logistic regression analysis and to determine whether parasite B status is still a predictor of parasite A status [52,57]. However, the main drawback of methods based on log-linear or logistic regression models is that they are based on an asymptotic approximation of the deviance, which might not be relevant for small sample size data.
In the present paper, we propose another method (termed the ''corrected chi-square'') to detect microparasite interactions from serological data, based on an adaptation of the Pearson's chisquare test. By combining logistic regressions and chi-square tests, we are able to calculate the expected frequencies of co-infected individuals if parasites are independent considering their risk factors, and to compare them to the observed ones. In a first step, we perform a theoretical comparison of the robustness of the corrected chi-square and the logistic regression approaches. In a second step, both approaches are applied to serological data obtained in natural populations of domestic cats to search for potential interactions between four feline viruses. The domestic cat is indeed an appropriate model to investigate such questions as its main viruses are well known and rather easy to survey on the field, and its natural populations, although very flexible in their social and spatial organisation, have been extensively studied [32,[58][59][60][61][62].

Statistical analysis
1.1. Logistic regression analysis. A first way to test the interaction between two pathogens is to test the effect of the serological status to one virus on the probability of being seropositive to the other. A logistic regression was used for that purpose. The approach allows correcting for common risk factors by adding known or suspected risk factors as correction variables. The logistic regression model reads: Where F k denotes the k-th risk factor, p 1 is the probability of seropositivity to pathogen 1 and S 2 the serological status to pathogen 2. The coefficients a k (k = 0…K) and b are the coefficients of the logistic regression. The interaction between the two pathogens was tested using a likelihood-ratio test (LRT) testing H0: b = 0 vs H1: b=0. The asymptotic chi-square approximation was used to derive the Pvalue of the test of independence between the two viruses [63].
1.2. Corrected Pearson's chi-square tests. The corrected chi-square approach is based on the idea that the coefficients of the logistic regression of the two viruses can be used to estimate the number of seronegative, single-and double-seropositive individuals expected if the two pathogens are independent. As the classical chi-square, the corrected chi-square compares the observed (O i,j ) and theoretical (E i,j ) numbers of individuals with different combinations of status (seropositive or seronegative) for the two pathogens using the chi-square statistic: where i is the status to pathogen 1 (0 for seronegative and 1 for seropositive) and j is the status to pathogen 2. To calculate the E i,j , for each pathogen taken separately, a logistic regression including K risk factors (see previous section) is run to estimate the probability of being seropositive for each individual (termedp p p,x for individual x and pathogen p, p[ 1,2 f g): Whereâ a p,k denotes the estimation of the regression coefficients for pathogen p and F k,x the value of the k-th risk factor in individual x.
The theoretical contingency table is then deduced from these probabilities: For each pair of viruses, the distribution of the corrected chisquare was determined by a parametric bootstrap run as follows: Step 1: Estimated seropositivity probabilities (p p p,x ) are used to generate in silico serological data for both pathogens independently.
Step 2: The corrected chi-square is calculated for this in silico dataset.
Steps 1 and 2 were repeated 1000 times, leading to 1000 independent realisations of the corrected chi-square statistic under the null hypothesis of independence between the two pathogens.
Two ways of calculating the P-value were derived from this procedure. P-value1 was estimated assuming that the corrected chi-square is proportional to a chi-square with one degree of freedom, the coefficient of over-(or under-) dispersion (ĉ) being defined by the mean of the bootstrapped corrected chi-square. P-value2 was given by the proportion of bootstrapped corrected chisquares which were smaller than the observed value. In principle, P-value2 is better (no assumption on the distribution of the Likelihood Ratio Test, LRT, is made), but requires running enough simulations, which may be long in some cases. P-value1 allows working with smaller numbers of simulations when simulation times are too long.
The R program is available as supplementary file (File S2) and can be applied to any presence-absence data to calculate the corrected chi-square and the associated P-values. A tutorial example (File S3) illustrates its use step-by-step using an example dataset (File S4).

Robustness of the two approaches
The main criticism that could be made to the logistic regression approach is that it is based on the asymptotic distribution of the LRT. In practice, the chi-square approximation is true only for large datasets. In the present paper we investigated the robustness of the logistic regression to different sample sizes and numbers of correction risk factors. We also aimed to compare how robustness is affected by the type of risk factors considered (qualitative or quantitative). The same investigations were performed with the corrected chi-square test to compare the robustness of the two approaches.
For that purpose, random seroprevalence datasets were generated, assuming independent viruses. Random data were always generated assuming that all individuals had an independent 0.5 probability of being seropositive for each pathogen. N F randomly generated risk factors were considered in the logistic regression for the two pathogens. By construction these factors have no effect (they are chosen independently of the serological status of the individuals) but from a theoretical point of view it is interesting to measure how their inclusion in the model can introduce biases depending on the approach.
Randomly generated factors could be either qualitative or quantitative. For simplicity, qualitative factors had only two modalities, individuals having a 0.5 probability of being in each one. Quantitative factors were chosen for each individual randomly according to a standard normal distribution. To investigate how the nature of risk factors affects robustness, three scenarios were tested: i) all factors are qualitative; ii) all factors are quantitative and iii) half of the factors are quantitative while the other half are qualitative factors (mixed scenario).
Our objective now was to understand how data characteristics (the number of individuals, n, the number of factors, N F and their type, scenario i, ii or iii) would affect the probability of wrongly concluding that there is an interaction between the two pathogens (type I error). For a given combination of these characteristics, a thousand random seroprevalence datasets were generated and we estimated the type I error associated to each approach as the proportion of random datasets for which the P-value was below 5%.
3. Application to cat data 3.1. Ethics Statement. The field work has been made by qualified people according to the French legislation. Accreditation has been granted to the UMR-CNRS 5558 (accreditation number 692660703) for the program.
3.2. The feline viruses. The Feline Immunodeficiency Virus (FIV), is a major non-traumatic cause of death in adult cats, and is associated with immunosuppression causing secondary infections [64]. This retrovirus can infect other felids, most of which are threatened or endangered species e.g., the European wildcat (F. s. silvestris) [64][65][66]. It is mainly transmitted by bites, through a direct horizontal mode [67], principally during aggressive or sexual contacts [64,68]. The Feline Herpesvirus (FHV) and the Feline Calicivirus (FCV) are responsible of upper respiratory tract disease, of concern in veterinary medicine [69,70]. Both viruses are transmitted through 'amicable' contacts, by oral, nasal and ocular secretions during close interactions [71,72]. FHV infected cats become asymptomatic carriers, but the latent infection can be reactivated by a stress (i.e., change of habitat, lactation or fights between males; [73]). The Feline Parvovirus (FPV) infects all felids, as well as other carnivores [74], and FPV infection may be fatal especially in kittens [75]. The virus is transiently excreted in feces, urine, saliva and vomiting and its high resistance in the environment (still infectious after 13 months at 4-25uC; [76]) makes indirect transmission through feces and contaminated areas largely predominant [77,78].
3.3. Serological data. The serological statuses for FIV, FHV, FCV and FPV were obtained in 2007 in 15 natural rural populations of domestic cats in North-Eastern France [62,79]. Cats were captured using baited traps or directly caught by the owner, anaesthetized, measured, and blood samples were taken from the jugular vein. FIV-antibodies were immediately searched for with a commercial kit using the ELISA method (SNAP Combo +, Idexx), whereas specific antibodies against FHV, FCV or FPV were measured by a specific blocking ELISA [80]. None of the cats was vaccinated. All six pairs of viruses were tested for potential association. Between 467 and 474 cats were tested for each virus and 465 to 469 were double-tested (depending on the virus pair).
Previous analyses using logistic regression models with the same dataset revealed the combination of risk factors that were supported by our data [62]. Five factors were initially investigated: age (AGE), sex (SEX), way of life (owned or unowned, WOL), orange phenotype (orange or non orange, PHENO) and body mass (MASS) and one correction factor (the population of origin, POP) was considered. For each virus, the most appropriate model was selected using the Akaike Information Criterion adjusted for small sample size (AICc, [81]). Ideally, all factors potentially creating apparent associations should be included in the model. But to limit the number of correction risk factors, the minimal model containing the identified risk factors for the two viruses was retained as a compromise for each pair (Table 1).

Robustness of the two approaches
The corrected chi-square was robust for all tested sample sizes and numbers of parameters, whatever the nature of the factors (scenarios i, ii, iii) and the method used to calculate the P-value (P-value1, P-value2, see File S1 and Fig. S3 for more details). The type I error of this method remained indeed very close to 5% (Fig. 1). On the contrary, the robustness of the logistic regression approach decreased with the N F /n ratio (number of factors/sample size). In scenarios i (only qualitative factors) and ii (only quantitative factors), the type I error was around 5% for a ratio of 0.005, around 8% for a ratio of 0.15 and around 20% for a ratio of 0.35. It became significantly different from 5% for ratios larger than 0.12 (type I error = 6.7%, z = 2.47, p = 0.019) and 0.08 (type I error = 7.9%, z = 4.21, p = 5.7610 25 ) for scenarios i and ii, respectively. In the mixed scenario (iii), the type I error became significantly different to 5% for all N F /n ratio larger than 0.075 (type I error = 7.1%, z = 3.047, p = 0.0038). More details are available in File S1 and Fig. S2. Taken together, these results show that, as a rule of thumb, the logistic regression approach is robust for N F /n ratios below 0.1 for all types of factors.

Feline viruses associations
The two approaches (corrected chi-square and logistic regression) were used for the analysis of the interactions between four cat viruses ( Table 2).
Results showed that the interaction was not significant for pairs involving FIV. All other pairs (FHV-FCV, FHV-FPV and FCV-FPV) were found to interact, i.e., the number of individuals coinfected by two viruses could not be explained by shared risk factors. The three significant associations were all positive, meaning that there were always more co-infected individuals than expected considering shared risk factors (Table 3).
Pairwise interactions between FHV, FCV and FPV could have come from the fact that one virus was a common risk factor for the two others. This possibility was tested (see the three last lines of Table 2) by adding the serological status to one virus as a common risk factor for the two others. Results led to reject this hypothesis, meaning that the observed associations cannot be solely explained by the fact that one virus interacts with the two others.
The two P-values obtained for the corrected chi-squares are coherent. As for the P-values obtained for the logistic regression approach, they are usually slightly lower than those of the corrected chi-squares, probably because of the over-predictive trend of logistic regressions.
In addition, as with simulated data, the logistic regression approach was less robust to small sample sizes than the corrected chi-square (Table S1). This was tested by randomly sampling smaller subsets of the cat data in order to increase the N F /n ratio.
Finally, to emphasise the need to consider risk factors in the analysis of interactions, we also calculated the classical independence Pearson's chi-square. This approach, which does not integrate risk factors, predicted an association between five of the six tested pairs. In the case of the FIV-FCV and FIV-FHV pairs, it would lead to wrongly conclude on the existence of an interaction, whereas the two approaches have shown that these apparent interactions were in fact explicable by shared factors.

Discussion
Common risk factors can create statistical associations. This work confirmed that ignoring them would lead to wrong conclusions. Ignoring them would indeed result in an over-estimation of the number of interactions as any association, biological or statistical, would be put in one basket. The loss of significance after controlling for other factors was illustrated in this paper with feline viruses data, and was previously found by Behnke et al. [39] for helminth parasites of the wood mice. The next step was to identify an appropriate way to take those risk factors into account.

Logistic regression approach versus corrected chisquare tests
Two approaches to take risk factors into account with serological data (i.e., presence-absence) were proposed and examined. Those are the use of logistic regression models as Table 1. Risk factors models used to test for potential association between pairs of feline viruses. previously done by some authors [52,57], or an adaptation of the chi-square test for independence presented for the first time in this paper.
To determine which method should be used under which circumstances, we need to make the following considerations.
First, the corrected chi-square involves 2n+2 estimations of the logistic regression coefficients, n being the number of bootstraps. In comparison, only two models must be parameterized in the logistic regression. As a consequence, the logistic regression approach is much faster to run (less than a second versus 2.5 minutes for the corrected chi-square for a model with 6 factors in full interaction and 300 individuals, for 1000 bootstraps, using a desktop computer with an Intel(R) core(TM)2 Quad CPU Q6600 processor). Second, the corrected chi-square is more robust than the logistic regression, especially for small sample size. A first solution would be to use the corrected chi-square as soon as simulation times are acceptable. For a 5% rejection threshold, a more straightforward alternative is to use the corrected chi-square by default as soon as the ratio between the sample size and the number of parameters is below 10 and the logistic regression in the opposite case. However, we did not test all potential situations and further analyses are needed to determine the limit of robustness of the logistic regression approach (in particular in situations where the probability of infection is not 50% and can be affected by risk factors).
Two P-values have been proposed for the corrected chi-square. The first one relies on the assumption that the corrected chi-square is proportional to a chi-square with one degree of freedom; the second one simply counts the proportion of in silico datasets for which the value of the corrected chi-square is above the observed value. Both P-values led to consistent results using a 5% rejection threshold, consistently with the fact that for all tested pairs the corrected chi-square fitted well with an under-dispersed chi-square with one degree of freedom (Fig. S1, Fig. S2). Which one should be used in practice actually depends on the simulation time. If simulations are fast enough and if running 1000 bootstrap is acceptable, P-value2 should be preferred. In the opposite case, a good option is to run much less bootstraps (typically 30) and to use P-value1.
Even if other alternative methods allow taking covariates into account, we only compared the corrected chi-square to the logistic regression approach. We could have compared it as well to loglinear models, which model the probability of infection with single and multiple parasite species from contingency tables and allow including known risk factors. However, in this approach the independence between parasites is tested using likelihood ratio tests, which are based on an asymptotic approximation of the deviance as in the logistic regression approach. They should therefore have the same limitations than logistic regressions and their robustness should be similarly influenced by the N F /n ratio. In addition, continuous variables are usually discretized in loglinear models, whereas the corrected chi-square allows working with continuous data.

Interactions between pairs of feline viruses
After correction by the known risk factors of the viruses, three pairs of feline viruses out of six appeared to be significantly associated. The N F /n ratio being 0.04 to 0.06, the logistic regression approach can be considered robust, at least for a 5% rejection threshold.
First, it is worth noting that age is a crucial covariate. The infection probability of all viruses increases with host' age [62], thus age must strongly participate in the generation of false interactions. This age-dependence is due to both a biological effect (i.e., behaviors and immune defenses may evolve with age, [82,83]) and a mechanical effect (i.e., older individuals are more likely to be seropositive because of a longer exposure time). Disentangling both effects would require the use of Susceptible-Infected-Recovered (SIR) models, but was not necessary here. Indeed, to The type I error of the corrected chi-square tests represented here is based on P-value2 but similar results were observed with P-value1 (Fig. S3). Note that for the logistic regression approach, points resulting from a given sample size were linked to see the effect of the N F /n ratio for different sample sizes (solid line: n = 100, dashed line: n = 200, dotted line: n = 300). The dashed horizontal line represents a type I error of 5%. doi:10.1371/journal.pone.0029618.g001 correct for age in the study of interactions, the important is to model the evolution of the probability of infection with age.
Correcting for all risk factors, no pair of viruses involving the Feline Immunodeficiency Virus (FIV-FHV, FIV-FCV, FIV-FPV) was significantly associated. This result is at first surprising because, as in humans infected by HIV, feline AIDS is characterised by a chronic immunodeficiency, allowing subsequent opportunistic infections (review in [84]). Indeed, although FIV positive cats can mount immune responses to administered antigens other than during the terminal phase of infection, their primary immune responses may be delayed or diminished [85,86]. Experimental studies also revealed that cats co-infected by FIV and FCV or FHV had more severe disease signs than non-FIV infected cats [87,88]. In addition, the presence of FHV was shown to accelerate FIV transcription through the activation of the FIV long terminal repeat [89], a phenomenon that was also shown in vitro for the human versions of the viruses, HSV2 and HIV [90][91][92][93]. Those laboratory experiments show that FIV infection may increase the severity of FHV or FCV-induced clinical signs but do not address the question of the effect of FIV on the sensitivity to FHV or FCV infection. Furthermore, the few epidemiological studies interested in the question did not demonstrate any epidemiological association between FIV and FHV [94]. In other words, if experimental investigations suggest a synergy between FIV and FHV and between FIV and FCV towards a more severe disease, our sero-epidemiological study suggests that the identified risk factors explain by themselves the apparent increase of double sero-positive individuals.
As for the FIV-FPV pair, this study is to our knowledge the first to search for a potential association. Whether risk factors were taken into account or not, we did not find any significant association between the two viruses. Again, this could be at first surprising as both viruses are supposed to be immunosuppressive [84,95,96]. In experimental conditions, FPV infection is more severe in FIV-infected cats [97]. Consequently, a positive association could have been expected if infections had facilitated   each other (leading to numerous co-infections) or a negative association if the co-infection had led to a strong host mortality (leading to few co-infections). However, the FPV-induced decrease in the immune response is transient and more likely to occur in young kittens, whereas FIV infection is more frequent in adult cats. The persistence of FPV-antibodies can be longer than 7 years [98], and consequently, double seropositivity against FPV and FIV is not synonymous of co-infection. It is likely that co-infections by the two viruses are not frequent and mainly occur in adult animals which are less sensitive to FPV. As no association was evidenced for these three pairs of viruses, the FIV infection does not seem to modify the risk of infection by another virus. However, our results do not exclude the occurrence of an interaction once both parasites are in contact within the host (e.g., directly through competition or indirectly via the host immune system), as suggested by several experimental co-infection studies. In addition, the FIV seropositivity status may encompass different stages of the infection with various degrees of immunodeficiency. The results of this study do not exclude the possibility that late stage FIV infection may increase the sensitivity to the other feline viruses.
On the contrary, the three other pairs (FHV-FCV, FHV-FPV and FCV-FPV) were significantly associated after correction by their known risk factors. It is to our knowledge the first evidence of a possible interaction between those viruses. As more double seropositive cats than expected under the independence hypothesis were observed, possible synergies are suggested. After an acute infection, FHV is known to persist life-long in a latent form, which can be reactivated in stressful conditions [73]. Infection with FPV or FCV could thus be responsible for the reactivation of FHV in latently infected animals, resulting in seroconversion against both FHV and the new infecting virus. This could explain the FHV-FCV and FHV-FPV associations. In addition, since FPV is more immunosuppressive than FCV, the interaction between FPV and FHV is expected to be stronger than that between FCV and FHV, which is consistent with our results. The immunosuppressive effect of FPV could also explain the association with FCV. In that case however, contrary to FHV, it would require that the FCVinfection occurs at the time of the immunosuppression occuring within the two weeks post-FPV infection. Interestingly, a similar association between FPV and FCV antibodies was described in free-ranging lions in East Africa [99].

Real interactions or confounding factors?
This work pointed out new probable synergies between feline viruses that can now be further investigated in laboratory conditions. However, the associations could also result from the existence of an unknown confounding factor common to FHV, FCV and FPV. The feline parvovirus is immunosuppressive, as a result of the strong leukopenia occurring within the two weeks post-infection [95,96]. This virus could therefore be a confounding factor to the FHV-FCV pair if FPV-seropositive cats are more susceptible to FHV and FCV at the same time. However, as shown in this paper, the FHV-FCV interaction remained significant after correction by FPV (Table 2).
If FPV is not a confounding factor, we cannot exclude the existence of another one, such as a greater susceptibility of certain individuals to infections whatever the parasite involved. Numerous studies have shown that an extensive inter-individual variability exists in response to certain pathogens, such as HIV (review in [100]), trypanosomiasis (review in [101]), or human and bovine tuberculosis (reviews in [102,103]), including variations in susceptibility to the parasite, its transmission, and/or the course of disease progression. It has been attributed to host determinants and variability in multiple genes that regulate virus cell entry, acquired and innate immunity (e.g., macrophages, molecular and cellular actors of the inflammatory reaction), and others that influence the outcome of the infection. Hosts with a diminished or delayed innate immune response may in fact be more susceptible to any infection, with physiological parameters, such as hormonal profiles (e.g., [104]), possibly playing a role in the modulation of transmission efficiency and/or in the immune response intensity. A weaker physical condition could also lead to a higher sensitivity to infectious agents (lower dose-effect, different intra-host dynamic) (e.g., [105]). More generally, individuals' personality may as well be involved [61,106]. A better understanding of genetic, physiological and immunological basis of such inter-individual variability would therefore be of particular interest in the context of polyparasitism. Another perspective of this work is the development of new methods able to distinguish pairwise interactions from those due to common confounding factors shared by the three viruses. Such methods could use the proportion of infected individuals that are in reality triply infected.

Conclusion
While the study of macroparasites usually uses quantitative data (i.e., parasite load per individual host), the study of microparasites on the field is most of the time limited to presence-absence data (i.e., serology), making the detection of associations between parasites more complicated from a methodological point of view. The corrected chi-square proposed in this study is, with the logistic regression approach, currently one of the rare ways to search for interaction between parasites from presence-absence data. This work provides evidence of the efficiency of such methods to reduce the bias introduced by common risk factors and encourages their use. However it also points out the low robustness of the likelihood ratio test for certain data characteristics. The corrected chi-square test must indeed be preferred for small sample size.
Those methods can be applied to any epidemiological study based on serology, within human or animal host populations. Applied here to feline viruses, they revealed significant associations between three pairs of feline viruses. If they still do not allow us to decide whether such associations are really true interactions or whether they reveal the existence of ''over-susceptible'' hosts, we believe it is an important step forward as it offers the possibility to point out parasites associations that should be further investigated in experimental conditions. The understanding of parasites interactions and of their consequences on diseases evolution, emergence and management is indeed a crucial challenge for human and animal epidemiologists of our time.  Figure S2 Issue of the conformity tests of the type I error to 5% according to the N F /n ratio for the logistic regression approach. The issue was coded 1 when the test was significant, 0 when not and the resulting logistic regression was drawn (dark line). Three scenarios are considered: i) all factors are qualitative (A); ii) all factors are quantitative (B) and iii) half of the factors are quantitative and the other half are qualitative (mixed scenario, C). (EPS) Figure S3 Type I error (%) of the corrected chi-square tests according to the N F /n ratio and the type of P-value used for the corrected chi-square: P-value1 (blue empty points) or P-value2 (red full points). Three scenarios are considered: i) all factors are qualitative (A); ii) all factors are quantitative (B) and iii) a half of the factors is quantitative and the other half is qualitative (mixed scenario, C). The dashed horizontal line represents a type I error of 5%.

(EPS)
Table S1 Corrected chi-square tests and logistic regressions to search for feline viruses' interactions using subsets randomly sampled in cat data such that the N F / n ratio takes various values.

(DOC)
File S1 Robustness of the logistic regression approach and of the corrected chi-square test. (1) Conformity tests of the type I error to 5%, (2) Influence of the way to calculate the Pvalue of the corrected chi-square test on the robustness of the study. (DOC) File S2 ''Chi2corr'', an R program for the application of the corrected chi-square test to any presence-absence data: test statistic, observed and expected frequencies, estimated dispersion coefficient (parametric bootstrap), P-values and distribution of the bootstrapped corrected chi-square.

(R)
File S3 A step-by-step example of application of the corrected chi-square test to search for interaction between two parasites, using a provided dataset (''da-ta_example.txt'', File S4) and the provided R program (''Chi2corr.R'', File S2).