Factors Associated with Post-Seasonal Serological Titer and Risk Factors for Infection with the Pandemic A/H1N1 Virus in the French General Population

The CoPanFlu-France cohort of households was set up in 2009 to study the risk factors for infection by the pandemic influenza virus (H1N1pdm) in the French general population. The authors developed an integrative data-driven approach to identify individual, collective and environmental factors associated with the post-seasonal serological H1N1pdm geometric mean titer, and derived a nested case-control analysis to identify risk factors for infection during the first season. This analysis included 1377 subjects (601 households). The GMT for the general population was 47.1 (95% confidence interval (CI): 45.1, 49.2). According to a multivariable analysis, pandemic vaccination, seasonal vaccination in 2009, recent history of influenza-like illness, asthma, chronic obstructive pulmonary disease, social contacts at school and use of public transports by the local population were associated with a higher GMT, whereas history of smoking was associated with a lower GMT. Additionally, young age at inclusion and risk perception of exposure to the virus at work were identified as possible risk factors, whereas presence of an air humidifier in the living room was a possible protective factor. These findings will be interpreted in light of the longitudinal analyses of this ongoing cohort.

The CoPanFlu-France cohort, which has previously been described elsewhere [11], aimed at studying the risk of influenza infection as a complex combination of biological characteristics (including immunity), individual or collective behaviors and environmental context. This integrative approach consists in comprehensively collecting and analyzing epidemiological data on subjects and their environment as well as biological samples [12,13].
Inclusion of households started in December 2009, at the end of the first H1N1pdm season in metropolitan France. We studied factors associated with the post-pandemic H1N1pdm titer from blood samples collected at inclusion. Previous studies showed that post-pandemic titer was linked to age classes [2,[14][15][16][17][18][19][20] and to pandemic vaccination status [21]. Relying on the massive amount of data collected at entry in the cohort, we tried to find other independent associations with this titer. In a complementary study, we carried out a nested case-control analysis in these subjects to identify risk factors for probable infection during the first H1N1pdm season.

Study design
This study relies on 601 households (1450 subjects) included in the study between December 2009 and July 2010, according to a stratified geographical sampling scheme in the French general population. More details on this sampling procedure, the representativeness of the sample and the global study design are available in a previous publication [11]. A total of 575 households (96%) were included after the first pandemic season (September 7 to December 27, 2009 [22]).
During the inclusion visit, nurses collected detailed data from all subjects with questionnaires and blood samples for serological analyses. As 73 of these samples (5.0%) were either too difficult to obtain (young children especially) or of insufficient quality or quantity to be analyzed, the analyses presented here focused on the 1377 subjects for whom haemagglutination inhibition (HI) titer was measured.
Variables HI assay. The outcome measure was the post-seasonal HI titer, measured from blood samples collected at inclusion. A standard HI technique was adapted to the detection and quantification of antibodies to H1N1pdm. HI assay was conducted in a Bio-Safety Level 3 laboratory using 5.33 haemagglutinating units of non-inactivated antigen [14]. The antigen used was made of a dilution of cell culture supernatant of a H1N1pdm strain (strain OPYFLU-1 isolated from a young patient returning from   [23]. A final volume of 75 ml was used, including 25 ml of serum dilution, 25 ml of virus suspension, and 25 ml of a 1% RBC suspension in PBS (v/v: 0.33%). The HI titer was determined as the highest dilution providing clear inhibition of haemagglutination in two independent readings [24]. All experiments were conducted using serial dilutions (1/10-1/1280) of heat-inactivated sera, group O human erythrocytes (French Blood Bank). All experiments were performed with same negative and positive controls [25] and with a serum agglutinating activity control. All steps of HI assay were performed on Eppendorf epMotion working stations.
Definition of infections (case-control analysis). Though some authors previously carried out risk factors analyses after defining cases as subjects with HI titer $1/40 [2,26], we chose in our main analysis a higher threshold for our definition as titers between 1/40 and 1/80 were likely to result from a cross-reaction. We therefore defined cases as subjects with HI titer $1/80 and all other subjects were considered as controls. In two sensitivity analyses, we additionally defined (i) controls as subjects with HI titer ,1/40 and (ii) cases as subjects with HI titer $1/80 who reported an influenza-like illness (ILI) during the pandemic season and controls as subjects with HI titer ,1/40 and no history of ILI. All pandemic vaccine recipients were excluded from these analyses.
Covariates. All covariates used in the analysis are listed elsewhere [11] and detailed in Tables S1-S6 in File S1. The relation with HI titer was studied for 310 covariates, gathered according to 6 main dimensions: 1) sociodemographic characteristics, smoking habits and medical history, 2) vaccination and preventive measures against the virus, 3) indoor housing, 4) attitudes, beliefs and risk perception, 5) nature of meetings with other people and characteristics of contacts and 6) ecological data regarding the surrounding environment. For dimensions 1 to 5, we used data collected from questionnaires completed by the household members, with the help of the visiting nurse. For geographic data, we geocoded the addresses of households and used information on the surrounding demographic and socioeconomic context provided by the French Institut national de la statistique et des études économiques (Insee) regarding statistical block groups of about 2000 inhabitants (IRIS) [27].
Definitions and coding. Some quantitative covariates were either dichotomized or log-transformed to enhance log-linearity of the studied relation (see supplementary material for details). Age was studied as a quantitative covariate. Subjects reported medical history and vaccinations with the help of their health records. We defined history of ILI as fever $37.8uC and cough and/or sore throat without another known cause [28] between September 7, 2009 and the date of inclusion. This covariate was excluded from the case-control analysis, which focused on possible risk factors. Daily frequency of hand washing was reported for the day before inclusion. For covariates describing smoking habits and preventive measures against the virus, characteristics of other members of the household was studied as an individual explanatory covariate (as a mean for quantitative covariates and as a proportion for binary covariates). Covariates regarding attitudes, beliefs and risk perception were collected from all subjects aged over 15 years with a dedicated questionnaire. Subjects were proposed affirmative sentences and were asked for all of them if they totally agreed, partly agreed, partly disagreed or totally disagreed. These answers were dichotomized (agree/disagree).
A contact was defined as someone the subject either spoke with (at least 3 words) or had a physical contact with. All subjects reported meetings with their contacts during a 3-day period ending the day before inclusion. Duration and location of meetings were collected, as well as age of contacts. In order to study meetings as covariates likely to be associated with the HI titer, we summed the individual durations of daily meetings according to their location (home, work, transports and at school) and to the age of contacts respectively. Summed durations of meetings were logtransformed (with an imputed value of 0.01 minute for subjects reporting a null summed duration of meetings). No information was collected on simultaneity of meetings, and the total reported duration of meetings was additional (e.g., a 10-minute meeting with 3 contacts simultaneously accounted for 30 minutes of meeting).

Statistical methods
All collected covariates likely to be associated with post-seasonal elevated HI titer were studied. Comparison tests between subgroups were Fisher's exact test (for binomial covariates) and Kruskal-Wallis rank sum test (for continuous covariates).
Estimation of geometric mean titers (GMTs). GMTs were estimated for HI assays with the use of regression models for interval-censored data [29,30] accounting for the within-household correlation. Post-stratification was used to compute representative post-seasonal GMT in the French general population. Calculation and use of sampling weights were detailed elsewhere [11].
We defined the ''GMT ratio'' (GMTR) as the multiplicative factor between the GMT in exposed versus non-exposed (for a binary covariate) or for each unit increase (for a quantitative covariate).
Control for confounding. As age and pandemic vaccination status had an important impact on the serological titer, GMTR was systematically adjusted on these major confounders in all univariable analyses. For analyses regarding environmental characteristics of the bedroom and of the IRIS, correlations were considered at these two levels respectively. Analyses of contacts were adjusted on the proportion of weekend days in the 3-day period.
Case-control analysis. Risk factors for infection were studied with the use of alternating logistic regression to model the pairwise odds ratios (ORs) between responses of subjects living in the same room, the same household or the same IRIS [31]. All univariable analyses were adjusted on age and control for confounding was carried out with the same adjustment measures as those used for the GMT analysis.
Model selection. The same selection process was used for both analyses. GMTRs and ORs were estimated for all covariates individually. Since a large number of covariates were tested, we adjusted the p-value to control the alpha inflation associated with multiple hypothesis testing and to account for the false discovery rate (FDR) for all covariates [32]. All covariates with an adjusted P,0.30 in univariable analyses were included in a multivariable analysis. Thirty datasets were imputed via multiple imputations by chained equations (MICE) [33]. Covariates related to attitudes, beliefs and risk perception for children were sampled from subjects over 15 years in the same household.
The criterion for model selection was the mean residual sum of squares with 10-fold cross-validation -aimed at avoiding overfitting and controlling FDR -run over the 30 imputed datasets and considering only models with P,0.05 for all covariates.
Resulting coefficients and standard errors were combined to obtain the reported results [34]. Additional multivariable models were estimated separately stratified by pandemic vaccination status.
Statistical analyses were performed with R version 2.15. We estimated GMTRs with the function ''survreg'' (package ''survival'' version 2.36). Multiple imputation was done with the package ''mice'' and we ran alternating logistic regression with the package ''orth''.

Descriptive data
Characteristics of the 1377 subjects are given in Tables S1-S6 in File S1. The median age at inclusion was 43.1 years (interquartile range (IQR): 20.7, 59.9 years), 38 children were aged 2 to 5 years and 14 children were aged ,2. A total of 561 subjects (40.1%) had at least one history of chronic disease. History of ILI since the beginning of the pandemic wave was reported in 99 subjects (7.5%). For the 3 previous seasons, the proportion of ILI ranged from 7.3% to 19.1%. History of smoking was reported in 544 subjects (39.5%).
The proportion of pandemic vaccine recipients was 12.2%. The median time since pandemic vaccination was 3.2 (IQR: 1.7, 5. Detailed data on meetings was collected in 1360 out of the 1377 subjects. The median number of reported daily meetings was 6 (IQR: 3, 10) and the median summed duration was 963 (IQR: 503, 1646) mn/day, with significant differences according to age groups and locations (Figure 1). Subjects aged less than 15 years had a higher daily duration of meetings than older ones: 1847 (IQR: 1147, 2564) mn vs. 848 (IQR: 440, 1378) mn, P,0.0001. Children at school reported a large amount of meetings with children of the same age. Working adults aged 20 to 60 years had many meetings with persons of their age. At home, subjects had meetings with people of their age and with persons from the previous or next generation.  Figure 2 gives an overview of the estimated post-stratified GMT with respect to the general population structure, in relation to pandemic vaccination status, history of ILI and age of subjects.

Factors associated with the GMT
All univariable GMTR estimates are listed in Tables S1-S6 in File S1. A total of 40 covariates with adjusted P,0.30 were retained in the multivariate analysis.
Selected multivariable models are listed in Table 1. Considering all the subjects (irrespective of the vaccination status), the final model retained (i) pandemic vaccination, 2009 seasonal vaccination, history of ILI for season 2009-2010, asthma, COPD, duration of meetings at school and IRIS proportion of workers using public transports as covariates associated with a higher GMT, and (ii) history of smoking as covariate associated with a lower GMT.
Considering the 1,207 subjects without pandemic vaccination, ''asthma'' was the only covariate that did not remain in the final model. Considering the 171 pandemic vaccine recipients, history of ILI remained in the model, while older age at inclusion and time since pandemic vaccination were associated with a lower GMT.

Case-control analysis
The 1,207 unvaccinated subjects were included in this analysis, 171 as cases and 1,036 as controls. The proportions of subjects with a history of ILI were 18.0% in cases and 6.5% in controls (P,0.0001). The final multivariable model retained (i) COPD, Factors Associated with H1N1pdm Serological Titer PLOS ONE | www.plosone.org asthma, duration of meetings at school, proportion of workers using public transports and belief that not going to work protects against H1N1pdm as factors associated with a higher risk of probable infection, and (ii) older age and having an air humidifier in the living room as factors associated with a lower risk (see Table 2 for details). Though we estimated pairwise odds ratios between responses of subjects living in the same room, the same household or the same IRIS, only the household level was kept in the final model as the other ones were not significant, OR = 3.31 (95% CI: 1.82, 6.02). Multivariable models for the sensitivity analyses retained subsets of these factors with no additional factors (Table S7 and S8 in File S2).

Covariates associated with HI titer
Post-pandemic elevated HI titer can be explained by a prepandemic elevated titer, a recent increase in titer due to an infection by the pandemic virus or to another antigenic stimulation (e.g., pandemic vaccination), or by any combination of these different factors. We review our findings in light of other studies on the same topic.
The global multivariate model including pandemic vaccination gave information on the association of this factor with the GMT. Adjustment on this factor in the same model allowed us to study factors that may have an impact on GMT increase after either vaccination or infection, whereas stratified analyses according to this vaccination intended to focus more specifically on factors associated with other causes of elevated GMT.
We found a lower anti-H1N1pdm GMT in older subjects in the univariable analysis and in the multivariable model run among pandemic vaccine recipients. This covariate did not remain in the other multivariate models mainly because of the adjustment on duration of contacts at school (age was significantly associated with the GMT in all models when we excluded this covariate). Older age was also associated with a lower risk of probable infection in the case-control analysis.
As expected, a reported history of ILI was associated with an elevated GMT, which indicates that some of these ILIs were probably caused by H1N1pdm infection. Though this factor lacks sensitivity and specificity to be considered as a good correlate of infection, its coefficient in selected multivariable models gives more information on the relative role of infections among all causes leading to a GMT increase. Its association with the GMT in vaccinated subjects indicates that the GMT was also caused by H1N1pdm infections. Indeed, as most vaccinations occurred at the end of the pandemic course (Figure 3), we could not distinguish whether the increased GMT in vaccine recipients was caused by vaccination itself or by previous infection.
Asthma and COPD were associated with a higher GMT and possible risk factors in the case-control analysis. Asthmatics may have increased susceptibility for H1N1pdm infection [45], possibly because of alterations in the airway architecture [46,47] and impairment of innate immunity [47]. Another hypothesis to explain a higher GMT in subjects with such medical conditions, regardless of their susceptibility to infection, would be a more severe illness [48] involving a greater immune response [49].
We found that smoking history was associated with a lower GMT. Although several studies already found an association between cigarette smoking and risk to contract influenza infection [50][51][52], smokers have a well-known diminished serological response to influenza infection or vaccination [52], the immunosuppressive mechanism is still unclear [53][54][55][56].
Seasonal vaccination for any season since 2006-2007 was associated with an increase in the GMT, maybe because of a crossreactive immune response with seasonal vaccination H1N1 strains [57]though studies investigating this association were all inconclusive [3,58]. Another hypothesis would consider that elevated post-seasonal titer might be a consequence of an increased risk of pandemic infection in seasonal vaccine recipients [59], though conflicting results were reported about this association [60][61][62][63][64].
In covariates related to the environmental characteristics of the housing, only the association between presence of an air humidifier in the living room and lower risk remained in the case-control multivariable model, which may be consistent with the possible impact of relative humidity on influenza aerosol transmission [65,66].
The multivariable analysis retained no covariate related to attitudes, beliefs and risk perception, except the belief that not going to work may protect against H1N1pdm infection, associated with a higher risk in the case-control analysis. We have no clear interpretation for this finding, except that this covariate may be a correlate of more general characteristics of risk perception, which affect the transmission patterns of pandemic influenza.
Increasing GMT and a higher risk of probable infection associated with duration of meetings at school were not surprising since schools are identified as places with high meeting rates between influenza susceptible subjects [67]. Interestingly, we did not find a significant association of GMT with daily duration of meetings with children younger than 10 years old regardless of location, suggesting that school favors transmissions by a particular pattern of contacts or environmental characteristics [67,68].
The multivariable analysis retained no covariate related to the characteristics of the surrounding area, except the proportion of workers using public transportation to go to work, which also appeared as a possible risk factor.
The important pairwise OR we found in the case-control analysis for subjects living in the same household suggests a common environmental exposure or susceptibility for these subjects who often belong to the same family, or more probably an elevated intra-household secondary attack rate (estimated 4 to 37% in previous household studies [10]).

Limitations
Though households were sampled in the general population, some households refused to participate, which may induce a selection bias. However, comparisons with French population census data suggest that this bias was controlled [11], and poststratification of the GMT by age and vaccination status with respect to the French population structure did not modify the results significantly. We did not post-stratify our estimations of the GMTR, as the choice of the auxiliary covariates used to adjust the sampling weights could have induced important changes in the standard error of our estimates leading to spurious associations [69].
The timeline of inclusion may have induced recall or reporting biases. The cohort was designed to include households before the 2009 pandemic season and to follow-up subjects during the influenza season. As inclusions were delayed, data regarding ILIs were collected retrospectively and recall bias may be important in subjects with late inclusion. Moreover, we found a decreasing GMT according to time since vaccination in pandemic vaccine recipients, and we cannot exclude an antibody loss in the months following an infection, although we did not find any association between GMT and date of inclusion in unvaccinated subjects.
Such limitations may have biased the association between GMT and other covariates.
In the case-control analysis, cases were defined serologically, yet we know that an elevated titer can sometimes be explained by cross-reactions, especially in the elderly [16], and that infected subjects can show a low titer a few months after infection [70]. This lack of specificity and sensitivity to identify infections must be considered in light of the sensitivity analyses results, which often showed similar results with different case definitions.
Another limitation may be linked to the amount of data collected. Though we controlled this FDR with the use of specific procedures, multiple testing of hundreds of covariates results in an important risk of finding spurious associations, due to the alpha inflation phenomenon.
Because of these limitations, our analysis must be understood as a hypothesis generating study aimed at identifying the possible role of many factors that would probably not have been studied otherwise. Further studies would be necessary to confirm the impact of these factors and their implications for the control of influenza.

Conclusion
We used a data-driven framework to carry out an exploratory analysis of potential relevant risk factors for infection. This hypothesis generating tool relying on an integrated approach allowed us to highlight the possible impact of previously unknown factors from several dimensions usually studied separately, such as presence of an air humidifier (indoor environment), duration of meetings at school (social contacts), characteristics of the local population or risk perception. Additional data is being collected and analyzed in this ongoing cohort. The longitudinal analysis of these households will permit integrative analyses of complex phenomena such as individual, collective and environmental risk factors for infection, routes of transmission, or determinants of the immune response to infection or vaccination.

Supporting Information
File S1 Tables S1-S6. Description and univariable analyses for all covariates.