Genetic Heterogeneity of Oesophageal Cancer in High-Incidence Areas of Southern and Northern China

Background and Objective Oesophageal cancer is one of the most common and deadliest cancers worldwide. Our previous population-based study reported a high prevalence of oesophageal cancer in Chaoshan, Guangdong Province, China. Ancestors of the Chaoshan population migrated from the Taihang Mountain region of north-central China, which is another high-incidence area for oesophageal cancer. The purpose of the present study was to obtain evidence of inherited susceptibility to oesophageal cancer in the Chaoshan population, with reference to the Taihang Mountain population, with the eventual goal of molecular identification of the disease genes. Methods We conducted familial correlation, commingling, and complex segregation analyses of 224 families from the Chaoshan population and 403 families from the Taihang population using the FPMM program of S.A.G.E. version 5.3.0. A second analysis focused on specific families having large numbers of affected individuals or early onset of the disease. Results For the general population, moderate sib-sib correlation was noticed for esophageal cancer. Additionally, brother-brother correlation was even higher. Commingling analyses indicated that a three-component distribution model best accounts for the variation in age of onset of oesophageal cancer, and that a multifactorial model provides the best fit to the general population data. An autosomal dominant mode and a dominant or recessive major gene with polygenic inheritance were found to be the best models of inherited susceptibility to oesophageal cancer in some large families. Conclusions The current results provide evidence for inherited susceptibility to oesophageal cancer in certain high-risk groups in China, and support efforts to identify the susceptibility genes.


Introduction
The incidence of oesophageal cancer varies by more than 300fold worldwide, with the highest rates recorded in certain areas of China and central Asia. [1,2,3] For most Chinese populations, the vast majority of oesophageal cancers are squamous cell carcinomas. [1,4,5] In north-central China, the high-incidence areas are mainly located along the northern borders of three provinces-Hebei, Henan, and Shanxi-abutting the southern flank of the Taihang Mountains. The mortality rate can be as high as 1100/ 100,000, as occurs in Linxian, Henan Province, and in Yangcheng, Shanxi Province. [6,7,8] Over the last two decades, Wu and his colleagues [9] have sought epidemiological evidence of genetic susceptibility in the development of oesophageal cancer as well as effective ways of screening for individuals who are highly susceptible to the disease. Familial aggregation of oesophageal cancer was found in this high-incidence area. [10] Moreover, the results of segregation studies by Wu et al. [7,11] on 221 high-risk nuclear families from Linxian, all with offspring $40 years old, and 225 high-risk families collected from Yangquan City suggested an autosomal recessive mode of inheritance of oesophageal cancer in these two high-risk locales.
In southern China, in the Chaoshan littoral region twothousand kilometers from the Taihang Mountains, there exists another high-incidence area for oesophageal cancer, notably Nanao Island. [4] According to our previous study, the agestandardised incidence rates of oesophageal cancer in males and females were 72-150/100,000 and 26-64/100,000, respectively, from 1995 to 2004 in Nanao Island. Oesophageal cancer and cardiac cancer were the most prevalent malignancies, comprising 52% of the total malignant tumour cases, and with oesophageal cancer showing an upward trend. [12] Despite the geographic separation, could there be a common aetiology for oesophageal cancer in those two areas? Historical records, supported by recent genetic data in the form of polymorphisms of both the Y chromosome and mtDNA, indicate that the ancestors of the Chaoshan people migrated from the Taihang Mountain region. Analysis of mtDNA haplogroups showed that a shared maternal genetic background is associated with the high-risk populations in the two areas. [13] Epidemiological studies have shown that certain environmental risk factors are also associated with oesophageal cancer in the Chaoshan area, including fermented fish sauce, dietary habits, alcohol consumption, tobacco smoking, and drinking of Kongfu tea. [6,14,15,16,17,18,19,20,21,22] Nonetheless, familial clustering of oesophageal cancer has been documented in the Chaoshan high-incidence population, providing an important clue to genetic aetiology. [23] Moreover, the existence of two high-risk populations-Chaoshan and Taihang-in two obviously different environments, but which are related through common ancestors, indicates that genetic susceptibility may play an important role in the risk of developing oesophageal cancer. Therefore, it is essential to explore whether genetic factors are indeed involved in the aetiology of oesophageal cancer in the Chaoshan high-incidence area.
The aim of the present study was to obtain evidence of a specific model of inherited susceptibility to oesophageal cancer in the Chaoshan and Taihang Mountain high-risk populations that would support further studies to localise susceptibility genes in familial oesophageal cancer. A total of 224 population-based pedigrees from the Chaoshan area and 403 hospital-based pedigrees from the Taihang Mountain area were collected and studied for familial correlation, commingling, and complex segregation using the program Statistical Analysis for Genetic Epidemiology (S.A.G.E.).

Study subjects and data collection
Subjects from two populations were studied: population-based subjects of Chaoshan in southern China, and hospital-based subjects of the Taihang Mountain region in northern China. The Chaoshan area, with a population of approximately 10 million, is a littoral area in the eastern part of Guangdong Province. Its major cities are Shantou, Chaozhou, and Jieyang. Directly east is Fujian Province and to the south is the South China Sea. Chaoshan residents comprise a relatively isolated population and have kept the old Chinese language (Chaoshan dialect) and traditional customs. Nanao Island, with a population of approximately 70,000, is a county attached to Shantou, opposite Taiwan. The residents of Nanao Island are participants in a long-term study of oesophageal cancer, and a cancer registry for Nanao Island has been operating in cooperation with the Department of Pathology, Shantou University Medical College and the Health Bureau of Nanao Island since 1995. The system is a large population-based network covering village clinics as well as town and county hospitals. Recorded information includes patient demographics and native origin, age of onset, X-ray and pathologic confirmation of diagnosis, and stage of the disease for each case of oesophageal and cardiac cancer. Guided by the records and after obtaining informed consent, we interviewed all new subjects at their homes accompanied by staff of the Board of Health of Nanao Island. A structured questionnaire was administered to the patients, who were asked to provide verbal answers. We recorded the information, including lifestyle habits (e.g., tobacco smoking and alcohol drinking) and diet, as well as the family history of four successive generations. Whenever a proband was unsure of an answer, we interviewed additional family members or older neighbors to ensure the accuracy of the information. Because difficulty in swallowing is an important clinical symptom, it was used in the diagnosis of the disease; before the 1980s, when poor medical conditions prevailed, the older oesophageal cancer patients who were diagnosed by village doctors or suspected of having died of EC had been ascertained by this method. From this population, a total of 224 oesophageal cancer and cardiac cancer pedigrees were enrolled from registry data collected in 2003 -2005.
The Taihang Mountain area has the highest incidence of oesophageal cancer in northern China. For this region we chose a cross-sectional study of a hospital-based population. We investigated the medical records of all patients with oesophageal cancer who presented at four hospitals-Linzhou (Linxian) Tumour Hospital, Linzhou People Hospital, Anyang Centre Hospital, and Anyang Tumour Hospital-during June to August 2005. After receiving consent from the patients and their physicians, we interviewed the patients. Families were ascertained through single probands, with the most recent affected family member being selected as the proband. Familial occurrence of oesophageal cancer was investigated in face-to-face interviews. Date at diagnosis was recorded for the proband and for all affected relatives. A total of 7379 individuals in 403 families ascertained through probands were enrolled in the study.
To explore the heterogeneity of age at onset we used the following criteria for selection of large pedigrees: oesophageal cancer patients in three successive generations, and age of onset progressively decreased from generation to generation. Finally, we obtained two early-onset families from 224 pedigrees in Chaoshan, and four early-onset families and a single extended pedigree (six generations deep) with 32 of 293 individuals affected with oesophageal cancer in the Taihang Mountains.
Informed-consent documents were signed by each participant before entering the project. This study was approved by the ethical review committee of Shantou University.

Familial correlation and statistical analysis
Familial correlation analysis was performed by using FCOR; version 5.3.0 of S.A.G.E. FCOR was used to estimate the correlations in trait values between pairs of relatives. Here, oesophageal cancer status was fixed as a trait; pair correlations were estimated and concentrated on the following relative types: parents and offspring, siblings, avuncular, and cousins. In addition, Chi-square statistics and P-values were calculated to test the homogeneity of correlation among the subtypes within each main type.

Commingling analysis
The covariance structure was analysed by using the Commingling segregation program, version 5.3.0 of S.A.G.E.. In setting parameters, one-, two-, and three-component distributions were estimated, separately. For each distribution, no transmission was specified. The other parameters for each component mean were estimated by the maximum likelihood method, in which all numerical methods start from a given set of initial values of the unknown parameters, and then begin traversing the likelihood surface until a boundary is met, a local maximum has been found, or numerical difficulties are encountered.

Complex segregation analysis
We conducted a complex segregation analysis on the families to model genetic susceptibility to and age at onset of oesophageal cancer by using the class finite polygenic mixed model (FPMM), which is the only option currently available for binary traits with variable age of onset under a Linux 3.0 operating system, version 5.3.0 of S.A.G.E.
This model leads to a likelihood that can be calculated using efficient algorithms developed for oligogenic models. The profiles for FPMM were closest to the profiles for the usual mixed model with exact calculations. [24,25,26] The basic theory of complex segregation analysis is that a subset of individuals in the population is assumed to be susceptible to oesophageal cancer and therefore to have an age of onset of oesophageal cancer, providing they live long enough, whereas the distribution of age of onset might depend on the possible segregating genes. If Mendelian transmission exists, it is assumed to be through a single autosomal locus with two alleles, A and B, A being the hypothesized disease allele. The frequencies of allele A and B are denoted q A and (1-q A ), respectively. The segregation of a possible major locus is allowed for by letting one or more parameters depend on an unobserved (latent) qualitative factor u = AA, AB or BB. We call u an individual's type. In this context, type is best defined in terms of the expected distribution of an individual's offspring. Thus we use the term type to allow for many kinds of discrete transmission, whether Mendelian or not. The distribution of types in the population is assumed to be in Hardy-Weinberg equilibrium. Individuals of each type are assumed to transmit allele A to their offspring with transmission probabilities t AA , t AB , and t BB , respectively. Letting b be a baseline parameter and a the age coefficient.
In this study, age, oesophageal cancer status, and sex were included to improve model fitting; cancer status was considered to be primary binary, i.e., 0 for no oesophageal cancer and 1 for cancer, and sex-code was fitted as a covariate to a primary trait, i.e., 0 for male and 1 for female. Under FPMM of segregation analysis, four major Mendelian gene models (dominant, recessive, codominant, descending), polygenetic models (pure polygenetic, major gene and polygenetic) and three non-genetic models (no major type, pure environmental, and general) are implemented on the data to estimate which model is most consistent with the observed data. There nine hypothesized models are following: 1. Mendelian models: Mendelian transmission of a major gene is assumed in this model (t AA = 1, t AB = 0.5, t BB = 0). In the dominant model, genotype AA is equivalent to genotype AB, as reflected by the baseline parameters b AA = b AB . In the recessive model, genotype BB is the same as the genotype AB, as reflected by the baseline parameters b BB = b AB . In the codominant model, genotype AB is intermediate to genotypes AA and BB, as reflected by the baseline Nongenetic models: an environmental model is also fit that includes a major type effect that is not transmitted from parent to offspring: A nontransmitted environmental effect model, in which each of the transmission probabilities is taken to be equal to the frequency of allele A, i.e., tAA = tAB = tBB = qA. In the other environmental model, which allows for possible heterogeneity of exposure levels between generations, the transmission probabilities are taken to be equal, i.e., tAA = tAB = tBB; No major type model, no transmission and no major gene and environment-type effects; General (multifactorial) model, a full model with arbitrary transmission probabilities, in which all parameters were unrestricted and allowed to fit the data. 3. Polygenetic or major gene and polygenetic models: option of the age of onset or susceptibility has a polygenic component specified under the FPMM block. In the program, the type frequencies, baseline parameters (b), and transmission proba-bilities were estimated. The model with the lowest Akaike's information criteria (AIC) value was considered to be the best model, supported by likelihood ratio tests.
To explore for ascertainment bias and a single or a multipoint ascertainment, the model-free likelihood of each pedigree was conditioned on the proband by age at onset and estimated. The Commingling segregation analysis program can be used to fit mixtures of two or three normal distributions, simultaneously applying a power transformation to the data and also allowing for both ascertainment and residual familial correlations. [27] To compare with the results of the former study under a class D model, we generated the following covariates representing residual familial effects as described by Carter et al. [11] Briefly, these are F1 (affected father effect), M1 (affected mother effect), S1 (number of affected older sibs); F1, M1, and S1 were coded 1 if father, mother or sibs were affected, respectively, and 0 if unaffected or missing.

Results
In Chaoshan, of 7224 individuals in 224 pedigrees, 374 (5.18%) (including probands) had oesophageal cancer (table 1). In Taihang, of 7379 individuals in 402 pedigrees, 738 (10.0%) had oesophageal cancer. The pedigrees in Chaoshan, based on population data, tended to be large and multigenerational (.3 generations), whereas those ascertained from hospital-based data in Taihang included many nuclear families of two or three generations only. These differences in pedigree structure may account for the apparent twofold difference in the rate of oesophageal cancer between the two study groups.
Descriptive characteristics of the pedigrees are presented in table 1. More males than females were affected in both populations. The proportion of pedigrees with multiple ($2) affected members was similar between the two populations (40.18% in Chaoshan, 43.42% in Taihang, P.0.05). The mean age of onset (at diagnosis) of oesophageal cancer in Taihang (61.65 years) was slightly higher than in Chaoshan (60.75 years, P.0.05).

Familial correlations
Familial correlation analysis was performed by using FCOR (S.A.G.E. version 5.3) to estimate correlations in trait values between pairs of relatives as described in the Materials and methods. We found that all correlation coefficient values for the Chaoshan population relative pairs were similar to those for the Taihang population (table 2). Overall, higher sibling-sibling than parent-offspring correlations were found, suggesting familial clustering of the genes, with each generation possibly having a different environment. The range of sibling correlations was 0.079 to 0.149 in Chaoshan and 0.102 to 0.171 in Taihang, with the brother-brother correlation coefficient being relatively high as compared with the other sib-sib correlations. These results indicate the involvement of genetic influences in the aetiology of oesophageal cancer in the general population.
For the large families, all correlation coefficient values were higher than those for the general population. The highest correlation coefficients were for the brother-brother (0.488) and sister-sister (0.416) pairs, indicating a strong genetic effect on susceptibility to oesophageal cancer in the large families of both the Chanshan and Taihang areas (table 2).

Commingling analysis
Commingling analysis was carried out using the Commingling segregation program of S.A.G.E. (version 5.3.0) to assess whether the distribution of the mean age at onset of oesophageal cancer in the families could be better explained as a mixture of two or more distributions rather than a single distribution, and to determine the number of means to fit in the segregation analysis (see below). The results of commingling analysis are presented in fig. 1.
A mixture of two distributions fits significantly better than a single distribution (Chaoshan: x 2 = 30.09, P,0.001, Taihang: x 2 = 48.15, P,0.001). A mixture of three distributions with three arbitrary means has the largest likelihood and fits the data slightly better than the two-distribution model in Chaoshan (general population x 2 = 4.1, P = 0.04288) and obviously better in Taihang (x 2 = 19.54, P,0.001). Thus, the commingling distributions of the two populations are better explained by a mixed distribution than a single one and are compatible with a possible major gene effect ( fig. 1).  In the large pedigrees from both areas, the Mendelian transmission models had the lowest AIC scores; the Chi-square and corresponding Pvalues are presented in tables 5 and 6. These results suggest that a major gene model underlies the aetiology of oesophageal cancer in these families. For the two large pedigree of Chaoshan, the models with the smallest AIC values were: recessive, dominant, and dominant with a polygenic mode of inheritance (table 5). For the four large pedigrees of Taihang, the recessive with a polygenic mode gave the smallest AIC value and is considered the best-fit model for those large families (table 6). In the extended pedigrees from Taihang Mountain, the Mendelian dominant models (table 7).  Table 3. Segregation analysis of oesophageal cancer in the Chaoshan general population with type influences age of onset and sex-dependence.

Discussion
A multifactorial mode of inheritance is the best model for susceptibility to oesophageal cancer in the two general populations The results of the FCOR analysis confirmed that oesophageal cancer is moderately correlated amongst family members, and support the involvement of a genetic component for oesophageal cancer susceptibility in the two general populations. The moderate sister-sister correlation found in this study indicates that offspring had influential effects from their parents. Our results are consistent with those of previous studies showing familial correlation and aggregation of oesophageal cancer in families in the two populations. [10,23]    The most satisfactory model from the commingling analysis for the two general populations consists of a mixture of three distributions, indicating that more than one component is needed to explain the distribution of oesophageal cancer. Our results are consistent with those of previous familial correlation and commingling studies, which indicated that a genetic risk factor(s) may be very important in the high-incidence areas of China. [9] The finding of multiple distributions is compatible with a major gene hypothesis; however, commingling may also arise through other causes. Thus, segregation analysis was used to determine whether these major effects segregated in families according to Mendelian expectations.
Development of advanced methods of segregation analysis and studying additional samples from other high-risk areas may be of help in better understanding the aetiology of oesophageal cancer. [7] In the Taihang Mountain area, a previous segregation analysis was performed using the REGTL program under a class D regressive model, and an autosomal recessive major gene was suggested [7]. However, until now, a similar formal segregation analysis had not been performed on the Chaoshan population. In Table 6. Segregation analysis of oesophageal cancer in four large pedigrees of Taihang.   the present study, we performed a complex segregation analysis using the SEGREG program under the FPMM model. Recent improvements to SEGREG give it significant advantages over the programs REGC, REGD, and REGTL of the previous S.A.G.E. versions; and FPMM, which is the only option currently available for binary traits with variable age of onset, is more effective than the class A regressive model for the data of esophageal disease as a main trait and age of onset as a censored trait. [28] When we modelled age of onset, all Mendelian models, the pure polygene model, and the pure environmental model were rejected, whereas the general model was accepted at the population level in the two high-incidence areas. Although we did not find evidence supporting involvement of a major gene in the aetiology of oesophageal cancer in either of the populations, our results of multifactorial inheritance suggest that a variety of polygenes and environmental factors contribute to the disease. Furthermore, in the general population, the environmental model had the lowest AIC value next to the general model, suggesting there is an important role for environmental factors in the development of oesophageal cancer.
Our results are consistent with and support the opinion of Garavello et al. [7,29], that a family history of cancer in combination with smoking and drinking increase the risk of oesophageal cancer. Additional genetic models need to be considered, including an interaction of susceptibility genes and environmental risk factors at the population level.
Segregation analysis is typically sensitive to ascertainment bias, and false assumptions on ascertainment could invalidate the estimates obtained through segregation analysis. An ascertainment correction was applied by conditioning the likelihood on the probands in the Taihang population.

A major Mendelian component confers susceptibility to oesophageal cancer in large families
After analysing the two general populations, our second approach was to focus on families that have a history of oesophageal cancer, including families with large numbers of affected individuals or early age of onset. From Taihang we evaluated a kindred of 293 individuals with 32 affected members as well as four other large pedigrees; from Chaoshan we studied two large pedigrees whose average age of onset was 49.25 years.
The familial correlation and commingling analyses demonstrated that more than one component is needed to explain the distributions, with support for a genetic component. In particular, the high sib-sib correlations (r = 0.2610 and 0.2982) found in the large families indicates an important role for parental factors in influencing the offsprings' traits.
By complex segregation analysis of one large pedigree from Taihang and two large pedigrees from Chaoshan, both pure environmental and pure polygenic models were rejected; autosomal dominant and dominant with polygenic models were the bestfit patterns. For the other four pedigrees of Taihang, a recessive with polygenic model was the best model.
The results support the existence of a major susceptibility locus with Mendelian inheritance in some large families.
Genetic heterogeneity of oesophageal cancer in the two high-incidence areas In a study by Wu et al, [7] to explore heterogeneity in the age at onset of oesophageal cancer, the authors separated families into two groups, one with probands whose age at onset was ,60 years and the other with probands whose age at onset was .60 years; they found no evidence of significant heterogeneity between these two subsets. Identification of subsets of families with different probable genetic aetiologies for oesophogeal cancer is taken as evidence of genetic heterogeneity. In this study, not only the age of the subjects but also the affected numbers of individuals, the generations with the affected individuals, and age variance were taken into account in chosing the subsets for analysis. Our data indicate a non-Mendelian mode of inheritance in the general population and an apparent dominant mode of inheritance in certain large families. These findings suggest that oesophageal cancer has significant genetic heterogeneity, specifically multifactorial inheritance in the overall population and autosomal dominant inheritance in some subgroups of the population. Furthermore, results of previous studies suggested involvement of a major recessive gene in oesophageal cancer in Yangquan, Shanxi Province and Linxian, Henan Province. [7,11] There is also evidence of a familial oesophageal cancer susceptibility gene region on chromosome 13q. [30,31,32,33] Moreover, tumours from patients with a positive family history exhibit more frequent loss of heterozygosity (LOH) for this chromosome than do those from patients without a family history. [34] As for other cancers, a dominant mode of inheritance of oesoaphageal cancer evident in a few very large families suggests involvement of common susceptibility genes. In other familial cases of oesophageal cancer and in sporadic cases, a polygenic/multifactorial aetiology can be postulated through sharing of alleles at many loci, each contributing to a small increase in cancer risk. [35] For example, only a subset of familial breast cancer is clearly hereditary, owing primarily to mutations in a single gene. Information about the genetic heterogeneity in oesophageal cancer is important to be able to identify the different subgroups and eventually reveal the disease loci.

Further studies of genetic linkage and susceptibility gene location
Combined molecular genetic analysis and genetic epidemiology may reveal the underlying basis of the genetic predisposition to oesophageal cancer. Currently, a rare autosomal dominant disorder defined by a genetic abnormality on chromosome 17q25 is the only recognised familial syndrome that predisposes patients to squamous cell carcinoma of the oesophagus. [36] The present study has provided some modeling parameters of oesophageal cancer for further genetic linkage studies. Linkage and association studies aimed at localising the susceptibility genes involved in the development of oesophageal cancer in the Chinese high-risk populations are currently under way.