Conceived and designed the experiments: NW RN CB JB. Performed the experiments: NW CB AM. Analyzed the data: NW RN CB. Contributed reagents/materials/analysis tools: RN JB. Wrote the paper: NW RN JB.
The authors have declared that no competing interests exist.
In this investigation chromatin immunoprecipitation (ChIP) experiments and bioinformatic methods were used to analyze if these binding sequences can be verified on chromatin of living cells as well.
After quantification of HMGA2 protein in different cell lines the colon cancer derived cell line HCT116 was chosen for further ChIP experiments because of its 3.4-fold higher HMGA2 protein level. 49 DNA fragments were obtained by ChIP. These fragments containing HMGA2 binding sites have been analyzed for their AT-content, location in the human genome and similarities to sequences generated by a SELEX study. The sequences show a significantly higher AT-content than the average of the human genome. The artificially generated SELEX sequences and short BLAST alignments (11 and 12 bp) of the ChIP fragments from living cells show similarities in their organization. The flanking regions are AT-rich, whereas a lower conservation is present in the center of the sequences.
High mobility AT-hook 2 (HMGA2) is a chromatin-associated protein implicated in the development and progression of benign and malignant tumors as well as stem cell self-renewal
The first step for characterization of HMGA2 binding sites was to choose an adequate cell line showing high levels of HMGA2 for the following ChIP analyses. Therefore, we investigated the
Origin of the various human cell lines and fresh sample: MCF 7 (mamma carcinoma), MM 31 (myometrium); MRI-H215, MRI-H196 and MRI-H186 (cervical carcinoma); Ad 211 (pleomorphic adenoma); NB-4 (promyelocytic leukemia); FTC133 and FTC238 (follicular thyroid carcinoma); TPC-1 (papillary thyroid carcinoma); Li14 (lipoma); FRO (anaplastic thyroid carcinoma); WRO (follicular carcinoma); supT1 (T cell lymphoblastic lymphoma); HCT116 (colon carcinoma).
(A) Expression of HMGA2 in three cell lines was determined using β-actin as endogenous control. (B) Comparison of
For this study two basic protocols for ChIP
Sample | Average Ct-value | CtNoAb-CtIP | x-fold Enrichment |
32 IP | 29.85 | 8.55 | 374.29 |
32 NoAb | 38.39 | ||
33 IP | 30.76 | 6.25 | 76.21 |
33 NoAb | 37.01 | ||
35 IP | 30.37 | 8.86 | 464.65 |
35 NoAb | 39.23 | ||
36 IP | 29.49 | 6.97 | 125.37 |
36 NoAb | 36.46 | ||
37 IP | 29.84 | 7.57 | 190.15 |
37 NoAb | 37.41 |
Furthermore, the enrichment of HMGA2 during ChIP was confirmed by Western blot analysis (
The analysis shows an enrichment of HMGA2 in the IP sample but not in the corresponding supernatant. No HMGA2 is detectable in the eluate of the NoAb control because HMGA2 remains in the supernatant of the non-immunoprecipitated sample.
All 49 sequences were mapped to single loci in the human genome using the NCBI BLAST tool (
Clone | Length [bp] | AT [%] | Localization | Gene Symbol | Location to Gene | Distance to Gene |
41 | 105 | 61 | 1p35 | PTPRU | upstream | 180 kb |
28 | 403 | 68 | 1q25 | SEC16B | downstream | 70 kb |
3 | 440 | 61 | 1q25.1 | TNR | intron 1 | - |
23 | 253 | 63 | 1q31 | KCNT2 | downstream | 1000 kb |
14 | 652 | 69 | 1q31.1 | FDPSL1 | downstream | 300 kb |
27 | 1017 | 57 | 1q42 | CDC42BPA | intron 21 | - |
8 | 715 | 68 | 2p13.3 | GKN3P | intron 1 | - |
2 | 373 | 65 | 2p24.1 | WDR35 | intron 34 | - |
29 | 316 | 47 | 2q31 | HOXD10 | exon 2 | - |
48 | 1612 | 54 | 3p21 | LARS2 | intron 13 | - |
45 | 1592 | 59 | 3p22 | STAC | downstream | 170 kb |
44 | 234 | 70 | 3q26.1 | SI | downstream | 25 kb |
49 | 157 | 52 | 3q26.1 | KPNA4 | intron 1 | - |
40 | 171 | 62 | 4p15.1 | ARAP2 | downstream | 2000 kb |
16 | 395 | 63 | 4q31.1 | CLGN | upstream | 500 bp |
12 | 981 | 69 | 4q32.3 | SPOCK3 | upstream | 300 kb |
10 | 350 | 73 | 4q34.3 | RPL19P8 | downstream | 15 kb |
39 | 561 | 60 | 5p14 | PRDM9 | downstream | 117 kb |
32 | 1080 | 61 | 6p22 | DCDC2 | intron 2 | - |
22 | 574 | 67 | 6q16 | TSG1 | upstream | 113 kb |
20 | 574 | 67 | 6q22 | NKAIN2 | intron 1 | - |
6 | 639 | 62 | 6q22.31 | MAN1A1 | downstream | 830 kb |
35 | 233 | 68 | 6q23 | VNN3 | upstream | 900 bp |
34 | 105 | 74 | 7q22 | RELN | intron 33 | - |
17 | 323 | 58 | 7q36.1 | ACTR3C | upstream | 9 kb |
4 | 492 | 72 | 8q21.12 | PKIA | upstream | 220 kb |
7 | 219 | 72 | 8q23.2 | PKHD1L1 | intron 16 | - |
13 | 161 | 72 | 9q21.12 | ALDH1A1 | upstream | 90 kb |
36 | 276 | 52 | 9q22 | COL15A1 | intron 1 | - |
30 | 765 | 50 | 9q34 | ENG | intron 8 | - |
46 | 180 | 69 | 10p11.2 | CCDC7 | downstream | 32 kb |
43 | 300 | 60 | 10p13 | FAM107B | intron 2 | - |
15 | 305 | 71 | 10q21.3 | JMJD1C | intron 22 | - |
5 | 1848 | 67 | 13q32.3 | FGF-14 | intron 1 | - |
11 | 363 | 71 | 14q21.3 | RPL10L | downstream | 350 kb |
24 | 142 | 64 | 14q32 | PPP4R4 | intron 2 | - |
42 | 695 | 52 | 17q21 | PLEKHM1 | upstream | 3 kb |
33 | 1220 | 61 | 17q22 | MBTD1 | intron 6 | - |
19 | 508 | 51 | 17q23.3 | RGS9 | intron 19 | - |
18 | 495 | 64 | 18p11.22 | PPP4R1 | intron 23 | - |
31 | 412 | 60 | 18q21 | STARD6 | upstream | 149 kb |
26 | 287 | 64 | 19q12 | ZNF99 | downstream | 25 kb |
47 | 359 | 66 | 19q13.1 | FCGBP | intron 3 | - |
9 | 511 | 66 | 20q13.11 | PTPRT | intron 1 | - |
1 | 500 | 56 | 20q13.13 | NFATC2 | downstream | 6.5 kb |
25 | 235 | 46 | 21q22 | RUNX1 | intron 5 | - |
21 | 201 | 54 |
|
|||
37 | 633 | 62 |
|
|||
38 | 388 | 79 |
|
*Unplaced genomic region.
For further analysis we compared our ChIP fragments with known binding sites, as predicted by Cui and Leng
Because none of the sequences for HMGA2 binding described by Cui and Leng
Because HMGA2 is supposed to bind to the minor groove of AT-rich sequences
The whole human genome was split into pieces of 500 bp and AT-content was determined and compared to the AT-content of the sequences revealed by ChIP with HMGA2-antibody. The Wilcoxson rank sum test shows that the AT-content in the ChIP DNA sequences is significantly higher than in the human genom (p<0.0012).
The cloned sequences were analyzed for the presence of any conserved sequences using the NCBI BLAST tool. This analysis shows a high rate of matches. These sequences have a significant higher AT-content compared to the human genome (W = 1169693292, p-value<2.2e−16, Wilcoxon rank sum test) and to the ChIP-isolated sequences themselves (W = 11787, p-value = 1.561e−05, Wilcoxon rank sum test). All sequences show multiple AT-stretches except for clone 25 and 49 containing only one AT-stretch. To identify further similarities between these BLAST matching sequences the 11 and 12 bp matches were adjusted manually from redundancies and used to create a sequence logo (WebLogo,
The sequence logo was created by 12 bp long BLAST alignments. Sequence conservation, measured in bits of information, is illustrated by the height of stacking of the four letters for each position in the binding sites. The relative heights are proportional to their frequencies shown in the 134 BLAST sequences. The sequence logo was generated by WebLogo (available at
A crucial question in field of gene regulation is where and to what extent transcription factors bind to DNA. This study is focused on the architectonic transcription factor HMGA2 which is abundantly expressed during embryonic and fetal development, whereas expression in normal fully differentiated adult cells is very low or even absent. This is the first time HMGA2 binding on chromatin in living cells is determined by ChIP analysis. The advantage of this method is that there is no need to prior identification of target genes regulated through binding of HMGA2. Furthermore, regulatory regions can be revealed wether they are located at promotors, introns or even distant enhancer elements.
In our study we selected a cell line with abundant expression of HMGA2 but this is not necessarily associated with malignant cellular behavior because, e.g. embryonic stem cells show a high level of HMGA2 associated with differentiation and cell proliferation during embryonic development
We compared the sequences of the DNA fragments obtained to results of a previously performed SELEX analysis on protein-free DNA
The second possible explanation for the absence of similarities between the SELEX sequences and the ChIP DNA fragments is, that the occurrence of the consensus sequences for HMGA2 binding described by Cui and Leng
The AT-content of the sequences generated by ChIP is significantly higher than the average of the human genome. This confirms the hypothesis that HMGA2 binds to AT-rich sequences. It therefore seems feasible to speculate that a motif with central GC bases and flanking AT bases is the possible target of HMGA2.
The analysis of the DNA fragments among each other shows a multitude of matches for conserved AT-stretches. All sequences but two contain multiple AT-stretches. A possible explanation for these two sequences having only one AT-stretch is that HMGA2 does not necessarily need DNA to interact with because it can bind to DNA- or chromatin binding proteins as well
HMGA2 is able to regulate certain genes via binding to promoter or enhancer regions, which are located upstream or downstream to the target gene, as well as intronic e.g. in case of the
To the best of our knowledge this is the first approach to characterize possible HMGA2 binding sites in the chromatin of living cells by ChIP and cloning. Via protein-DNA binding HMGA2 plays important roles in tumor growth and stem cell-renewal. The possibility to screen, localize, and characterize the whole human genome for sequences bound to HMGA2, can help to understand in which way HMGA2 is associated with different biological processes.
The use of the human myometrium sample for this study was approved by the local medical ethics committee and followed the guidelines of the declaration of Helsinki. The patient gave written informed consent for clinical procedure and research use of the tissues.
14 human cell lines and one sample of fresh tissue were examined in this study: MCF 7 (mamma carcinoma)
Total RNA was purified from cell lines and the tissue sample according to the “RNeasy mini protocol for isolation of total RNA from heart, muscle and skin tissue” (Qiagen, Hilden, Germany) including on-column DNase I digest and homogenisation with QIAshredder©. Following quantification, 5 µg RNA have been digested a second time with DNaseI (6.75 U) for 15 min at room temperature and a cleanup according to the RNeasy mini protocol was performed to remove possible contaminating DNA completely.
Approximately 1×107 HCT116 cells were harvested with TrypLE Express (Invitrogen, Karlsruhe, Germany) and the cell suspension was transferred into a sterile tube filled with McCoy's 5A medium. Proteins were crosslinked to the DNA using a final concentration of 1% formaldehyde for 10 min at room temperature. The formaldehyde was quenched with 0.125 M glycine (final concentration). After centrifugation the cell pellet was rinsed with an ice-cold PBS/AEBSF solution and then suspended in ChIP Lysis Buffer (Santa Cruz, Heidelberg, Germany). The sample was incubated on ice for 5 min and the pellet was rinsed with an ice-cold PBS/AEBSF solution again. For sonication, the pellet was suspended in 300 µl ChIP Lysis Buffer High Salt (Santa Cruz, Heidelberg, Germany). Fragmentation of the DNA was performed on ice, first to isolate and break down the nuclei and then to fragment the DNA (size 200–500 nucleotides). The parameters were 10 s pulse on and 20 s pulse off for 37.5 min with a Bandelin sonicator HD 3200 plus (Bandelin, Berlin, Germany). The sheared chromatin was cleared by centrifugation at 4°C (10 min at 10,621×g).
Magnetic Dynabeads protein G (Invitrogen, Karlsruhe, Germany) were prepared before usage following the manufacturer's instruction. To reduce the background signal a preclearing step was performed. 100 µl beads were added to the sample and the suspension was incubated for 30 min at 4°C with rotation. The supernatant was transferred and divided into two fractions (IP and NoAb). 4 µg anti-HMGA2 antibody (Santa Cruz, Heidelberg, Germany) were added to the IP sample, the fraction without antibody (NoAb) served as a negative control. Both fractions were incubated over night at 4°C on a rotator. To avoid unspecific interactions between DNA and beads, Dynabeads protein G were rotated with 22.2 µg salmon sperm DNA for 30 min at 4°C before use. After this second preclearing step, the IP and NoAb fractions were incubated on a rotator for 2 h at 4°C each with 50 µl of the blocked Dynabead suspension. The immune complexes were washed two times with 1 ml ChIP Lysis Buffer, four times with ChIP Lysis Buffer High Salt and ChIP Wash Buffer (Santa Cruz) and once with 1× TE buffer (10 mM Tris base, 1 mM EDTA). All washing steps were carried out at 4°C. To reverse crosslinks the Dynabeads protein G were suspended in 150 µl SDS elution buffer (1% SDS, 0.1 M NaHCO3) and incubated in a shaking water bath for 2 h at 67°C. The supernatants were transferred into new 2 ml plastic tubes and incubated with 5 µg Proteinase K (Qiagen, Hilden, Germany) for another 2 h at 67°C. To avoid precipitation during the DNA isolation the samples were diluted 1∶2 with H2O. The DNA was isolated using the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany) following the manufacturer's instructions.
The protein concentration was measured with the BCA Protein Assay Kit (Pierce, Bonn, Germany) 15 µg of protein obtained from each sample were used for SDS-PAGE in a X-Cell Sure Lock Mini-Cell apparatus (Invitrogen, Karlsruhe, Germany) and transferred to a nitrocellulose membrane with the Fastblot 33 system (Biometra, Göttingen, Germany). The membrane was blocked with 5% BSA over night and incubated with rabbit polyclonal anti-HMGA2 antibody (1∶3000, Biocheck, Foster City, USA) and mouse monoclonal anti-β-actin (1∶7500, Novus Biologicals, Cambridge, United Kingdom) for one hour. Second antibodies were alkaline phosphatase-bovine anti-rabbit IgG (1∶3750, Sante Cruz, Heidelberg, Germany) and alkaline phosphatase-goat anti-mouse IgG (1∶7500, Invitrogen, Karlsruhe, Germany). The detection of β-actin was used as an internal control to confirm equivalent total protein loading. Relative HMGA2 protein expression was determined by band intensities with the ImageJ program.
For determination of protein expression in the ChIP samples, supernatants of samples after DNA-protein-antibody-bead-complex formation (IP and NoAb) and samples before Proteinase K digestion (IP and NoAb) were taken. Proteins were separated by SDS-PAGE as described by Laemmli
All real-time PCRs were run on an ABI Prism 7300 Sequence Detection System (Applied Biosystems, Darmstadt, Germany). For Quantification of
IP fragments were analyzed in triplicates starting with 3 µl of template DNA. The enrichment of DNA in the samples (IP, NoAb) was determined by amplification of
The ChIP-generated DNA fragments were A-tailed and ligated into the pGEM©-T easy vector (Promega, Mannheim, Germany) with T4 ligase at 4°C over night. The transformation was carried out according to the manufacturer's protocol with 100 µl
Clones were sequenced by Eurofins MWG Operon (Ebersberg, Germany). After revising with the Lasergene software, a BLAST search of the human genome database at NCBI was performed to locate the sequences.
For identifying the consensus sequences in the human genomic sequences (NCBI refseq build 36) Perl (
(TIF)
Sequences of the cloned fragments.
(XLS)