Figure 1.
Directed Mapping of Transcription Start Sites (DMTSS).
a) Data selection using different databases in regulonDB; b) Rapid Amplification of cDNA Ends modified protocol. The key points to enhance the efficiency of the DMTSS protocol for massive TSSs mapping were: 1) selection of highly expressed TUs under specific growth conditions, and rational oligonucleotide design; 2) lineal amplification of cDNA; 3) PAGE separation and purification of PCR products and sequencing.
Figure 2.
Analysis of different 3′ end polynucleotide incorporation efficiency.
A) Electropherograms show the incorporation of dCTP, dGTP, and dATP at the 3′ end of the cDNA for precise map the TSS of the ompA gene (ompAp2*) [51]. dATP was the one that produced the most homogeneous tail. B) Sequence comparison shows the 5′ end of the different tailing reactions.
Figure 3.
Mapping the TSS of the hns gene.
A) Proximal and distal oligonucleotides were designed to prime 38 and 155 nucleotides downstream of the ATG, respectively. B) The PCR products generated with each oligonucleotide primers were separated by PAGE and purified from the gel. C) Nucleotide sequence of each PCR band after excision from the gel. The nucleotide immediately before the polynucleotide tail corresponds to the TSS. D) Comparison of the nucleotide sequences obtained with the TSS previously reported [22].
Figure 4.
Determination of the unknown TSSs for rpsB gene.
A) Proximal and distal oligonucleotides were design to prime 4 and 67 nucleotides downstream of the ATG, respectively. B) The PCR products generated with the oligonucleotide primers were separated by PAGE and purified from the gel. C) Nucleotide sequence of each PCR band after excision from the gel. The 3′ end the nucleotide immediately before the polynucleotide tail is the TSS. D) Comparison of the nucleotide sequences obtained with upstream region of rpsB.
Table 1.
Results obtained with DMTSS compared with previously reports of TSSs. In the cases with a * mark, additional TSSs were identified.
Figure 5.
Mapping the TSSs of the cysK gene.
A) The oligonucleotide primer was designed to prime 97 nucleotides downstream of the ATG. B) The A and B PCR products generated with the oligonucleotide primer were separated by PAGE and purified from the gel. C) Nucleotide sequence of each PCR band. The 3′ end the nucleotide immediately before the polynucleotide tail is the TSS. D) Comparison of the nucleotide sequences obtained with the upstream region. The previously reported TSS [52] was located 34 nucleotides upstream from the ATG, while the new TSS was located at 67 nucleotides downstream.
Figure 6.
TSSs mapping for three genes with no previously determined 5′ end, as examples of the 317 TSSs mapped in this work.
The TSSs for ychH (A), serS (B), and ycbB (C) genes, which code for a predicted inner membrane protein, a seryl-tRNA synthetase, and a predicted carboxypeptidase, respectively, were determined by DMTSS. The unique PCR fragments obtained by PCR for each gene were sequenced. The positions of the TSSs are indicated by arrows.
Figure 7.
Number of TSSs per gene mapped.
Comparison of the TSSs obtained in this work with the ones in RegulonDB. Both data sets are very similar, indicating no bias in the genes selected in this work.
Figure 8.
A) Multiple new TSSs were obtained for the kup gene, 57, 135, and 213 nucleotides upstream of the ATG. B) A new TSS for hybO was identified 26 nucleotides upstream of the ATG, plus the previously reported one at 102 nucleotides upstream of the ATG. C) For putP three TSSs out of five reported were mapped 17, 94, and 138 nucleotides upstream of the ATG.
Table 2.
Prediction of putative σ70 and σ38 −10 elements at the 5′ ends detected within coding regions.
Table 3.
Promoters identified by DMTSS.
Figure 9.
Graphical representation of the E. coli chromosome region of the tig gene obtained with the GenoSeqGrapher V1.0 program.
Each pyrosequencing read is displayed as an arrow below the genomic DNA. Colors represent the different growth conditions from which the sequences were obtained. Mouse over the arrows displays a box with the nucleotide sequence, the position in the genome and the position with respect to ATG of the selected gene.
Table 4.
Promoters identified by HTPS.
Figure 10.
Display on the E. coli K-12 chromosome of all the TSSs obtained in this work by DMTSS (red) and by HTPS (black).
TSSs obtained by both methodologies are shown in blue.
Figure 11.
Multiple TSSs for a single TU.
The graph shows several sequences upstream of the csrB and cspA genes initiating at different positions, showing the ambiguity of the TSS in some TU.
Figure 12.
Frequency of each initiation nucleotide.
The graph shows the frecuency of the starting nucleotide (adenine, guanine, cytosine and thymine) TSSs obtained by DMTSS, by HTPS, and for the TSSs with predicted promoters from the HTPS data set. AGCT in DMTSS indicates any nucleotide, see text.
Figure 13.
Distance of the predicted TF binding sites to the TSSs described in Table S1.
Data obtained in this work were compared with that of RegulonDB.
Figure 14.
Length of the 5′ untranslated region (5′ UTR).
The distances of each TSS mapped to the ATG translation initiation codon is plotted (5′ UTR). Dataset obtained in this work (solid line), and in all the previously mapped TSSs in RegulonDB (dashed line). For both data sets the most frequent 5′ UTR length was between 20 to 40 nucleotides.