Table 1.
Primer sequences.
Figure 1.
Quantitative sequence analysis reveals positive selection for nucleotides surrounding translational start sites.
−9 to +6 nucleotide frequencies (%) are presented for the consensus Kozak sequence and 7 model species. Yellow indicates positions with statistically significant selection (p<0.03). Frequencies are color coded on a 3 color gradient from 0% (red) to 25% (white) to >50% (green).
Table 2.
Bioinformatic data sets and consensus Kozak sequences.
Figure 2.
Natural frequency (y axis) correlates with the comparative sequence-selection score.
Unique translation initiation sequences demonstrate a relationship between natural frequency and the comparative score. The comparative scoring system was developed to relate unique sequences to their individual nucleotide frequencies relative to the entire population (see text). The exponential function was calculated using the median score from each frequency group, revealing a modest correlation between comparative score and natural frequency. The points are color-coded based on natural frequency. The consensus/canonical Kozak sequence is marked in pink.
Figure 3.
Schematic of a PCR-based method to correlate Kozak sequence with translation efficiency.
Following PCR reactions with nested primers, the purified PCR product is transcribed and the RNA quantified and injected into 50 embryos per RNA product. At 24 hours post injection, the embryos are divided into 2 groups of 25 embryos – one group is used for RNA extraction, to validate the amount of RNA that was injected, and the other group is used for protein extraction, to assess translation. elf1a was used as an endogenous normalization control for qRT-PCR. Fluorescence from eGFP is normalized to total protein (see text and Figure S1).
Figure 4.
Nucleotide sequence surrounding translational start sites modifies expression levels.
Translational efficiency is measured by fluorescent expression relative to the canonical/consensus Kozak sequence. All samples are statistically different from each other (p<0.04) except for Kozak vs. Middle. Each condition represents 4 replicates of 25 embryos each. One replicate from each condition was performed at the same time (i.e. one set) in order to achieve maximal consistency, and the replicate for the consensus Kozak sequence from that set was used to normalize the other conditions. Therefore, the consensus Kozak relative expression level is defined as 1 for each set of replicates, and that number lacks error bars. The fold over/under consensus has an order of magnitude range, providing no sign of assay saturation.