A Powerful Method for Transcriptional Profiling of Specific Cell Types in Eukaryotes: Laser-Assisted Microdissection and RNA Sequencing

doi:10.1371/journal.pone.0029685

Figure 1.

Schematic representation of the flower and the embryo sac of Arabidopsis thaliana.

The flower of Arabidopsis thaliana consists of four whorls of organs: sepals, petals, anthers (male reproductive organs) and carpels (female reproductive organs). The carpels are fused and form the ovary, which harbors around fifty ovules. During ovule development, one embryo sac is formed within each ovule. The mature embryo sac contains three distinct cell types: the synergids and the two female gametes: the egg and the central cell [13]. The mature embryo sac of Arabidopsis thaliana, accession Landsberg erecta, is around long and wide [44]. The nuclei of the cells of the embryo sac are drawn as black circles, the vacuoles as white regions.

More »

Expand

Table 1.

Classification of alignments.

More »

Expand

Figure 2.

Examples of sequence coverage in annotated (A) and unannotated (B) regions.

Graphs in the upper parts of the panels represent the number of hits per base within the two replicates (CC1: cyan, CC2: yellow). Transcripts are drawn in the lower parts of the panels: dark boxes represent exons, bright lines mark introns and the arrowhead depicts the direction of transcription. (A) Sequence coverage at the region around the locus AT4G27960 (UBC9) on chromosome 4. The two transcripts represent two isoforms of AT4G27960. Clearly visible is the lack of coverage at the introns and the non-uniformity of sequence coverage with the maxima close to the 3′ end of the transcripts. (B) Sequence coverage at a region on chromosome 5, which is not annotated as being transcribed. Hits in this region were assembled into transcripts using cufflinks [11]. For each replicate, two transcripts with overlapping 3′ ends could be assembled (CC1: cyan, CC2: yellow). Notably, the sequence coverage along these transcripts resembles the coverage observed at annotated transcripts (A). Also visible are the unsharp transcript boundaries which vary between the replicates.

More »

Expand

Figure 3.

Comparisons of expression values between the two RNA-Seq replicates.

In each panel, the expression values (log2 of the number of hits plus one) for each feature are plotted on the x-axis (CC2) and the y-axis (CC1). Colors indicate the point density: red and blue indicate the highest, respectively lowest, densities. (A) refers to the approach that was based on the alignment of reads to the reference genome: given are the expression values of the “expressed” genes (Pearson correlation: 0.99, Spearman correlation: 0.83). (B) refers to the approach that was based on de novo assembly of the short reads. Reads from both replicates were pooled and assembled together. To calculate expression values, reads from both replicates were aligned to the assembled transcriptome (Spearman correlation: 0.87).

More »

Expand

Figure 4.

Comparisons between microarray and RNA-Seq data.

(A) The average number of hits (log2(x+1)) for each gene are plotted on the y-axis and the corresponding normalized expression values from the array data are shown on the x-axis. Expression values of the genes having a probeset on the array are well correlated between the technologies (Spearman correlation: 0.63). (B) A Venn diagram summarizing the overlap between genes detected to be expressed in the RNA-Seq data sets and the array data.

More »

Expand

Figure 5.

Test for enrichment of InterPro domains in RNA-Seq data compared to array data.

The graph shows the relative enrichment of (combinations of) InterPro domains (simplified description, details are given in Table S3) in the RNA-Seq data compared to the array data, which was found to be significant. If the combination did not occur in the array data, the enrichment value was set to the total number of occurences of the combination in the RNA-Seq data (marked with a). We performed two tests to separate the effect of the higher sensitivity (yellow) from the effect caused by the whole-genome coverage (magenta). Combinations of protein domains in the upper, middle, and lower part of the figure were significantly enriched in both, the first, and the second test, respectively. Abbreviations: DUF: domain of unknown function, LRR: leucine rich repeat, PPR: pentatricopeptide repeat, bHLH: basic helix-loop-helix, NBS: nucleotide binding site, SI-: self-incompatibility, DEFL: defensin-like. The term “unknown” comprises all transcripts without an InterPro annotation (includes also non-protein-coding genes).

More »

Expand

Figure 6.

Genes enriched in the central cell compared to other tissues of Arabidopsis thaliana.

Expression values of genes preferentially expressed in central cells are summarized in a heatmap (blue/red: low/high expression values). Expression values were equalized using edgeR [31] and log2(x+1) transformed. Samples and genes were clustered using Spearman correlation and hierarchical agglomerative clustering. Transcriptomes from whole plant and seedlings, unopened flowers, early globular embryos, male meiocytes, and 2–4 cell and globular stage embryos were obtained from [18], [19], [29], [30], and [12], respectively.

More »

Expand