Reader Comments
Post a new comment on this article
Post Your Discussion Comment
Please follow our guidelines for comments and review our competing interests policy. Comments that do not conform to our guidelines will be promptly removed and the user account disabled. The following must be avoided:
- Remarks that could be interpreted as allegations of misconduct
- Unsupported assertions or statements
- Inflammatory or insulting language
Thank You!
Thank you for taking the time to flag this posting; we review flagged postings on a regular basis.
closeDifferential expression methodology is contradictory and incomplete
Posted by RichardSmith on 18 Nov 2013 at 11:41 GMT
In the *Identification of differentially expressed genes* section, the authors state:
"To identify the genes associated with the growth of shoot and assess the molecular basis involved in the shoots' rapid growth, the expressional levels of genes were analyzed and the fold changes were assessed by the log2 ratio (RPKM-H/RPKM-CK). After the expressional abundances in each library were normalized to transcript per million (RPKM), then the most differentially regulated genes (differentially expressed genes, DEGs) with a log2 ratio (> 2 or <2) using a greater statistically significant value (P<0.001) as well as false discovery rates (FDR<0.01) were selected."
I would like to highlight the following issues:
*1.* RPKM does not stand for 'transcript per million', it stands for 'reads per kilobase per million mapped fragments'. This is a different measure to TPM. Which did the authors use? If RPKM was used, this is by definition not a suitable normalization for between-samples differential expression analysis, as the normalization factor varies between samples. This is explained in detail in the following articles: http://www.ncbi.nlm.nih.g... http://bib.oxfordjournals....
*2.* The authors do not state what statistical test and software was used to test for differential expression. A p-value and FDR are not useful statistics without knowing the method used.
Please could the authors describe this section of the methodology with appropriate detail and accuracy?
RE: Differential expression methodology is contradictory and incomplete
Chunling replied to RichardSmith on 02 Dec 2013 at 21:17 GMT
Hi Richard, let me explain your first concern. I have to admit it that you are definitely right. RPKM is malapropism in our manuscript. RPKM stands for 'reads per kilobase transcriptome per million mapped reads' in more detail, and is used to calculate the gene expression level in transcriptome (Mortazavi, A., Williams B. A., et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods. 2008, 5(7): 621-8.), while TPM is used in digital gene expression (DGE), and they are used in different felids. In our article, transcriptome was performed, and therefore RPKM was used to normalize the expressional abundances.
RE: RE: Differential expression methodology is contradictory and incomplete
Chunling replied to Chunling on 02 Dec 2013 at 21:27 GMT
As for your next question about statistical test and software, we referred to 'The significance of digital gene expression profiles' (Audic, S. and Claverie J. M. The significance of digital gene expression profiles. Genome Res 1997, 7(10): 986-95.), a strict algorithm was developed to identify differentially expressed genes between two samples (sorry for the inconvenience to post the arithmetic formula in the website). And the significance of differentially expressed genes was determined via the FDR (false discovery rate) control method (Benjamini, Y. and Yekutieli D. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics. 2001, 29:1165-1188.) to justify the p-values. Therefore, genes with a log2 ratio (> 2 or <2) using a greater statistically significant value (P<0.001) as well as false discovery rates (FDR<0.01) were selected as the most differentially regulated genes.
Addition/ Change:RE: RE: RE: Differential expression methodology is contradictory and incomplete
Chunling replied to Chunling on 30 Dec 2013 at 02:49 GMT
Referring to 'The significance of digital gene expression profiles' (Audic, S. and Claverie J. M. The significance of digital gene expression profiles. Genome Res 1997, 7(10): 986-95.), a strict algorithm, using an inhouse R software script, was developed to identify differentially expressed genes between two samples (sorry for the inconvenience to post the arithmetic formula in the website) and the significance of differentially expressed genes was determined via the FDR (false discovery rate) control method (Benjamini, Y. and Yekutieli D. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics. 2001, 29:1165-1188.) which justified the threshold of p-values in multiple test and analysis. For more accuracy and the development of sequencing, genes with a log2 ratio≥ 2 or ≤ -2 using a statistically significant false discovery rates (FDR ≤ 0.001) were selected as the most differentially regulated genes.
RE: Differential expression methodology is contradictory and incomplete
Chunling replied to RichardSmith on 02 Dec 2013 at 21:29 GMT
Once more, we should like to express our appreciation to the regarding comments for the patience and responsibility on our article. Again, we hope we could check our manuscript carefully to correct some professional terms with highlight, for the development of sequencing, and updated our manuscript in more accuracy. We hope that these posts can provide exact or appropriate detail to the comments. Feel free to let me know if it has any further valuable questions and suggestions. Thank you again.