Table 1.
Cohort Description.
Fig 1.
Percentage of genes detected above the limit of detection (LOD) by cohort.
Each point on the boxplot represents a NanoString nCounter unique run (duplicates and triplicates included where available). The colored boxes represent the distribution of the percentage of genes detected in a particular cohort. The white line indicates the median. A cutoff of 50% was used for Cell Lines and clinical samples, and 95% was used for oligonucleotide samples. HL: Hodgkin lymphoma clinical samples, OC: ovarian cancer clinical samples, OVCL: ovarian cancer cell lines, HLO: oligonucleotides corresponding to the HL CodeSet, OVO: oligonucleotides corresponding to the OC CodeSet.
Fig 2.
Percentage of genes detected as a function of Signal to Noise Ratio by cohort.
Each point on the plot represents a NanoString nCounter unique run (duplicates and triplicates included where available). The zoomed in section illustrates how the selected cut-off excludes samples that have low signal to noise and low % genes detected. HL: Hodgkin lymphoma clinical samples, OC: ovarian cancer clinical samples, OVCL: ovarian cancer cell lines, HLO: oligonucleotides corresponding to the HL CodeSet, OVO: oligonucleotides corresponding to the OC CodeSet.
Table 2.
Overall QC Measures by cohort.
Fig 3.
PVCA and PCA plots of the Hodgkin Lymphoma clinical samples.
We considered the PVCA plot (A) of the HL clinical samples run in different batches. The percentages represent the variability explained by each factor and first order interaction between factors. The PCA plot (B) provides a two-dimensional summary of the pairwise plot of the first three principal components, which represent 49% of the variability in the data. HL1, HL2, and HL3 label each of unique CodeSets corresponding to the HL gene list.
Fig 4.
PVCA and PCA plots of the ovarian cancer clinical samples.
We considered the PVCA plot (A) of the OC clinical samples run in different batches. The percentages represent the variability explained by each factor and first order interaction between factors. The PCA plot (B) provides a two-dimensional summary of the pairwise plot of the first three principal components, which represent 40% of the variability in the data. CS1, CS2, and CS3 label each of unique CodeSets corresponding to the OC gene list.
Table 3.
Concordance between duplicates of HL clinical samples obtained from two CodeSets, after adjusting using different methods.
Table 4.
Concordance between duplicates of OC clinical samples obtained from two CodeSets, after adjusting using different methods.
Fig 5.
PVCA of the HL clinical samples after adjusting batch effect using different methods.
We consider the PVCA plot of the HL clinical samples run in different batches after adjusting BE with different methods. In each plot, percentages represent the variability explained by each factor and first order interaction between factors.
Fig 6.
PVCA of the OC clinical samples after adjusting batch effect using different methods.
We consider the PVCA plot of the OC clinical samples run in different batches after adjusting BE with different methods. In each plot, percentages represent the variability explained by each factor and first order interaction between factors.
Fig 7.
Impact of BE on downstream analysis, illustrated using a HL prognostic model.
The x and y axes correspond to risk scores obtained in HL1 and HL2 respectively. The dashed line represents the identity line, and the solid line represents the best linear fit. The horizontal line indicates the threshold used for prediction. The results in (A) correspond to scores not corrected for BE, and in (B) scores are corrected using 3 reference samples that were run in both CodeSets.