RNA-seq quality control checklist

RNA-seq quality control helps determine whether sequencing data are reliable enough for downstream analysis.

Poor quality RNA-seq data can produce misleading expression estimates and false biological conclusions. Quality control should be performed before and after alignment or quantification. The goal is to identify low-quality reads, adapter contamination, poor mapping, sample swaps, outliers, and unexpected expression patterns.

Raw read quality

FASTQ quality reports can reveal low base quality, adapter contamination, biased sequence composition, or overrepresented sequences. Adapter trimming or quality filtering may be needed depending on the dataset.

Mapping or quantification rate

A good dataset should have a reasonable proportion of reads mapped or assigned to features. Low mapping may indicate wrong species, wrong genome build, contamination, poor read quality, or library preparation issues.

Library type and strandedness

RNA-seq libraries may be single-end or paired-end, stranded or unstranded. Using the wrong library assumption can affect quantification accuracy. Public datasets should be checked carefully before analysis.

Sample-level patterns

PCA, clustering, and heatmaps can reveal whether samples group by expected condition. Outliers may indicate sample mix-up, technical failure, batch effect, or biological heterogeneity.

Practical checklist

THRAISE helps organize RNA-seq analysis outputs, but users should still inspect QC indicators before interpreting differential expression results.

This guide is provided for research and educational purposes. RNA-seq results should be interpreted with appropriate experimental design, quality control, statistical review, and biological validation.

Back to THRAISE Home