Skip to article frontmatterSkip to article content

Quality filtering

Bokulich Lab

To perform the quality control we will use fastp wrapped into a q2-fastp plugin. Below you will see two scenarios: how to run the analysis without performing any filtering to only generate a quality report and how to do both at the same time.

Quality overview

We can get an overview of the read quality by using the process-seqs action from the fastp QIIME 2 plugin. This command will run fastp without performing any trimming/filtering. To generate a report visualization we will then run the visualize command.

mosh fastp process-seqs \
    --i-sequences cache:reads_paired \
    --p-disable-quality-filtering \
    --p-no-dedup \
    --p-disable-adapter-trimming \
    --p-no-correction \
    --p-thread 4 \
    --o-processed-sequences cache:reads_paired_fastp_not_processed \
    --o-reports cache:fastp_reports_before \
    --verbose

To generate a visualization run:

mosh fastp visualize \
    --i-reports cache:fastp_reports_before \
    --o-visualization fastp-before.qzv \
    --verbose

Read trimming and quality filtering

Alternatively, we remove low quality bases from the reads and generate a report at the same time. To do this we run the same command but without disabling all the QC steps:

mosh fastp process-seqs \
    --i-sequences cache:reads_paired \
    --p-length-required 90 \
    --p-cut-mean-quality 30 \
    --p-cut-tail \
    --p-thread 4 \
    --o-processed-sequences cache:reads_paired_fastp \
    --o-reports cache:fastp_reports \
    --verbose

Finally, we generate the visualization:

mosh fastp visualize \
    --i-reports cache:fastp_reports \
    --o-visualization fastp.qzv \
    --verbose

You should see something similar to this result.

References
  1. Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 34(17), i884–i890. 10.1093/bioinformatics/bty560