Trimming and filtering of NGS reads

Trimming and filtering NGS reads

Currently, the cutadapt software provides the most convenient way to:

  • Remove adapters, poly-A tails and any other unwanted sequences from NGS reads
  • Remove a fixed number of bases from any side of NGS reads
  • Remove or trim reads with low-quality bases
  • Extract NGS reads carrying certain sequence motifs
  • Size select NGS reads
  • Change NGS reads names


# Download and unzip the latest version of cutadapt:

wget https://pypi.python.org/packages/68/73/2ae48245bbf6d84a24bdf29540ed01669f09c5d21c26258f2ce07e13c767/cutadapt-1.16.tar.gz

gunzip cutadapt-1.16.tar.gz

# Install the Python setuptools:

sudo apt-get install python-setuptools

# Install cutadapt:

cd /cutadapt-1.16

sudo python setup.py install

Raw reads trimming for different NGS kits:

CATS RNA and DNA library preparation kits (Diagenode)

# Read 1

cutadapt -u 3 input.fastq | cutadapt -a AAAAAAAA - | cutadapt -a AAAAAAAN$ -a AAAAAAN$ -a AAAAAN$ - | cutadapt -a AGAGCACACGTCTG - | cutadapt -O 8 -g GTTCAGAGTTCTACAGTCCGACGATCNNN - | cutadapt -m 18 -o output.fastq -

# Read 2

cutadapt -a CCCGATCGTCGG read2.fastq | cutadapt -a GGGGATCGTCGG - | cutadapt -m 18 -o output.fastq

NEBNext® Ultra™ and NEBNext® Ultra™ II DNA Library Prep Kits for Illumina® (NEB)

# Read 1

cutadapt -a GATCGGAAGAGCACACGT input.fastq | cutadapt -m 18 -o output.fastq -

# Read 2

cutadapt -a GATCGGAAGAGCACACGT input.fastq | cutadapt -m 18 -o output.fastq -

NEBNext® Small RNA Library Prep kit (NEB)

# Read 1

cutadapt -a AGATCGGAAGAGCACACGTCT input.fastq | cutadapt -m 18 -o output.fastq -

# Read 2

cutadapt -a GATCGTCGGACTGTAGAACTC input.fastq | cutadapt -m 18 -o output.fastq -

TruSeq Small RNA Library Preparation Kits (Illumina)

# Read 1

cutadapt -a TGGAATTCTCGGGTGCCAAGG input.fastq | cutadapt -m 18 -o output.fastq -

# Read 2

cutadapt -a GATCGTCGGACTGTAGAACTC input.fastq | cutadapt -m 18 -o output.fastq -

TruSeq RNA Library Prep Kit v2, TruSeq Stranded mRNA, TruSeq Stranded Total RNA (Illumina)

# Read 1

cutadapt -a AGATCGGAAGAGCACACGTCT input.fastq | cutadapt -m 18 -o output.fastq -

# Read 2

cutadapt -a AGATCGGAAGAGCGTCGTGTA input.fastq | cutadapt -m 18 -o output.fastq -

ScriptSeq RNA-Seq Library Preparation Kit (Illumina)

# Read 1

cutadapt -a AGATCGGAAGAGCACACGTCT input.fastq | cutadapt -m 18 -o output.fastq -

# Read 2

cutadapt -a AGATCGGAAGAGCGTCGTGTA input.fastq | cutadapt -m 18 -o output.fastq -

SciBerg e.Kfm

Legal form: Sole Proprietorship

Birkenauer Str. 7, Mannheim 68309, Germany

Amtsgericht Mannheim HRA 707401

VAT identification number: DE 312303132

Get in Touch

Email: info@sciberg.com

Phone: +49 171 190 8276