Advanced Bioinformatics Services

Primary analysis of FASTQ files

Currently, the FastQC software (Babraham Bioinformatics) provides the most convenient way for:

  • Preliminary quality check of raw sequence data obtained on Illumina, BGI and Ion Torrent platforms
  • Calculating a percent of duplicated reads (duplication rate)
  • Visualizing the distribution of reads length and nucleotide content
  • Calculating and visualizing GC distribution over bases/reads
  • Extracting overrepresented sequences from a fastq file

# Download and unzip the latest release of FastQC software:

wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.9.zip

unzip fastqc_v0.11.9.zip

# Make the fastqc running script executable:

cd FastQC

sudo chmod 755 fastqc

# Create a link in your $PATH (e.g. /usr/local/bin):

sudo ln -s /path/to/FastQC/fastqc /usr/local/bin/fastqc

# Launch interactive interface from any location:

fastqc

# Or, apply to one or multiple fastq files from any directory:

fastqc file1.fastq file2.fastq .... fileN.fastq