Many FASTA references files (e.g. downloaded from UCSC & NCBI ftp servers) contain duplicated sequences. The latter would not only decrease the number of uniquely mapped reads but may also interfere in the downstream processing by other software packages (e.g. reads quantification with eXpress).
The faFilter software offers a reliable way to clean any FASTA file from duplicated reference sequences.
# Download faFilter software:
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/faFilter
# Create a link in your $PATH (e.g. /usr/local/bin):
sudo ln -s /path/to/faFilter/faFilter /usr/local/bin/faFilter
# Apply to a FASTA reference file:
faFilter -uniq reference.fa reference_no_duplicates.fa
SciBerg e.Kfm
Legal form: Sole Proprietorship
James-Monroe-Ring 107, Mannheim 68309, Germany
Amtsgericht Mannheim HRA 707401
VAT identification number: DE 312303132