BOWTIE ALIGNMENT USING COMMAND LINES
Import data¶
We first create a working directory for our bowtie alignment and import the required input data in it:
wget https://ftp.flybase.net/genomes/dmel/dmel_r6.54_FB2023_05/fasta/dmel-all-chromosome-r6.54.fasta.gz \
https://psilo.sorbonne-universite.fr/index.php/s/p6SEEQGQw39NJ3N/download/GRH-103.fastq.gz
We also need to uncompress the .gz
files
Install required packages¶
We will need the bowtie
and samtools
programs:
Clip fastq reads from their sequence adapter and output clipped sequences in a fasta format¶
cat GRH-103.fastq | \
perl -ne 'if (/^([GATC]{18,})TGGAATT/){$count++; print ">$count\n"; print "$1\n"}' \
> clipped_GRH-103.fa
Prepare dmel_r6.54 bowtie index¶
The following command line is masked. Before unmasking it, you can try to find the
appropriate command line using the man
command or the --help
argument
Bowtie indexing command line
Note thetime
here is to indicate the time consumed to index the genome, it is optional.
Align the clipped fasta reads to dmel.r6.54 using bowtie
¶
time bowtie dmel.r6.54 -f clipped_GRH-103.fa \
-v 0 \
-k 1 \
-p 7 \
--al dmel_matched_GRH-103.fa \
--un unmatched_GRH-103.fa \
-S \
> GRH-103.sam