BOWTIE ALIGNMENT USING galaxy

Import data¶

Rename the Unnamed history to Bowtie using the pencil icon
Go to Upload Data (to the left bar) and select Paste/Fetch Data

Paste the following content

https://ftp.flybase.net/genomes/dmel/dmel_r6.54_FB2023_05/fasta/dmel-all-chromosome-r6.54.fasta.gz
https://psilo.sorbonne-universite.fr/index.php/s/HYLtfo9d2eD3Q2A/download/GRH-103_R1.fastq.gz

And click the start button
Check the imported datasets in the history bar
Check the content of the imported datasets by clicking the eye icon in each dataset

Install required packages¶

Required packages (bowtie and samtools) are already installed in your Galaxy server

Clip fastq reads from their sequence adapter and output clipped sequences in a fasta format¶

type "clip adapter" in the search toolbar box
select the Clip adapter Galaxy toolbar
Fill the tool form as following, indicating which file to clip, the min and max sizes of the reads you wish to keep in the processed dataset, that you want a fasta output, do no want N in the retrieved clipped reads, and that the adapter in the dataset is the Illumina TruSeq adapter.

Clip adapter parameters

Source file: 2: GRH-103_R1.fastq.gz
min size: 18
max size: 36
Select output format: fasta
Accept reads containing N?: reject
Source: Use a built-in adapter (select from the list below)
Select Adapter to clip: Illumina TruSeq TGGAATTCTCGGGTGCCAAGTGGAAT

clip tool

Click the Execute icon

Check the result in the history:

how many clipped sequences ? → click on the dataset to deploy it
which format ?
How do the sequences look like ? → click on the eye icon

Prepare dmel_r6.54 bowtie index¶

No need to prepare the bowtie index, the next tool will do it for us on the fly

Align the clipped fasta reads to dmel.r6.54 using `bowtie`¶

In the search toolbar box, type bowtie
Select the tool sR_bowtie for small RNA short reads

sR_bowtie for small RNA short reads parameters

Input fasta or fastq file: reads clipped from their adapter: Clipped GRH-103_R1.fastq.gz-then-fasta
What kind of matching do you want to do?: Match on DNA as fast as possible, ...
Number of mismatches allowed: 0
Will you select a reference genome from your history or use a built-in index?: Use one from the history
Select a fasta file, to serve as index reference: dmel-all-chromosome-r6.54.fasta
Select output format: bam
additional fasta output: both aligned and unaligned

Examine the output datasets (Bowtie Output, Matched reads and Unmatched reads)

Convert SAM file to BAM file and sort the alignments by chromosome positions¶

This is automatically done by Galaxy