BOWTIE ALIGNMENT USING GALAXY
Import data¶
- Rename the
Unnamed history
toBowtie
using the pencil icon - Go to
Upload Data
(to the left bar) and selectPaste/Fetch Data
- Paste the following content
-
And click the
start
button -
Check the imported datasets in the history bar
- Check the content of the imported datasets by clicking the eye icon in each dataset
Install required packages¶
Required packages (bowtie
and samtools
) are already installed in your Galaxy server
Clip fastq reads from their sequence adapter and output clipped sequences in a fasta format¶
- type "clip adapter" in the search toolbar box
- select the
Clip adapter
Galaxy toolbar - Fill the tool form as following, indicating which file to clip, the min and max sizes of the reads you wish to keep in the processed dataset, that you want a fasta output, do no want N in the retrieved clipped reads, and that the adapter in the dataset is the Illumina TruSeq adapter.
Clip adapter parameters
- Source file:
2: GRH-103_R1.fastq.gz
- min size:
18
- max size:
36
- Select output format:
fasta
- Accept reads containing N?:
reject
- Source:
Use a built-in adapter (select from the list below)
- Select Adapter to clip:
Illumina TruSeq TGGAATTCTCGGGTGCCAAGTGGAAT
- Click the
Execute
icon
Check the result in the history:
- how many clipped sequences ? → click on the dataset to deploy it
- which format ?
- How do the sequences look like ? → click on the eye icon
Prepare dmel_r6.54 bowtie index¶
No need to prepare the bowtie index, the next tool will do it for us on the fly
Align the clipped fasta reads to dmel.r6.54 using bowtie
¶
- In the search toolbar box, type
bowtie
- Select the tool
sR_bowtie for small RNA short reads
sR_bowtie for small RNA short reads parameters
- Input fasta or fastq file: reads clipped from their adapter:
Clipped GRH-103_R1.fastq.gz-then-fasta
- What kind of matching do you want to do?:
Match on DNA as fast as possible, ...
- Number of mismatches allowed:
0
- Will you select a reference genome from your history or use a built-in index?:
Use one from the history
- Select a fasta file, to serve as index reference:
dmel-all-chromosome-r6.54.fasta
- Select output format:
bam
- additional fasta output:
both aligned and unaligned
Examine the output datasets (Bowtie Output
, Matched reads
and Unmatched reads
)
Convert SAM file to BAM file and sort the alignments by chromosome positions¶
This is automatically done by Galaxy