Skip to content

HTSeq_counts

Use of htseq-count tool on PRJNA630433 datasets

Create a new history which PRJNA630433 htseq-count Counting on HISAT2 bam alignments and copy the input dataset you need from another history (the previous FeatureCounts one for instance)

  • The 3 collections Dc, Mo and Oc HISAT2 alignments
  • The GTF Mus_musculus.GRCm38.102.chr.gtf

htseq-count settings

  • Aligned SAM/BAM File

    → Click on the collection icon and select Dc HISAT2 alignments (BAM)

  • GFF File

    Mus_musculus.GRCm38.102.chr.gtf. This is the occasion to note that the GTF format is a specific case of the more general GFF format.

  • Mode → Union. See the help section of the tool for a detailed description of the possible modes

  • Stranded

    → select Reverse. You should not ask why anymore !

  • Minimum alignment quality

    → leave at 10

  • Feature type

    → leave exon. If you are working with bacterial genome, you may have here to put gene since there is no splicing in bacteria.

  • ID Attribute

    → leave gene_id. This may have to be tweaked with some non-standard GTF files.

  • Set advanced options

    → Leave default settings

  • Click Execute (or Run Tool with the latest Galaxy version)

Repeat the exact same operation twice for the collections Mo and Oc HISAT2 alignments

💡 use the rerun functionality ! 💡 do not wait for the end of the first run of htseq-count before rerunning the tool on the 2 other collections !

MultiQC

Unfortunately, The MultiQC tool poorly works with the latest version of htseq-count

MultiQC settings

  • 1: Results/ Which tool was used generate logs?

    → HTSeq

  • 1: Results/ Output of HTSeq

    → Click on the collection icon, then select the three collections generated by htseq-count and suffixed with (no feature)

  • click Execute(or Run Tool in the latest Galaxy version)

examine the results by clicking the eye icon of the generated collection MultiQC... ...others:Webpage

The result is misleading since the count of reads properly aligned to the genome features is missing in any of the outputs of htseq-count !

Compare the counts produced by the tools featureCounts and htseq-count

This is your job!

Using Galaxy tools you should be able to find a method (there are several possible methods) to show that counts produced by featureCounts and htseq-count are identical in this use case at least...