Filtering datasets to remove or trim low quality sequences

This step is optional and should be performed by 50% of attendees.

Cutadapt with single reads


  1. Create a new history Cutapdapt (wheel --> Create New)
  2. Copy the fastq files from the RNAseq data library to this new history (wheel --> Copy datasets)
  3. Select the Cutadapt tool
  4. Start with selecting Single-end in the Single-end or Paired-end reads? menu
  5. Select the multiple datasets button for this menu
  6. Cmd-Click for discontinuous multiple selection of single fastq.gz files (3 datasets)
  7. Filter Options
    • Minimum length: 20
  8. Read Modification Options
    • Quality cutoff: 20
  9. Output Options
    • Report: Yes
  10. Do not change the other available parameters and click Execute

Cutadapt with paired-end reads


Repeat the same procedure as above, except that you select Paired-endin step 4: Re-Run the tool using the re-run button on one Cutadapt instance and just select Paired-end instead of Single-end

  • Then you have two input boxes, one for file #1 and one for file #2.

  • In the box file #1 click the multiple datasets button and carefully Select the fastq.gz files with the _1 suffix

  • In the box file #2 click the multiple datasets button and carefully Select the fastq.gz files with the _2 suffix

  • Do not change the other parameters (they are set to the same value as previously because you used the re-run button).

  • Click the Execute button


Run MultiQC on Cutadapt jobs


  1. Select MultiQC tool
  2. Select Cutadapt/Trim Galore! in the menu Which tool was used generate logs?
  3. Cmd-Select the Report datasets generated by Cutadapt
  4. Press Execute
  5. Now, the boring but essential job: Rename carefully the Output datasets generated by Cutadapt. To do so, help yourself to the Info button at the bottom of dataset green boxes.

    Example: Rename Cutadapt on data 10 and data 9: Read 2 Output in GSM461181_2_treat_paired.fastq.gz

  6. Trash the 11 unfiltered/trimmed fastq.gz files. This is important to avoid mixing filtered and non filtered datasets in the next steps.