import files in Galaxy using the edited files metadata
Now that we have formed the appropriate metadata file, we are going to use it to import the GDC data in a GALAXY User Account. Therefore, you should own a Galaxy account in a Galaxy server, for instance
Import the metadata file in a Galaxy History¶
-
Create a new Galaxy history by clicking on the
+
of the top right menu -
Rename you history "TARGET Expression datasets"
-
Open the upload panel by clicking on the upload icon
-
Select the
Choose local file
tab and your edited gdc_sample_sheet.2020-02-19.tsv (see previous section) -
Click on
Start
and close the panel. the edited metadata file should now be in yourTARGET Expression datasets
history.
Import the expression datasets specified in the metadata file¶
- Click again on the upload icon
-
This time, select the Paste/Fetch data tab
-
Click on the
Rule-based
tab -
Select Upload data as:
Collection(s)
, Load tabular data from:History Dataset
, Select dataset to load:gdc_sample_sheet.2020-02-19.tsv
You should see the content of the metadata file appearing.
-
Click on the
Build
button -
On the left hand side off the panel, there is a "Rules" section, and a link to click on
-
Then, click a first time on
Add Definition
and selectURL
You should now see that the column A will be recognized as providing the URLs of the datasets to download.
-
Click a second time on
Add Definition
and selectList Identifier(s)
. Further select theB
column. -
Finish the Rules settings by clicking on
Apply
Now, there is on task remaining: we have to indicate to Galaxy that the first line of the metadata file is the column headers and should not be considered as containing information.
-
Click on the
Filter
button,select
in the empty field, and finally check theMatching a Supplied Value
, from columnA
(should be already selected), pasteInvert filter
checkbox.Here is our filter: we do not want to consider lines that contain the string
File ID
in column A. -
Click on
Apply
button -
Finally, give a Name to the future collection of downloaded files, in the field
Name
. For instanceand click the
Upload
button.You should see the following upload banner
Now, you just need to wait, the upload of the ~200 files, 1.6 GB each, is expected to take less than 5 min.
-
After completion of the upload, take a look to you dataset collection: