Skip to content

How to submit metadata and upload data using morphic util

dipayan1985 edited this page May 23, 2024 · 2 revisions

Requirements

morphic-util is a command line based tool that assists with data upload to MorPhic DRACC AWS S3 storage. To be able to use morphic-util you will need:

  1. Python3 installed on your computer
  2. Basic knowledge of how to navigate the command line and run commands in a terminal, e.g. cd, ls, pwd
  3. AWS username and password, provided to you by the DRACC team

If you are missing any of the prerequisites above, please contact the MorPhic DRACC helpdesk at helpdesk@morphic.bio or on the dedicated slack channel.

Installation

Check Python 3 Installation

Check that you have Python 3 installed by opening a terminal and running the following command:

which python3

It should return the location of your Python 3 installation e.g.

/path/to/bin/python3

If nothing is returned it means you do not have Python 3 installed.

There are many tutorials online for installing python so please follow one that makes sense to you and suits your operating system or contact your system administrator for help. One example can be found here: https://realpython.com/installing-python/

Install morphic-util with pip

Once you have installed Python 3, in your terminal run the following command to install morphic-util

pip3 install morphic-util

Once installed, you can see the available commands by typing the following command in your terminal.

morphic-util -h

If this succeeds continue to the configuration section.

Depending on your Python installation you may need to use sudo to install the tool in your system directory. If the above command gives a Permission denied error, try using the following command:

sudo pip3 install morphic-util

You will then be prompted for your system password. If this succeeds continue to the configuration section below.

Configuration

You will need to configure the morphic-util tool the first time you use it. The configuration creates an AWS profile called morphic-util that will give you the appropriate permissions to upload to your upload area.

To configure your morphic-util tool, use the following command with the AWS username and password that you have obtained from your DRACC administrator.

morphic-util config AWS_USERNAME AWS_PASSWORD

This would look something like:

$ morphic-util config myUserName my#Pass#12345
Credentials saved.
Valid credentials

Submit your metadata

You can prepare your study metadata file in tsv, csv or json format. An example tsv file will look similar to this file

Register your study metadata

Request

morphic-util submit --type study --file <local_path_to_your_study_metadata_file>

Response

You will find your unique study ID in your response, please keep a note of your study ID as you will need it in your next step

Study created successfully: 664f1ef55a564312eb478177

Register your dataset

Request

A dataset doesn't need any metadata but if you want to provide metadata you can do so using the --file option and mentioning a file path that contains the dataset metadata

Request type 1

Passing the study ID while creating a dataset

morphic-util submit --type dataset --study 664f1ef55a564312eb478177

Response

You will find your unique dataset ID in your response. Please note the dataset ID is automatically linked to the study ID

Dataset created successfully: 664f1fd35a564312eb478179
Linking dataset 664f1fd35a564312eb478179 to study 664f1ef55a564312eb478177
Dataset linked successfully to study: 664f1ef55a564312eb478177

Request type 2

Interactively creating and linking a dataset to a study

morphic-util submit --type dataset

Response

Dataset created successfully: 664f20265a564312eb47817b
Do you want to link this dataset to a study? (yes/no): yes
Input study id: 664f1ef55a564312eb478177
Linking dataset 664f20265a564312eb47817b to study 664f1ef55a564312eb478177
Dataset linked successfully to study: 664f1ef55a564312eb478177
  • Please note an upload area in Amazon S3 is now created with your dataset ID for uploading your data files.

Selecting your upload area

Once configured, you need to select your upload area using the dataset ID that you just created in the last step.

morphic-util select UPLOADAREANAME

For example:

$ morphic-util select 664f20265a564312eb47817b
Selected 664f20265a564312eb47817b

You are now ready to upload files to your upload area!

Uploading files

Once your upload area is selected you can use the upload command upload the files related to your project into the upload area. The command works by specifying either a path to a file, a space separated list of paths to files, or a path to a directory. Sub-directories of a provided directory path are ignored.

To upload a single file or space-separated list of files, specify the relative or absolute path to each file after the upload command. If files have spaces they must be escaped or enclosed in quotes:

morphic-util upload /path/to/file/file1.txt "/path/to/file/file 2.txt"

This could look something like:

$ morphic-util upload /path/to/file/sample1_R1.fastq.gz /path/to/file/sample1_R2.fastq.gz
Uploading...
/path/to/file/sample1_R1.fastq.gz 2845965046 / 2845965046.0  (100.00%)
/path/to/file/sample1_R2.fastq.gz 2845965046 / 2845965046.0  (100.00%)

$ morphic-util upload /path/to/file/sample1_R1.fastq.gz "/path/to/file/dissociation protocol.pdf" /path/to/file/enrichment\ protocol.pdf
Uploading...
/path/to/file/sample1_R1.fastq.gz 2845965046 / 2845965046.0  (100.00%)
/path/to/file/dissociation protocol.pdf 354 / 354.0  (100.00%)
/path/to/file/enrichment protocol.pdf 354 / 354.0  (100.00%)
Successful upload.

To upload all files in a directory, specify the path to the directory or use the . operator to upload all files in your current working directory:

morphic-util upload .

This would look something like:

$ morphic-util upload .
Uploading...
sample1_R1.fastq.gz  2845965046 / 2845965046.0  (100.00%)
sample1_R2.fastq.gz  2845965046 / 2845965046.0  (100.00%)
sample2_R1.fastq.gz  2845965046 / 2845965046.0  (100.00%)
sample2_R2.fastq.gz  2845965046 / 2845965046.0  (100.00%)
dissociation protocol.pdf 354 / 354.0  (100.00%)
enrichment protocol.pdf 354 / 354.0  (100.00%)

To check if all the files you expected to upload are present in your upload area use the list command

This should look something like:

morphic-util list
664f20265a564312eb47817b/sample1_R1.fastq.gz
664f20265a564312eb47817b/sample1_R2.fastq.gz
664f20265a564312eb47817b/sample2_R1.fastq.gz
664f20265a564312eb47817b/sample2_R2.fastq.gz
664f20265a564312eb47817b/dissociation protocol.pdf
664f20265a564312eb47817b/enrichment protocol.pdf

By default the upload command won't upload files that have the same name as files already present in the upload area. If you do need to overwrite an uploaded file with a file of the same name, you will need to use the -o flag. For example:

$ morphic-util upload -o /path/to/file/sample1_R1.fastq.gz
Uploading...
/path/to/file/sample1_R1.fastq.gz 2845965046 / 2845965046.0  (100.00%)

Notes:

* If you change your mind and wish to cancel the upload hit ctrl + c to cancel the upload.

* If there are sub-directories within a folder these will be ignored so please ensure all files to be uploaded are within provided path.

* If there are file names with space(s) these should be quoted for any command, for example morphic-util upload 'a file name' 'another file'.

Help and feedback

If you have any issues with uploading files or using the morphic-util tool, or wish to discuss more options for transfer of data, please contact the MorPhic DRACC helpdesk at helpdesk@morphic.bio or on the dedicated Slack channel.

Updating the tool

Periodically there will be updates to the tool to fix bugs and release new features. The latest version of the tool can be installed using the upgrade command.

$ pip3 install --upgrade --no-cache morphic-util
Successfully installed morphic-util-0.0.2

We suggest using the no-cache flag in order to avoid issues relating to storing old packages.

Clone this wiki locally