View on GitHub

multiDUMP

Easy to use snakmake function to run the parallel download of SRA runs via fastq-dump.

Snakemake release license forks

multiDUMP



Introduction

multiDUMP is simple snakemake based pipeline to parallelize the download of SRA data through fastq-dump.

Citation

If you use this package, please acknowledge Sebastian Gregoricchio in your paper.

Installation an dependencies

To install the pipeline it is required to download this repository and the installation of a conda environment is strongly recommended. Follow the steps below for the installation:

Notice that if you are encountering problems in the installation via conda, try to use mamba instead.



How to run the pipeline

To download a list of SRA numbers what you need is to prepare a sample configuration table with the SRA number and the corresponding name to assign to the corresponding fastq files:

SRA_ID sample_name
SRR125346 sampleA
SRR578951 sampleB

Then, upon conda environment activation, run the following commands (one can use the -n flag for a dry run):

snakemake \
-s </target/folder>/multiDUMP/workflow/multiDUMP.snakefile  \
--cores 5 \
--config \
TABLE="/path/to/sample_config_table.txt" \
OUTDIR="/full/path/to/output/directory" \
SUFFIX="['_R1', '_R2']" \
EXTENSION=".fastq.gz"

Where the config flags correspond to:

Alternatively to the manual --config flags one can provide a .yaml file as follows:

TABLE = "/path/to/sample_config_table.txt"
OUTDIR = "/full/path/to/output/directory"
SUFFIX = ['_R1', '_R2']
EXTENSION = ".fastq.gz"

And run the following code:

snakemake \
-s </target/folder>/multiDUMP/workflow/multiDUMP.snakefile  \
--cores 5 \
--configfile /path/to/config.yaml


Collect several SRA accession numbers

To inspect and collect the samples belonging to a specific project you can follow the fastq downloading tutorial. A tab-delimited tables can be downloaded from the ENA browser as described in paragraph 2.2.1 of the tutorial.



Package history and releases

A list of all releases and respective description of changes applied could be found here.

Contact

For any suggestion, bug fixing, commentary please report it in the issues/request tab of this repository.

License

This repository is under a GNU General Public License (version 3).


Contributors

contributors