Snakemake usage

IMP uses the Snakemake workflow manager to handle the analysis steps properly.

You could use this workflow directly on your own infrastructure but you will need to install all required tools before.

Also, it is possible to use Snakemake directly if you have entered the container using the --enter flag.

We use environment variables to change the workflow dynamically based on the use input. All parameters can also be changed in a configuration file. Please refers to the configuration section.

So to use snakemake, you could either:

  • Pass a variable from the command line:
MYVARIABLE="myvalue" MYOTHERVARIABLE="myothervalue" snakemake
  • Or change the variables inside the config file:
CONFIGFILE="/path/to/config.json" snakemake

with the configfile containing valid JSON:

    "MYVARIABLE": "myvalue",
    "MYOTHERVARIABLE": "myothervalue"

If you use impy, most of these environment variable are automatically set before entering the container.


Before everything, IMP needs databases that are not shipped inside, otherwise the container would be too large. So you should have a working internet connection to download databases over the network.

snakemake -s <src>/rules/init

where <src> is the path to IMP source code.


In order to run the workflow to the end:

snakemake -s <src>/Snakefile
# or if you are inside the <src> directory

To list all available steps:

snakemake -l

In order to run the workflow to a specific step:

snakemake <step>.done

Where <step> could be one of preprocessing, assembly, analysis, binning, report or workflow.

Some parameters are required in order to the workflow behaving correctly. Please refers to the configuration section.