Skip to content
Snippets Groups Projects

Welcome to Biopet

(Bio Pipeline Execution Tool)

Introduction

Biopet is an abbreviation of ( Bio Pipeline Execution Tool ) and packages several functionalities:

  1. Tools for working on sequencing data
  2. Pipelines to do analysis on sequencing data
  3. Running analysis on a computing cluster ( Open Grid Engine )
  4. Running analysis on your local desktop computer

System Requirements

Biopet is build on top of GATK Queue, which requires having java installed on the analysis machine(s).

For end-users:

For developers:

How to use

Running a pipeline

  • Help:
java -jar Biopet(version).jar (pipeline of interest) -h
  • Local:
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -run
  • Cluster:
    • Note that -qsub is cluster specific (SunGrid Engine)
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -qsub* -jobParaEnv YoureParallelEnv -run
  • DryRun:
    • A dry run can be performed to see if the scheduling and creating of the pipelines jobs performs well. Nothing will be executed only the job commands are created. If this succeeds it's a good indication you actual run will be successful as well.
    • Each pipeline can be found as an options inside the jar file Biopet[version].jar which is located in the target directory and can be started with java -jar <pipelineJarFile>
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) 

If one performs a dry run the config report will be generated. From this config report you can identify all configurable options.

Shark Compute Cluster specific

In the SHARK compute cluster, a module is available to load the necessary dependencies.

$ module load biopet/v0.2.0

Using this option, the java -jar Biopet-<version>.jar can be ommited and biopet can be started using:

$ biopet

Running pipelines

$ biopet pipeline <pipeline_name>

__Note that each pipeline needs a config file written in JSON format see config & How To! Config __

There are multiple configs that can be passed to a pipeline, for example the sample, settings and executables wherefrom sample and settings are mandatory.

  • Here one can find how to create a sample and settings config
  • More info can be found here: How To! Config

Running a tool

$ biopet tool <tool_name>
  • BedToInterval
  • BedtoolsCoverageToCounts
  • BiopetFlagstat
  • CheckAllelesVcfInBam
  • ExtractAlignedFastq
  • FastqSplitter
  • FindRepeatsPacBio
  • MpileupToVcf
  • SageCountFastq
  • SageCreateLibrary
  • SageCreateTagCounts
  • VcfFilter
  • VcfToTsv
  • WipeReads

Developers

Compiling Biopet

  1. Clone biopet with git clone git@git.lumc.nl:biopet/biopet.git biopet
  2. Go to biopet directory
  3. run mvn_install_queue.sh, this install queue jars into the local maven repository
  4. alternatively download the queue.jar from the GATK website
  5. run mvn verify to compile and package or do mvn install to install the jars also in local maven repository

About

Go to the about page

License

See: License