Welcome to Biopet
(Bio Pipeline Execution Tool)
Introduction
Biopet is an abbreviation of ( Bio Pipeline Execution Tool ) and packages several functionalities:
- Tools for working on sequencing data
- Pipelines to do analysis on sequencing data
- Running analysis on a computing cluster ( Open Grid Engine )
- Running analysis on your local desktop computer
System Requirements
Biopet is build on top of GATK Queue, which requires having java
installed on the analysis machine(s).
For end-users:
For developers:
How to use
Running a pipeline
- Help:
java -jar Biopet(version).jar (pipeline of interest) -h
- Local:
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -run
- Cluster:
- Note that
-qsub
is cluster specific (SunGrid Engine)
- Note that
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -qsub* -jobParaEnv YoureParallelEnv -run
- DryRun:
- A dry run can be performed to see if the scheduling and creating of the pipelines jobs performs well. Nothing will be executed only the job commands are created. If this succeeds it's a good indication you actual run will be successful as well.
- Each pipeline can be found as an options inside the jar file Biopet[version].jar which is located in the target directory and can be started with
java -jar <pipelineJarFile>
java -jar Biopet(version).jar (pipeline of interest) (pipeline options)
If one performs a dry run the config report will be generated. From this config report you can identify all configurable options.
Shark Compute Cluster specific
In the SHARK compute cluster, a module is available to load the necessary dependencies.
$ module load biopet/v0.2.0
Using this option, the java -jar Biopet-<version>.jar
can be ommited and biopet
can be started using:
$ biopet
Running pipelines
$ biopet pipeline <pipeline_name>
- Flexiprep
- Mapping
- Gatk Variantcalling
- BamMetrics
- Basty
- GatkBenchmarkGenotyping
- GatkGenotyping
- GatkPipeline
- GatkVariantRecalibration
- GatkVcfSampleCompare
- Gentrap
- Sage
- Yamsvp (Under development)
__Note that each pipeline needs a config file written in JSON format see config & How To! Config __
There are multiple configs that can be passed to a pipeline, for example the sample, settings and executables wherefrom sample and settings are mandatory.
- Here one can find how to create a sample and settings config
- More info can be found here: How To! Config
Running a tool
$ biopet tool <tool_name>
- BedToInterval
- BedtoolsCoverageToCounts
- BiopetFlagstat
- CheckAllelesVcfInBam
- ExtractAlignedFastq
- FastqSplitter
- FindRepeatsPacBio
- MpileupToVcf
- SageCountFastq
- SageCreateLibrary
- SageCreateTagCounts
- VcfFilter
- VcfToTsv
- WipeReads
Developers
Compiling Biopet
- Clone biopet with
git clone git@git.lumc.nl:biopet/biopet.git biopet
- Go to biopet directory
- run mvn_install_queue.sh, this install queue jars into the local maven repository
- alternatively download the
queue.jar
from the GATK website - run
mvn verify
to compile and package or domvn install
to install the jars also in local maven repository
About
Go to the about page
License
See: License