Commit 7aace72b authored by Peter van 't Hof's avatar Peter van 't Hof
Browse files

Merge branch 'develop' into biopet-bios

parents f3dc2736 d36878da
......@@ -5,6 +5,7 @@
Biopet (Bio Pipeline Execution Toolkit) is the main pipeline development framework of the LUMC Sequencing Analysis Support Core team. It contains our main pipelines and some of the command line tools we develop in-house. It is meant to be used in the main [SHARK](https://humgenprojects.lumc.nl/trac/shark) computing cluster. While usage outside of SHARK is technically possible, some adjustments may need to be made in order to do so.
Full documantation is here: [Biopet documantation](http://biopet-docs.readthedocs.io/en/latest/)
## Quick Start
......@@ -48,9 +49,9 @@ $ biopet pipeline <pipeline_name> -config <path/to/config.json> -qsub -jobParaEn
It is usually a good idea to do the real run using `screen` or `nohup` to prevent the job from terminating when you log out of SHARK. In practice, using `biopet` as it is is also fine. What you need to keep in mind, is that each pipeline has their own expected config layout. You can check out more about the general structure of our config files [here](docs/config.md). For the specific structure that each pipeline accepts, please consult the respective pipeline page.
### Running Biopet in your own computer
## Testing
At the moment, we do not provide links to download the Biopet package. If you are interested in trying out Biopet locally, please contact us as [sasc@lumc.nl](mailto:sasc@lumc.nl).
Our code is tested at our local Jenkins installation for every change. We are using a [JenkinsFile](Jenkinsfile) in our repository to do this.
## Contributing to Biopet
......@@ -59,27 +60,7 @@ Biopet is based on the Queue framework developed by the Broad Institute as part
We welcome any kind of contribution, be it merge requests on the code base, documentation updates, or any kinds of other fixes! The main language we use is Scala, though the repository also contains a small bit of Python and R. Our main code repository is located at [https://github.com/biopet/biopet](https://github.com/biopet/biopet/issues), along with our issue tracker.
## Local development setup
To develop Biopet, Java 7, Maven 3.3.3, and GATK Queue 3.5 is required. Please consult the Java homepage and Maven homepage for the respective installation instruction. After you have both Java and Maven installed, you would then need to install GATK Queue. However, as the GATK Queue package is not yet available as an artifact in Maven Central, you will need to download, compile, and install GATK Queue first.
~~~
$ git clone https://github.com/broadgsa/gatk-protected
$ cd gatk
$ git checkout 3.5 # the current release is based on GATK 3.5
$ mvn -U clean install
~~~
This will install all the required dependencies to your local maven repository. After this is done, you can clone our repository and test if everything builds fine:
~~~
$ git clone https://github.com/biopet/biopet.git
$ cd biopet
$ mvn -U clean install
~~~
If everything builds fine, you're good to go! Otherwise, don't hesitate to contact us or file an issue at our issue tracker.
For more information please go to our [Developer documantation](http://biopet-docs.readthedocs.io/en/develop/developer/getting-started/)
## About
......
......@@ -19,7 +19,6 @@ import java.io.File
import nl.lumc.sasc.biopet.core.summary.{ SummaryQScript, WriteSummary }
import nl.lumc.sasc.biopet.utils.config.Configurable
import nl.lumc.sasc.biopet.core.report.ReportBuilderExtension
import nl.lumc.sasc.biopet.core.workaround.BiopetQCommandLine
import nl.lumc.sasc.biopet.utils.Logging
import org.broadinstitute.gatk.queue.{ QScript, QSettings }
import org.broadinstitute.gatk.queue.function.QFunction
......@@ -118,11 +117,10 @@ trait BiopetQScript extends Configurable with GatkLogging { qscript: QScript =>
}
functions.filter(_.jobOutputFile == null).foreach(f => {
try {
val className = if (f.getClass.isAnonymousClass) f.getClass.getSuperclass.getSimpleName else f.getClass.getSimpleName
f.jobOutputFile = new File(f.firstOutput.getAbsoluteFile.getParent, "." + f.firstOutput.getName + "." + className + ".out")
} catch {
case e: NullPointerException => logger.warn(s"Can't generate a jobOutputFile for $f")
val className = if (f.getClass.isAnonymousClass) f.getClass.getSuperclass.getSimpleName else f.getClass.getSimpleName
BiopetQScript.safeOutputs(f) match {
case Some(o) => f.jobOutputFile = new File(o.head.getAbsoluteFile.getParent, "." + f.firstOutput.getName + "." + className + ".out")
case _ => f.jobOutputFile = new File("./stdout") // Line is here for test backup
}
})
......@@ -159,7 +157,7 @@ trait BiopetQScript extends Configurable with GatkLogging { qscript: QScript =>
case that: BiopetQScript =>
that.init()
that.biopetScript()
case _ => subPipeline.script
case _ => subPipeline.script()
}
addAll(subPipeline.functions)
}
......
......@@ -101,6 +101,9 @@ object WriteDependencies extends Logging with Configurable {
file.addOutputJob(function)
files += output -> file
}
val file = files.getOrElse(function.jobOutputFile, QueueFile(function.jobOutputFile))
file.addOutputJob(function)
files += function.jobOutputFile -> file
}
val jobs = functionNames.par.map {
......@@ -116,7 +119,7 @@ object WriteDependencies extends Logging with Configurable {
"depends_on_intermediate" -> BiopetQScript.safeOutputs(f).getOrElse(Seq()).exists(files(_).isIntermediate),
"depends_on_jobs" -> BiopetQScript.safeOutputs(f).getOrElse(Seq()).toList.flatMap(files(_).outputJobNames).distinct,
"output_used_by_jobs" -> BiopetQScript.safeOutputs(f).getOrElse(Seq()).toList.flatMap(files(_).inputJobNames).distinct,
"outputs" -> BiopetQScript.safeOutputs(f).getOrElse(Seq()).toList,
"outputs" -> (f.jobOutputFile :: BiopetQScript.safeOutputs(f).getOrElse(Seq()).toList),
"inputs" -> BiopetQScript.safeOutputs(f).getOrElse(Seq()).toList,
"done_files" -> BiopetQScript.safeDoneFiles(f).getOrElse(Seq()).toList,
"fail_files" -> BiopetQScript.safeFailFiles(f).getOrElse(Seq()).toList,
......
......@@ -40,10 +40,10 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu
var vepScript: String = config("vep_script")
@Input(doc = "input VCF", required = true)
var input: File = null
var input: File = _
@Output(doc = "output file", required = true)
var output: File = null
var output: File = _
override def subPath = {
if (vepVersion.isSet) super.subPath ++ List("vep_settings") ++ vepVersion()
......@@ -160,7 +160,7 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu
override def defaultCoreMemory = 4.0
@Output
private var _summary: File = null
private var _summary: File = _
override def beforeGraph(): Unit = {
super.beforeGraph()
......@@ -312,11 +312,11 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu
(for ((header, headerIndex) <- headers) yield {
val name = header.stripPrefix("[").stripSuffix("]")
name.replaceAll(" ", "_") -> (contents.drop(headerIndex + 1).takeWhile(!isHeader(_)).flatMap { line =>
name.replaceAll(" ", "_") -> contents.drop(headerIndex + 1).takeWhile(!isHeader(_)).flatMap { line =>
val values = line.split("\t", 2)
if (values.last.isEmpty || values.last == "-") None
else Some(values.head.replaceAll(" ", "_") -> tryToParseNumber(values.last).getOrElse(values.last))
}.toMap)
}.toMap
}).toMap
}
}
......@@ -6,6 +6,7 @@ import org.testng.annotations.Test
/**
* Created by Sander Bollen on 12-10-16.
* Here we test utils
*/
class UtilsTest extends TestNGSuite with Matchers {
......
import htsjdk.variant.variantcontext.{ Allele, Genotype, GenotypeBuilder }
package nl.lumc.sasc.biopet.utils
import htsjdk.variant.variantcontext.{ Allele, GenotypeBuilder }
import org.scalatest.Matchers
import org.scalatest.testng.TestNGSuite
import org.testng.annotations.Test
import scala.collection.JavaConversions._
import nl.lumc.sasc.biopet.utils.VcfUtils
/**
* Created by Sander Bollen on 4-10-16.
*/
......
......@@ -2,6 +2,7 @@
### Requirements
- Maven 3.3
- Java 8
- Installed Gatk to maven local repository (see below)
- Installed Biopet to maven local repository (see below)
- Some knowledge of the programming language [Scala](http://www.scala-lang.org/) (The pipelines are scripted using Scala)
......@@ -16,17 +17,9 @@ Make sure both tools are installed in your local maven repository. To do this on
```bash
# Replace 'mvn' with the location of you maven executable or put it in your PATH with the export command.
git clone https://github.com/broadgsa/gatk
cd gatk
git checkout 3.6
# The GATK version is bound to a version of Biopet. Biopet 0.7.0 uses Gatk 3.6
mvn clean install
cd ..
git clone https://github.com/biopet/biopet.git
git clone --recursive https://github.com/biopet/biopet.git
cd biopet
git checkout 0.7.0
mvn -DskipTests=true clean install
```
......
......@@ -50,7 +50,7 @@ trait MultisampleMappingReportTrait extends MultisampleReportBuilder {
val wgsExecuted = summary.getSampleValues("bammetrics", "stats", "wgs").values.exists(_.isDefined)
val rnaExecuted = summary.getSampleValues("bammetrics", "stats", "rna").values.exists(_.isDefined)
val insertsizeExecuted = summary.getSampleValues("bammetrics", "stats", "CollectInsertSizeMetrics", "metrics").values.exists(_ != Some(None))
val mappingExecuted = summary.getLibraryValues("mapping").nonEmpty
val mappingExecuted = summary.getLibraryValues("mapping").exists(_._2.isDefined)
val pairedFound = !mappingExecuted || summary.getLibraryValues("mapping", "settings", "paired").exists(_._2 == Some(true))
val flexiprepExecuted = summary.getLibraryValues("flexiprep")
.exists { case ((sample, lib), value) => value.isDefined }
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment