Commit f7ac5718 authored by Peter van 't Hof's avatar Peter van 't Hof
Browse files

Merge branch 'feature-pipeline-example' into 'develop'

Documentation on how to create project that depends on Biopet

Fixes #323 

See merge request !431
parents 0bc09ac2 36e1f1a2
......@@ -22,7 +22,7 @@ class extractReads {}
```
- Avoid using `null`, the Option `type` in Scala can be used instead
- Avoid using `null`; Scala's `Option` type should be used instead
```scala
// correct:
......
# Developer - Using Biopet as a dependency for your own project
You can use Biopet as a library for your own Scala project.
This can be useful if you want to make your own pipeline that you don't want to add back upstream.
## Prerequisites
At bare minimum you will need:
* Java 8
* Scala 2.10 or higher
* SBT or Maven
We highly recommend you to use an IDE such as IntelliJ IDEA for development.
### Maven dependencies
If you decide to use Maven, you should clone the [GATK public github repository](https://github.com/broadgsa/gatk).
You should use GATK 3.6.
After cloning GATK 3.6, run the following in a terminal
`mvn clean install`
You should perform the same steps for [Biopet](https://github.com/biopet/biopet). This document assumes you are working with Biopet 0.7 or higher.
### SBT dependencies
You can develop biopet pipelines with SBT as well. However, since GATK uses Maven, you will still need to install GATK
into your local Maven repository with `mvn install`.
After this, you can create a regular `build.sbt` file in the project root directory. In addition to the regular
SBT settings, you will also need to make SBT aware of the local GATK Maven installation you just did. This can be done
by adding a new resolver object:
```
resolvers += {
val repo = new IBiblioResolver
repo.setM2compatible(true)
repo.setName("localhost")
repo.setRoot(s"file://${Path.userHome.absolutePath}/.m2/repository")
repo.setCheckconsistency(false)
new RawRepository(repo)
}
```
Having set this, you can then add specific biopet modules as your library dependency. Here is one example that adds
the Flexiprep version 0.7.0 dependency:
```
libraryDependencies ++= Seq(
"nl.lumc.sasc" % "Flexiprep" % "0.7.0"
)
```
In some cases, there may be a conflict with the `org.reflections` package used (this is a transitive dependency of
GATK). If you encounter this, we recommend forcing the version to 0.9.9-RC1 like so:
```
libraryDependencies ++= Seq(
"org.reflections" % "reflections" % "0.9.9-RC1" force()
)
```
## Project structure
You should follow typical Scala folder structure. Ideally your IDE will handles this for you.
An example structure looks like:
```
.
├── pom.xml
├── src
│   ├── main
│   │   ├── resources
│   │   │   └── path
│   │   │   └── to
│   │   │   └── your
│   │   │   └── myProject
│   │   │   └── a_resource.txt
│   │   └── scala
│   │   └── path
│   │   └── to
│   │   └── your
│   │   └── myProject
│   │   └── MyProject.scala
│   └── test
│   ├── resources
│   └── scala
│   └── path
│   └── to
│   └── your
│   └── MyProject
│   └── MyProjectTest.scala
```
## POM
(skip this section if using SBT)
When using Biopet, your Maven pom.xml file should at minimum contain the following dependency:
```xml
<dependencies>
<dependency>
<groupId>nl.lumc.sasc</groupId>
<artifactId>BiopetCore</artifactId>
<version>0.7.0</version>
</dependency>
</dependencies>
```
In case you want to use a specific pipeline you want to add this to your dependencies. E.g.
```xml
<dependencies>
<dependency>
<groupId>nl.lumc.sasc</groupId>
<artifactId>BiopetCore</artifactId>
<version>0.7.0</version>
</dependency>
<dependency>
<groupId>nl.lumc.sasc</groupId>
<artifactId>Shiva</artifactId>
<version>0.7.0</version>
</dependency>
</dependencies>
```
For a complete example pom.xml see [here](../examples/pom.xml).
## SBT build
You can use SBT to build a fat JAR that contains all the required class files in a single JAR file. This can be done
using the [sbt-assembly plugin](https://github.com/sbt/sbt-assembly). Keep in mind that you have to explicitly define a specific merge strategy for conflicting
file names. In our experience, the merge strategy below works quite well:
```
assemblyMergeStrategy in assembly := {
case "git.properties" => MergeStrategy.first
// Discard the GATK's queueJobReport.R and use the one from Biopet
case PathList("org", "broadinstitute", "gatk", "queue", "util", x) if x.endsWith("queueJobReport.R")
=> MergeStrategy.first
case "GATKText.properties" => MergeStrategy.first
case "dependency_list.txt" => MergeStrategy.discard
case other => MergeStrategy.defaultMergeStrategy(other)
}
```
## New pipeline
To create a new pipeline in your project you need a class that extends from `Qscript` and `SummaryQScript`.
E.g.:
```scala
class MyProject(val root: Configurable) extends Qscript with SummaryQScript {
def init(): Unit = {}
def biopetScript(): Unit = {} # pipeline code here
def summarySettings = Map()
def summaryFiles = Map()
def summaryStats = Map()
def summaryFile: File = new File()
}
```
To make your pipeline runnable from the command line, you need to add a one line object:
```scala
object MyProject extends PipelineCommand
```
When you build your jar, you cna then simply use:
```
java -jar MyProject.jar -config some_config.yml <other arguments>
```
This jar comes with all standard biopet arguments.
\ No newline at end of file
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>path.to.your</groupId>
<artifactId>MyProject</artifactId>
<version>YourVersion1.0</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scoverage.plugin.version>1.0.4</scoverage.plugin.version>
<sting.shade.phase>package</sting.shade.phase>
<app.main.class>path.to.your.MyProject</app.main.class>
</properties>
<dependencies>
<dependency>
<groupId>nl.lumc.sasc</groupId>
<artifactId>BiopetCore</artifactId>
<version>0.7.0</version>
</dependency>
<dependency>
<groupId>nl.lumc.sasc</groupId>
<artifactId>Shiva</artifactId>
<version>0.7.0</version>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>6.8</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.10</artifactId>
<version>2.2.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-native_2.10</artifactId>
<version>3.3.0</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>${basedir}/src/main/scala</sourceDirectory>
<testSourceDirectory>${basedir}/src/test/scala</testSourceDirectory>
<testResources>
<testResource>
<directory>${basedir}/src/test/resources</directory>
<includes>
<include>**/*</include>
</includes>
</testResource>
</testResources>
<resources>
<resource>
<directory>${basedir}/src/main/resources</directory>
<includes>
<include>**/*</include>
</includes>
</resource>
</resources>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<configuration>
<!--suppress MavenModelInspection -->
<finalName>Magpie-${project.version}-${git.commit.id.abbrev}</finalName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>${app.main.class}</Main-Class>
<!--suppress MavenModelInspection -->
<X-Compile-Source-JDK>${maven.compile.source}</X-Compile-Source-JDK>
<!--suppress MavenModelInspection -->
<X-Compile-Target-JDK>${maven.compile.target}</X-Compile-Target-JDK>
</manifestEntries>
</transformer>
</transformers>
<filters>
</filters>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
<configuration>
<forkCount>1C</forkCount>
<workingDirectory>${project.build.directory}</workingDirectory>
</configuration>
</plugin>
<plugin>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.10</version>
<executions>
<execution>
<id>copy-installed</id>
<phase>prepare-package</phase>
<goals>
<goal>list</goal>
</goals>
<configuration>
<outputFile>${project.build.outputDirectory}/dependency_list.txt</outputFile>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<id>scala-compile</id>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
<arg>-deprecation</arg>
<arg>-feature</arg>
</args>
</configuration>
</execution>
</executions>
<!-- ... (see other usage or goals for details) ... -->
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.5</version>
<executions>
<execution>
<goals>
<goal>test-jar</goal>
</goals>
</execution>
</executions>
<configuration>
<archive>
<manifest>
<addDefaultImplementationEntries>true</addDefaultImplementationEntries>
<addDefaultSpecificationEntries>true</addDefaultSpecificationEntries>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<showDeprecation>true</showDeprecation>
</configuration>
</plugin>
<plugin>
<groupId>org.scalariform</groupId>
<artifactId>scalariform-maven-plugin</artifactId>
<version>0.1.4</version>
<executions>
<execution>
<phase>process-sources</phase>
<goals>
<goal>format</goal>
</goals>
<configuration>
<rewriteArrowSymbols>false</rewriteArrowSymbols>
<alignParameters>true</alignParameters>
<alignSingleLineCaseStatements_maxArrowIndent>40
</alignSingleLineCaseStatements_maxArrowIndent>
<alignSingleLineCaseStatements>true</alignSingleLineCaseStatements>
<compactStringConcatenation>false</compactStringConcatenation>
<compactControlReadability>false</compactControlReadability>
<doubleIndentClassDeclaration>false</doubleIndentClassDeclaration>
<formatXml>true</formatXml>
<indentLocalDefs>false</indentLocalDefs>
<indentPackageBlocks>true</indentPackageBlocks>
<indentSpaces>2</indentSpaces>
<placeScaladocAsterisksBeneathSecondAsterisk>false
</placeScaladocAsterisksBeneathSecondAsterisk>
<preserveDanglingCloseParenthesis>true</preserveDanglingCloseParenthesis>
<preserveSpaceBeforeArguments>false</preserveSpaceBeforeArguments>
<rewriteArrowSymbols>false</rewriteArrowSymbols>
<spaceBeforeColon>false</spaceBeforeColon>
<spaceInsideBrackets>false</spaceInsideBrackets>
<spaceInsideParentheses>false</spaceInsideParentheses>
<spacesWithinPatternBinders>true</spacesWithinPatternBinders>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>pl.project13.maven</groupId>
<artifactId>git-commit-id-plugin</artifactId>
<version>2.1.10</version>
<executions>
<execution>
<goals>
<goal>revision</goal>
</goals>
</execution>
</executions>
<configuration>
<prefix>git</prefix>
<dateFormat>dd.MM.yyyy '@' HH:mm:ss z</dateFormat>
<verbose>false</verbose>
<dotGitDirectory>${basedir}/../../.git</dotGitDirectory>
<useNativeGit>true</useNativeGit>
<skipPoms>false</skipPoms>
<generateGitPropertiesFile>true</generateGitPropertiesFile>
<generateGitPropertiesFilename>src/main/resources/git.properties</generateGitPropertiesFilename>
<failOnNoGitDirectory>false</failOnNoGitDirectory>
<abbrevLength>8</abbrevLength>
<skip>false</skip>
<gitDescribe>
<skip>false</skip>
<always>false</always>
<abbrev>8</abbrev>
<dirty>-dirty</dirty>
<forceLongFormat>false</forceLongFormat>
</gitDescribe>
</configuration>
</plugin>
<plugin>
<groupId>com.mycila</groupId>
<artifactId>license-maven-plugin</artifactId>
<version>2.6</version>
<configuration>
<excludes>
<exclude>**/*git*</exclude>
<exclude>**/*.bam</exclude>
<exclude>**/*.bai</exclude>
<exclude>**/*.gtf</exclude>
<exclude>**/*.fq</exclude>
<exclude>**/*.sam</exclude>
<exclude>**/*.bed</exclude>
<exclude>**/*.refFlat</exclude>
<exclude>**/*.R</exclude>
<exclude>**/*.rscript</exclude>
</excludes>
</configuration>
</plugin>
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
<configuration>
<scalaVersion>2.10.2</scalaVersion>
<!-- other parameters -->
</configuration>
</plugin>
</plugins>
</build>
<reporting>
<plugins>
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
</plugin>
</plugins>
</reporting>
</project>
\ No newline at end of file
......@@ -45,9 +45,10 @@ pages:
- Developer:
- Getting Started: 'developer/getting-started.md'
- Code Style: 'developer/code-style.md'
- Example pipeline: 'developer/example-pipeline.md'
- Example tool: 'developer/example-tool.md'
- Example pipeable: 'developer/example-pipeable.md'
- Example Dependecies: 'developer/example-depends.md'
- Example Pipeline: 'developer/example-pipeline.md'
- Example Tool: 'developer/example-tool.md'
- Example Pipeable: 'developer/example-pipeable.md'
- Scala docs: 'developer/scaladocs.md'
#- ['developing/Setup.md', 'Developing', 'Setting up your local development environment']
#theme: readthedocs
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment