Commit 7f5d8873 authored by Wai Yi Leung's avatar Wai Yi Leung

Adding examples for developers creating pipeline, tool, wrapper and reports.

WIP
parent 2d9184ce
# Developer - Code style
## General rules
- Variable names should alway be in *CamelCase* and does **not** start with a capital letter
- Class names should alway be in *CamelCase* and does **always** start with a capital letter
- Variable names should always be in *CamelCase* and does **not** start with a capital letter
```scala
// correct:
val outputFromProgram: String = "foobar"
// incorrect:
val OutputFromProgram: String = "foobar"
```
- Class names should always be in *CamelCase* and does **always** start with a capital letter
```scala
// correct:
class ExtractReads {}
// incorrect:
class extractReads {}
```
- Avoid using `null`, the Option `type` in Scala can be used instead
- If a method/value is designed to be overridden make it a `def` and override it with a `def`, we encourage you to not use `val`
\ No newline at end of file
```scala
// correct:
val inputFile: Option[File] = None
// incorrect:
val inputFile: File = null
```
- If a method/value is designed to be overridden make it a `def` and override it with a `def`, we encourage you to not use `val`
# Developer - Example pipeline
### Adding maven project
### Initial pipeline code
### Config setup
### Test pipeline
### Summary output
### Reporting output (opt)
\ No newline at end of file
# Developer - Example pipeline report
### Concept
### Requirements
### Getting started - First page
### How to generate report independent from pipeline
### Branding etc.
# Developer - Example tool
In this tutorial we explain how to create a tool within the biopet-framework. We provide convient helper methods which can be used in the tool.
We take a line counter as the use case.
### Initial tool code
```scala
package nl.lumc.sasc.biopet.tools
import java.io.{ PrintWriter, File }
import nl.lumc.sasc.biopet.utils.ConfigUtils._
import nl.lumc.sasc.biopet.utils.ToolCommand
import scala.collection.mutable
import scala.io.Source
/**
*/
object SimpleTool extends ToolCommand {
/*
* Main function executes the LineCounter.scala
*/
def main(args: Array[String]): Unit = {
println("This is the SimpleTool");
}
}
```
This is the minimum setup for having a working tool. (not functional yet)
### Program arguments and environment variables
A basic application/tool usually takes arguments to configure and set parameters to be used within the tool.
In biopet we facilitate an ``AbstractArgs`` case-class which stores the arguments read from commandline.
```scala
case class Args(inputFile: File = Nil, outputFile: Option[File] = None) extends AbstractArgs
```
The arguments are stored in ``Args``
Then add code that fills the Args.
```scala
class OptParser extends AbstractOptParser {
head(
s"""
|$commandName - Count lines in a textfile
""".stripMargin)
opt[File]('i', "input") required () unbounded () valueName "<inputFile>" action { (x, c) =>
c.copy(inputFile = x)
} validate {
x => if (x.exists) success else failure("Inputfile not found")
} text "Count lines from this files"
opt[File]('o', "output") unbounded () valueName "<outputFile>" action { (x, c) =>
c.copy(outputFile = Some(x))
} text "File to write output to, if not supplied output go to stdout"
}
```
In the end your tool would look like the following:
```scala
package nl.lumc.sasc.biopet.tools
import java.io.{ PrintWriter, File }
import nl.lumc.sasc.biopet.utils.ConfigUtils._
import nl.lumc.sasc.biopet.utils.ToolCommand
import scala.collection.mutable
import scala.io.Source
/**
*/
object SimpleTool extends ToolCommand {
case class Args(inputFile: File = Nil, outputFile: Option[File] = None) extends AbstractArgs
class OptParser extends AbstractOptParser {
head(
s"""
|$commandName - Count lines in a textfile
""".stripMargin)
opt[File]('i', "input") required () unbounded () valueName "<inputFile>" action { (x, c) =>
c.copy(inputFile = x)
} validate {
x => if (x.exists) success else failure("Inputfile not found")
} text "Count lines from this files"
opt[File]('o', "output") unbounded () valueName "<outputFile>" action { (x, c) =>
c.copy(outputFile = Some(x))
} text "File to write output to, if not supplied output go to stdout"
}
def countToJSON(inputRaw: File): String = {
val reader = Source.fromFile(inputRaw)
val nLines = reader.getLines.size
mapToJson(Map(
"lines" -> nLines,
"input" -> inputRaw
)).spaces2
}
/*
* Main function executes the LineCounter.scala
*/
def main(args: Array[String]): Unit = {
val commandArgs: Args = parseArgs(args)
// use the arguments
val jsonString: String = countToJSON(commandArgs.input)
commandArgs.outputJson match {
case Some(file) =>
val writer = new PrintWriter(file)
writer.println(jsonString)
writer.close()
case _ => println(jsonString)
}
}
}
```
### Running your new tool
### Debugging the tool with IDEA
### Setting up unit tests
### Adding tool-extension for usage in pipeline
When this tool is used in a pipeline in biopet, one has to add a tool wrapper for the tool created.
The wrapper would look like:
```scala
package nl.lumc.sasc.biopet.extensions.tools
import java.io.File
import nl.lumc.sasc.biopet.core.ToolCommandFunction
import nl.lumc.sasc.biopet.core.summary.Summarizable
import nl.lumc.sasc.biopet.utils.ConfigUtils
import nl.lumc.sasc.biopet.utils.config.Configurable
import org.broadinstitute.gatk.utils.commandline.{ Argument, Output, Input }
/**
* SimpleTool function class for usage in Biopet pipelines
*
* @param root Configuration object for the pipeline
*/
class SimpleTool(val root: Configurable) extends ToolCommandFunction with Summarizable {
def toolObject = nl.lumc.sasc.biopet.tools.SimpleTool
@Input(doc = "Input file to count lines from", shortName = "input", required = true)
var input: File = _
@Output(doc = "Output JSON", shortName = "output", required = true)
var output: File = _
override def defaultCoreMemory = 1.0
override def cmdLine = super.cmdLine +
required("-i", input) +
required("-o", output)
def summaryStats: Map[String, Any] = {
ConfigUtils.fileToConfigMap(output)
}
def summaryFiles: Map[String, File] = Map(
"simpletool" -> output
)
}
object SimpleTool {
def apply(root: Configurable, input: File, output: File): SimpleTool = {
val report = new SimpleTool(root)
report.inputReport = input
report.output = new File(output, input.getName.substring(0, input.getName.lastIndexOf(".")) + ".simpletool.json")
report
}
def apply(root: Configurable, input: File, outDir: String): SimpleTool = {
val report = new SimpleTool(root)
report.inputReport = input
report.output = new File(outDir, input.getName.substring(0, input.getName.lastIndexOf(".")) + ".simpletool.json")
report
}
}
```
### Summary setup (for reporting results to JSON)
......@@ -32,8 +32,8 @@ mvn -DskipTests=true clean install
### Basic components
#### Qscript (pipeline)
A basic pipeline would look like this.
### Qscript (pipeline)
A basic pipeline would look like this. [Extended example](example-pipeline.md)
```scala
package org.example.group.pipelines
......@@ -73,7 +73,7 @@ class SimplePipeline(val root: Configurable) extends QScript with BiopetQScript
object SimplePipeline extends PipelineCommand
```
#### Extensions (wrappers)
### Extensions (wrappers)
Wrappers have to be written for each tool used inside the pipeline. A basic wrapper (example wraps the linux ```cat``` command) would look like this:
```scala
package nl.lumc.sasc.biopet.extensions
......@@ -101,9 +101,12 @@ class Cat(val root: Configurable) extends BiopetCommandLineFunction {
}
```
#### Tools (Scala programs)
Within the Biopet framework it is also possible to write your own tools in Scala. If a give functionality or script is not incorporated within the framework
one can write a tool that does the job. Below you can see an example tool which is written for automatically building sample configs.
### Tools (Scala programs)
Within the Biopet framework it is also possible to write your own tools in Scala.
When a certain functionality or script is not incorporated within the framework one can write a tool that does the job.
Below you can see an example tool which is written for automatically building sample configs.
[Extended example](example-tool.md)
```scala
package nl.lumc.sasc.biopet.tools
......
......@@ -40,6 +40,10 @@ pages:
- Developer:
- Getting Started: 'developer/getting-started.md'
- Code Style: 'developer/code-style.md'
- Example pipeline: 'developer/example-pipeline.md'
- Example tool: 'developer/example-tool.md'
- Example reporting: 'developer/example-reporting.md'
- Example pipeable: 'developer/example-pipeable.md'
- Scala docs:
- 0.4.0: 'developer/code-style.md'
#- ['developing/Setup.md', 'Developing', 'Setting up your local development environment']
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment