Skip to content
Snippets Groups Projects
Commit 7f5d8873 authored by Wai Yi Leung's avatar Wai Yi Leung
Browse files

Adding examples for developers creating pipeline, tool, wrapper and reports.

WIP
parent 2d9184ce
No related branches found
No related tags found
No related merge requests found
# Developer - Code style
## General rules
- Variable names should alway be in *CamelCase* and does **not** start with a capital letter
- Class names should alway be in *CamelCase* and does **always** start with a capital letter
- Variable names should always be in *CamelCase* and does **not** start with a capital letter
```scala
// correct:
val outputFromProgram: String = "foobar"
// incorrect:
val OutputFromProgram: String = "foobar"
```
- Class names should always be in *CamelCase* and does **always** start with a capital letter
```scala
// correct:
class ExtractReads {}
// incorrect:
class extractReads {}
```
- Avoid using `null`, the Option `type` in Scala can be used instead
- If a method/value is designed to be overridden make it a `def` and override it with a `def`, we encourage you to not use `val`
\ No newline at end of file
```scala
// correct:
val inputFile: Option[File] = None
// incorrect:
val inputFile: File = null
```
- If a method/value is designed to be overridden make it a `def` and override it with a `def`, we encourage you to not use `val`
## Pipeable commands
# Developer - Example pipeline
### Adding maven project
### Initial pipeline code
### Config setup
### Test pipeline
### Summary output
### Reporting output (opt)
\ No newline at end of file
# Developer - Example pipeline report
### Concept
### Requirements
### Getting started - First page
### How to generate report independent from pipeline
### Branding etc.
# Developer - Example tool
In this tutorial we explain how to create a tool within the biopet-framework. We provide convient helper methods which can be used in the tool.
We take a line counter as the use case.
### Initial tool code
```scala
package nl.lumc.sasc.biopet.tools
import java.io.{ PrintWriter, File }
import nl.lumc.sasc.biopet.utils.ConfigUtils._
import nl.lumc.sasc.biopet.utils.ToolCommand
import scala.collection.mutable
import scala.io.Source
/**
*/
object SimpleTool extends ToolCommand {
/*
* Main function executes the LineCounter.scala
*/
def main(args: Array[String]): Unit = {
println("This is the SimpleTool");
}
}
```
This is the minimum setup for having a working tool. (not functional yet)
### Program arguments and environment variables
A basic application/tool usually takes arguments to configure and set parameters to be used within the tool.
In biopet we facilitate an ``AbstractArgs`` case-class which stores the arguments read from commandline.
```scala
case class Args(inputFile: File = Nil, outputFile: Option[File] = None) extends AbstractArgs
```
The arguments are stored in ``Args``
Then add code that fills the Args.
```scala
class OptParser extends AbstractOptParser {
head(
s"""
|$commandName - Count lines in a textfile
""".stripMargin)
opt[File]('i', "input") required () unbounded () valueName "<inputFile>" action { (x, c) =>
c.copy(inputFile = x)
} validate {
x => if (x.exists) success else failure("Inputfile not found")
} text "Count lines from this files"
opt[File]('o', "output") unbounded () valueName "<outputFile>" action { (x, c) =>
c.copy(outputFile = Some(x))
} text "File to write output to, if not supplied output go to stdout"
}
```
In the end your tool would look like the following:
```scala
package nl.lumc.sasc.biopet.tools
import java.io.{ PrintWriter, File }
import nl.lumc.sasc.biopet.utils.ConfigUtils._
import nl.lumc.sasc.biopet.utils.ToolCommand
import scala.collection.mutable
import scala.io.Source
/**
*/
object SimpleTool extends ToolCommand {
case class Args(inputFile: File = Nil, outputFile: Option[File] = None) extends AbstractArgs
class OptParser extends AbstractOptParser {
head(
s"""
|$commandName - Count lines in a textfile
""".stripMargin)
opt[File]('i', "input") required () unbounded () valueName "<inputFile>" action { (x, c) =>
c.copy(inputFile = x)
} validate {
x => if (x.exists) success else failure("Inputfile not found")
} text "Count lines from this files"
opt[File]('o', "output") unbounded () valueName "<outputFile>" action { (x, c) =>
c.copy(outputFile = Some(x))
} text "File to write output to, if not supplied output go to stdout"
}
def countToJSON(inputRaw: File): String = {
val reader = Source.fromFile(inputRaw)
val nLines = reader.getLines.size
mapToJson(Map(
"lines" -> nLines,
"input" -> inputRaw
)).spaces2
}
/*
* Main function executes the LineCounter.scala
*/
def main(args: Array[String]): Unit = {
val commandArgs: Args = parseArgs(args)
// use the arguments
val jsonString: String = countToJSON(commandArgs.input)
commandArgs.outputJson match {
case Some(file) =>
val writer = new PrintWriter(file)
writer.println(jsonString)
writer.close()
case _ => println(jsonString)
}
}
}
```
### Running your new tool
### Debugging the tool with IDEA
### Setting up unit tests
### Adding tool-extension for usage in pipeline
When this tool is used in a pipeline in biopet, one has to add a tool wrapper for the tool created.
The wrapper would look like:
```scala
package nl.lumc.sasc.biopet.extensions.tools
import java.io.File
import nl.lumc.sasc.biopet.core.ToolCommandFunction
import nl.lumc.sasc.biopet.core.summary.Summarizable
import nl.lumc.sasc.biopet.utils.ConfigUtils
import nl.lumc.sasc.biopet.utils.config.Configurable
import org.broadinstitute.gatk.utils.commandline.{ Argument, Output, Input }
/**
* SimpleTool function class for usage in Biopet pipelines
*
* @param root Configuration object for the pipeline
*/
class SimpleTool(val root: Configurable) extends ToolCommandFunction with Summarizable {
def toolObject = nl.lumc.sasc.biopet.tools.SimpleTool
@Input(doc = "Input file to count lines from", shortName = "input", required = true)
var input: File = _
@Output(doc = "Output JSON", shortName = "output", required = true)
var output: File = _
override def defaultCoreMemory = 1.0
override def cmdLine = super.cmdLine +
required("-i", input) +
required("-o", output)
def summaryStats: Map[String, Any] = {
ConfigUtils.fileToConfigMap(output)
}
def summaryFiles: Map[String, File] = Map(
"simpletool" -> output
)
}
object SimpleTool {
def apply(root: Configurable, input: File, output: File): SimpleTool = {
val report = new SimpleTool(root)
report.inputReport = input
report.output = new File(output, input.getName.substring(0, input.getName.lastIndexOf(".")) + ".simpletool.json")
report
}
def apply(root: Configurable, input: File, outDir: String): SimpleTool = {
val report = new SimpleTool(root)
report.inputReport = input
report.output = new File(outDir, input.getName.substring(0, input.getName.lastIndexOf(".")) + ".simpletool.json")
report
}
}
```
### Summary setup (for reporting results to JSON)
......@@ -32,8 +32,8 @@ mvn -DskipTests=true clean install
### Basic components
#### Qscript (pipeline)
A basic pipeline would look like this.
### Qscript (pipeline)
A basic pipeline would look like this. [Extended example](example-pipeline.md)
```scala
package org.example.group.pipelines
......@@ -73,7 +73,7 @@ class SimplePipeline(val root: Configurable) extends QScript with BiopetQScript
object SimplePipeline extends PipelineCommand
```
#### Extensions (wrappers)
### Extensions (wrappers)
Wrappers have to be written for each tool used inside the pipeline. A basic wrapper (example wraps the linux ```cat``` command) would look like this:
```scala
package nl.lumc.sasc.biopet.extensions
......@@ -101,9 +101,12 @@ class Cat(val root: Configurable) extends BiopetCommandLineFunction {
}
```
#### Tools (Scala programs)
Within the Biopet framework it is also possible to write your own tools in Scala. If a give functionality or script is not incorporated within the framework
one can write a tool that does the job. Below you can see an example tool which is written for automatically building sample configs.
### Tools (Scala programs)
Within the Biopet framework it is also possible to write your own tools in Scala.
When a certain functionality or script is not incorporated within the framework one can write a tool that does the job.
Below you can see an example tool which is written for automatically building sample configs.
[Extended example](example-tool.md)
```scala
package nl.lumc.sasc.biopet.tools
......
......@@ -40,6 +40,10 @@ pages:
- Developer:
- Getting Started: 'developer/getting-started.md'
- Code Style: 'developer/code-style.md'
- Example pipeline: 'developer/example-pipeline.md'
- Example tool: 'developer/example-tool.md'
- Example reporting: 'developer/example-reporting.md'
- Example pipeable: 'developer/example-pipeable.md'
- Scala docs:
- 0.4.0: 'developer/code-style.md'
#- ['developing/Setup.md', 'Developing', 'Setting up your local development environment']
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment