- 22 Sep, 2015 1 commit
-
-
jhoogenboom authored
Will complete this when updating to Vega 2.2.5 or newer, which contains a new feature that I contributed specifically for this.
-
- 10 Sep, 2015 1 commit
-
-
jhoogenboom authored
* Properly implemented the options on the StuttermodelVis HTML visualisation. * Added filtering options for marker and repeat unit to StuttermodelVis. * Added StuttermodelVis to the Vis tool. General visualisation changes: * Updated Vega to v2.2.4. * Fixed glitch that caused mouseover events in HTML visualisations to stop working after the renderer was switched. * The file name suggested by the Save Image link in HTML visualisations is now derived from the name of the loaded data file.
-
- 04 Sep, 2015 1 commit
-
-
jhoogenboom authored
And thereby removed a dirty workaround for a bug I found.
-
- 03 Sep, 2015 1 commit
-
-
jhoogenboom authored
* Added StuttermodelVis HTML file and JSON spec. The rendering works, but some of the options are not implemented yet. It is also not yet added to the Vis tool. * Changed the order of stuttermodel's coefficients: 'a' used to be the most significant coefficient, now it is the least significant coefficient (the shift). The benefit of this is that when moving to higher-order polynomials, the extra coefficients do not change the meaning of the others. So 'a' is now always the shift, 'b' is the linear component, 'c' the quadratic, etc. * Added some development notes (including todo list) that I had kept outside of the project until now.
-
- 01 Sep, 2015 2 commits
-
-
jhoogenboom authored
* BGCorrect and Stuttermark will now exit with an error message if more than one input file for the same sample is specified and no separate output files are given. Previously these tools would just overwrite the output file repeatedly, discarding the output of all but the last data file of the sample. * Removed to main() functions and related stubs from the tools because they are not actually runnable directly anyway. * Added some more help text to some of the tools. * Doubled the size of the marker name filter input element on the HTML visualisations.
-
jhoogenboom authored
Fixed: * Fixed crash that would occur when an empty sequence (primer dimer) is converted from raw to TSSV-style (or allelename) format. * Fixed bug in BGHomRaw that caused incorrect sample tags in the output. * Fixed bug that caused allele names with negative CE numbers and names of primer dimers to be regarded as 'invalid allele names' even though FDSTools generated those names itself. * Fixed crash when reading sample data while looking for an annotation column. * Fixed bug in Allelefinder resulting in the complete absence of output that occurred when a column name with Stuttermark output was specified. Changed: * Restyled the Options box on HTML visualisations. It is now less transparent and oriented more vertically to reduce overlap with the visualisation. Options are now presented in groups. * Updated Vega to version 2.2.1. New: * Added *_corrected columns to BGCorrect output for convenience. E.g., the total_corrected column contains the value of total-total_noise+total_add. * Added -L/--log-scale option to the Vis tool.
-
- 26 Aug, 2015 1 commit
-
-
jhoogenboom authored
* Added new visualisation BGRawVis to the Vis tool. It visualises BGHomRaw output data. * Now using more reliable linear X axis label formatting in Profilevis. * Changed filtering operands in Profilevis and Samplevis from > to >=.
-
- 25 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool BGHomRaw computes noise ratios for all detected noise in all homozygous reference samples. The idea is to plot this data in a visualisation that will be added later.
-
- 24 Aug, 2015 1 commit
-
-
jhoogenboom authored
* Added options for the graph width and filtering on marker name to Samplevis and Profilevis. * The text fields in the HTML versions of Samplevs and Profilevis now update the graph OnChange instead of OnKeyUp. This is done because rendering the graph takes a while with large data files. * Fixed glitch in Profilevis that caused useless horizontal axis labels when the logarithmic scale is used. * Fixed glitch in Profilevis that caused Vega to render the graph even before data was loaded. * Changed -R option of SeqConvert to -r to avoid a potential collision with the -R/--report option if SeqConvert ever gets report output support in the future.
-
- 21 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New visualisation Profilevis added to the package, but not yet to the Vis tool. * The Vis tool now prints a helpful error message if no output file was specified, instead of printing half a megabyte of HTML and minified JavaScript to the terminal. * Fixed crash that occurred when attempting to convert the sequence of an alias to its allele name. * Fixed various bugs in the functions that convert sequences to TSSV-style and allele names. Only the conversion of non-matching sequences was affected. * Added "max_expected_copies" section to the FDSTools library format. The default value is 2. Allelefinder will now use these as the maximum number of alleles per marker if the -a/--max-alleles option is not specified. * The section headers in the FDSTools library format are now case insensitive.
-
- 18 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool Vis creates an interactive visualisation in HTML format, or a bare Vega graph spec (JSON format). The user can choose to supply a data file that will be embedded in the visualisation file. If no data file is given, the HTML visualisation will offer a file selection element, or the bare JSON output will refer to a file called 'data.csv'. * Changes to Samplevis: * The Options box can now be opened/closed. * Added options to change the width of the bars and the space between subgraphs (markers). * Added options to filter by read count or percentage vs the highest allele of the marker. * Replaced deprecated 'zip' data transforms in the Vega spec with the new 'lookup' transform. * Updated bundled Vega to v2.1.1.
-
- 14 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New visualisation Samplevis visualises sample data files. (Note: visualisations are currently stored in the package, but are not available via FDSTools commands yet. A new tool is going to be introduced later, which will copy the visualisation files to a user-selected folder.) * Including the current versions of Vega and D3 for completeness. * Fixed missing numpy dependency in setup.py. * Clarified some option help texts in Allelefinder based on feedback by Rick and Kris.
-
- 12 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool BGMerge can be used to merge background noise profiles (e.g., merge BGPredict output with a database previously obtained from BGEstimate). * Fixed two major bugs in BGPredict that resulted in incorrect fit functions being used. * BGEstimate, BGPredict, BGHomStats, Blame, and StutterModel no longer crash if a library file is specified. * Added reverse strand profile estimation to BGPredict.
-
- 11 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool BGPredict predicts background noise profiles (containing only stutter products) for user-supplied alleles/sequences using a trained stutter model obtained from Stuttermodel. Currently only the amounts of the forward strand are predicted. * New option -L/--min-lengths for Stuttermodel allows to set a minimum required number of unique repeat lengths to base the fits on (default: 5). * Updated formatting of output of Stuttermodel: added '+' sign to positive stutter, limited r2 scores to 3 decimal places, and now all coefficients are written in scientific notation with 3 decimal places. * The --output-column option of SeqConvert now defaults to using the value of --allele-column.
-
- 10 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool StutterModel fits polynomials to stutter ratio vs repeat length. * Changed -R to -Q (--limit-reads) so that I can reassign -R to an option that is used more often. * Changed -r to -R (--report) to make sure it will not collide with the -r option in Stuttermark, if I ever want to add report output to Stuttermark. * BGHomStats now checks whether all alleles are detected
-
- 07 Aug, 2015 1 commit
-
-
jhoogenboom authored
* All tools now write to stdout by default. Tools that support writing report files write those to stderr by default. The -o/--output and -r/--report options can be used to override these. * Tools that operated on one sample at a time (bgcorrect, seqconvert, stuttermark) now support batch processing. The new -i/--input argument takes a list of files. In batch mode, the -o/--output argument can be used to specify a list of corresponding output files (which must be the same length). It is also possible to specify a format string to automatically generate file names. -o/--output defaults to "\1-\2.out" which is automatically expanded to "sampletag-toolname.out". The old positional arguments [IN] and [OUT] are maintained and allow for conveniently running the tools on a single sample file. [IN] is mutually exclusive with -i/--input and [OUT] is mutually exclusive with -o/--output. [OUT] now also accepts the filename format, but when not in batch mode, it still defaults to stdout. Note that by default, the sample tag is extracted from the input filenames by simply stripping the extension. This means a minimal batch processing command like "fdstools stuttermark -i *.csv" automatically creates a "...-stuttermark.out" file next to each CSV file in the current working directory. * Libconvert now also supports only specifying an output file. This makes it easier to write the default FDSTools library to a new file. E.g., "fdstools libconvert mynewfile.txt" now creates "mynewfile.txt" if it does not exist, and writes the default library to it. Most helpful.
-
- 06 Aug, 2015 1 commit
-
-
jhoogenboom authored
* All tools now have a longer description in the tool-specific help page. * Arguments are now presented in groups and the order is the same across tools. Furthermore: * Fixed bug that rendered BGHomStats and BGEstimate with the -H option useless. * The report of Allelefinder and BGEstimate is now written to sys.stderr by default. This means the report is now always generated (but it may be sent directly to /dev/null explicitly by the user). The big plus is that the progress of the tools is visible in the terminal when the tools are run by hand.
-
- 05 Aug, 2015 1 commit
-
-
jhoogenboom authored
-
- 04 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool Blame can be used to find particularly dirty samples and to construct a DNA profile of the contaminator. * Fixed bug BGCorrect that resulted in incorrect values in the *_add columns. * BGEstimate and BGHomStats no longer crash if a library file is provided. * SeqConvert can now use a different library file for the output, thereby offering some possibilities to update allele names when a library file gets updated. * Replaced various uses of map() by generator expressions and listcomps for increased readability speed (although slightly).
-
- 03 Aug, 2015 1 commit
-
-
jhoogenboom authored
* New tool BGHomStats computes statistics (minimum, maximum, mean, and sample variance) of noise ratios in homozygous samples. * The default BGEstimate output format has been changed to be compatible with that of BGHomStats. The cross-tabular output format is still available as an option because it easily uses 90% less disk space. BGCorrect (and other future tools that use noise profiles) will work with both formats. * Fixed bug in the --min-samples option of BGEstimate that could cause some alleles with less than the specified number of samples to be included if --drop-samples is used at the same time. * The user now receives an error message if there are unknown arguments. The error message lists the usage string of the requested tool. (Argparse's default was to print the general FDSTools usage string, which is not helpful.)
-
- 31 Jul, 2015 1 commit
-
-
jhoogenboom authored
* Unknown arguments are now silently ignored. If this results in the tool not being able to run, the usage information of the tool is printed instead of the general fdstools usage. * Seqconvert no longer crashes on an empty line in the input. * Libconvert now maintains the order of prefix/suffix sequences. * Allele names with aliases other than 'X' or 'Y' are now correctly recognised. These were previously rejected as 'unknown format'. * Fixed bug where a prefix/suffix other than the first listed in the library file was sometimes used as the canonical sequence. * Sequence format conversion from raw to TSSV-style sequences now attempts to match the prefix, suffix, and STR pattern to non-matching sequences on a best effort basis. This is especially useful when converting to allelenames (which is done via TSSV-style sequences), since it results in an allele name that matches more closely the names of other alleles. * Generating allele names for sequences that lack a prefix and/or suffix is now supported (by adding a variant description that deletes the entire prefix/suffix).
-
- 30 Jul, 2015 1 commit
-
-
jhoogenboom authored
* Added BGCorrect tool for filtering noise in case samples. * BGEstimate now writes its output in tab-separated format, instead of JSON. * Small changes to help output formatting.
-
- 29 Jul, 2015 1 commit
-
-
jhoogenboom authored
I could write about all its features here, but instead I will point out some future plans to highlight the things that are possibly not optimal in their current implementation. There are a number of things I plan to change in the future: * The output format is currently JSON, perhaps a carefully designed tabular format is a better choice. The benefit of switching to a tabluar format is that the data can be loaded into e.g. Excel as well. * The profiles are currently produced separately for forward and reverse reads. I would prefer to integrate these into a single computation that estimates allele balance in the heterozygotes using both strands as well. * I would like to add information about strand bias of the alleles as well. The most straightforward way to do this is to set only the forward reads of the true allele to 100 and treat the reverse reads the same as all background products. You will then obtain a number of reverse reads observed for ever 100 forward reads of the true allele. * I think it would be appropriate to make sure the values in the allele balance matrices of each sample ('Ax' in the source code) should add up to 1. For homozygotes, it is currently a scalar 1, the sum of the elements tend to be more than 1. This means that a heterozygous sample has a stronger influence on the profiles than a homozygous sample.
-
- 27 Jul, 2015 1 commit
-
-
jhoogenboom authored
* Allelefinder can now combine data from multiple files into a single sample (this happens when the same sample tag was extracted from their names). * Allelefinder can now automatically convert sequences to a given format (this is optional though). This is particularly useful when combining the knownalleles.csv and newalleles.csv files of a sample. (Note that allelefinder still assumes that the files contain different alleles; no attempt is made to check whether the same allele was represented in multiple files.)
-
- 24 Jul, 2015 1 commit
-
-
jhoogenboom authored
* Fixed crash when attempting to read a TSSV library from sys.stdin. * Various large updates to allelefinder. * libconvert now gives a useful default FDSTools library when given no input.
-
- 23 Jul, 2015 1 commit
-
-
jhoogenboom authored
* Introducing a new, extended library file format to support allele name generation. The new libconvert tool can convert TSSV libraries to the new format and vice versa. * Added functions for converting between raw sequences, TSSV-style sequences, and allele names. * Added global -d/--debug option. Stuttermark updates: * Stuttermark now automatically converts input sequences to TSSV-style if a library is provided. * Stuttermark will no longer crash if there is no 'name' column. Instead, all sequences are taken to belong to the same marker. New tools: * libconvert converts between FDSTools and TSSV library formats. * seqconvert converts between raw sequences, TSSV-style sequences, and allele names. * allelefinder detects the true alleles in reference samples.
-
- 02 Jul, 2015 2 commits
-
-
jhoogenboom authored
-
jhoogenboom authored
FDSTools v0.0.1 with Stuttermark v1.3. Other tools will come later.
-