Skip to content
Snippets Groups Projects
Commit ed71984b authored by Sander Bollen's avatar Sander Bollen
Browse files

some docs on split_genome

parent 09583c2a
No related branches found
No related tags found
1 merge request!2Review comments
......@@ -45,6 +45,14 @@ BASE_REFFLATS = [basename(x) for x in BEDS]
def split_genome(ref, approx_n_chunks=100):
"""
Split genome in chunks.
Chunks are strings in the format: `<ctg>:<start>-<end>`
These follow the region string format as used by htslib,
which uses _1_-based indexing.
See: http://www.htslib.org/doc/tabix.html
"""
fa = Fasta(ref)
tot_size = sum([len(x) for x in fa.records.values()])
chunk_size = tot_size//approx_n_chunks
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment