Skip to content

Commit a8ececf

Browse files
modified the code structure
Signed-off-by: AdityaPandeyCN <[email protected]>
1 parent 03f5ea5 commit a8ececf

File tree

1 file changed

+16
-12
lines changed

1 file changed

+16
-12
lines changed

_posts/2025-05-12-using-root-for-genome-sequencing.md

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -50,25 +50,29 @@ Create advanced file splitting strategies (by chromosome, region, or read group)
5050
Implement high-performance query tools leveraging RNTuple's columnar structure
5151

5252
### Compression Strategy Analysis
53+
5354
A key component of this project involves analyzing the compression techniques used by Samtools/HTSlib and comparing them with ROOT's compression capabilities:
54-
BGZF (Blocked GZIP Format) in BAM Files
5555

56-
I'll study the 64KB block architecture that enables random access while maintaining gzip compatibility
57-
Test the nine compression levels (1-9) to determine optimal settings for genomic data
58-
Analyze the multi-threading implementation for parallel compression/decompression
56+
#### BGZF (Blocked GZIP Format) in BAM Files
57+
58+
- I'll study the 64KB block architecture that enables random access while maintaining gzip compatibility
59+
- Test the nine compression levels (1-9) to determine optimal settings for genomic data
60+
- Analyze the multi-threading implementation for parallel compression/decompression
61+
62+
#### CRAM Advanced Codecs
5963

60-
CRAM Advanced Codecs
64+
- Investigate rANS (Asymmetric Numeral Systems) implementations
65+
- Examine CRAM transforms including interleaving, RLE, bit-packing, and striped encoding
66+
- Analyze integration techniques for external codecs like bzip2 and LZMA
6167

62-
Investigate rANS (Asymmetric Numeral Systems) implementations
63-
Examine CRAM transforms including interleaving, RLE, bit-packing, and striped encoding
64-
Analyze integration techniques for external codecs like bzip2 and LZMA
68+
#### Implementation Strategy
6569

6670
The findings from this analysis will inform the implementation of:
6771

68-
Codec library integration with HTSlib's compression libraries where possible
69-
ROOT-native implementations of key algorithms where direct integration isn't possible
70-
Reference-based compression similar to CRAM
71-
Adaptive selection of optimal compression methods based on data characteristics
72+
- Codec library integration with HTSlib's compression libraries where possible
73+
- ROOT-native implementations of key algorithms where direct integration isn't possible
74+
- Reference-based compression similar to CRAM
75+
- Adaptive selection of optimal compression methods based on data characteristics
7276

7377
### Why RNTuple for Genomics?
7478
RNTuple is ROOT's successor to TTree columnar data storage, offering several advantages for genomic data:

0 commit comments

Comments
 (0)