At Naturalis we are now Illumina sequencing genomes from 150+ years old herbarium specimens. One of the things about this old DNA is that it fragments along predictable patterns, i.e. the strand breaks just after a purine. This means that when we do NGS of these genomes and we map them agains a reference we should see compositional bias one base upstream from where the short reads map against the reference. There exist tools to compute and visualize this bias across an entire chromosome, but not across an interval (which is what we'd like), so I took the opportunity to play around with the SAMTools perl API.