1. Introduction

1.1. Introductionn

Collocated genomic intervals indicate biological association. Therefore, overlapping analysis of genomic intervals has been widely used to QC, integrate, and impute the function of genomic intervals.

The conventional approach of measuring the “overlap between genomic intervals” involves arbitrary thresholds to decide the total number of overlapped genomic regions, which leads to biased, non-reproducible, and incomparable results. Specifically,

  • The result derived from this threshold-and-count approach is non-reproducible and incomparable, as different thresholds produce different results.

  • The overlapping between two intervals is a continuous variable, whereas the thresholded approach reduces it into a binary variable. Casting the one-dimensional intervals as zero-dimensional points loses the information and sensitivity needed to accurately evaluate the collocation strength.

  • The absolute or relative counts is biased by the size and the total number of intervals.

To address these limitations, cobind offers six threshold-free metrics that rigorously quantify the strength of genomic overlapping. These metrics aim to provide more reliable and comparable results without arbitrary thresholds.