4. Input file and data format

4.1. BED format

BED (Browser Extensible Data) format is commonly used to describe genomic intervals. Standard BED file has 12 columns, but cobind only requires the first three columns (all the other columns are optional):

# BED3 format (chrom, start, end)
chr1    629149    629391
chr1    629720    630165
chr1    631404    631758
...

# BED4 format (chrom, start, end, name)
chr1    629149    629391   region_1
chr1    629720    630165   region_2
chr1    631404    631758   region_3
...

# BED6 format (chrom, start, end, name, score, strand)
chr1    629149  629391 region_1    0    +
chr1    629720  630165 region_2    0    +
chr1    631404  631758 region_3    0    -
...

4.2. BED-like format

4.3. bigBed

bigBed is an indexed binary format of a BED file. UCSC’s bedToBigBed and bigBedToBed commands can be used to convert BED files into bigBed files or vice versa.

4.4. bigWig

The bigWig format is an indexed binary format of a wiggle file, which is widely used to represent genomic signals. UCSC’s wigToBigWig and bigWigToWig commands can be used to convert wiggle files into bigWig files or vice versa.