Download core data of CohesinDB

The processed data of each study can be download via Data page

All cohesin binding sites with annotations

751,590 cohesin binding sites identified from ChIP-seq samples in CohesinDB, with annotations of epigenome, 3D genome, transcriptome, and cis-regulatory modules.

660 Mb (108 Mb compressed) file in .tsv format:
  1. Basic-Chrom: Chromosome of cohesin binding sites.
  2. Basic-Start: Start genomic loci of cohesin binding sites (hg38).
  3. Basic-End: End genomic loci of cohesin binding sites (hg38).
  4. Basic-Study: The studies that are used to identify a cohesin site (Encode, 4DN or GEO.)
  5. Basic-CellType: The cell types that are used to identify this cohesin site.
  6. Category-CellSpecificity: A score that measures the cell specificity.
  7. Category-Subunit: The involved cohesin subunits.
  8. Category-CTCF: Whether this cohesin site have overlaps with CTCF.
  9. Category-Location: (Ensembl) Intra: Intragenic; Inter: Intergenic; TSS: transcription start sites; TES: transcription end sites.
  10. 3Dgenome-Boundary: Whether this cohesin site overlaps with TAD boundary.
  11. 3Dgenome-Hub: Whether this cohesin site overlap with chromatin hubs.
  12. 3Dgenome-HiCloop: Whether this cohesin site overlap with Hi-C loops.
  13. 3Dgenome-HiChIPloop: Whether this cohesin site overlap with Hi-ChIP loops.
  14. 3Dgenome-ChIAloop: Whether this cohesin site overlap with ChIA-PET loops.
  15. Cis-Enhancer: Whether this cohesin site overlap with Enhancers (defined in Fantom5 database).
  16. Cis-CobindTF: The possible transcription factors that are located at this cohesin sites. TFs information is from ReMAP2020 database
  17. Cis-TargetGeneName: The candidate cohesin regulated gene names that are evidenced by both "cohesin DEGs" and "cohesin loops".
  18. Cis-TargetGeneID: The candidate cohesin regulated gene IDs that are evidenced by both "cohesin DEGs" and "cohesin loops".
  19. Function-SNP: The SNPs (from GWAS catalog) on this cohesin sites. "." represent no SNPs on this site.
  20. Function-CodeMutation: Number of mutation events that are defined in COSMIC database.
  21. Function-NoncodeMutation: Number of noncoding mutation events that are defined in COSMIC database.
  22. Addition-CTCFmotif: Whether this cohesin site have CTCF motif.
  23. Addition-SuperEnhancer: Whether this cohesin site overlap with Super Enhancers (defined in SEdb database).
  24. Addition-CompartA: Percentage of Hi-C samples shows Compartment A.
  25. Addition-top1HMMname: The name of most overlaped chromatin states (Roadmap ChromHMM project).
  26. Addition-top1HMMper: The percentage of most overlaped chromatin states (Roadmap ChromHMM project).
All cohesin-mediated chromatin loops with annotations

957,868 cohesin-mediated interactions identified from ChIA-PET, Hi-ChIP and Hi-C.

101 Mb (19 Mb compressed) file in .tsv format:
  1. Chrom1: Chromosome of cohesin loop anchor 1.
  2. Start1: Start genomic loci of cohesin loop anchor 1 (hg38).
  3. End1: End genomic loci of cohesin loop anchor 1 (hg38).
  4. Chrom2: Chromosome of cohesin loop anchor 2.
  5. Start2: Start genomic loci of cohesin loop anchor 2 (hg38).
  6. End2: End genomic loci of cohesin loop anchor 2 (hg38).
  7. Assay: The assay type (Hi-C, Hi-ChIP or ChIA-PET) involved.
  8. Subunit: The cohesin subunits involved.
  9. Celltype: The cell type involved.
  10. Study: The studies that are used to identify a cohesin loop (Encode, 4DN or GEO.)
  11. Looplength: The genomic length of a loop (kb).
All genes annotated by cohesin and CRMs

60,607 genes with the annotations of cohesin and cis-regulatory modules

110 Mb (27 Mb compressed) file in .tsv format:
  1. ID: Ensembl gene ID.
  2. Symbol: HUGO gene Symbol.
  3. Chrom: Chromosome of the gene.
  4. Start: Start genomic loci of the gene (hg38).
  5. End: End genomic loci of the gene (hg38).
  6. Strand: Strand of the gene, i.e., + or -.
  7. GeneType: Protein coding genes or other genes.
  8. Double-Whether: Whether a gene is regulated by cohesin as evidenced by cohesin loop and cohesin DEG.
  9. Double-RegulatoryCohesin: Regulatory cohesin sites for this gene. "." if the gene is not related to cohesin CRM.
  10. Interaction-Whether: Whether a gene is connected to cohesin sites via chromatin loops.
  11. Interaction-Type: The types of loops (e.g., Hi-C or ChIA-PET). Direct means cohesin sites are located on gene body.
  12. Interaction-Study: The studies involved in the cohesin-loop evidence.
  13. Interaction-Subunit: The cohesin subunits involved in the cohesin-loop evidence.
  14. DEG-Whether: Whether a gene is related to cohesin-DEG evidence.
  15. DEG-NunmerOfStudy: Number of studies have identified this gene as cohesin DEGs.
  16. DEG-Study: The studies involved in the cohesin-DEG evidence.
  17. DEG-Subunit: The cohesin subunits involved in the cohesin-DEG evidence.
  18. Corraltion-Whether: (Deprecated) Whether a gene is co-expressed with cohesin genes.
  19. Correlation-Rho: (Deprecated) Rho of corresion with cohesin genes.
  20. Correlation-FDR: (Deprecated) FDR of corresion with cohesin genes.
  21. Correlation-Subunit: (Deprecated) Cohesin subunits that co-express with a gene.
  22. CelltypeInvolvedLoop: Cell types involed in the cohesin loop evidence.
  23. CelltypeInvolvedDEG: Cell types involed in the cohesin DEG evidence.
All CRMs (cohesin-gene pairs)

2,229,500 double-evidenced cis-regulatory modules (cohesin-gene pairs) annotated with cell types.

727 Mb (138 Mb compressed) file in .tsv format: double-evidenced.pair.tsv.gz
  1. GeneName: HUGO gene symbol of a gene involved in double evidenced cohesin CRMs.
  2. GeneID: Ensembl gene ID of a gene involved in double evidenced cohesin CRMs.
  3. RegulatorySite: Regulatory cohesin sites in 'chromosome-start-end' style.
  4. CelltypeInvolvedAll: Cell types (combine the following three columns) involved in double evidenced cohesin CRMs.
  5. CelltypeInvolvedCohesin: Cell types involved in the cohesin binding sites.
  6. CelltypeInvolvedLoop: Cell types involved in the cohesin-loop evidence.
  7. CelltypeInvolvedDEG: Cell types involved in the cohesin-DEG evidence.

All cohesin ChIP-seq peaks

776 cohesin related ChIP-seq peak file at bed3 format

1.01 Gb (352 MB compressed) file in .bed3 format: : allpeak3col.tar.gz
  1. Chrom: Chromosome of the site.
  2. Start: Start genomic loci of the site (hg38).
  3. End: End genomic loci of the site (hg38).