Databases

NGS

Layer / GoalWidely used method(s)Paired newer method(s)Why the newer one is superiorMain purpose
DNA: genome & SVsShort-read WGS (Illumina)Long-read WGS (PacBio HiFi, ONT Q20+); hybrid assembliesResolves repeats & SVs; phasing/haplotypes; fewer mapping ambiguitiesDetect SNVs, indels, SVs, CNVs, haplotypes
DNA: methylationWGBS / RRBSEM-seq / TAPS; ONT direct 5mC callingLess DNA damage, GC bias; direct multi-base detectionBase-level 5mC/5hmC methylome and allele-specific methylation
Chromatin accessibilityATAC-seq / scATACMultiome (ATAC+RNA); long-read ATACJoint regulatory info; better peak-to-gene linkageIdentify open chromatin regions, TF footprints
Protein–DNA bindingChIP-seqCUT&Tag / CUT&RUNHigher signal-to-noise, lower input, fewer artifactsMap TF or histone mark binding sites
3D genomeHi-C (in situ)Micro-C; HiChIP; single-cell Hi-CNucleosome resolution, mark-anchored loops, single-cell structureReveal chromatin loops, domains, and genome folding
RNA: expressionRNA-seq (ribo-minus, stranded)Long-read RNA-seq (Iso-Seq, ONT); UMI-based high-throughputFull-length isoforms, better quant accuracyQuantify transcript abundance and differential expression
RNA: isoforms & splicingJunction-aware short-read RNA-seqLong-read RNA/cDNA sequencingUnambiguous isoforms, fusion/splice variant resolutionStudy alternative splicing, fusions, and isoform diversity
Nascent transcription / kineticsPRO-seq / GRO-seqTT-seq; SLAM-seq / TimeLapse-seqTime-resolved labeling; synthesis/decay ratesMeasure transcriptional dynamics, initiation, and elongation
RNA–protein bindingeCLIP / iCLIPiCLIP2 / irCLIP / RNP-MaPHigher positional accuracy, captures weak interactionsIdentify RNA-binding protein targets and motifs
RNA modifications (post-RNA)MeRIP-seq (m6A-IP)miCLIP2; direct RNA nanopore sequencingBase-level m⁶A/ψ/m⁵C detection; antibody-freeMap chemical RNA modifications (m⁶A, m⁵C, ψ)
TranslationRibo-seqQTI/GTI-seq; disome-seqDetects initiation, ribosome collisionsMeasure translation efficiency, start-site mapping
Proteome quantificationDDA LC-MS/MSDIA/SWATH; PASEF-DIA; single-cell proteomicsFewer missing values, reproducible quantificationIdentify and quantify proteins in bulk or single cells
Protein post-translational modification (PTM)Phospho-enrichment + DDADIA phospho-proteomics; FAIMS/tims ion-mobilityHigher PTM coverage, multiplexing, improved quantIdentify and quantify phospho/acetyl/ubiquitin sites
Protein interactions / proximityAP-MS / Co-IP-MSBioID / TurboID / APEXCaptures transient and weak interactors; spatial contextDetect protein–protein interactions and local neighborhoods
Spatial multi-omicsVisium (10x)Slide-seqV2; MERFISH / CosMx / XeniumHigher plex & subcellular resolution; multi-modalMap RNA/protein distribution within tissue context
Single-cell multi-omicsSeparate scRNA/scATACSame-cell multiome (ATAC+RNA), SHARE-seq, Paired-TagLinks regulatory chromatin to RNA output directlyIntegrate regulatory and expression layers per cell

database

CategoryDatabaseDescription
Gene Expression DatabasesGTExRNA expression across multiple human tissues.
 Expression AtlasProvides gene expression patterns across different species and conditions, including diseases.
 GEORepository of high-throughput gene expression data, including RNA-seq and microarray data.
 ArrayExpressContains a wide array of gene expression data similar to GEO.
 Human Protein AtlasIncludes RNA and protein expression data in various tissues and organs.
Genomic Variation - population genetics1000 Genomes ProjectCatalog of human genetic variation, including common SNPs and structural variants.
 Human Genome Diversity Project (HGDP)Focused on genetic diversity across global populations, including rare and indigenous groups.
 European Variation Archive (EVA)Open-access repository for genomic variation data across species.
Genomic Variation - variant interpretationgnomADAggregates exome/genome sequencing data; focuses on population allele frequencies and rare variants.
 dbSNPDatabase for single nucleotide polymorphisms (SNPs) and other genetic variations.
 ClinVarClinical significance of genetic variants linked to human health and diseases.
Genomic Variation - Large-Scale CohortsUK BiobankLarge-scale biomedical database with genetic, lifestyle, and health information.
 GWAS CatalogFocuses on genetic variants associated with traits/diseases from published GWAS studies.
 TOPMedFocuses on diseases related to heart, lung, blood, and sleep disorders.
Genomic Variation - cancer focusedcBioPortalInteractive platform for visualizing and analyzing multi-omics cancer data with clinical outcomes.
 TCGA / GDCComprehensive multi-omics data for various cancer types.
 COSMICCatalogue of somatic mutations in cancer, including driver mutations.
 dbGaPContains datasets exploring the genetic basis of various diseases.
 ICGCInternational repository for cancer genome data, covering rare and global cancers.
Epigenomics and RegulatoryENCODEIdentifies all functional elements in the human genome.
 Roadmap Epigenomics ProjectProvides data on the epigenomic landscape of different tissues and cell types.
 BluePrint EpigenomeFocuses on epigenomic data of blood cells in health and disease.
Integrated and Multi-OmicsEnsemblComprehensive resource for genomic data, including annotations and variants.
 UCSC Genome BrowserIntegrates data from various genomic resources with visualization tools.
 FANTOMContains data on gene expression and regulatory elements, focusing on non-coding RNAs.
 ReactomePathway database for exploring molecular interactions and biological processes.
Single-Cell DatabasesHuman Cell AtlasProvides single-cell RNA-seq data from various human tissues and organs.
 Tabula SapiensA single-cell transcriptomic atlas of human tissues.
Metagenomics DatabasesMG-RASTOffers analysis and archiving for metagenomic data.
 Human Microbiome Project (HMP)Focuses on the microbial communities found in and on the human body.