
What is Pythiomics?
Public omics data are valuable resources. However, they are often fragmented across multiple repositories and stored in different raw formats, units, and metadata standards. These lead to major challenges in accessibility, harmonization, and standardization, hindering researchers and organizations from effectively utilizing the data.
Pythiomics is a multi-omics database developed and curated by Pythia Biosciences with an aim to create a single, united multi-omics database for scientists to explore. By combining state-of-the-art AI techniques for metadata harmonization and cell type prediction with meticulous manual curation and quality control, Pythiomics DB provides a standardized and reliable data resource for biopharmaceutical companies and research institutions to accelerate data analysis, data integration, and data-driven drug discovery.
Most importantly, Pythiomics can be interactively explored via the C-DIAM Multi-Omics Studio platform - making it accessible to all scientists who want to leverage the data.
Centered on Quality, Structure, and Tracebility.
Every dataset in Pythiomics follows a consistent set of principles designed to ensure high quality, clear structure, and complete traceability.
-
A rock‑solid Standard Operating Process (SOP). Each dataset follows a defined framework outlining QC criteria, processing steps, naming conventions, and how to handle edge cases.
-
Harmonized metadata. Diseases, tissues, treatments, cell types, genders, sampling ages, and more are all mapped to standardized vocabularies for consistency across studies.
-
Rigorous QC. Automated checks handle scale, while manual review ensures accuracy and catches what algorithms might miss.
-
Documentation and versioning. Every step is recorded — from input data types and QC thresholds to the rationale and code used — with full version histories for transparency and traceability.

Interactively access
10,000+ multi-omics datasets
Pythiomics brings all those data into one single place, with standard formats and GUI for exploration. It is currently incorporating data from different omics types, including Bulk RNA-seq, Single-cell RNA-seq, Proteomics, Spatial Transcriptomics, and from different public databases.
Bulk RNA
Visium HD & Xenium
Coming soon
Single-cell RNA
Metabolomics
Coming soon
Proteomics
ATAC-Seq & CITE-Seq
Coming soon
Visium Spatial
CHIP-Seq
Coming soon
CosMx
Mutation & GWAS
Coming soon
Take a deep dive into the single-cell space
Single-cell RNA-seq data has been a key focus of Pythiomics. Explore some quick stats of our single-cell database below.
118,439,197 cells
5,940 donors
1,337 datasets
243 diseases

Extract actionable insights through a wide range of visualizations and analytics
Pythiomics is hosted in CDIAM Multi-omics Studio, you can interactively explore the data through an easy-to-use graphical UI as well as a rich package of state-of-the-art machine learning algorithms and analysis workflows.
