1. Introduction

What is a BioProject?

BioProject is a central framework in K-BDS that connects and organizes diverse biological data generated from a single research initiative. It acts as a top-level container that provides an overview of the research, offering a single entry point to access related datasets across multiple K-BDS databases.

A BioProject can be established by a single research group or a consortium and serves as a reference unit that links all associated data — including sequencing, omics, chemical, imaging, and generalist data — under one research context.

Why Register a BioProject?

Registering a BioProject ensures that various types of data produced from the same study are contextually connected and traceable. This linkage provides several important benefits:

Integrated data access: Easily find and access all related datasets (e.g., genome, transcriptome, proteome) under one project.
Reproducibility and citation: Provide a unified reference ID that can be cited in publications, enhancing reproducibility and scientific transparency.
Data interoperability: Enable cross-database linking, improving the discoverability and reusability of your data.
Efficient project management: Organize complex multi-omics and cross-disciplinary data under a single project framework.

What Types of Data Can a BioProject Include?

A BioProject supports a wide range of biological and biomedical data generated from comprehensive research efforts, including:

Genomics data (sequence read, functional genomics data, nucleotide and assembly, variation)
Proteomics data from mass spectrometry (MS)
Metabolomics data from MS/NMR
Chemical data (structure, assay, profiling)
Bio-imaging data (optical, EM, MR, CT, EPhys, etc.)
Pre-clinical data
Others (generalist)

Relationship with Other Registration Units

In K-BDS, a BioProject serves as the top-level registration unit and connects to other components in the data submission ecosystem:

BioSample: Provides detailed information about the biological source materials (e.g., organism, tissue, environment).
KRA (Korea Sequence Read Archive): Stores raw sequencing data.
KNA (Korea Nucleotide Sequence Archive): Stores assembled nucleotide sequences.
KEA (Korea Expression Archive): Manages gene expression and functional genomics data.
KSO (Korea Spatial Omics): Handles spatial transcriptomics and multi-omics spatial data.
Other databases: Such as proteomics, metabolomics, chemical profiling, bio-imaging, and pre-clinical archives.

This hierarchical structure ensures that each dataset is properly linked back to the research context, making the BioProject the central node in the K-BDS data ecosystem.

💡 Tip: Think of a BioProject as the "home page" for your research. It tells the story of your study and organizes all associated data — no matter how diverse — into one accessible and citable reference point.

BioProject as a Central Hub for Integrated Biological Data

Next2. Before you start

Last updated 2 months ago

hashtagWhat is a BioProject?

hashtagWhy Register a BioProject?

hashtagWhat Types of Data Can a BioProject Include?

hashtagRelationship with Other Registration Units

hashtagBioProject as a Central Hub for Integrated Biological Data

What is a BioProject?

Why Register a BioProject?

What Types of Data Can a BioProject Include?

Relationship with Other Registration Units

BioProject as a Central Hub for Integrated Biological Data