# 1. Introduction

### What is a BioProject?

**BioProject** is a central framework in K-BDS that connects and organizes diverse biological data generated from a single research initiative.\
It acts as a top-level container that provides an overview of the research, offering a single entry point to access related datasets across multiple K-BDS databases.

A BioProject can be established by a single research group or a consortium and serves as a reference unit that links all associated data — including sequencing, omics, chemical, imaging, and generalist data — under one research context.

***

### Why Register a BioProject?

Registering a BioProject ensures that various types of data produced from the same study are **contextually connected and traceable**.\
This linkage provides several important benefits:

* **Integrated data access:** Easily find and access all related datasets (e.g., genome, transcriptome, proteome) under one project.
* **Reproducibility and citation:** Provide a unified reference ID that can be cited in publications, enhancing reproducibility and scientific transparency.
* **Data interoperability:** Enable cross-database linking, improving the discoverability and reusability of your data.
* **Efficient project management:** Organize complex multi-omics and cross-disciplinary data under a single project framework.

***

### What Types of Data Can a BioProject Include?

A BioProject supports a wide range of biological and biomedical data generated from comprehensive research efforts, including:

* Genomics data (sequence read, functional genomics data, nucleotide and assembly, variation)&#x20;
* Proteomics data from mass spectrometry (MS)
* Metabolomics data from MS/NMR
* Chemical data (structure, assay, profiling)&#x20;
* Bio-imaging data (optical, EM, MR, CT, EPhys, etc.)&#x20;
* Pre-clinical data&#x20;
* Others (generalist)

***

### Relationship with Other Registration Units

In K-BDS, a BioProject serves as the **top-level registration unit** and connects to other components in the data submission ecosystem:

* **BioSample:** Provides detailed information about the biological source materials (e.g., organism, tissue, environment).
* **KRA (Korea Sequence Read Archive):** Stores raw sequencing data.
* **KNA (Korea Nucleotide Sequence Archive):** Stores assembled nucleotide sequences.
* **KEA (Korea Expression Archive):** Manages gene expression and functional genomics data.
* **KSO (Korea Spatial Omics):** Handles spatial transcriptomics and multi-omics spatial data.
* **Other databases:** Such as proteomics, metabolomics, chemical profiling, bio-imaging, and pre-clinical archives.

This hierarchical structure ensures that each dataset is properly linked back to the research context, making the BioProject the central node in the K-BDS data ecosystem.

***

> 💡 **Tip:** Think of a BioProject as the "home page" for your research. It tells the story of your study and organizes all associated data — no matter how diverse — into one accessible and citable reference point.

#### BioProject as a Central Hub for Integrated Biological Data


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kobic.gitbook.io/bioproject-doc/1.-introduction.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
