2.3 KRA Metadata

Introduction

The KRA metadata describes the technical aspects of sequencing experiments: the sequencing librares, preparation techniques and data files .Most of descriptive information is captured at the level of the KRA EXPERIMENT and will be displayed in the public record. It is therefore imperative that submitters provide clear and informative Title and Description for each EXPERIMENT.

Anatomy of the KRA

Organizational framework of the KRA data is based on the concepts of BioProject, BioSample, KRA (EXPERIMENT, RUN).

Anatomy of KRA submission

Relation between KRA object, data, and submission

The KRA publicly accessioned objects are BioProject (accession in the form of KAP#), BioSample (KAS#), EXPERIMENT (KAE#), RUN (KAR#), ANALYSIS (KAZ#) . SUBMISSION has a non-public accession in the form of KRA#.

  • The KRA EXPERIMENT and RUN objects contain instrument and library information and are directly associated with sequence data.

  • The KRA ANALYSIS objects contain data and information that processed sequence data.

KRA data pertaining to a STUDY can be deposited in more than one SUBMISSION.

The SAMPLE related to a STUDY can be shared between SUBMISSIONS.

KRA metadata: EXPERIMENT

Each KRA EXPERIMENT (Experiment accession KAE#) is a unique sequencing result for a specific sample.

Example:

Six sequencing libraries were prepared from a single biological sample. Three were single-end libraries, and three paired-end, although the paired-end libraries were sequenced using both paired and unidirectional sequencing. Two of the single-end libraries were treated using a targeted selection approach for some runs. Libraries were sequenced on two different instruments at three sequencing labs.

  • In all there are 13 different combinations of Sample name + Library strategy or source.

Each combination represents a unique EXPERIMENT.

Additional information may be included in the EXPERIMENT. For example, you should differentiate biological replicates using EXPERIMENTs if sequencing results were obtained separately from each animal in a group of otherwise identical animals (e.g., treated, non-treated, healthy, infected).

  • The above EXPERIMENTs can be represented by a combination of Sample name + Library strategy or source + replicate number.

An KRA EXPERIMENT is the main publishable unit in the KRA database.

Most of descriptive information is captured at the level of the KRA EXPERIMENT will be displayed in the public record.

Linking metadata and data: RUN

KRA RUN is simply a manifest of data file(s) that are derived from sequencing a library described by the associated EXPERIMENT.

When submitting, submitters only provide types and names for the sequence data files that they will be uploading.

Paired-end data files (forward/reverse) must be listed together in the same RUN in order for the two files to be correctly processed as paired-end.

Processed data: ANALYSIS

KRA ANALYSIS refers to the result file generated by processing the RUN data file.

When submitting, submitters only provide types, names and methods for the processed data files that they will be uploading.

But if the REFERENCE_ALIGNMENT is selsected in analysis type, reference information must also be provided.

Last updated