Human-associated
Mandatory Attributes Optional Attributes
Mandatory Attributes
Collection date
Harmonized name: collection_date
Description: the date on which the sample was collected; date/time ranges are supported by providing two dates from among the supported value formats, delimited by a forward-slash character; collection times are supported by adding "T", then the hour and minute after the date, and must be in Coordinated Universal Time (UTC), otherwise known as "Zulu Time" (Z); supported formats include "DD-Mmm-YYYY", "Mmm-YYYY", "YYYY" or ISO 8601 standard "YYYY-mm-dd", "YYYY-mm", "YYYY-mm-ddThh:mm:ss"; e.g., 30-Oct-1990, Oct-1990, 1990, 1990-10-30, 1990-10, 21-Oct-1952/15-Feb-1953, 2015-10-11T17:53:03Z; valid non-ISO dates will be automatically transformed to ISO format
Broad-scale environmental context
Harmonized name: env_broad_scale
Description: Add terms that identify the major environment type(s) where your sample was collected. Recommend subclasses of biome [ENVO:00000428]. Multiple terms can be separated by one or more pipes e.g.: mangrove biome [ENVO:01000181]|estuarine biome [ENVO:01000020]
Local-scale environmental context
Harmonized name: env_local_scale
Description: Add terms that identify environmental entities having causal influences upon the entity at time of sampling, multiple terms can be separated by pipes, e.g.: shoreline [ENVO:00000486]|intertidal zone [ENVO:00000316]
Environmental medium
Harmonized name: env_medium
Description: Add terms that identify the material displaced by the entity at time of sampling. Recommend subclasses of environmental material [ENVO:00010483]. Multiple terms can be separated by pipes e.g.: estuarine water [ENVO:01000301]|estuarine mud [ENVO:00002160]
Geographic location
Harmonized name: geo_loc_name
Description: Geographical origin of the sample; use the appropriate name from this list https://www.insdc.org/submitting-standards/geo_loc_name-qualifier-vocabulary/. Use a colon to separate the country or ocean from more detailed information about the location, eg "Canada: Vancouver" or "Germany: halfway down Zugspitze, Alps"
Host
Harmonized name: host
Description: The natural (as opposed to laboratory) host to the organism from which the sample was obtained. Use the full taxonomic name, eg, "Homo sapiens".
Latitude and longitude
Harmonized name: lat_lon
Description: The geographical coordinates of the location where the sample was collected. Specify as degrees latitude and longitude in format "d[d.dddd] N|S d[dd.dddd] W|E", eg, 38.98 N 77.11 W
NCBI Taxonomy ID
Description: NCBI’s taxonomy identifier of the organism for this sample. The NCBI taxonomy ID can be found at https://www.ncbi.nlm.nih.gov/taxonomy/. Enter 32644 (which is a taxonomy ID for unidentified organisms) for the following or similar cases: (1) when NCBI taxonomy ID is not available because NCBI taxonomy does not yet cover the organism, (2) when metagenome or environmental sample was used, whose organismal composition is unknown in advance
Example: 9606 (for Homo sapiens), 452680 (for Pseudomonas sp. UK4)
Organism
Description: The most descriptive organism name for this sample (to the species, if possible). In the case of a new species, provide the desired organism name. In the case of unidentified species, choose the appropriate Genus and include ‘sp.’, e.g. “Escherichia sp.”. When sequencing a genome from a non-metagenomic source, include a strain or isolate name too, e.g. “Pseudomonas sp. UK4”
Example: Homo sapiens, Pseudomonas sp. UK4
Sample name
Description: A name that you choose for the sample. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. Every sample name from a single submitter must be unique within a single BioProject
Optional Attributes
Amniotic fluid color
Harmonized name: amniotic_fluid_color
Description: specification of the color of the amniotic fluid sample
Blood disorder
Harmonized name: blood_blood_disord
Description: history of blood disorders; can include multiple disorders
Chemical administration
Harmonized name: chem_administration
Description: list of chemical compounds administered to the host or site where sampling occurred, and when (e.g. antibiotics, N fertilizer, air filter); can include multiple compounds. For Chemical Entities of Biological Interest ontology (CHEBI) (v1.72), please see http://bioportal.bioontology.org/visualize/44603
Collection method
Harmonized name: collection_method
Description: Process used to collect the sample, e.g., bronchoalveolar lavage (BAL)
Derived from
Description: Indicates when one BioSample was derived from another BioSample. Value should include BioSample accession number(s)
Example: SAMN00000001, KAS24095074
Major diet change in last six months
Harmonized name: diet_last_six_month
Description: specification of major diet changes in the last six months, if yes the change should be specified
Drug usage
Harmonized name: drug_usage
Description: any drug used by subject and the frequency of usage; can include multiple drugs used
Ethnicity
Harmonized name: ethnicity
Description: ethnicity of the subject
Fetal health status
Harmonized name: foetal_health_stat
Description: specification of foetal health status, should also include abortion
Gestation state
Harmonized name: gestation_state
Description: specification of the gestation state
Host age
Harmonized name: host_age
Description: Age of host at the time of sampling
Host body mass index
Harmonized name: host_body_mass_index
Description: body mass index of the host, calculated as weight/(height)squared
Host body product
Harmonized name: host_body_product
Description: substance produced by the host, e.g. stool, mucus, where the sample was obtained from
Host body temperature
Harmonized name: host_body_temp
Description: core body temperature of the host when sample was collected
Host diet
Harmonized name: host_diet
Description: type of diet depending on the sample for animals omnivore, herbivore etc., for humans high-fat, meditteranean etc.; can include multiple diet types
Host disease
Harmonized name: host_disease
Description: Name of relevant disease, e.g. Salmonella gastroenteritis. Controlled vocabulary, http://bioportal.bioontology.org/ontologies/1009 or http://www.ncbi.nlm.nih.gov/mesh
Host family relationship
Harmonized name: host_family_relationship
Description:
Host genotype
Harmonized name: host_genotype
Host height
Harmonized name: host_height
Description: the height of subject
Host HIV status
Harmonized name: host_hiv_stat
Description: HIV status of subject, if yes HAART initiation status should also be indicated as [YES or NO]
Host last meal
Harmonized name: host_last_meal
Description: content of last meal and time since feeding; can include multiple values
Host occupation
Harmonized name: host_occupation
Description: most frequent job performed by subject
Host phenotype
Harmonized name: host_phenotype
Host pulse
Harmonized name: host_pulse
Description: resting pulse of the host, measured as beats per minute
Host sex
Harmonized name: host_sex
Description: Gender or physical sex of the host
Host subject id
Harmonized name: host_subject_id
Description: a unique identifier by which each subject can be referred to, de-identified, e.g. #131
Observed host symbionts
Harmonized name: host_symbiont
Description: The taxonomic name of the organism(s) found living in mutualistic, commensalistic, or parasitic symbiosis with the specific host
Host tissue sampled
Harmonized name: host_tissue_sampled
Description: name of body site where the sample was obtained from, such as a specific organ or tissue, e.g., tongue, lung. For foundational model of anatomy ontology (fma) (v 4.11.0) or Uber-anatomy ontology (UBERON) (v releases/2014-06-15) terms, please see http://purl.bioontology.org/ontology/FMA or http://purl.bioontology.org/ontology/UBERON
Host total mass
Harmonized name: host_tot_mass
Description: total mass of the host at collection, the unit depends on host
Medication code
Harmonized name: ihmc_medication_code
Description: can include multiple medication codes
Isolation source
Harmonized name: isolation_source
Description: Describes the physical, environmental and/or local geographical source of the biological sample from which the sample was derived.
Kidney disorder
Harmonized name: kidney_disord
Description: history of kidney disorders; can include multiple disorders
Maternal health status
Harmonized name: maternal_health_stat
Description: specification of the maternal health status
Medical history performed
Harmonized name: medic_hist_perform
Description: whether full medical history was collected
Miscellaneous parameter
Harmonized name: misc_param
Description: any other measurement performed or parameter collected, that is not listed here
Negative control type
Harmonized name: neg_cont_type
Description: The substance or equipment used as a negative control in an investigation, e.g., distilled water, phosphate buffer, empty collection device, empty collection tube, DNA-free PCR mix, sterile swab, sterile syringe
Nose throat disorder
Harmonized name: nose_throat_disord
Description: history of nose-throat disorders; can include multiple disorders
Omics Observatory ID
Harmonized name: omics_observ_id
Description: A unique identifier of the omics-enabled observatory (or comparable time series) your data derives from. This identifier should be provided by the OMICON ontology; if you require a new identifier for your time series, contact the ontology's developers. Information is available here: https://github.com/GLOMICON/omicon. This field is only applicable to records which derive from an omics time-series or observatory.
Organism count
Harmonized name: organism_count
Description: total count of any organism per gram or volume of sample,should include name of organism followed by count; can include multiple organism counts
Oxygenation status of sample
Harmonized name: oxy_stat_samp
Description: oxygenation status of sample
Perturbation
Harmonized name: perturbation
Description: type of perturbation, e.g. chemical administration, physical disturbance, etc., coupled with time that perturbation occurred; can include multiple perturbation types
Presence of pets or farm animals
Harmonized name: pet_farm_animal
Description: specification of presence of pets or farm animals in the environment of subject, if yes the animals should be specified; can include multiple animals present
Positive control type
Harmonized name: pos_cont_type
Description: The substance, mixture, product, or apparatus used to verify that a process which is part of an investigation delivers a true positive
Pulmonary disorder
Harmonized name: pulmonary_disord
Description: history of pulmonary disorders; can include multiple disorders
Reference for biomaterial
Harmonized name: ref_biomaterial
Description: Primary publication or genome report
Relationship to oxygen
Harmonized name: rel_to_oxygen
Description: Is this organism an aerobe, anaerobe? Please note that aerobic and anaerobic are valid descriptors for microbial environments, eg, aerobe, anaerobe, facultative, microaerophilic, microanaerobe, obligate aerobe, obligate anaerobe, missing, not applicable, not collected, not provided, restricted access
Sample collection device or method
Harmonized name: samp_collect_device
Description: Method or device employed for collecting sample
Sample material processing
Harmonized name: samp_mat_process
Description: Processing applied to the sample during or after isolation
Sample salinity
Harmonized name: samp_salinity
Description: The amount of salt dissolved within the collected sample
Sample size
Harmonized name: samp_size
Description: Amount or size of sample (volume, mass or area) that was collected
Sample storage duration
Harmonized name: samp_store_dur
Sample storage location
Harmonized name: samp_store_loc
Sample storage temperature
Harmonized name: samp_store_temp
Sample volume or weight for DNA extraction
Harmonized name: samp_vol_we_dna_ext
Description: volume (mL) or weight (g) of sample processed for DNA extraction
Size fraction selected
Harmonized name: size_frac
Description: Filtering pore size used in sample preparation, e.g., 0-0.22 micrometer
Smoker
Harmonized name: smoker
Description: Specification of smoking status
Source material identifiers
Harmonized name: source_material_id
Description: Unique identifier assigned to a material sample used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples.
Study completion status
Harmonized name: study_complt_stat
Description: Specification of study completion status, if no the reason should be specified
Temperature
Harmonized name: temp
Description: Temperature of the sample at time of sampling
Travel outside the country in last six months
Harmonized name: travel_out_six_month
Description: Specification of the countries travelled in the last six months; can include multiple travels
Twin sibling existence
Harmonized name: twin_sibling
Description: Specification of twin sibling presence
Urine collection method
Harmonized name: urine_collect_meth
Description: Specification of urine collection method
Urogenital tract disorder
Harmonized name: urogenit_tract_disor
Description: History of urogenitaltract disorders; can include multiple disorders
Weight loss in last three months
Harmonized name: weight_loss_3_month
Description: Specification of weight loss in the last three months, if yes should be further specified to include amount of weight loss
Last updated