Host-associated
Mandatory Attributes Optional Attributes
Mandatory Attributes
Collection date
Harmonized name: collection_date
Description: the date on which the sample was collected; date/time ranges are supported by providing two dates from among the supported value formats, delimited by a forward-slash character; collection times are supported by adding "T", then the hour and minute after the date, and must be in Coordinated Universal Time (UTC), otherwise known as "Zulu Time" (Z); supported formats include "DD-Mmm-YYYY", "Mmm-YYYY", "YYYY" or ISO 8601 standard "YYYY-mm-dd", "YYYY-mm", "YYYY-mm-ddThh:mm:ss"; e.g., 30-Oct-1990, Oct-1990, 1990, 1990-10-30, 1990-10, 21-Oct-1952/15-Feb-1953, 2015-10-11T17:53:03Z; valid non-ISO dates will be automatically transformed to ISO format
Broad-scale environmental context
Harmonized name: env_broad_scale
Description: Add terms that identify the major environment type(s) where your sample was collected. Recommend subclasses of biome [ENVO:00000428]. Multiple terms can be separated by one or more pipes e.g.: mangrove biome [ENVO:01000181]|estuarine biome [ENVO:01000020]
Local-scale environmental context
Harmonized name: env_local_scale
Description: Add terms that identify environmental entities having causal influences upon the entity at time of sampling, multiple terms can be separated by pipes, e.g.: shoreline [ENVO:00000486]|intertidal zone [ENVO:00000316]
Environmental medium
Harmonized name: env_medium
Description: Add terms that identify the material displaced by the entity at time of sampling. Recommend subclasses of environmental material [ENVO:00010483]. Multiple terms can be separated by pipes e.g.: estuarine water [ENVO:01000301]|estuarine mud [ENVO:00002160]
Geographic location
Harmonized name: geo_loc_name
Description: Geographical origin of the sample; use the appropriate name from this list https://www.insdc.org/submitting-standards/geo_loc_name-qualifier-vocabulary/. Use a colon to separate the country or ocean from more detailed information about the location, eg "Canada: Vancouver" or "Germany: halfway down Zugspitze, Alps"
Host
Harmonized name: host
Description: The natural (as opposed to laboratory) host to the organism from which the sample was obtained. Use the full taxonomic name, eg, "Homo sapiens".
Latitude and longitude
Harmonized name: lat_lon
Description: The geographical coordinates of the location where the sample was collected. Specify as degrees latitude and longitude in format "d[d.dddd] N|S d[dd.dddd] W|E", eg, 38.98 N 77.11 W
NCBI Taxonomy ID
Description: NCBI’s taxonomy identifier of the organism for this sample. The NCBI taxonomy ID can be found at https://www.ncbi.nlm.nih.gov/taxonomy/. Enter 32644 (which is a taxonomy ID for unidentified organisms) for the following or similar cases: (1) when NCBI taxonomy ID is not available because NCBI taxonomy does not yet cover the organism, (2) when metagenome or environmental sample was used, whose organismal composition is unknown in advance
Example: 9606 (for Homo sapiens), 452680 (for Pseudomonas sp. UK4)
Organism
Description: The most descriptive organism name for this sample (to the species, if possible). In the case of a new species, provide the desired organism name. In the case of unidentified species, choose the appropriate Genus and include ‘sp.’, e.g. “Escherichia sp.”. When sequencing a genome from a non-metagenomic source, include a strain or isolate name too, e.g. “Pseudomonas sp. UK4”
Example: Homo sapiens, Pseudomonas sp. UK4
Sample name
Description: A name that you choose for the sample. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. Every sample name from a single submitter must be unique within a single BioProject
Optional Attributes
Altitude
Harmonized name: altitude
Description: The altitude of the sample is the vertical distance between Earth's surface above Sea Level and the sampled position in the air.
Ancestral data
Harmonized name: ances_data
Description: Information about either pedigree or other ancestral information Description: , e.g., parental variety in case of mutant or selection, A/3*B (meaning [(A x B) x B] x B)
Biological status
Harmonized name: biol_stat
Description: The level of genome modification, e.g., wild, natural, semi-natural, inbred line, breeder's line, hybrid, clonal selection, mutant
Chemical administration
Harmonized name: chem_administration
Description: list of chemical compounds administered to the host or site where sampling occurred, and when (e.g. antibiotics, N fertilizer, air filter); can include multiple compounds. For Chemical Entities of Biological Interest ontology (CHEBI) (v1.72), please see http://bioportal.bioontology.org/visualize/44603
Collection method
Harmonized name: collection_method
Description: Process used to collect the sample, e.g., bronchoalveolar lavage (BAL)
Depth
Harmonized name: depth
Description: Depth is defined as the vertical distance below surface, e.g. for sediment or soil samples depth is measured from sediment or soil surface, respectivly. Depth can be reported as an interval for subsurface samples.
Derived from
Description: Indicates when one BioSample was derived from another BioSample. Value should include BioSample accession number(s)
Example: SAMN00000001, KAS24095074
Elevation
Harmonized name: elev
Description: The elevation of the sampling site as measured by the vertical distance from mean sea level.
Genetic modification
Harmonized name: genetic_mod
Description: Genetic modifications of the genome of an organism, which may occur naturally by spontaneous mutation, or be introduced by some experimental means, e.g. specification of a transgene or the gene knocked-out or details of transient transfection
Gravidity
Harmonized name: gravidity
Description: whether or not subject is gravid, and if yes date due or date post-conception, specifying which is used
Host age
Harmonized name: host_age
Description: Age of host at the time of sampling
Host blood pressure diastolic
Harmonized name: host_blood_press_diast
Description: resting diastolic blood pressureof the host, measured as mm mercury
Host blood pressure systolic
Harmonized name: host_blood_press_syst
Description: resting systolic blood pressure of the host, measured as mm mercury
Host body habitat
Harmonized name: host_body_habitat
Description: original body habitat where the sample was obtained from
Host body product
Harmonized name: host_body_product
Description: substance produced by the host, e.g. stool, mucus, where the sample was obtained from
Host body temperature
Harmonized name: host_body_temp
Description: core body temperature of the host when sample was collected
Host color
Harmonized name: host_color
Description: the color of host
Host common name
Harmonized name: host_common_name
Description: The natural language (non-taxonomic) name of the host organism, e.g., mouse
Host diet
Harmonized name: host_diet
Description: type of diet depending on the sample for animals omnivore, herbivore etc., for humans high-fat, meditteranean etc.; can include multiple diet types
Host disease
Harmonized name: host_disease
Description: Name of relevant disease, e.g. Salmonella gastroenteritis. Controlled vocabulary, http://bioportal.bioontology.org/ontologies/1009 or http://www.ncbi.nlm.nih.gov/mesh
Host dry mass
Harmonized name: host_dry_mass
Description: measurement of dry mass
Host family relationship
Harmonized name: host_family_relationship
Description:
Host genotype
Harmonized name: host_genotype
Host growth conditions
Harmonized name: host_growth_cond
Description: literature reference giving growth conditions of the host
Host height
Harmonized name: host_height
Description: the height of subject
Host last meal
Harmonized name: host_last_meal
Description: content of last meal and time since feeding; can include multiple values
Host length
Harmonized name: host_length
Description: the length of subject
Host life stage
Harmonized name: host_life_stage
Description: Description: of host life stage
Host phenotype
Harmonized name: host_phenotype
Host sex
Harmonized name: host_sex
Description: Gender or physical sex of the host
Host shape
Harmonized name: host_shape
Description: morphological shape of host
Host subject id
Harmonized name: host_subject_id
Description: a unique identifier by which each subject can be referred to, de-identified, e.g. #131
Host subspecific genetic lineage
Harmonized name: host_subspecf_genlin
Description: Information about the genetic distinctness of the host organism below the subspecies level e.g., serovar, serotype, biotype, ecotype, variety, cultivar, or any relevant genetic typing schemes like Group I plasmid. Subspecies should not be recorded in this term, but in the NCBI taxonomy. Supply both the lineage name and the lineage rank separated by a colon, e.g., biovar:abc123
Host substrate
Harmonized name: host_substrate
Description: the growth substrate of the host
Observed host symbionts
Harmonized name: host_symbiont
Description: The taxonomic name of the organism(s) found living in mutualistic, commensalistic, or parasitic symbiosis with the specific host
Host taxonomy ID
Harmonized name: host_taxid
Description: NCBI taxonomy ID of the host, e.g. 9606
Host tissue sampled
Harmonized name: host_tissue_sampled
Description: name of body site where the sample was obtained from, such as a specific organ or tissue, e.g., tongue, lung. For foundational model of anatomy ontology (fma) (v 4.11.0) or Uber-anatomy ontology (UBERON) (v releases/2014-06-15) terms, please see http://purl.bioontology.org/ontology/FMA or http://purl.bioontology.org/ontology/UBERON
Host total mass
Harmonized name: host_tot_mass
Description: total mass of the host at collection, the unit depends on host
Isolation source
Harmonized name: isolation_source
Description: Describes the physical, environmental and/or local geographical source of the biological sample from which the sample was derived.
Miscellaneous parameter
Harmonized name: misc_param
Description: any other measurement performed or parameter collected, that is not listed here
Negative control type
Harmonized name: neg_cont_type
Description: The substance or equipment used as a negative control in an investigation, e.g., distilled water, phosphate buffer, empty collection device, empty collection tube, DNA-free PCR mix, sterile swab, sterile syringe
Omics Observatory ID
Harmonized name: omics_observ_id
Description: A unique identifier of the omics-enabled observatory (or comparable time series) your data derives from. This identifier should be provided by the OMICON ontology; if you require a new identifier for your time series, contact the ontology's developers. Information is available here: https://github.com/GLOMICON/omicon. This field is only applicable to records which derive from an omics time-series or observatory.
Organism count
Harmonized name: organism_count
Description: total count of any organism per gram or volume of sample,should include name of organism followed by count; can include multiple organism counts
Oxygenation status of sample
Harmonized name: oxy_stat_samp
Description: oxygenation status of sample
Perturbation
Harmonized name: perturbation
Description: type of perturbation, e.g. chemical administration, physical disturbance, etc., coupled with time that perturbation occurred; can include multiple perturbation types
Positive control type
Harmonized name: pos_cont_type
Description: The substance, mixture, product, or apparatus used to verify that a process which is part of an investigation delivers a true positive
Reference for biomaterial
Harmonized name: ref_biomaterial
Description: Primary publication or genome report
Relationship to oxygen
Harmonized name: rel_to_oxygen
Description: Is this organism an aerobe, anaerobe? Please note that aerobic and anaerobic are valid descriptors for microbial environments, eg, aerobe, anaerobe, facultative, microaerophilic, microanaerobe, obligate aerobe, obligate anaerobe, missing, not applicable, not collected, not provided, restricted access
Sample capture status
Harmonized name: samp_capt_status
Description: Reason for the sample, e.g., active surveillance in response to an outbreak, active surveillance not initiated by an outbreak, farm sample, market sample
Sample collection device or method
Harmonized name: samp_collect_device
Description: Method or device employed for collecting sample
Sample disease stage
Harmonized name: samp_dis_stage
Description: Stage of the disease at the time of sample collection, e.g., dissemination, growth and reproduction, infection, inoculation, penetration
Sample material processing
Harmonized name: samp_mat_process
Description: Processing applied to the sample during or after isolation
Sample salinity
Harmonized name: samp_salinity
Description: the amount of salt dissolved within the collected sample
Sample size
Harmonized name: samp_size
Description: Amount or size of sample (volume, mass or area) that was collected
Sample storage duration
Harmonized name: samp_store_dur
Sample storage location
Harmonized name: samp_store_loc
Sample storage temperature
Harmonized name: samp_store_temp
Sample volume or weight for DNA extraction
Harmonized name: samp_vol_we_dna_ext
Description: volume (mL) or weight (g) of sample processed for DNA extraction
Size fraction selected
Harmonized name: size_frac
Description: Filtering pore size used in sample preparation, e.g., 0-0.22 micrometer
Source material identifiers
Harmonized name: source_material_id
Description: unique identifier assigned to a material sample used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples.
Temperature
Harmonized name: temp
Description: temperature of the sample at time of sampling
Last updated