Value Object Descriptor Specification¶
This draft specification, under development by Driver Projects of the GKS Work Stream, specifies standard data classes for the exchange of common information useful for the description of variation but superfluous to the salient elements necessary for specifying a value object. We describe these classes as Value Object Descriptors (VODs). The VOD specification introduced here is version-controlled and extensible, and envisioned to seed a larger collection of VODs used with other GA4GH standards beyond VRSATILE.
Use of VODs provides a convenience mechanism for passing labels that you As an example, this means that a value object representing a genomic variant in VRS can be conveniently passed alongside human identifiers (e.g. ClinVar IDs), expressions (e.g. HGVS descriptions), and important contexts (e.g. sequence type, gene, transcript) in a standard format. This additional structure is necessary due to the nature of value objects and VRS.
The GA4GH Variation Representation Specification (VRS) is a terminology, information model, and schema for the computational representation of variation. VRS also describes useful conventions for the normalization of variation forms for message passing between systems. Objects compliant with VRS are value objects: data objects that are compared by structure and value, in contrast to entities which are compared by registered identifiers. For example, the variants represented by the NM_004415.2:c.8472_8483del and LRG_423t1:c.8472_8483del HGVS descriptions are not found equivalent by comparing these strings, but by comparing the structure of the reference sequence and indicated change underlying the descriptors. Conversely, the meaning of the variant (a specific deletion on a specific residue sequence) is clear without reference to any external naming authority (in this example, the NM and LRG sequence identifiers), and in fact the referenced entities can only be retrieved through lookup on a sequence registry instead of through inspection of the variant itself.
Value Object Descriptor¶
Computational Definition
The abstract Value Object Descriptor parent class. All attributes of this parent class are inherited by descendent classes.
Variation Descriptor¶
Computational Definition
This descriptor class is used for describing VRS Variation value objects.
Information Model
Some VariationDescriptor attributes are inherited from Entity.
Field |
Type |
Limits |
Description |
---|---|---|---|
id |
string |
1..1 |
The ‘logical’ identifier of the entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system. The identified entity may have a different ‘id’ in a different system, or may refer to an ‘id’ for the shared concept in another system (e.g. a CURIE). |
type |
string |
1..1 |
MUST be “VariationDescriptor”. |
label |
string |
0..1 |
A primary label for the value object. |
extensions |
0..m |
||
description |
string |
0..1 |
A free-text description of the value object. |
xrefs |
0..m |
List of CURIEs representing associated concepts. |
|
alternate_labels |
string |
0..m |
List of strings representing alternate labels for the value object. |
variation |
1..1 |
MUST be a Variation or CURIE reference to a Variation. |
|
molecule_context |
string |
0..1 |
The molecular context of this variant. Must be one of “genomic”, “transcript”, or “protein”. |
structural_type |
0..1 |
The structural variant type associated with this variant. We RECOMMEND a descendent term of SO:0001537. |
|
expressions |
0..m |
Typically HGVS or ISCN nomenclature expressions. Other systems relevant to the description of variation MAY be used. |
|
gene_context |
0..1 |
A specific gene context that applies to this variant. |
|
vrs_ref_allele_seq |
0..1 |
A VRS Sequence corresponding to a “ref allele”, describing the sequence expected at a VRS SequenceLocation reference. |
|
allelic_state |
0..1 |
We RECOMMEND that the allelic_state of a variation be described by terms from the Genotype Ontology (GENO). These SHOULD descend from concept GENO:0000875 <http://purl.obolibrary.org/obo/GENO_0000875>. |
Location Descriptor¶
This descriptor is intended to reference VRS Location value objects. In addition to the attributes inherited from its Value Object Descriptor parent class, the Sequence Location Descriptor has the following attributes:
Field |
Type |
Limits |
Description |
---|---|---|---|
type |
string |
1..1 |
MUST be “LocationDescriptor” |
location_id |
0..1 |
This MUST be provided if location is omitted |
|
location |
0..1 |
This MUST be provided if location_id is omitted |
Sequence Descriptor¶
This descriptor is intended to reference VRS Sequence value objects. In addition to the attributes inherited from its Value Object Descriptor parent class, the Sequence Descriptor has the following attributes:
Field |
Type |
Limits |
Description |
---|---|---|---|
type |
string |
1..1 |
MUST be “SequenceDescriptor” |
sequence_id |
0..1 |
This MUST be provided if sequence is omitted |
|
sequence |
0..1 |
This MUST be provided if sequence_id is omitted |
|
residue_type |
0..1 |
CURIE MUST be SO:0000348 (nucleic acid), SO:0001407 (peptidyl), or a descendent of one of these concepts. |
Gene Descriptor¶
This descriptor is intended to reference VRS Gene value objects. In addition to the attributes inherited from its Value Object Descriptor parent class, the Gene Descriptor has the following attributes:
Field |
Type |
Limits |
Description |
---|---|---|---|
type |
string |
1..1 |
MUST be “GeneDescriptor” |
gene_id |
0..1 |
This MUST be provided if gene is omitted |
|
gene |
0..1 |
This MUST be provided if gene_id is omitted |
Categorical Variation Descriptor¶
Computational Definition
This descriptor class is used for describing Categorical Variation value objects.
Other Data Classes¶
VCF Record¶
Extension¶
The Extension class provides VODs with a means to extend descriptions with other attributes unique to a content provider. These extensions are not expected to be natively understood under VRSATILE, but may be used for pre-negotiated exchange of message attributes when needed.
Field |
Type |
Limits |
Description |
---|---|---|---|
type |
string |
1..1 |
MUST be “Extension” |
name |
string |
1..1 |
A name for the Extension |
value |
any[] |
0..* |
Any primitive or structured object |
Expression¶
The Expression class is designed to enable descriptions based on a specified nomenclature or syntax for representing an object. Common examples of expressions for the description of molecular variation include the HGVS and ISCN nomenclatures.
Field |
Type |
Limits |
Description |
---|---|---|---|
type |
string |
1..1 |
MUST be “Expression” |
syntax |
1..1 |
CURIE referencing the expression syntax |
|
value |
string |
1..1 |
The concept expression as a string |
version |
string |
0..1 |
An optional version of the expression syntax |