Beacon - Discovery Services for Genomic Data¶
The Beacon protocol
defines an open standard for genomics data discovery by the Global Alliance for
Genomics & Health GA4GH with technical implementation through the
ELIXIR Beacon project. Since 2015 the
Theoretical Cytogenetics and Oncogenomics Group
at the University of Zurich has contributed to Beacon development, partially with the
Beacon+ demonstrator,
to show current functionality and test future Beacon protocol extensions. The
Beacon+ as well as the Progenetix
and cancercelllines.org websites run on top of the
open source bycon
stack which represent a full
Beacon implementation.
Technical Documentation
An increasing amount of documentation relevant to the Progenetix API can be found in those locations:
BeaconPlus Data / Query Model¶
The Progenetix / Beaconplus query model utilises the Beacon core data model for genomic and (biomedical, procedural) queries and data delivery. The model uses an object hierarchy, consisting of
variant
(a.k.a. genomicVariation)- a single molecular observation, e.g. a genomic variant observed in the analysis of the DNA from a biosample
- mostly corresponding to the "allele" concept, but with alternate use similar to that in VCF (e.g. CNV are no typical "allelic variants")
- in Progenetix identical variants from different sampleas are identified through
a compact digest (
variantInternalId
) and can be used to retrieve those distinct variants (c.f. "line in VCF")
analysis
- the entirety of all variants, observed in a single experiment on a single sample
- the result of an analysis represents a callset , comparable to a data column in a VCF variant annotation file
- callset has an optional position in the object hierarchy, since the variants themselves describe biological observations in a biosample
biosample
- a reference to a physical biological specimen on which analyses are performed
individual
- in a typical use a human subject from which the biosample(s) was/were extracted
The bycon
framework implemented for Progenetix and related collections such as
cancercelllines.org implements these core entities as data collections in a MongoDB database.
BeaconPlus Extensions of the Beacon API
The Progenetix Beacon API implements the Beacon framework and Beacon v2 default model with some extended functionality - e.g.
- limited support for Boolean filter use (i.e. ability to force an override of the general
AND
with a general&filterLogic=OR
option) - experimental support of a
/phenopackets
entity type &&requestedSchema=phenopacket
output option - additional service endpoints, e.g. for biosamples or individuals
- geoqueries using
$geoNear
parameters orcity
matches
Filters / Filtering Terms¶
Besides variant parameters the Beacon protocol defines filters
as (self-)scoped
query parameters, e.g. for phenotypes, diseases, biomedical performance or technical
entities.
The Progenetix query filter system adopts a hierarchical logic for filtering terms.
However, the includeDescendantTerms
pragma can be used to modify this behaviour.
Examples for codes with hierarchical treatment within the filter space are:
- NCIt
- true, deep hierarchical ontology of cancer classifications
- Cellosaurus
- derived cell lines are also accessible through the code of their parental line
Most of the filter options are based on ontology terms or identifiers in
CURIE format (e.g. NCIT:C4033
, cellosaurus:CVCL_0030
or PMID:16004614
). Please
see Beacon's Filters
documentation
for more information, e.g. about OntologyFilter
, AlphanumericFilter
, CustomFilter
types.
More documentation of available ontologies and how to find out about available terms can be found on the Classifications and Ontologies page.
Example¶
"filters": [
{"id": "NCIT:C4536", "includeDescendantTerms": false}
],
Beacon JSON responses¶
The Progenetix resource's API utilizes the bycon
framework for implementation of
the Beacon v2 API. The standard format for JSON responses corresponds to a generic Beacon v2
response. Depending on the endpoint, the main data will be a list of objects either
inside response.results
or (mostly) in response.resultSets[...].results
. Additionally,
most API responses provide access to data using handover objects.
Beacon API is implemented through bycon
Progenetix' Beacon API is implemented through the bycon
software. The code documentation site at bycon.progenetix.org]
provides live Beacon v2 path examples using the Progenetix resource.
bycon
Beacon Server¶
The bycon
project provides a combination of a Beacon-protocol based API with additional API services, used as backend and middleware for the Progenetix resource.
bycon
has been developed to support Beacon protocol development following earlier implementations of Beacon+ ("beaconPlus") with now deprected Perl libraries. The work tightly integrates with the ELIXIR Beacon project.
bycon
has its own documentation at bycon.progenetix.org.