Progenetix News

Data model and plotting update

Now with "high level" support

Example Plot

Both the data model and the plot engine now support an separate color style for high level CNV events. In histograms those are overplotted on the standard color scheme - e.g. red areas where high level plots were detected, by their absolute frequency. In the example plot above we performed a search for high-level gains involving EGFR, in glioblastomas, and then only plotted chromosomes 7 and 9. While the ~100% peak at EGFR is expected, additionally nearly 70% of the matched samples have a focal and high level deletion involving the CDKN2A locus.

Continue reading

cancercelllines.org listed in Expasy

Entry in the Swiss Institute of Bioinformatics Catalogue

Expasy logo Our recently launched cancer cell line genomics site cancercelllines.org is now listed as one of the resources in the Swiss Institute of Bioinformatics’ Expasy catalogue.

Continue reading

cancercelllines.org - a Novel Resource for Genomic Variants in Cancer Cell Lines

DATABASE Article

Rahel Paloots and Michael Baudis

Database (Oxford). 2024 Apr 30:2024:baae030. doi: 10.1093/database/baae030
bioarXiv preprint (2023-12-13): https://doi.org/10.1101/2023.12.12.571281

DATABASE logo Abstract: Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource - cancercelllines.org - with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants (SNVs) data. We have gathered over 5,600 copy number profiles as well as SNV annotations for 16,000 cell lines and provide this data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 API and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme.

Availability and Implementation: Our resource is publicly available on the web at cancercelllines.org.

Continue reading

Progenetix as SIB and ELIXIR Resource

Recognizing the Progenetix platform as Swiss contribution to the European bioinformatics resources ecosystem

elixir logo The Progenetix resource has finally been recognized as an official contribution to the ELIXIR European bioinformatics informatics ecosystem. Besides Expasy Progenetix now is linked through ELIXIR's resource page. Or just go directly to progenetix.org (and its daughter project cancercelllines.org).

Continue reading

Implementing alphanumeric filters

Age queries as first use case for comparator queries

age query image While the Beacon v2 API has in principle support for an alphanumeric filter type so far in bycon there had been no dedicated support. With yesterday's v1.0.49 update the library now supports such queries, implemented & tested specifically for age (...at diagnosis) values.

Continue reading

Plotting now handled by Python - Goodbye Perl and PGX

The plotting code has been re-implemented as part of the
bycon

The last month has seen the transition from the Perl-based PGX plotting apps to new libraries implemented as part of the main bycon Python library stack. Very few of the options have been removed although some (probably color side bars for clustering items ...) will be added again.

Continue reading

New plotting documentation and changes in parameter handling

Separate page for plot options and parameters

Inspired by some request regarding plotting of large sample numbers we've added a dedicated page to this site as the one-stop place for plot generation information (though you still will find examples e.g. on the use-cases page).

Plot parameters lost the - prefix

We have removed the - prefix from the plot parameter names; e.g. the previous -plotChros selector is now simply plotChros.

Continue reading

New .pgxfreq Interval Frequencies File Type

Changed suffix and upcoming additions...

With the November 2022 update we changed the file suffix to pgxfreq to keep a clean separation between the (usually binned) CNV frequency files and the (usually raw) representation of sample-specific CNVs (and other variants).

Continue reading

Geographic Maps

Displaying geolocations query results or user-provided data on a map

This new feature utilizes the geolocations service to:

  • display of matched cities on a map using the &output=map option
  • load arbitrary data from a hosted data table (e.g. on Github)
Continue reading

Variant Types Update

Correcting Hierarchical Queries for Variant Type

The variantType query parameter - recommended only for non-precise variants, i.e. such w/o a specified allele - is now being expanded correctly. In Progenetix these are only CNVs, all expressed as (sub)classes of EFO:0030066 (relative copy number variation):

  EFO:0030066:
    child_terms:
      - EFO:0030066
      - EFO:0030067
      - EFO:0030068
      - EFO:0030069
      - EFO:0030070
      - EFO:0030071
      - EFO:0030072
      - EFO:0030073
  EFO:0030070:
    child_terms:
      - EFO:0030070
      - EFO:0030071
      - EFO:0030072
      - EFO:0030073   
...etc. Continue reading

Implementation of the GA4GH Beacon protocol for discovery and sharing of genomic copy number variation data

Poster Abstract | ESHG Vienna 2022

Background & Objectives Genomic copy number variations (CNV) are a major contributor to inter-individual genomic variation, can be causative events in rare diseases, but especially represent the majority of the mutational landscape in the most malignancies. While specific CNV events and some recurring patterns have contributed to the identification of individual cancer drivers and the recognition of cancer subtypes, the complexity of genomic CNV patterns requires large amounts of well-defined genomic profiles for statistically meaningful analyses. At the other end of the spectrum, in the area of rare disease genomics the potential pathogenicity of individual CNV events requires validation against a vast set of disease-related and reference genomic profiles and annotations.

Continue reading

CNVs in Prenatal Tests & Maternal Malignancies

Publication indicating rare CNV signatures from a nationwide Dutch screening program

In a new publication in the Journal of Clinical Oncology CJ Heesterbeek, SM Aukema and the co-authors from the Dutch NIPT Consortium report about the incidence and diagnostic significance of incidential detection om maternal copy number variations in a large screening program aimed at detecting chromosomal imbalances in embryos, for a prediction of developmental abnormalities.

Continue reading

Paginated Downloads

Chunk-wise downloads of search results

Throught its Search Samples page Progenetix has always offered options to download search results (biosamples, variants) in different formats (JSON, tab-delimited tables, pgxseg files ...). However, especially for large results with thousands of samples and potentially millions of variants this led to inconsistent behaviour e.g. time-outs or dropped connections.

Now, API responses are capped through the limit parameter to default "sensible" values which, however, can be adjusted for systematic data access & retrieval. This functionality is also implemented in the sample search form, allowing e.g. the limited retrieval of a subset of samples from large or general cancer types, or the "paging" through consecutive sample groups for partitioned data retrieval.

Continue reading

VRSified Variants

Variant Response in GA4GH Variant Representation Standard (VRS) Format

The variant format served through the API has now changed to a format commpatible with the GA4GH Variant Representation Standard (VRS version (bleeding edge version...).

{
    "caseLevelData": [
        {
            "analysisId": "pgxcs-kftwfurn",
            "biosampleId": "pgxbs-kftvj7rz",
            "id": "pgxvar-5c86664409d374f2dc4eeb93"
        }
    ],
    "variation": {
        "location": {
            "interval": {
                "end": {
                    "type": "Number",
                    "value": 62947165
                },
                "start": {
                    "type": "Number",
                    "value": 23029501
                }
            },
            "sequenceId": "refseq:NC_000018.10",
            "type": "SequenceLocation"
        },
        "relativeCopyClass": "partial loss",
        "updated": "2022-03-29T15:06:46.526020",
        "variantInternalId": "18:23029501-62947165:DEL"
    }
}
{
    "caseLevelData": [
        {
            "analysisId": "pgxcs-kftwfurn",
            "biosampleId": "pgxbs-kftvj7rz",
            "id": "pgxvar-5c86664409d374f2dc4eeb93"
        }
    ],
    "variantInternalId": "18:23029501-62947165:DEL"
    "referenceName": "18",
    "start": 23029501,
    "end": 62947165,
    "variantType": "DEL",
    "updated": "2022-03-29T15:06:46.526020"
}
{
    "caseLevelData": [
        {
            "analysisId": "pgxcs-kl8hg4ky",
            "biosampleId": "pgxbs-kl8hg4ku",
            "id": "pgxvar-5be1840772798347f0eda0d8"
        }
    ],
    "variation": {
        "location": {
            "interval": {
                "end": {
                    "type": "Number",
                    "value": 7577121
                },
                "start": {
                    "type": "Number",
                    "value": 7577120
                },
                "type": "SequenceInterval"
            },
            "sequenceId": "refseq:NC_000017.11",
            "type": "SequenceLocation"
        },
        "state": {
            "sequence": "G",
            "type": "LiteralSequenceExpression"
        },
        "updated": "2022-03-29T15:35:35.700954",
        "variantInternalId": "17:7577121:C>G"
    }
}
{
    "caseLevelData": [
        {
            "analysisId": "pgxcs-kl8hg4ky",
            "biosampleId": "pgxbs-kl8hg4ku",
            "id": "pgxvar-5be1840772798347f0eda0d8"
        }
    ],
    "variantInternalId": "17:7577121:C>G",
    "start": 7577120,
    "end": 7577121,
    "referenceName": "17",
    "referenceBases": "C",
    "alternateBases": "G",
    "updated": "2022-03-29T15:35:35.700954"
}
Continue reading

Histogram Improvements

Excluding reference samples from default plots

So far all samples matching a grouping code ("collation"; disease, publication etc.) have been included when generating the pre-computed CNV frequencies. However, the potential inclusion of normal/refernce samples sometimes lead to "dampened" CNV profiles. Now, samples labeled as "reference sample" (EFO:0009654) - a term we had introduced into the Experimental Factor Ontology - are excluded from pre-computed histograms. However, when e.g. calling up samples from publications using the search panel referencve samples will be included unless specifically excluded.

Pre-computed CNV Frequencies for PMID:22824167, now ommitting reference samples by default
Continue reading

Query-based histograms

Direct generation of histogram plots from Beacon queries

So far, the plot API only provided (documented) access to generate CNV histogram plots from "collations" with pre-computed frequencies. The bycon API now offers a direct access to the histograms, by adding &output=histoplot to a Beacon (biosamples) query URL. The server will first query the samples and then perform a handover to the plotting API. Please be aware that this procedure is best suited for limited queries and may lead to a time-out.

Continue reading

Genomic Interval Changes

New positions for the 1Mb interval maps

So far, CNV histograms and .pgxseg segment and matrix files used a 1Mb genome binning, based on the consecutive assignment of 1Mb intervals from 1pter -> Yqter. This resulted in 3102 intervals, with the last interval of each chromosome being smaller.

On 2022-02-11 we have changed the procedure. Now, the last interval of the short arm of any chromosome is terminated at the centromere, leading to

  • a (potentially) shortened "last p" interval
  • a shift of most interval positions
  • a changed interval number from 3102 to 3106
Continue reading

CNV Ontology Proposal - Now Live at EFO

EFO Ontology contains now terms for (relative) CNV levels

EFO copy number assessment treeAs part of the hCNV-X work - related to "Workflows and Tools for hCNV Data Exchange Procedures" and to the intersection with Beacon and GA4GH VRS - we have now a new proposal for the creation of an ontology for the annotation of (relative) CNV events. The CNV representation ontology is targeted for adoption by Sequence Ontology (SO) and then to be used by an updated version of the VRS standard. Please see the discussions linked from the proposal page. However, we have also contributed the CNV proposal to EFO where it has gotten live on January 21.

Continue reading

Introducing variant_state classes for CNVs

More granular annotation of CNV types

More information can be found in the description of ontology use for CNVs.

Continue reading

Term-specific queries

Allowing the de-selection of descendant terms in ontology filters

includeDescendantTerms selectorSo far (and still as standard), any selected filter will also include matches on its child terms; i.e. "NCIT:C3052 - Digestive System Neoplasm" will include results from gastric, esophagus, colon ... cancer. Here we introduce a selector for the search panel to make use of the Beacon v2 filters includeDescendantTerms pragma, which can be set to false if one only wants to query for the term itself and exclude any child terms from the matching.

Please be aware that this can only be applied globally and will affect all filtering terms used in a query. More information is available in the Filtering Terms documentation.

Continue reading

BUG FIX Frequency Maps

Fix "only direct code matches" frequencies

Pre-computed Progenetix CNV frequency histograms (e.g. for NCIT codes) are based samples from all child terms; e.g. NCIT:C3262 will display an overview of all neoplasias, although no single case has this specific code.

However, there had been a bug when under specific circumstances (code has some mapped samples and code has more samples in child terms) only the direct matches were used to compute the frequencies although the full number of samples was indicated in the plot legend. FIXED.

Continue reading

Publications - Updated publication listings

Progenetix citations page and better map

Progenetixuse Page

We have introduced a new publications listing page which contains links to articles that cite or use Progenetix and resources from this "ecosystem." Please let us know if you are aware of other such cases - frequently the publications do not use a proper citation format but just refer to "according tho the Progenetix resource" or similar in the text.

Continue reading

API: Biosample Schema Update

Conversion of `biocharacteristics` array to separate parameters

The Biosample schema used for exporting Progenetix data has been adjusted with respect to representation of "bio-"classifications. The previous biocharacteristics list parameter has been deprecated and its previous content is now expressed in1:

  • histologicalDiagnosis (PXF)
  • sampledTissue (PXF)
  • icdoMorphology (pgx)
  • icdoTopography (pgx)
Continue reading

Progenetix - An open reference resource for copy number vatiation data in cancer

Qingyao Huang

Cancer Genomics Consortium Annual Meeting 2021 Aug 1-4

Continue reading

API: Beacon Paths Updates

For testing the rapidly evolving Beacon v2 API, we have now implemented more paths/endpoints which mostly conform to the brand new & still "flexible" v2.0.0-draft.4 version. Please check the documentation and examples.

Continue reading

API: JSON Exports now camelCased

In "forward-looking" conformity with the Beacon v2 API, the JSON attributes of the API responses has been changed from snake_cased to camelCased. Please adjust your code, where necessary.

Continue reading

The Progenetix oncogenomic resource in 2021

The Progenetix oncogenomic resource in 2021

Qingyao Huang, Paula Carrio Cordo, Bo Gao, Rahel Paloots, Michael Baudis

Database (Oxford). 2021 Jul 17;2021:baab043.

DATABASE logoThis article provides an overview of recent changes and additions to the Progenetix database and the services provided through the resource.

Continue reading

New feature - LOH data

Loss of heterozygosity (LOH) is a phenomenon frequently observed in cancer genomes where the selective pressure to keep only the susceptible gene product from one allele removes the other healthy allele from the pool; In this context, copy neutral - loss of heterozygosity (CN-LOH) is commonly observed in haematological malignancies (O'keefe et al., 2010 and Mulligan et al., 2007). To Progenetix oncogenomic resource, comprising of nearly 800 cancer types (by NCIt classification) as of 2021, we expanded the new feature of LOH in our data collection, in addition to the total copy number, to open the door for the analysis of frequency and impact of this phenomenon.

Update 2021-01-28:

LOH variants can now be queried through the Search and Beacon+ interfaces, either as specific variants or together with deletions.

Please be aware that in contrast to the "complete for chromosomes 1-22" DUP and DEL calls, LOH is only determined for a subset of samples and therefore will be underreported in the statistics section.

Continue reading

Improved Data Access through Histograms

Histograms for static datasets (e.g. NCIT codes, publications ...) provide now links to the dataset details page as well as a download option for the binned CNV frequency data.

Continue reading

Signatures of Discriminative CNA in 31 Cancer Subtypes

Bo Gao and Michael Baudis (2021)

Accepted at Frontiers in Genetics, 2021-04-15

Abstract

Copy number aberrations (CNA) are one of the most important classes of genomic mutations relatedto oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated bymolecular-cytogenetic and genome sequencing based methods. While this data has been instrumentalin the identification of cancer-related genes and promoted research into the relation between CNA andhisto-pathologically defined cancer types, the heterogeneity of source data and derived CNV profilespose great challenges for data integration and comparative analysis. Furthermore, a majority of exist-ing studies have been focused on the association of CNA to pre-selected ”driver” genes with limitedapplication to rare drivers and other genomic elements.

Continue reading

Beacon+ and Progenetix Queries by Gene Symbol

We have introduced a simple option to search directly by Gene Symbol, which will match to any genomic variant with partial overlap to the specified gene. This works by expanding the Gene Symbol (e.g. TP53, CDKN2A ...) into a range query for its genomic coordinates (maximum CDR).

Such queries - which would e.g. return all whole-chromosome CNV events covering the gene of interest, too - should be narrowed by providing e.g. Variant Type and Maximum Size (e.g. 2000000) values.

Continue reading

Diffuse Intrinsic Pontine Glioma (DIPG) cohort

Diffuse Intrinsic Pontine Glioma (DIPG) is a highly aggressive tumor type that originate from glial cells in the pon area of brainstem, which controls vital functions including breathing, blood pressure and heart rate. DIPG occurs frequently in the early childhood and has a 5-year survival rate below 1 percent. Progenetix has now incorporated the DIPG cohort, consisting of 1067 individuals from 18 publications. The measured data include copy number variation as well as (in part) point mutations on relevant genes, e.g. TP53, NF1, ATRX, TERT promoter.

Continue reading

TCGA CNV Data

www.cancer.gov/tcga) for quite some time, we have now launched a dedicated search page to facilitate data access and visualization using the standard Progenetix tools.

Additionally, the TCGA page section priovides pre-computed CNV frequency data for the individual TCGA studies.

Continue reading

arrayMap is Back

After some months of dormancy, the arrayMap resource has been relaunched through integration with the new Progenetix site. All of the oiginal arrayMap data has now been integrated into Progenetix, and of today the arraymap.org domain maps to a standard Progenetix search page, where only data samples with existing source data (e.g. probe specific array files) will be presented.

Continue reading

Website updates

The new year brings some refinements to biosamples search and display:

  • added example for a pure filter search (HeLa)
  • made UCSC link depending on variants
  • added info pop-ups to biosamples table header
  • removed DEL and DUP fractions from biosamples table
  • added label display for external_references items in biosamples table

Enjoy ...

Continue reading

Genomic Copy Number Signatures...

Genomic Copy Number Signatures Based Classifiers for Subtype Identification in Cancer

Bo Gao and Michael Baudis (2020)

bioRxiv, 2020-12-18
Continue reading

API and Services Documentation

Following the launch of the updated Progenetix website (new interface, now much more data with >130'000 samples...) and the recent introduction of the new Python based bycon API for BeaconPlus and Progenetix Services we now also have some structured information for the different API options.

Continue reading

pgx namespace and persistant identifiers

While the pgx prefix had been registered in 2017 with identifiers.org we recently changed the resolver and target mappings on the Progenetix server. This went hand-in-hand with the generation of unique & persistant identifiers for the main data items.

Continue reading

bycon powered BeaconPlus

Moving to a new, Python-based API

We've changed the Beacon backend to the bycon code base. The new project's codebase is accessible through the bycon project. Contributions welcome!

Example

Continue reading

GA4GH Beacon v2 at GA4GH Plenary

GA4GH Beacon v2 - Evolving Reference Standard for Genomic Data Exchange

GA4GH 8th Plenary

Gary Saunders, Jordi Rambla de Argila, Anthony Brookes, Juha Törnroos and Michael Baudis

For the ELIXIR Beacon project, GA4GH Discovery work stream and the international network of Beacon API developers

The Beacon driver project was one of the earliest initiatives of the Global Alliance for Genomics and Health with the Beacon v1.0 API as first approved GA4GH standard. Version 2 of the protocol is slated to provide fundamental changes, towards a Internet of Genomics foundational standard:

Continue reading

Progenetix at GA4GH 2020 Plenary

GA4GH 8th Plenary

Michael Baudis

The Progenetix oncogenomics resource provides sample-specific cancer genome profiling data and biomedical annotations as well as provenance data from cancer studies. Especially through currently 113322 curated genomic copy number number (CNV) profiles from 1600 individual studies representing over 500 cancer types (NCIt), Progenetix empowers aggregate and comparative analyses which vastly exceed individual studies or single diagnostic concepts.

Continue reading

New Progenetix Website

The Progenetix website has been completely rebuilt using a JavaScript / React based framework and API based content delivery. At its core, the site is built around the Beacon standard, with some extensions for data colections and advanced query options.

Continue reading

Progenetix now licensed under CC-BY 4.0

After many years of using a CreativeCommons CC-BY-SA ("attribution + share alike"), the Progenetix resource has dropped the "SA - share alike" attribute and is now "attribution" only. This may facilitate the use of the data in more complex and/or commercial scenarios - enjoy!

Continue reading

Error Calibration ... for CNA Analysis

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis.

Bo Gao and Michael Baudis (2020)

bioRxiv, 2019-07-31. DOI 10.1101/720854
Genomics, Volume 112, Issue 5, September 2020, Pages 3331-3341, accepted 2020-05-06 doi.org/10.1016/j.ygeno.2020.05.008.
Continue reading

Geographic assessment of cancer genome profiling studies

Geographic assessment of cancer genome profiling studies.

Paula Carrio Cordo, Elise Acheson, Qingyao Huang and Michael Baudis (2020)

DATABASE, Volume 2020, 2020, baaa009, doi.org/10.1093/database/baaa009
Continue reading

CURIE Prefix Remapping - NCIT & PMID

Wherever possible, data annotation in Progenetix uses {S}[B] OntologyClass objects for categorical values, with CURIEs as id values. So far, the Progenetix databases had used pubmed: for PubMed identifiers and ncit: for NCI Metathesaurus (Neoplasm) ids.

Continue reading

Population assignment from cancer genomes

Enabling population assignment from cancer genomes with SNP2pop.

Huang Q and Baudis M. (2020)

Sci Rep 10, 4846 (2020). doi.org/10.1038/s41598-020-61854-x
Continue reading

BeaconPlus in ELIXIR Beacon Network

The Beacon+ implementation of the GA4GH Beacon protocol has become a part of the ELIXIR Beacon Network, an expanding Beacon service to query multiple Beacon resources and aggregate their query results.

Continue reading

Beacon Variants in UCSC Browser

The response element of the Beacon+ interface now contains a link for displaying the matched variants e.g. of e.g. a CNV query in the UCSC genome browser.

Continue reading

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis

Minimum Error Calibration and Normalization for Genomic Copy Number Analysis.

Bo Gao and Michael Baudis (2019)

bioRxiv, 2019-07-31. DOI 10.1101/720854
Continue reading

New info.progenetix.org site

Launch of new info.progenetix.org resource site

Today, we started to provide a new documentation structure for our group's work and software projects.

The site is assumed, over time, to replace the previous Progenetix guide.

Continue reading

arrayMap 2014: an...

arrayMap 2014: an updated cancer genome resource.

Cai H, Gupta S, Rath P, Ai N, Baudis M.

Abstract Somatic copy number aberrations (CNA) represent a mutation type encountered in the majority of cancer genomes. Here, we present the 2014 edition of arrayMap (www.arraymap.org), a publicly accessible collection of pre-processed oncogenomic array data sets and CNA profiles, representing a vast range of human malignancies. Since the initial release, we have enhanced this resource both in content and especially with regard to data mining support.

Continue reading

Chromothripsis-like...

Chromothripsis-like patterns are recurring but heterogeneously distributed features in a survey of 22,347 cancer genome screens.

Cai H, Kumar N, Bagheri HC, von Mering C, Robinson MD, Baudis M.

Abstract BACKGROUND: Chromothripsis is a recently discovered phenomenon of genomic rearrangement, possibly arising during a single genome-shattering event. This could provide an alternative paradigm in cancer development, replacing the gradual accumulation of genomic changes with a "one-off" catastrophic event. However, the term has been used with varying operational definitions, with the minimal consensus being a large number of locally clustered copy number aberrations.

Continue reading

Progenetix: 12 years...

Progenetix: 12 years of oncogenomic data curation.

Cai H, Kumar N, Ai N, Gupta S, Rath P, Baudis M.

Abstract DNA copy number aberrations (CNAs) can be found in the majority of cancer genomes and are crucial for understanding the potential mechanisms underlying tumor initiation and progression. Since the first release in 2001, the Progenetix project (www.progenetix.org) has provided a reference resource dedicated to provide the most comprehensive collection of genome-wide CNA profiles.

Continue reading

Progenetix & arrayMap Changes (2012-06-01 - 2013-05-22)

2013-05-22

  • bug fix: fixing lack of clustering for CNA frequency profiles in the analysis section
  • removed "Series Search" from the arrayMap side bar; kind of confusing - just search for the samples & select the series

2013-05-12

  • introduced a method to combine sample annotations and segmentation files for user data processing (see "FAQ & GUIDE")
  • fixed some array plot presentation and replotting problems

2013-05-05

  • consolidation of script names - again, don't use deep links (besides for "api.cgi?...")
  • moving of remaining sample selection options (random sample number, segments number, age range) to the sample selection page, leaving the pre-analysis page (now "prepare.cgi") for plotting/grouping options
  • fixed the KM-style survival plots

2013-04-10

  • re-factoring of the cytobands plotting for histograms and heatmaps; this also fixes missing histogram tiles
  • analysis output page: the circular histogram/connections plot and group specific histograms are now all available as both SVG and PNG image files

2013-04-06

Some changes to the plotting options:

  • the circular plot is now added as a default; and connections are drawn in for <= 30 samples (subject to change)
  • one can now mark up multiple genes (or other loci of interest), for all plot types

2013-03-25

  • added option to create custom analysis groups based on text match values
  • rewritten circular plot code

2013-02-27

  • copied data for PMIDs 17327916, 17311676, 18506749 and 18246049 from arrayMap to Progenetix

2013-02-24

  • bug fix: gene selector was broken for about a week; fixed

2013-02-17

  • In many places, images are now converted sever side to PNG data streams and embedded into the web pages. This will substantially decrease web data traffic and page download times. Fully linked SVG images (including region links etc.) are still available through the analysis pipeline.

2013-02-13

  • data fix: PMID 18160781 had missing loss values (due to irregular character encoding); fixed, thanks to Emanuela Felley-Bosco for the note!

2012-12-14

  • moved the region filter from the analysis to the sample selection page
  • added a "mark region" option to the analysis page: one now can highlight a genome region in histograms and matrix plots

2012-11-29

  • added "select all" option to entity lists
  • implemented first version of sample-to-entity match score
  • added single sample annotation input field to "User File Processing"; i.e. one can now type in CNA data for a single case, and have this visualised and similar cases listed
  • added per sample CNA visualisation to the samples details listings (currently if up to 100 samples)
  • added direct access to sample details listing to the subsets pages

2012-11-09

  • adding of abstract search to the publication search page

2012-10-25

  • introduction of a matching function for similar cases by CNA profile, accessible through the sample details pages of both Progenetix and arraymap

2012-10-22

  • Introduction of SEER groups

2012-09-26

The database now contains the copy number status for different interval sizes (e.g. 1MB). With this, users can now create their own data plots (histograms etc.) using more than 10000 cancer copy number profiles with a high resolution. The options here are still being tested and improved - comments welcome!

2012-09-18

  • added a new export file format "ANNOTATED SEGMENTS FILE", which uses the first columns for standard segment annotation, followed by some diagnostic and clinical data; i.e., the information for a case is repeated for each segment:
GSM255090   22  25063244    25193559    1   NA  C50 8500/3  breast  Infiltrating duct carcinoma, NOS    Carcinomas: breast ca.  NA  1   51  0.58  
GSM255090   22  25368299    48899534    -1  NA  C50 8500/3  breast  Infiltrating duct carcinoma, NOS    Carcinomas: breast ca.  NA  1   51  0.58  
GSM255091   1   2224111 30146401    -1  NA  C50 8500/3  breast  Infiltrating duct carcinoma, NOS    Carcinomas: breast ca.  NA  0   72  0.54  
GSM255091   1   35418712    37555461    1   NA  C50 8500/3  breast  Infiltrating duct carcinoma, NOS    Carcinomas: breast ca.  NA  0   72  0.54  

2012-09-13

  • added gene selection for region specific replotting of array data

2012-08-22

  • the gene database has been changed to the last version of the complete (HUGO names only) Ensembl gene list for HG18; previously, only a subset of "cancer related genes" was offered in the gene selection search fields

2012-07-04

  • some interface and form elements have been streamlined (e.g. less commonly used selector fields, sample selection options)
  • some common options are now displayed only if activated (e.g. "mouse over" to see all files available for download)
  • icon quality has been enhanced for all but the details pages

2012-06-13

  • New: All pre-generated histogram and ideogram plots are now produced based on a 1Mb matrix, with a 500Kb minimum size filter to remove CNV/platform dependent background from some high resolution array platforms. The unfiltered data can still be visualized through the standard analysis procedures.
  • Bug fix: Interactive segment size filtering so far only worked for region specific queries, but not as a general filter (see above). This has been fixed; a minimum segment size in the visualization options now will remove all smaller segments.

2012-06-01

  • NEW: change log; that is what is shown here
  • FEATURE: The interval selector now has options to include the p-arms of acrocentric chromosomes (though the data itself there may be incompletely annotated!). Feature requested by Melody Lam.
Continue reading

arrayMap feature update(s)

arrayMap feature update(s)

Over the last weeks, we have introduced a number of new search/ordering features to arrayMap. Some of those mimic functions previously implemented in Progenetix. Overall, the highlights are:

ICD entity aggregation
all ICD-O entities with their according samples
ICD locus aggregation
all tumor loci with their according samples
Clinical group aggregation
clinical super-entities (e.g. "breast ca.": all carcinoma types with locus breast) with their samples
Publication aggregation
all publication with samples in arrayMap

In contrast to Progenetix, we do not offer precomputed SCNA histograms. However, users can generate them on the fly, but should consider the specific challenges in doing so (e.g. noise background in frequency calculations).

Continue reading

Genomic imbalances in 5918 malignant tumors

Genomic imbalances in 5918 malignant epithelial tumors: an explorative meta-analysis of chromosomal CGH data.

Baudis M.

Abstract BACKGROUND: Chromosomal abnormalities have been associated with most human malignancies, with gains and losses on some genomic regions associated with particular entities. METHODS: Of the 15429 cases collected for the Progenetix molecular-cytogenetic database, 5918 malignant epithelial neoplasias analyzed by chromosomal Comparative Genomic Hybridization (CGH) were selected for further evaluation. For the 22 clinico-pathological entities with more than 50 cases, summary profiles for genomic imbalances were generated from case specific data and analyzed.

Continue reading

Online database and...

Online database and bioinformatics toolbox to support data mining in cancer cytogenetics.

Baudis M.

Continue reading

Progenetix.net: an...

Progenetix.net: an online repository for molecular cytogenetic aberration data.

Baudis M, Cleary ML.

Abstract Through sequencing projects and, more recently, array-based expression analysis experiments, a wealth of genetic data has become accessible via online resources. In contrast, few of the (molecular-) cytogenetic aberration data collected in the last decades are available in a format suitable for data mining procedures. www.progenetix.net is a new online repository for previously published chromosomal aberration data, allowing the addition of band-specific information about chromosomal imbalances to oncologic data analysis efforts.

Continue reading