Progenetix Data Review

This is a tracker for datasets which should be reviewed / flagged / re-processed or added to Progenetix in the first place. Please check & comment when done.

This mostly looks like noise?


The samples are highly skewed towards deletion...


Excluded due to lacking CNV annotations (source file w/ complex karyotypes but not parsed correctly in FMP).

  • 2022-02-02 PMID:17934521

  • SOLVED 2022-02-02: the 12 samples w/ platform geo:GPL5055 have only chr1 probes; removed. The other 96 arrays (like the exaample below) are GPL5056 and have also genome covering probes

  • odd provenance; the samples have been tagged - and seem to correspond - to GSE7428 with the data per sample on the server corresponding to the chr 1 only arrays from GEO

  • however, the stored variants & summary profile indicate nice whole-genome CNV profiles
  • have to look-up provenance; maybe mix of annotations (from where?) and arrays?

  • This example really looks like a combination of whole-chromosome CNVs & array for chr1:





  • 2021-12-17 PMID:19330026

  • SOLVED 2022-02-03: removed

  • only partial genome coverage => should be flagged/removed?


  • 2021-09-07 PMID:23417712

  • highly noisy/spiky

  • AFAIK was from methylation arrays & kept for DIPG project?
  • review / discard / select samples?


  • 2021-09-07 NCIT:C7431 FIXED

  • very strange frequency plot, with just some spikes; looks like either only 2 or such samples with only background are processed (while 3234 are listed), or some value error?

  • all (?) child terms are fine
  • byconeer/frequencymapsCreator.py -d progenetix -p "NCIT:C7431" doesn't help...