Re Genotyper: Detecting mislabeled samples in genetic data

Research output: Contribution to journalArticle

  • External authors:
  • Konrad Zych
  • Basten L. Snoek
  • Miriam Rodriguez
  • K. Joeri Van Der Velde
  • Danny Arends
  • Harm Jan Westra
  • Morris A. Swertz
  • Jan E. Kammenga
  • Ritsert C. Jansen
  • Yang Li

Abstract

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the "ideal" genotype and identify "best-matched" labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a "data cleaning" step before standard data analysis.

Bibliographical metadata

Original languageEnglish
Article number0171324
JournalPLoS ONE
Volume12
Issue number2
DOIs
StatePublished - 13 Feb 2017