Web service to optimize case-control pairing based on both demographic and genetic data

Randomized trials consist of comparing the effectiveness of treatment by applying it to cases and matched controls that receive a placebo. If the cases and controls come from different genetic background that is not captured by broad racial categories, we may see an effect (or a lack of effect) solely due to these differences and regardless of the treatment leading us to false conclusions regarding the effectiveness of the treatment. These problems exist in ANY study that uses cases and controls (e.g., genetic investigation for a disease allele). To overcome these problems we have developed the CaseControlMatcher tool. CaseControlMatcher matches cases with controls based on demographic and genetic similarity and allows users (Pharmaceutical, Population Geneticists, Randomised trial designers) to make more informed decisions about the results of their trials.

Our web-service allows you to upload plink formatted genetic & demographic data for you test cohort. This CaseControlMatcher infers the geographical origins of the samples you provided using the Geographic Population Structure (GPS) tool [Elhaik et al 2014]. The CaseControlMatcher tool then optimises the case/control pairings. The results file contains the individuals admixture components, the optimised pairings, and a list of any unmatched individuals. For 1000 individuals and 150,000 SNPs, the analysis takes ~20 minutes and the results will be emailed to you.

Case-Control pairing using only demographic data can produce stratification bias. Here the bias is towards the Western Europeans origin, which is higher in the cases.

Eran Elhaik 2

Case-Control pairing using genomic and demographic data produces optimised pairings and highlights any unpaired individuals.

Eran Elhaik 2