Up for a Challenge? (U4C) – Stimulating Innovation in Breast Cancer Genetic Epidemiology
- Breast Cancer GWAS Data Available for the Challenge
- Evaluation Criteria
- Winning Teams and Project Descriptions
- U4C Publications
- NCI Contacts
Breast cancer remains a major public health burden. Researchers have performed genome-wide association studies (GWAS) to identify key genes and biological pathways potentially affecting disease risk. More than 100 common genetic variants have been associated with breast cancer; however, these variants explain only a small proportion of the estimated genetic contribution to the risk of breast cancer.
In order to stimulate innovation in the field of genetic epidemiology, the Epidemiology and Genomic Research Program (EGRP) sponsored a prize competition to inspire novel cross-disciplinary approaches to more fully decipher the genomic basis of breast cancer. "Up For A Challenge (U4C) – Stimulating Innovation in Breast Cancer Genetic Epidemiology" was launched in 2015.
The goal of this Challenge was to use innovative approaches to identify novel pathways—including new genes or combinations of genes, genetic variants, or sets of genomic features—involved in breast cancer susceptibility in order to generate new biological hypotheses. Several data sets were made available for use in the Challenge, some of which were shared for the first time. In addition, Challenge participants could use any other publicly available data sets for the purposes of developing and applying methods for identification of novel pathways.
The U4C advanced innovation in the field of genetic epidemiology by providing further insight into the genetic contribution to breast cancer, increased the amount and diversity of minds tackling a tough scientific problem, and made breast cancer genetic epidemiologic data more widely available.
Breast Cancer GWAS Data Available for the Challenge
For the purposes of this Challenge, data contributors made several GWAS data sets (phenotype, genotype and imputation data) available in the database of Genotypes and Phenotypes (dbGaP). Challenge participants could use any of the dbGaP datasets or any other publicly available or controlled access data from NIH repositories.View Summary Table of Breast Cancer GWAS Data Available for U4C
|dbGaP Accession Number||Study Name||Data Contributors
*Denotes contact Principal Investigator for study
|phs000912||Admixture Mapping for Breast Cancer in Latinas||Esther John, Elad Ziv*|
|phs000851||African American Breast Cancer GWAS||Christine Ambrosone, Leslie Bernstein, Christopher Haiman*, Jennifer Hu, Esther John, Andrew Olshan, Regina Ziegler|
|phs000812||Breast and Prostate Cancer Cohort Consortium GWAS||Federico Canzian, Stephen Chanock, Susan Gapstur, Montserrat Garcia-Closas, Christopher Haiman, Brian Henderson, David Hunter, Peter Kraft*, Sara Lindstroem, Elio Riboli|
|phs000147||Cancer Genetic Markers of Susceptibility Breast Cancer GWAS||Stephen Chanock, David Hunter, Peter Kraft*|
|phs000517||GWAS in African Americans, Latinos, and Japanese||Christopher Haiman*|
|phs000383||GWAS of Breast Cancer in the African Diaspora||Clement Adebamowo, Stefan Ambs, Susan M.Domchek, Adeyinka G.Falusi, Anselm J.M.Hennis, Dezheng Huo, Esther John, Maria Cristina Leske, Katherine L.Nathanson, Barbara Nemesure, Temidayo O.Ogundiran, Olufunmilayo I. Olopade*, Timothy R.Rebbeck, Suh-Yuh Wu, Yonglan Zheng|
|phs000799||Shanghai Breast Cancer Genetics Study||Yu-Tang Gao, Wei Zheng*|
Entries were scored by the Challenge Evaluation Panel using the criteria listed below. The highest scoring applications were evaluated for reproducibility. In order to qualify for a Challenge prize, the entry results had to be reproduced by data scientists. NCI judges reviewed scores and reproduction and made recommendations to the NCI Director. The NCI Director made the final selection of entries for award.
Scoring Criteria (100 points)
1. Identification of Novel Findings (25 points)
Using breast cancer GWAS data sets available in dbGaP and/or any other publicly available data sets, Challenge participants must identify new genes or combinations of genes, genetic variants, or sets of genomic features associated with breast cancer susceptibility.
- The National Human Genome Research Institute’s (NHGRI) Catalog of Published Genome Wide Association Studies or variants/loci identified in the following publications can be used to evaluate possible novel findings:
- The scale for novelty for the Challenge Evaluation panel to use as a guide is provided:
- New variants in well-established high or moderate penetrance genes (e.g., BRCA1/BRCA2; ATM; PALB2)
- New variants in GWAS-identified genes or loci
- New combinations of variants which were previously identified (i.e., the combination or combined effect is new, but the variants were previously identified)
- New genes or loci
- New combinations of variants from genes or loci not identified previously (i.e., the combination and some of the variants are new)
2. Replication of Findings (25 points)
Evidence of the validity of the proposed novel finding will be evaluated through replication.
- There are several different ways replication can be accomplished. These might include using data sets as testing and training data (or discovery in one data set and replication in another data set) or dividing the data into several portions and performing some type of cross-validation. The Challenge Evaluation panel will also be open to other innovative approaches for replication.
- The Challenge participant will need to select criteria for replication and provide a justification for the selected criteria. Using the criteria selected by the Challenge participant, the Challenge participant must demonstrate replication of findings.
- NOTE: Challenge participants should provide their criteria for replication in the narrative portion of their Challenge Entry.
- The adequacy of criteria selected by the Challenge participant and evidence for replication will be scored by the Challenge Evaluation Panel.
3. Innovation of Approach (25 points)
Innovation and creativity of the submitted approach will be evaluated. Innovation will be defined as a new or significantly improved method. The submitted narrative must describe what is innovative about the approach, what this approach is building on, and why the approach is necessary or how it improves upon existing approaches. Some criteria for innovation include the following:
- Does the Entry seek to shift current paradigms by utilizing novel theoretical concepts, approaches, or methodologies?
- Are the concepts, approaches, or methods, in the Entry novel to this field of research or novel in a broader sense?
- Does the Entry represent a refinement, improvement, or new application of theoretical concepts, approaches or methodologies?
4. Evidence of Novel Biological Hypothesis(es) (10 points)
- Evaluation of this aspect of Challenge Entries will be based on whether findings (i.e., new genes or combinations of genes, genetic variants, or sets of genomic features) lead to novel biological hypotheses. A description of these hypotheses should be provided in the final project Entry.
- Novel biological hypotheses should be testable, either using computational or laboratory approaches. Evaluation will be based on the narrative description of the design of testable experiments, which could examine the novel biological hypothesis identified through these new genes or combinations of genes, genetic variants, or sets of genomic features associated with breast cancer. The format should mirror an outline of grant-specific aims.
- NOTE: The “Evidence of Novel Biological Hypothesis(es)” criteria (4) is distinct from the “Identification of Novel Findings” criteria (1). The “Evidence of Novel Biological Hypothesis(es)” criteria (4) is based on the narrative description of hypotheses generated from the findings and proposed follow up experiments. In contrast, the “Identification of Novel Findings” criteria (1) are the identification of new genes or combinations of genes, genetic variants, or sets of genomic features associated with breast cancer susceptibility.
5. Collaboration (15 points)
Points will be awarded based on (a) the number of different fields represented on the Team; (b) the number of new collaborations represented on the Team (defined as individuals not having published together in the past 5 years); and (c) the number of individuals invited to participate in the Challenge by Team members resulting in Entries to the Challenge.
Finalists and Project Descriptions
Back row: Jason Moore, Ph.D.; Leah Mechanic, Ph.D., M.P.H.; Sara Lindström, MSc., Ph.D.
Middle row: Joshua Hoffman, Ph.D.; Michael Guertin, Ph.D.; Elizabeth Gillanders, Ph.D.
Front row: Yunxian (Fureya) Liu, Ph.D.; Chad Myers, Ph.D.; Wen Wang, Ph.D.
GRAND PRIZE: Team UCSF
Team Captain: John Witte, Ph.D., University of California, San Francisco
Team Members: Nima Emami, Ph.D.; Rebecca Graff, Ph.D.; Dexter Hadley, M.D., Ph.D.; Josh Hoffman, Ph.D., M.S.; Donglei Hu, Ph.D.; Scott Huntsman, M.S.; Lancelote Leong, B.A.; Arunabha Majumdar, Ph.D.; Michael Passarelli, Ph.D., M.P.H.; Caroline Tai, Ph.D., M.P.H.; Noah Zaitlen, Ph.D.; Elad Ziv, M.D.
View Project Description
Team UCSF used all the designated GWAS datasets provided and performed a traditional GWAS to reproduce previous published findings, followed by a genome-wide association of gene expression (GWAGE) and admixture mapping to identify new genes. Using the GWAGE approach, they identified novel associations with the ACAP1 and RTNK2 genes and breast cancer. These findings were replicated in the UK biobank study. ACAP1 and RTKN2 are in the same gene family. Moreover, ACAP1 interacts with the third cytoplasmic loop of SLC2A4/GLUT4, while RTKN2 is implicated in the activation of NF-κB pathway, suggesting possible biological mechanisms for these findings.
GRAND PRIZE: UMN-CSBIO
Team Captain: Chad Myers, Ph.D., University of Minnesota
Team Members: Carol Lange, Ph.D.; Wen Wang, Ph.D.; Zhiyuan Xu, Ph.D.
View Project Description
Team UMN-CSBIO used an innovative computational approach to search for pathway level interactions, instead of examining individual variants or genes. By examining pathway interactions using two of the U4C designated GWAS datasets, the team identified steroid hormone biosynthesis as a major hub of interactions and this pathway was implicated as interacting with many pathways, including the gene set previously associated with acute myeloid leukemia (AML). Several existing studies reported the chemotherapy treatment for breast cancer as a risk factor for AML. Importantly, these interactions would have been missed using traditional approaches.
SECOND PLACE: Team Transcription
Team Captain: Michael Guertin, Ph.D., University of Virginia
Team Members: Mete Civelek, Ph.D.; Mikhail Dozmorov, Ph.D.; Yunxian (Fureya) Liu, Ph.D.; Stephen Rich, Ph.D.
View Project Description
Team Transcription employed a novel integrative genomics approach to explore the hypothesis that many of the non-coding single nucleotide polymorphisms (SNPs) identified by GWAS alter transcription factor (TF) binding sites and mediate effect on disease by modulating TF binding and gene regulation. This team identified a SNP, rs4802200, in perfect linkage disequilibrium with a GWAS-identified SNP, which is predicted to disrupt ZNF143 transcription factor binding within a breast cancer-relevant regulatory element. This SNP is a strong expression quantitative trait loci (eQTL) of ZNF404 in breast tissue. This pipeline can be used as a general framework to identify candidate causal variants with regulatory regions and TF binding sites that confer phenotypic variation and disease risk.
Team: U4C Maroons
Team Captain: DeZheng Huo, MD., Ph.D., University of Chicago
Team Members: Guimin Gao, Ph.D.; Hae Kyung Im, Ph.D.; Olufunmilayo Olopade, Ph.D.; Brandon Pierce, Ph.D.
Team Captain: Knut M. Wittkowksi, Ph.D., Sc.D., The Rockefeller University
Team Members: Christina Dadurian, B.A.
Editorial Describing Challenge
- Mechanic LE, Lindström S, Daily KM, Sieberts SK, Amos CI, Chen HS, Cox NJ, Dathe M, Feuer EJ, Guertin MJ, Hoffman J, Liu Y, Moore JH, Myers CL, Ritchie MD, Schildkraut J, Schumacher F, Witte JS, Wang W, Williams SM, U4C Challenge Participants, U4C Challenge Data Contributors, Gillanders EM. Up For A Challenge (U4C): Stimulating innovation in breast cancer genetic epidemiology. PLoS Genetics. 2017 Sep 28 [Epub].
Publications from U4C Teams
- Gao G, Pierce BL, Olopade OI, Im HK, Kyung H, Huo D. Trans-ethnic predicted expression genome-wide association analysis identifies a gene for estrogen receptor-negative breast cancer. PLoS Genetics. 2017 Sep 28 [Epub].
- Hoffman JD, Graff RE, Emami NC, Tai Caroline G, Passarelli MN, Hu D, Huntsman S, Hadley D, Leong L, Majumdar A, Zaitlen N, Ziv E, Witte JS. Cis-eQTL-based trans-ethnic meta-analysis reveals novel genes associated with breast cancer risk. PLoS Genetics. 2017 Mar 31; 13(3): e1006690.
- Liu Y, Walavalkar NM, Dozmorov MG, Rich SS, Civelek M, Guertin MJ. Identification of breast cancer associated variants that modulate transcription factor binding. PLoS Genetics. 2017 Sep 28 [Epub].
- Wang W, Xu ZZ, Costanzo M, Boone C, Lange CA, Myers CL.Pathway-based discovery of genetic interactions in breast cancer. PLoS Genetics. 2017 Sep 28 [Epub].
- Elizabeth Gillanders, Ph.D., Chief, Genomic Epidemiology Branch, EGRP
- Leah Mechanic, Ph.D., M.P.H., Program Director, Genomic Epidemiology Branch, EGRP