Genomic Datasets for Cancer Research
Examples of Research Use Statements and Non-Technical Summaries
Research Use Statement:
We have been involved in the development of statistical tools for high throughput data and have expertise we can leverage to create statistical methods for existing public Genome-Wide Association Studies (GWAS). Through simulated data we have also explored the use of alternative functionals for statistical significance in GWAS. By applying our new methods to specific datasets we hope to bring new insights into the genetics of specific conditions including autoimmune disease, schizophrenia, attention deficit hyperactivity disorder, psychiatric health and related somatic conditions, type 1 diabetes, and bipolar disorders. We do not intend to combine these datasets. Instead we plan to study each of them separately. We understand that these datasets have use limitations and we intend to respect these fully. However, the focus of our proposal is to develop statistical methodology that can be applied to any genetic study. Studying various conditions simultaneously gives us the ability to determine if the statistical methods we develop are generally useful or specific to certain conditions.
In general, we plan to develop novel statistical approaches that take advantage of the structure of these data to give more informative assessments of significance. Specifically, we will examine the behavior of estimators using both the family wise error rate and false discovery rate. We hope that the statistical methods we are developing and applying may lead to a clearer picture of the truly significant results in these high-dimensional studies.
The great success of GWAS is partly due to the development of rigorous data analysis pipelines created by highly qualified statistical geneticists. However, it is possible that more information can be extracted from these data using alternative approaches. We plan to explore the possibility of expanding the utility of these already valuable datasets. In particular, we will apply newly developed and validated statistical methodology to gain new insights into the genetics of the following conditions: autoimmune disease, schizophrenia, attention deficit hyperactivity disorder psychiatric health and related somatic conditions, type 1 diabetes, and bipolar disorders.
Research Use Statement:
It is a general belief that a common disease is caused by a common variant, but more and more evidence shows that human diseases may be caused by cumulative effects of many common genetic variants. This polygenic model poses a significant challenge in Genome-wide Association Studies (GWAS), as variants in a polygenetic model may not show strong association with the disease phenotype individually. Therefore, additional biological evidence is needed to identify polygenic models from genes that are marginally associated with disease phenotypes from GWAS.
We have developed the concept of ontology fingerprints— a list of ontology terms overrepresented in the PubMed abstracts linked to a gene or a disease along with their corresponding enrichment p-value, to characterize genes and diseases. We further have quantified the relationship between a gene and a disease by comparing their ontology fingerprints – the more similar the ontology fingerprint between a gene and a disease, the more likely the gene will play a role in the disease.
We propose to: 1) develop a novel human gene network using ontology fingerprints; 2) identify gene network modules from networks relevant to kidney diseases and diabetes; and 3) test the association of these polygenic models to kidney disease and diabetes using the Genetics of Kidneys in Diabetes (GoKinD) data.
Our request for the data from the GoKinD study would allow exploration of polygenic pathways to kidney disease in those with type I diabetes as well as other phenotypes well described in this cohort, including complications of type I diabetes such as hypertension, cardiovascular disease, peripheral vascular disease, neuropathy, and retinal complications. In addition, we would explore pathways that may associate with biomarkers of these complications, such as creatinine, HDL, and BMI. Our work to identify gene networks implicated in these conditions may inform the development of treatments for individuals suffering from these complications of type I diabetes and possibly interventions in non-diabetic populations at increased risk.
Identifying all the genetic factors that contribute to a human disease is essential to developing effective therapeutics to treat the disease. We have developed a new approach to study multiple genetic factors that contribute to human disease, from biomedical literatures. This approach can help to identify these genetic factors from GWAS.