On this page...

Danielle Carrick, PhD, MHS
Program Director, Genomic Epidemiology Branch
carrick@mail.nih.gov

Somdat Mahabir, PhD, MPH
Program Director, Epidemiology and Genomics Research Program
mahabir@mail.nih.gov
Overview
Cohort studies are one of the fundamental designs for epidemiological research. Cancer epidemiology cohorts are large observational population studies in which groups of people with a set of characteristics or exposures are prospectively followed for the incidence of new cancers and cancer-related outcomes. Data from cohort studies have helped researchers to better understand the complex etiology of cancer, and have provided fundamental insights into key environmental, lifestyle, clinical, and genetic determinants of this disease and its outcomes.

Funded Projects
Related Research Resources
This list provides links to resources that may be of interest to cancer epidemiologists interested in or conducting cohort-based studies, but is not exhaustive.
Descriptive Information from Existing Cohort Studies
- Cancer Epidemiology Descriptive Cohort Database
This searchable database contains descriptive information about existing cohorts, including study design, eligibility criteria, enrollment numbers, numbers of biospecimens, and numbers of cancer and other health outcomes. - Biospecimen Resources for Population Sciences
This list provides links to biospecimen resources that may be of interest to cancer epidemiologists, but is not exhaustive. - NCI Cohort Consortium
The NCI Cohort Consortium is an extramural-intramural partnership formed by NCI to address the need for large-scale collaborations to pool the large quantity of data and biospecimens necessary to conduct a wide range of cancer studies. It includes investigators responsible for more than 73 high-quality cohorts involving more than 7 million people. The cohorts are international in scope and cover large, rich, and diverse populations. Investigators team up to use common protocols and methods, and to conduct coordinated parallel and pooled analyses.
NIH-Sponsored Data Repositories
- Database of Genotypes and Phenotypes (dbGAP)
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans. EGRP cohorts often participate in genetic and genomic studies that have shared their data in dbGaP. In addition, cohorts have shared their own data in dbGaP, often with extensive non-genomic data collected by the cohort. Click the + sign below to view a table with cohort details. These and other dbGaP datasets are available via controlled access.Cohort Data in dbGaPdbGaP Accession Number Dataset Title Contact PI dbGaP Public Webpage* phs002835 Breast Cancer Family Registry (BCFR) Mary Beth Terry, Columbia University phs002835 Breast Cancer Family Registry (BCFR) Mary Beth Terry, Columbia University phs002733 Colon Cancer Family Registry (CCFR) Mark Jenkins, University of Melbourne phs002733 Colon Cancer Family Registry (CCFR) Mark Jenkins, University of Melbourne phs002171 Serrated Colorectal Cancer: An Emerging Disease Subtype (CCFR) Polly Newcomb and Amanda Phipps, Fred Hutchinson Cancer Center phs002460 Health Professionals Follow up Study (HPFS) Lorelei Mucci, Harvard T.H. Chan School of Public Health phs003538 Light at Night and Prostate Cancer in the Health Professionals Follow-Up Study (HPFS) Lorelei Mucci, Harvard T.H. Chan School of Public Health phs002183 Multiethnic Cohort (MEC) Study Loïc Le Marchand, University of Hawaii Cancer Center phs003786 Mind Body Study: a sub-study on psychosocial factors and microbiomes of nurses in the Nurse’s Health Study II (NHS2) Heather Eliassen, Harvard T. H. Chan School of Public Health phs001964 Women's Health Study (WHS) Accelerometer Data I-Min Lee and Julie Buring, Brigham and Women's Hospital, Medical School * Investigators who are not familiar with dbGaP public study pages may wish to review an informational guide developed by EGRP on how to navigate different sections, in particular sections related to non-genomic data.
- BioLINCC
The National Heart, Lung, and Blood Institute (NHLBI) hosts this centralized, controlled-access database where Investigators can deposit and access datasets related to heart, lung, and blood diseases. - EpiShare
EpiShare is a web-based platform for sharing biospecimens and/or datasets with the greater research community. EpiShare provides a central location for researchers to see summaries of National Institute of Environmental Health Sciences (NIEHS) Epidemiology Branch studies and specimen inventories, submit requests, and track all requestor correspondence.
Cohort-related Analytical Tools
- Nested Cohort Software Package
NCI's intramural Division of Cancer Epidemiology and Genetics (DCEG) has made available this software package for fitting Kaplan-Meier and Cox Models to estimate standardized survival and attributable risks for studies where covariates of interest are observed on only a sample of the cohort. Standard designs that can be handled by this software include the case-cohort and case-control studies conducted within defined cohorts. At this time, the software does not yet support nested case-control designs.