What Have We Learned from Epidemiology Cohorts and Where Should We Go Next?

The information on this page is archived and provided for reference purposes only.

Trends in 21st Century Epidemiology: From Scientific Discoveries to Population Health Impact

The Epidemiology and Genomics Research Program (EGRP) has initiated a strategic planning effort to develop scientific priorities for cancer epidemiology research in the next decade in the midst of a period of great scientific opportunity but also of significant resource constraints. EGRP would like to engage the research community and other stakeholders in a planning effort that will include a workshop in December 2012 to help shape cancer epidemiology research.

EGRP Invites Your Feedback

To facilitate this process, we invite the research community to join in an ongoing Web-based conversation to develop priorities and influence the next generation of high-impact studies.

This week, we address the evolution of epidemiologic cohorts in the study of cancer and other diseases. Research involving epidemiology cohorts falls primarily within two categories:

  • Large observational studies consisting groups of people with a set of characteristics or exposures being followed systematically and prospectively for the incidence of new cancers, and cancer mortality.
  • Large cohorts of cancer survivors assembled for observational studies designed to study a variety of cancer-related outcomes, including responses to therapies, cancer recurrence, treatment-related second cancers, and additional short- and long-term health outcomes occurring after diagnosis.

People Standing over World Map

Throughout the last two decades, EGRP-supported epidemiology cohort-based studies have helped to better understand the complex etiology of cancer and have provided fundamental insights into key environmental, lifestyle, and genetic determinants of this disease. Findings from cancer epidemiology cohorts are critical for many areas of trans-disciplinary and translational research.

The adoption of new methods and technologies in cohort studies has the potential to improve recruitment, exposure measurements, and lifestyle factors; advance biobanking and molecular characterization of study participants; and develop disease prediction and prognostic models. In addition, collaboration among existing cohorts, as consortia, will allow for increased power to detect complex interactions and to study rarer outcomes and special and underserved populations. These cohort studies can yield population-based discoveries that help basic scientists more fully understand biologic mechanisms and allow clinicians to make precision medicine a reality.

We would like to get your feedback on the following fundamental questions:

  • What developments are needed to make epidemiologic cohorts a cornerstone of the discovery to practice continuum—bridging the transition from etiology to outcomes to policy and practice?
  • How should NCI and NIH facilitate multidisciplinary collaboration to integrate these developments into the research portfolio?

Please use the comment section below to share your perspectives.

We encourage you to be as specific as possible. You can use or be inspired by the NCI Provocative QuestionsExternal Web Site Policy exercise. Your comments will be used to shape the workshop discussion in December.

Comments are also still welcome in response to first two questions of the strategic planning series:

EGRP’s Workshop Science Advisory Group


  • Graham Colditz - August 28, 2012 at 11:00 AM (UTC -4)

    I have blogged on my site about the role of cohorts informing prevention (across the whole continuum) and published a piece following the 2 day meeting in Banff that addressed the role of cohorts. See (http://bit.ly/aI2jArExternal Web Site Policy)

    I have also emphasized how cohorts contribute across etiology to survivorship when structured appropriately and ideally when using repeated measures of key lifestyle variables (and obtaining tumor samples for classification). See http://www.springerlink.com/content/92727v218605388u/External Web Site Policy

    So where does this leave us. Cohorts make major contributions to public health policy and practice. See for example ACS CPSI and CPSII cited in Surgeon General Reports on Smoking and Health! or http://jnci.oxfordjournals.org/content/100/13/918.longExternal Web Site Policy

    Perhaps foremost emphasis must be placed on approaches to updating exposure assessment as technology / lifestyle / drugs / change over time and the best approach to relate these to cancer and cancer outcomes is through large observational studies.

    Given the importance of early life exposures for lifetime cancer risk – (whether measured as telomere length, or cancer diagnosis) the lack of substantial prospective data even recalled for childhood and adolescence, is a pressing challenge as more and more evidence points to growth through to late adolescent years as key to driving lifetime cancer risk.

  • Gertraud Maskarinec - August 28, 2012 at 1:34 PM (UTC -4)

    As to the question of how to expand the possibilities of epidemiologic cohorts into the area of outcomes and policy and practice, one route is to establish more links with health plans. Epidemiologists have been using Medicare data and the SEER-Medicare database to look at treatment and cancer survival, but these studies are limited to participants 65 years and over. In particular, HMOs offer the possibility to link data with younger cohorts in order to add information on new cases of chronic diseases other than cancer, treatment, and health care utilization. With improved computerization, the technology is here to do it, however, the IRB and HIPAA approvals are challenging. So one topic of discussion could be how to overcome the latter issues.

  • Julie Palmer - October 15, 2012 at 5:07 PM (UTC -4)

    The productivity of cohort studies and impact of their findings is clearly established (http://www.springerlink.com/content/j2853r1n12722565/fulltext.pdf?MUD=MPExternal Web Site Policy). In almost all instances, each cohort was designed and conceptualized by a few investigators who identified important gaps in existing research, developed original methods for a cohort that could answer address specific hypotheses, and obtained funding after rigorous peer review. U.S. cancer cohorts are diverse in design: they differ by sex, ethnicity, age, and social class distribution of the population; by the types of biospecimens collected; by whether exposure data are updated over time or collected at baseline only; by the kinds of exposure data collected; and by the method of follow-up. The diversity of these cohorts and approaches is a strength, allowing for quick adjustments as new scientific questions arise.

    The NCI has institutionalized support of existing cohorts through the infrastructure grant mechanism. Implicit is the recognition that these cohorts represent 10-40 years of data collection on diverse populations, a basis from which to launch investigations of new hypotheses. The NCI has increased the future value of existing cohorts by adding support for collection of blood samples and tumor tissue samples.

    How will existing cohorts remain productive and move us closer to the goal of reducing cancer morbidity and mortality? While some questions can be answered within a single cohort, collaborations which combine data and investigators from multiple cohorts are needed. Such collaborations will be most likely to succeed if appropriate funding support is available. How best to support consortial projects has been the subject of much discussion in the 10+ years of the NCI Cohort Consortium. Most projects fall into one of two categories, with different implications for funding.

    Collaborations of 2-6 cohorts (e.g., Breast and Prostate Cancer Cohort Consortium (http://www.ncbi.nlm.nih.gov/pubmed/18596909External Web Site Policy) and AMBER consortium on breast cancer subtypes in African American women) (http://www.genengnews.com/gen-news-highlights/nci-awards-19-3m-grant-to-study-breast-cancer-disparity-between-african-american-and-european-women/81245589/External Web Site Policy). Funding through R01, R21, P01 mechanisms will provide support for the scientific effort required and will encourage a disparate group of investigators to work together, bringing their best ideas to the new project.

    Collaborations of large numbers of cohorts for study of rare cancers when survival is short or for examination of important subgroups of the population. Successful projects of this type have been carried out under auspices of the NCI Cohort Consortium (http://www.ncbi.nlm.nih.gov/pubmed/20103627External Web Site Policy), but cost recovery and cohort burnout have been a concern. Continued participation in these projects will be more likely if consortial projects undergo external peer review, with funding requested to support scientific effort by the lead investigator(s), preparation of data files and MTAs (fixed amount to each cohort, e.g. $5,000), and data harmonization by a central NCI contractor. NCI has already supported harmonization of numerous data items across many cohorts. Continued support for a contractor to store cohort data and update harmonization as new cohorts are added will greatly reduce data harmonization costs for each new project. Support for investigator time for each of 20+ participating studies is not feasible. However, cohort PIs are likely to participate without funding if the project has undergone formal peer review (e.g. study section) and been judged to be of high significance.

    Longstanding cohort studies can also be a valuable resource for investigating factors related to recurrence, second cancers, survival, and quality of life after cancer. Pre-diagnosis exposure data and tumor characteristics have already been obtained. Cohorts with active follow-up will identify recurrences and second cancers through routine data collection. A new priority is for cohorts to develop protocols to obtain data on cancer treatments, quality of life, and co-morbidities that frequently occur with cancer treatment.

  • Daniela Seminara - November 13, 2012 at 11:13 AM (UTC -4)

    In 1981, a prominent epidemiologist prophesized the "Fall of Epidemiology" (1). However, thirty years later, the principles of epidemiology are still the backbone of many fields of inquiry across the translational continuum. Further, epidemiology has evolved to respond promptly to technical advances and changes in the pattern of diseases by adopting novel high-throughput technologies and research strategies to remain relevant to the complexity of the current research questions, and to expedite the translation of research results to the benefit of Public Health. A large component of what has and will make this evolution possible is the rise of consortia as "hubs" of collaborative and interdisciplinary research (2). In cancer research, consortia-related publications in many scientific domains, including epidemiology, have increased exponentially during the last decade (Pubmed search, October 2012). This trend has been driven initially, although not exclusively, by the confluence of rapid advances in genomic technologies and the need for extremely large sample sizes to address the complex interactions defining the mechanisms leading to cancer (3). The proliferation of multi-institutional cancer consortia has created a virtual "network of networks", with enormous potential not only to investigate the causes and mechanisms underlying common diseases, but also to evaluate risk, and lay the basis for designing preventions and tailoring targeted therapies (4). Most recently, cohorts have become essential contributors to cancer consortia, supporting broad, agnostic, hypothesis generating approaches and hypothesis-driven research on complex diseases. In addition, it has been proposed that for cohort-based research to evolve into the 21st century, a "synthetic cohort" needs to be assembled. This can happen through a consortium of the major existing cohorts which will be able to focus on common diseases with a lifespan approach (5) and to address a wide range of issues spanning from prevention to control to survivorship. Results could also contribute to inform the establishment of a future "mega-cohort" by identifying gaps and needs.

    Some of the approaches adopted by successful large consortia and nation-wide cooperative cohorts as well as the lessons learned during the implementation of multi-institutional collaborative research could be useful to this end (6):

    • Enhance follow-up of participants utilizing contemporary technologies and real-time feedback for exposures assessment;
    • Expedite outcomes and clinical and demographic data collection and integration utilizing all available health care, clinical and surveillance infrastructures and data bases;
    • Optimize associated biobanks with systematic collection and management of high-quality biospecimens (blood and tissues) from adequate numbers of relevant subsets of the targeted populations;
    • Accelerate integration of emerging technologies through the establishment of standards and guidelines for "high throughput" readiness;
    • Involve multi-disciplinary teams with wide range of expertise in planning cohort-based research (meeting of the minds);
    • Establish and maintain forums to exchange experiences and pilot results, and to plan and coordinate individual and collaborative research.
    • Implement clear, multi-level data sharing and publication guidelines processes for prioritization of data access among investigators while protecting the privacy and confidentiality of participants.
    • Enhance communication strategies for diffusion of results and recommendations to cohort(s) participants and health care providers, and evaluate the impact of such dissemination on health, lifestyle and clinical practices.

    The continued support of a global cohort consortium will require careful planning and review to maintain balanced representation, relevance to rapidly changing research questions, state-of-the art methodology and to timely involve investigators from many disciplines and at all stages of their careers. To this extent, the Epidemiology and Genomic Research Program (EGRP) is sponsoring and planning several initiatives to support the establishment, maintenance and enhancement of cohorts and consortia infrastructures and research (7).


  • Les Robison - November 13, 2012 at 2:56 PM (UTC -4)

    Cohort studies have and will continue to provide important insights across the discovery to practice continuum. The future challenge is to further increase, and ultimately maximize, the yield from these significant investments in research. Typically, a cohort is proposed, established, maintained, and evaluated by a relatively limited number of investigators, generally within a single academic institution. While the establishment of cohort consortia has provided expanded opportunities for collaboration, consideration needs to be given to approaches to encourage and incentivize access and use of cohorts by the broader scientific community.

    Opening access to cohorts by investigators outside of the primary center can be challenging and has to be handled correctly so as not to be deleterious to the cohort. Also, it is essential to be mindful of the time and effort commitment of individuals directly responsible for establishing and maintaining the cohort. Thus, any involvement of outside investigators needs to be within a collaborative structure, where the cohort investigators are part of the research team for any ancillary studies.

    Each cohort, whether existing or proposed, should be evaluated with regard to research potential for various aspects of the cancer discovery to practice continuum (e.g., etiology, pathophysiological mechanisms of malignant initiation/progression, risk prediction, screening/early detection, health promotion, survivorship, intervention trials, dissemination research, etc.). This evaluation should be performed considering the rationale for establishing the cohort, with a view toward issues such as: (1) how expansion of the research scope may potentially impact the ability to investigate the primary aims for establishing cohort, (2) potential conflicts with coordination/prioritization of an expanded research portfolio, (3) feasibility, (4) cost implications and resource utilization.

    Expanded use of cohorts could be achieved through a variety of approaches including access to existing data, access to cohort members, and/or access to biological samples. Based upon attributes of the cohort, the principal investigator and research team could direct the topics and scope of expanded access through issuance of requests for proposals, to target specific aspects of the continuum that are viewed as high priority, with potential high impact, and currently not being addressed to the extent the cohort could support. Additionally, approaches to broaden the impact of cohorts may include requiring that some level of public access to data be incorporated into the funding plan. This can take the form of public access to selected data sets or data tables (similar to those made available through SEER, NHANES, NHIS, dbGaP, etc.).

    While many academic institutions with existing cohorts have integrated trainees (pre- and post-doctoral) and junior faculty into their research efforts, it is likely that many cohorts could support a greater level of involvement. To maximize use of cohorts as a vehicle for career development, funding mechanisms should be establish to provide cohorts with supplemental developmental funds to specifically target career development of promising investigators outside of the cohort institution.

    Any expanded use of cohorts by the scientific community, will require additional investment of resources. Thus, transforming cohorts to a "research resource model" cannot and should not be implemented through unfunded mandates.

  • Lyle Palmer - November 29, 2012 at 6:03 PM (UTC -4)

    The value of large, prospective study designs and their concomitant statistical power to quantify the combined effects of the environment, lifestyle and genes has been extensively articulated. Current large, prospective observational studies such as UK Biobank, Kadoorie, and EPIC are designed to allow the reliable assessment of different causes of chronic disease over a long period of time and with detailed follow-up of cause-specific morbidity and mortality. These studies are also designed to enable the assessment of a wide range of exposures to a wide range of health-related outcomes. In parallel with the growing availability of large and well-characterized cohort resources, the -omics revolutions are transforming epidemiology, medicine and drug discovery. In particular, human genomics has had enormous and unprecedented success, through using population-based studies, in discovering and validating new genes for common disease susceptibility, natural history and treatment response. Identification of specific etiological factors (e.g., genes) affecting susceptibility, progression and response to therapy for such diseases is likely to allow fundamental insights into disease biology, which will in turn help to better define interventional, therapeutic and health promotion strategies. The vital question now is: "˜how do we cost-effectively translate new knowledge and technologies into improved health outcomes?" Increasingly, it has become clear that there is a critical need for comprehensive, blood-based, prospective resources.

    Key developments necessary to enable large cohort studies as vehicles for translation include: constructing large, representative, population-based resources; a life-course approach to epidemiology and a concomitant focus on participant retention and follow-up; ensuring blood and DNA is available on all or most participants; enabling dynamic, online, bi-directional platforms for participant engagement; linking participant study data with objective health outcome data from administrative health records; and constructing the necessary networks across academia, industry and government to ensure effective translation of evidence into policy and clinical practice. To single out one exciting development, a number of major cohorts have now implemented online solutions for data acquisition and for transmitting information to participants. The online nature of such studies provides the basis for ongoing, cost-effective and timely follow-up, bi-directional communication, interventions and close engagement with participants. There are profound new opportunities to engage individuals and communities in their own health. Recognition by funding bodies of the need for secure and long-term funding of longitudinal cohorts will also be critical. Further key enabling developments in collaborative models, data harmonization (prospective and retrospective) as well as health economics -" providing an evidence base for social good and cost savings -" will also be important.

  • Julie Buring - December 10, 2012 at 9:05 AM (UTC -4)

    The question of what we have learned from epidemiology cohorts and where we should be going next is a particularly timely one, as the NCI Cohort Consortium just held its annual symposium in October 2012, with the focus being exactly this question. One of the main tasks for the symposium was to take a hard look at ourselves, 12 years after we began. Formed by the NCI's intramural and extramural staff together with the cohort PIs, it currently includes 46 cohorts in more than 15 countries, with 4 million study participants, and 2 million DNA samples. What are our unique strengths and limitations to better understand the complex etiology of cancer? How can we extend the evaluations in our cohorts to prevention and treatment? What role can we play over the next decade to accomplish these goals and make the cohorts a cornerstone of our progress? What specific gaps in knowledge are most imperative for the cohorts to address? What obstacles do we need to overcome?

    In the next decade, there will only be an increasing need for the data from large, well-designed and conducted epidemiologic studies of long duration, as we increasingly turn our focus for cancer to more complex interactions of gene and environment, as well as to rarer outcomes. Consortia of cohort studies are in a unique position to provide those needed data. Their strengths include their necessarily large sample sizes; their multi-ethnic composition; their extensive collection of phenotypes, often with serial measurements on study participants that can address time-varying characteristics; and their large accompanying biobanks, that can provide or obtain extensive genetic information. A key strength of the cohort consortium activities is the strong, cross-disciplinary collaboration of diverse cohorts and approaches, allowing the cancer field to be pushed forward from discovery to practice.

    Where do we need to extend or expand the cohorts, to meet the priorities of the next decade? With regard to cancer, the next questions to be explored will likely require the availability of tumor tissues and involve the ability to evaluate detailed molecular characteristics of subtypes of cancer. We also will need to extend beyond first events, to include an evaluation of recurrence, second cancers, and survivorship, as well as cancer treatment. It will also be important to extend to an evaluation of the lifecourse, including the challenge of inclusion of groups that have been underrepresented, such as children and adolescents. We also need to examine whether further methodology is needed to validate, improve, adapt or extend current assessments of exposures. Finally, as the next generation of genetic tools are developed, stored samples will be revisited so that relevant information can be extracted.

    More broadly, there is a compelling imperative that we move beyond examining only cancer endpoints, and focus on multiple disease endpoints within the cohort setting. This is a worthwhile and achievable goal. Being able to evaluate multiple outcomes for a single exposure is cost-effective and value-added. There is a commonality of major risk factors for multiple diseases. Moreover, many of the cohort studies were jointly funded as multi-purpose studies by multiple NIH Institutes, and as such, they were originally designed to evaluate multiple outcomes with the same methodologic rigor as the cancer endpoints. In fact, many members of our cancer consortium are also members of other non-cancer consortia. As a first step, the NCI cancer consortium has proposed that cohorts that have validated non-cancer outcomes such as cardiovascular disease take the lead and demonstrate the proof-of-principle that this extension of cancer consortium activities can be accomplished.

    There is no question there are obstacles to be overcome. For some structural and methodologic issues, we need help -" and the NIH is in a unique position to be of assistance and accelerate the process. Most importantly, to achieve our goal of using the cohorts to address multiple outcomes, joint funding by multiple Institutes will need to be expedited, as well as non-disease specific funding mechanisms, and integrated management of the cohorts across the NIH. Support for the cohort studies to maintain their infrastructure long-term is critical to continued collaboration. Funding is also needed to evaluate and add important new methodologic technologies to assess exposure information, as well as to provide central administrative assistance in cross-cohort projects such as harmonization of data. The NIH can also serve as a liaison for cohorts to obtain the lowest cost opportunities for record linkage, such as with regard to databases at the Center for Medicare and Medicaid, or as a driver to overcome hurdles in obtaining record linkage for younger individuals, or tracking outcomes in an easily accessible and cost-effective way. Some of this assistance has already begun to occur: the NCI, for example, has been critical to the Cohort Consortium activities, including facilitating the feasible coordination of the harmonization of data, working to restore access to assessment of mortality outcomes, and issuing a NCI Cancer Epidemiology Cohort Funding Opportunity Announcement for infrastructure support of the maintenance of the cohorts.

    As we think ahead to continuing the contributions of the cohorts over the long term, we will have to consider how to best nurture the pipeline of young investigators who will be working with the consortial activities over the years to come. To do so, the issue of their career development will need to be addressed. While promotion committees of academic institutions understand the scientific contributions of the consortia, they do not yet know how to recognize the contributions of an individual in the context of a necessarily team endeavor. The onus will be on our senior investigators to educate all of our promotion committees in this area.

    Finally, it has often been asked whether the use of synthetic cohorts is the best approach versus launching a new mega-cohort to address the many complex questions of the next decade. I believe it is important to realize that one does not preclude the other, and we can leverage the existing cohorts while developing any new ones. We don't need or want to wait. We may not have everything we need from all cohorts, but we have enough on most cohorts to establish a research portfolio that will continue to provide fundamental insights into key environmental, lifestyle, and genetic factors playing a role in the etiology of cancer and other diseases.

Return to Top

The information on this page is archived and provided for reference purposes only.