Novel Approaches and Challenges to Data Harmonization: Maximizing the Use of Multi-level Data in Collaborative Studies

The Novel Approaches and Challenges to Data Harmonization Workshop, sponsored by the Epidemiology and Genomics Research Program (EGRP), will be held on October 6-7, 2014, at the Neuroscience Center Building in Rockville, Maryland.


The breadth and scope of studies utilizing an epidemiologic framework has grown significantly during the last few decades. This trend is mainly due to the increasing complexity of the research questions being asked and the consequent need to coalesce data across many studies. In fact, the case has been made that "integrative epidemiology"--the integration and analysis of heterogeneous and multi-layered data sets—may be the key to advance the practice of epidemiology in the twenty-first century.

The current climate of scarce resources necessitates assembling these expansive and heterogeneous data sets through extensive multi-disciplinary collaboration or consortia. Throughout the last decade, consortia-based research has grown exponentially, has contributed to a better understanding of the complex etiology of cancer, and has provided fundamental insights into key environmental, lifestyle, clinical, and genetic determinants of these diseases and their outcomes. Using existing epidemiology data sets and complementing them with newly acquired genomic, clinical and other types of data can empower researchers to address important health issues more expeditiously and in a more cost-effective manner than through the creation of new large research infrastructures. Data harmonization, or the process of assessing compatibility of data accrued from independent sources, is an essential step to support these integrative analyses.

The goals of the Novel Approaches and Challenges to Data Harmonization Workshop were:

  1. To explore the theory and practice of multi-level data harmonization;
  2. To consider available tools and new approaches;
  3. To review representative case-studies; and
  4. To develop recommendations for best practices.

Return to Top


Neuroscience Center Building, Conference Room A1-A2
6001 Executive Blvd, Rockville, Maryland

Day 1 - Monday, October 6, 2014

Time Topic
9:00 a.m. - 10:20 a.m.

Consortia in Epidemiology and the Need for Data Harmonization
Daniela Seminara, Ph.D., M.P.H.
Senior Scientist and Consortia Scientific Coordinator
Epidemiology and Genomics Research Program (EGRP), Division of Cancer Control and Population Sciences (DCCPS), National Cancer Institute (NCI)

The Harmonization Process: The Maelstom-Research Experience
Isabel Fortier, Ph.D.
Research Institute of the McGill University Health Centre

Data Harmonization for Signature Projects in the NCI Cohort Consortium
Michelle Brotzman, M.P.H.
Study Manager

Discussion - 20 min.
10:20 a.m. - 10:40 a.m. Break
Session I: Epidemiologic Risk Factor Data Harmonization
Moderator: Sara Olson, Ph.D., Memorial Sloan Kettering Cancer Center
10:40 a.m. - 11:15 a.m.

The Data Coordinating Center: Challenges of Data Harmonization
Susan Slager, Ph.D.
Professor of Biostatistics
Mayo Clinic

Combining Variables from Case-Control and Cohort Studies: Experience from E2C2
V. Wendy Setiawan, Ph.D.
Assistant Professor of Preventive Medicine
University of Southern California Keck School of Medicine

Data Harmonization Across Large Consortia: Analytic Challenges
Donna Spiegelman, Sc.D.
Professor of Epidemiologic Methods
Harvard University School of Public Health

Discussion - 30 min.
12:10 p.m. - 1:25 p.m. Lunch
Session II: Clinical and Outcome Data Harmonization
Moderators: Jonine Bernstein, Ph.D., Memorial Sloan Kettering Cancer Center and Lindsay Morton, Ph.D., Division of Cancer Epidemiology and Genetics (DCEG), NCI
1:25 p.m. - 2:55 p.m.

Harmonizing Tumor Subtypes
Peggy Porter, M.D.
Member, Human Biology Division and Public Health Sciences Division
Fred Hutchinson Cancer Research Center

Harmonizing Treatment Data from Cancer Survivor Studies
Lawrence Kushi, Sc.D.
Director of Scientific Policy
Kaiser Permanente Northern California Division of Research

Leveraging Existing Studies to Pursue an Array of Research Questions
Lindsay Morton, Ph.D.
Investigator, Radiation Epidemiology Branch

Discussion - 30 min.
2:55 p.m. - 3:15 p.m. Break - 20 min.
Session III: Biomarkers and Conduct of Data Harmonization
Moderators: Ulrike Peters, Ph.D., Fred Hutchinson Cancer Research Center and Gabriel Lai, Ph.D., EGRP, DCCPS, NCI
3:15 p.m. - 4:45 p.m.

Combining Biomarkers Other Than Genotypes
Anne Zeleniuch-Jacquotte, M.D.
Professor, Departments of Population Health and Environmental Medicine
NYU Langone Medical Center

Facilitating Large Scale Harmonization for Many Variables
Leslie Lange, Ph.D.
Associate Professor of Genetics
University of North Carolina School of Medicine

Harmonizing Genomic Data
Christopher Amos, Ph.D.
Associate Director for Population Sciences, Norris Cotton Cancer Center
Professor, Geisel School of Medicine at Dartmouth

Discussion - 30 min.
4:45 p.m. End of Session III and Day 1

Day 2 - Tuesday, October 7, 2014

Time Topic
Session IV: Analytic Issues
Moderator: Daniela Seminara, Ph.D., M.P.H., EGRP, DCCPS, NCI
9:00 a.m. - 10:00 a.m.

GxE Harmonization
Carolyn Hutter, Ph.D.
Program Director, Division of Genomic Medicine
National Human Genome Research Institute (NHGRI)

Meta-Analysis: Mega-Analysis and Pooling
Ken Rice, Ph.D.
Associate Professor
University of Washington

Outliers and Approaches in the Presence of Biological Heterogeneity
Nilanjan Chatterjee, Ph.D
Biostatistics Branch Chief and Senior Investigator

Discussion - 30 min.
10:30 a.m. - 10:50 a.m. Break - 20 min.
Breakout Groups
10:50 a.m. - 12:15 p.m. Harmonization with an Eye to the Future
  • Epidemiology Risk Factors
  • Clinical/Outcome
  • Biomarkers
  • Analysis
12:15 p.m. - 12:40 p.m. Lunch
12:40 p.m. - 1:25 p.m. Keynote Speaker: Multi-Level Data Integration
John Ioannidis, M.D.
Stanford University School of Medicine
1:25 p.m. - 2:40 p.m. Breakout Groups Reports and Recommendations
Moderator: Jonine Bernstein, Ph.D., Memorial Sloan Kettering Cancer Center
2:40 p.m. - 2:55 p.m.

Concluding Remarks and Charge to the Group
Sara Olson, Ph.D.
Associate Attending Epidemiologist
Memorial Sloan Kettering Cancer Center

2:55 p.m. Meeting Adjourned

Note: A link to the recorded webinar will be added to this site approximately one month after the workshop.

Return to Top

Planning Committee

  • Sara Olson, Ph.D., Memorial Sloan Kettering Cancer Center
  • Jonine Bernstein, Ph.D., Memorial Sloan Kettering Cancer Center
  • Ulrike Peters, University of Washington School of Public Health, Department of Epidemiology
  • Carolyn Hutter, Ph.D., M.S., National Human Genome Research Institute, Division of Genomic Medicine
  • Daniela Seminara, Ph.D., M.P.H., National Cancer Institute (NCI), Division of Cancer Control and Population Sciences, EGRP
  • Lindsay Morton, Ph.D., NCI, Division of Cancer Epidemiology and Genetics
  • Gabriel Lai, Ph.D., NCI, DCCPS, EGRP

Return to Top

Related Resources

Return to Top


Return to Top