Trends in 21st Century Epidemiology: From Scientific Discoveries to Population Health Impact

Geoffrey S. Ginsburg Presentation: Technology-Driven Epidemiology: A Paradigm Shift

Slide 1 of 37: Technology Driven Epidemiology: A Paradigm Shift

[Images] of people and Duke University logos

Geoffrey S. Ginsburg M.D., Ph.D.
Director, Genomic Medicine,
Institute for Genome Sciences & Policy
Executive Director, Center for Personalized Medicine,
Duke University Health System


Slide 2 of 37: The Goals Are the Same

  • Discovery: to explain the etiology of diseases and health conditions
  • Development: to provide the basis for -
    • Clinical prevention and control measures for populations at risk
    • Public health measures and practices
  • Delivery - implementation and use of findings
  • Clinicians, public health practitioners
  • Public, policy makers, others

Source: Colditz & Winn JNCI 2008; 100:918-25.


Slide 3 of 37: "Factors of Risk"

[Image] of Annals of Internal Medicine journal cover.

Factors of Risk in the Development of Coronary Heart Disease-Six-Year Follow-up Experience

  • High blood pressure
  • Increased cholesterol
  • Smoking
  • Diabetes
  • Family history
  • Male sex

Source: Kannel WB et al. Ann Intern Med 19961;55:33−50.


Slide 4 of 37: Transitions in Biology Over 50 Years

Observational Science → Molecular Science → Genomic (Digital) Science

[Images] of cells, western blot, and heatmap.


Slide 5 of 37: Genomics: A Toolbox for High Dimensional Data

Human Genome Sequence SNPs, CNVs; 10,000,000+
Gene Expression Profiles ~25,000 gene transcripts; miRNA
Proteome Specific protein products; ~100,000+
Metabolome Small molecule metabolites; ~5,000

Source: Ginsburg GS et al. J Am Coll Cardiol 2005;46:1615-1627.


Slide 6 of 37: New Predictive Models of Disease and Outcomes

[Image] showing that signatures and models incorporate imaging, clinical data (treatments, family history, demographics, environmental), gene expression profiles, genome data (SNPs and CNVs, genome scale sequence), metabolomics and proteomic data, and other factors to predict risk, individualized prognosis and diagnosis, drug response, and environmental response.

Source: Ginsburg GS et al. J Am Coll Cardiol 2005;46:1615-1627.


Slide 7 of 37: Crowdsourcing: Data Sharing

[Image] of timeline from 2000 to 2011 with the following milestones:

  • Jan 2001: wikipedia.com
  • Feb 2001: HGP first results
  • Apr 2003: completion of HGP
  • Feb 2004: Facebook.com
  • 2004: PatientsLikeMe.com
  • Feb 2005: YouTube.com
  • Apr 2005: PGP approved for 1 person
  • Nov 2007: first companies offer genome scanning DTC
  • Apr 2008: PGP approved for 100,000 people

The timeline also indicates the emergence of GWAS studies in 2004 to 2005.


Slide 8 of 37

[Image] of a man holding a cell phone.

  • 6 billion cell phone accounts
  • 60% of people have one
  • Transforming health information delivery
  • Text messages
  • Developing nations
  • Cardiac health
  • HIV/AIDS
  • Diabetes

Source: Feder, 2010, Health Aff


Slide 9 of 37: Whole Genome Sequencing: A Big Opportunity

[Images] of computers getting progressively smaller over the years.

  • HGP 2001 (13 years) $2.7B
  • Jim Watson 2007 $1M
  • Complete Genomics 2009 $4,400
  • >Ion Torrent 2012 $1,000 (1 day)
  • Nanopore 2012 USB sequencer

Slide 10 of 37: Moore's Law and Metcalf's Law – Convergence and Opportunity Obstacles to be Overcome

[Image] of a graph showing costs of genome sequencing has plummeted Tung JY et al. Efficient replication of over 180 genetic associations with self-reported medical data. PLoS One. 2011; 6(8): e23473. Doi: 10.1371/journal.pone.0023473. Epub 2011 Aug 17.


Slide 11 of 37: Health Systems as Research Systems

[Image] showing graphic for the eMERGE Network (electronic medical records and genomics), a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

  • Opportunity to use high quality biospecimens and genomic data linked to electronic medical records for discovery of genomic variants
  • Develop the methodology, standards, and policy frameworks for large-scale studies using EMR-defined phenotypes and outcomes
  • Incorporation of genomic information into EMRs for clinical decision making and outcomes research

Slide 12 of 37

[Images] from scientific journals for the following publications:

Kho et al. Electronic medical records for genetic research: results of the eMERGE consortium. Sci Trans Med. 2011 Apr 20; 3(79): 79re1. doi: 10.1126/scitranslmed.3001807.

Liao KP et al. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res. 2010 Aug; 62(8): 1120-7.

Pacheco JA et al. A highly specific algorithm for identifying asthma cases and controls for genome-wide association studies. AMIA Annu Symp Proc. 2009. 2009; 497-501.

Perlis RH et al. Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model. Psychol Med. 2012 Jan; 42(1): 41-50.


Slide 13 of 37

Wild CW. Complementing the genome with an "exposome": the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Biomarkers Prev. 2005; 14: 1847-50.

  • "There is a desperate need to develop methods with the same precision for an individual's environmental exposure as we have for the individual's genome. I would like to suggest that there is need for an "exposome" to match the "genome."
  • It is pertinent to ask whether the new "omics" technologies of transcriptomics, proteomics, and metabonomics can help unlock the problem of environmental exposure assessment.
  • Advances will require increasing collaboration between epidemiologists, biostatisticians, experts in bioinformatics, and laboratory and environmental scientists. In addition, funding agencies must take a medium- to long-term view and encourage research that focuses on improved measures of environmental risk factors."

Slide 14 of 37: Exposures, the Exposome, and Exposomics: The Human As A Sensor

[Image] showing elements of the external environment (radiation, stress, lifestyle, infextions, drugs, diet, pollution) influencing the internal chemical environment (xenobiotics, inflammation, preexisting disease, lipid peroxidation, oxidative stress, gut flora) and may be phenotypically expressed via the exposome (reactive electrophiles, metals, endocrine disruptors, immune modulators, receptor-binding proteins). Gene, protein, and metabolite expression, along with phenotypic data, are technologies that can be used in multi-dimensional predictive models of exposure, health, and disease.


Slide 15 of 37: Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes (N=1)

[Image] illustrating how the study of a single individual using a wide variety of genomic measures over a period of 14 months can capture a variety of exposures (e.g. infectious diseases), and the unmasking of something his genome predicted (e.g. a latent tendency to diabetes).


Slide 16 of 37: Personal Microbiomics Profiling Dynamic Changes in Gut Flora (N=2)

[Images] from personal communication with David and Alm showing personal microbiomics profiling of gut microbiome over a period of a year for two individuals. One individual traveling abroad had a noticeable acute change in their gut flora and then returned to baseline while another individual experienced a Salmonella exposure with a very disruptive change in their microbial communities for quite a long time.


Slide 17 of 37: Lead Exposure and Radiation Exposures Blood RNA Profiles in Murine Models

[Images] from publications looking at model systems to harness blood-based transcriptional profiling to quantify toxicities from our environment. One the left side of the slide mouse study showing two distinct blood gene expression biomarker panels for low and high dose lead exposure over a period of several weeks. On the right side of the slide are images from a study, also done in mice exposing them to Cesium 137 ionizing radiation showing dose-dependent exposure signatures that could be elicited from the animals.


Slide 18 of 37

[Image] from a publication about a cohort being followed for cardiovascular disease demonstrating that never or former smokers have a very distinct gene expression pattern measured in blood compared to recent and current smokers.

Beineke P et al. A whole blood gene expression-based signature for smoking status. BMC Medical Genomics. 2012. 5: 58. doi:10.1186/1755-8794-5-58.


Slide 19 of 37: A Molecular Classifier of Aerobic-Training Adaptation

[Image] showing results from a physiologic challenge using exercise elicits a response in a blood-based RNA profiling measure that is associated with VO2 max.


Slide 20 of 37: A General Model for Developing Molecular Classifiers of Exposure

[Image] showing paradigm for experimental settings where physiologic/phenotypic/PRO measures and molecular profiling are taken from healthy volunteers before they receive an exposure and then repeating these measures at various intervals post-exposure.

  • Clinico-molecular models
  • Prediction of response
  • Classification of exposure

Slide 21 of 37: Pathogen Exposures in a Human Exposure Model Viral Challenge Studies (Rhino, RSV, h4N2, H1N1)

[Image] showing blood RNA expression profiling, blood/urine/saliva/breath proteomics and metabolomics profiling being done for healthy volunteers before receiving a standard viral challenge and then repeated profiling being done during a 5 day observation period waiting to determine which volunteers develop disease.


Slide 22 of 37: A Blood RNA Model for Influenza

[Image] of a graph from an RNA profiling experiment from an influenza challenge study. There are three groups of data on the graph: one became symptomatic, another group remained healthy. Graph also indicates when symptoms first appeared and the peak in symptoms.

Source: Chen et al. BMC Bioinformatics. 2010, 11:552.


Slide 23 of 37: H1N1 Influenza: Temporal Analysis Molecular vs. Clinical Detection

[Image] of graph from a study using an RT-PCR platform showing molecular detection (e.g. global gene expression) precedes peak of clinical symptoms by about 50 hours.

Source: Woods et al, PLoS ONE, in press


Slide 24 of 37: Host Blood RNA Profiles Can Classify Viral vs. Bacterial Infection

[Image] showing gene expression profile from whole blood that can clearly distinguish between infection of viral etiology from bacterial infections.

Source: Zaas et al, Cell Host and Microbe, 2009


Slide 25 of 37: Generating a Comprehensive Catalog of Molecular Classifiers of Exposure

[Image] showing study design using multiple healthy volunteers to study exposure using repeat monitoring.


Slide 26 of 37: Sensors, Sensors (and Phenotypes) Everywhere

[Image] showing "The Borg and "Modern Man," both wearing various sensors.

Lots of data being generated.


Slide 27 of 37

[Images] showing Nike sensor in sneaker, fitbit monitor on jeans pocket, Phillips monitor on a necklace, and a bracelet monitor to measure movement.


Slide 28 of 37

[Image] showing a smartphone device displaying electrocardiogram results.


Slide 29 of 37: Gaming, Cognitive Function, Performance

[Image] showing screen shot of Angry Birds game.


Slide 30 of 37

[Images] of Sensimed Triggerfish logo, and three pictures of contact lens with embedded chip. The first states "a breakthrough solution to continuously monitor fluctuations of intraocular pressure, the second points to details on the lens for 1) active strain gage, 2) telemetry chip, and 3) loop antenna. The third slide states "personalized IOP monitoring."


Slide 31 of 37: Paper-Based Sensors

[Images] of a penny next to a small square paper-based sensor and a woman blowing her nose.

http://gmwgroup.harvard.edu/researchExternal Web Site Policy


Slide 32 of 37: A New Era of Data and Connectivity

"We're all aware of the approximately two billion people now on the Internet [and] also upward of a trillion interconnected and intelligent objects and organisms - what some call the Internet of Things. All this is generating vast stores of information... reaching 35 zettabytes in 2020."
-Samuel Palmisano, Chairman IBM


Slide 33 of 37: Incorporation of Population Data into Individual Models: Google Flu Trends

[Images] from Google showing flu trends around the world and flu search activity from 2004 to 9/23/12. http://www.google.org/flutrendsExternal Web Site Policy


Slide 34 of 37: Twitter Data Models and Influenza: 2010

[Images] showing data that has been mined from Twitter feeds, searching for individual comments on flu-like symptoms and comparing that to influenza-like illness patterns in the community to build multi-dimensional models.

Source: Culotta, KDD Workshop on Social Media Analytics, 2010.


Slide 35 of 37: 21st Century Epidemiology: Integrated Data

[Image] showing several types of data (geo-spatial, social network, sensory, electronic health records, exposure, omics and imaging data) being integrated into a repository to be used for discovery and modeling simulation.


Slide 36 of 37: What Might Medicine Look Like if We Are Successful?

[Video] from http://gizmodo.com/5965143/holy-spock-the-star-trek-medical-tricorder-is-real-and-its-only-150External Web Site Policy


Slide 37 of 37: Some Key Questions/Opportunities

  • What data? Is it reliable?
  • What data should be collected? When? How?
  • Systematizing exposome data
  • Interoperability and standards
  • Developing the mathematical models
  • Validating the findings
  • Establishing utility
  • Communicating the results

Return to Top