Skip to Main Content
An official website of the United States government
Epidemiology and Genomics Research Program

Genomic Summary Results

What are genomic summary results (GSR)?

Genomic summary results (GSR) are summary genomic data generated from primary analyses of genomic research across many individuals (also referred to as “aggregate genomic data” or “genomic summary statistics.” GSR include data calculated from a study sample such as genotype counts, allele frequencies, effect size estimate and standard errors, p-values). Many research and clinical questions can be addressed using summary information without requiring the individual-level data. Because sharing of summary statistics is easier and more efficient than sharing of individual level genetic data, there has been a proliferation of analytical methods that use summary statistic information.

The following references from NIH provide additional information about GSR:

NIH policy for sharing GSR

In 2016, the National Human Genome Research Institute (NHGRI) held a workshop to discuss the benefits and risks of sharing GSR. The workshop resulted in the November 2018 notice in the NIH Guide for Grants and Contracts, NOT-OD-19-023, which updated NIH’s management of GSR under the Genomic Data Sharing (GDS) Policy: previously all GSR were placed under controlled access in NIH-designated repositories, but this update allowed for GSR to be available through unrestricted access for most datasets (i.e., those not considered “sensitive” due to privacy risks).

Where can I find cancer GSR data?

The following list includes databases and websites where cancer GSR data may be found and accessed:

If you have questions or know of other databases or websites with cancer GSR data that could be added to the list above, please contact Dr. Danielle Carrick by emailing Danielle.Carrick@nih.gov.

Where can I deposit GSR data?

  • When deciding on where to deposit your GSR data to make it available to the public, some questions you might ask are:
    • Is the GSR data sensitive (i.e., requiring controlled access due to potential privacy risks) and/or is it associated with a dataset already in dbGaP? If so, dbGaP is a good option. More information about privacy risks is available on NHGRI’s website.
    • Is the GSR data associated with a specific publication or unpublished, non-sensitive data? If so, the GWAS Catalog could be a good option. Learn how to submit summary statistics in the GWAS CatalogExternal Web Site Policy.

If I get individual-level data from dbGaP and generate GSR, can I share the newly generated GSR data?

The answer to this depends on whether the individual-level dataset you requested is designated as “sensitive” or "non-sensitive," which affects whether public posting of GSR is “restricted” or “allowed.” This information can be found on the dataset’s dbGaP public study page, in the “Authorized Access” section, which describes the terms of access of secondary users.

  • If the data you requested is non-sensitive, public posting of genomic summary results is “allowed.”
  • You can determine this by looking under the Authorized Access section of the dbGaP entry; if sharing the GSR data generated as a secondary user is allowed, it would say “Public Posting of Genomic Summary Results: Allowed.”

    For non-sensitive datasets, requester-generated GSR can be shared/posted. Data requesters can indicate plans to generate and disseminate GSR in their research use statement if they wish to post GSR more broadly than publication within the scientific literature as an intrinsic piece of evidence to support a study’s conclusions, and this may be approved by a DAC. Requesters do not need to indicate what specific GSR they plan to generate and disseminate.

  • If the data you requested is sensitive, public posting of genomic summary results is “restricted.”
  • You can determine this by looking under the Authorized Access section of the dbGaP entry; if sharing the GSR data generated as a secondary user is "restricted," it would say “Public Posting of Genomic Summary Results: Not Allowed.”

    For datasets that are designated as sensitive, DACs will not approve research use statements that indicate plans to disseminate GSR more broadly than publication within the scientific literature to support a study’s conclusions.