Skip to Main Content
An official website of the United States government
Epidemiology and Genomics Research Program

Frequently Asked Questions

Calculating HEI Scores

What information do I need to calculate an HEI score from a set of foods?

Scoring a set of foods requires that the foods be mapped to food groups and certain nutrients (such as sodium and fatty acids). NCI developed tools accomplish this using the Food and Nutrient Database for Dietary Studies (FNDDS) and the Food Patterns Equivalents Database (FPED). These databases provide nutrient amounts in foods (FNDDS) as well as disaggregate foods into the food group components (FPED) that are used in calculating HEI component scores. Information about which FPED and FNDDS constituents are used to calculate each HEI component can be found in the table Dietary Constituents for HEI-2020.

Examples of diet assessment tools NCI has created include the Automated Self-Administered 24-hour Dietary Assessment Tool (ASA-24, collect 24-hour recalls or food records) and the Diet History Questionnaire (DHQ, a food frequency questionnaire).

A set of foods can theoretically be mapped to the FNDDS and FPED by hand, for example if you have collected information about a person’s diet using a food frequency questionnaire whose output is not automatically linked to the food codes in FNDDS and FPED. Another option is to analyze the set of foods you are studying with a program that is linked to these or other similar databases already, such as ASA24, DHQ, or data collected with the fee-based Nutrition Data System for Research (NDSR) available through the University of Minnesota.

How do I know which method to use to calculate the HEI score of my data?

There are a variety of methods you can use to calculate an HEI score depending on the research question and the data available. See the Overview of the Methods and Calculations webpage for more information about the methods that can be used to calculate HEI scores based on the research question.

How do the methods proposed for analysis of the HEI adjust for measurement error?

The recommended approaches for estimating means and distributions of scores for a population, subpopulation, or group are intended to minimize the effects of measurement error in dietary intake data such that the results better reflect HEI scores for usual intake.

The methods proposed for analysis of usual intake based on 2 days of intake only adjust for random error. Little is known about the impact of systematic error on the results of analyses that make use of the HEI. Biomarker-based validation studies focusing on energy and protein have shown that the observed effects of diet on health are biased (typically toward the null, or attenuated) when diet is measured with error. Energy adjustment appears to lessen, though not eliminate, this problem. The extent to which these findings apply to models using the HEI is not yet fully understood. Until more is known about the effects of measurement error on analyses using HEI total or component scores as exposures in regression models, researchers should consider the potential for bias due to error in the interpretation of their results. For additional information on measurement error in dietary assessment, please consult the NCI Dietary Assessment Primer.

Is there an HEI tool or instrument needed to calculate an HEI score?

There is no data collection tool or questionnaire specific to the HEI because it can be used to score any set of foods, such as a population’s diet, a shopping basket, a menu, or an individual’s diet.

To score dietary intakes, a researcher or clinician should use a food record, or ideally at least several 24-hour recalls, or a food frequency questionnaire, to determine the variables needed to calculate the HEI. NCI has developed versions of these tools that could be used to get the food group output needed to calculate the HEI, including the Automated Self-Administered 24-hour Dietary Assessment Tool (ASA-24, collects 24-hour recalls or food records) and the Diet History Questionnaire (DHQ, a food frequency questionnaire).

Researchers and clinicians are encouraged to become familiar with the pros and cons of the different types of tools available to collect data for their research question and population. More information on this can be found in the Diet Assessment Primer.

Once dietary intake variables have been collected with a diet assessment tool, methods for calculating the HEI and SAS Code to help calculate scores are available on the HEI website.

Can I calculate HEI scores using Nutrition Data System for Research (NDSR) or Automated Self-Administered 24-Hour Recall System (ASA24)?

Yes. Code for calculating HEI scores using ASA24 output can be found on the HEI SAS Code page. Other commercial tools provide, for a fee, information and code to calculate HEI scores, such as NDSRExternal Web Site Policy.

How do I figure out food group amounts? How do I disaggregate foods into food groups?

To calculate the HEI, a researcher must know the amounts of several dietary components within the foods being assessed. For example, it is necessary to know information such as amount of whole fruit and amount of monounsaturated fats in the set of foods on which the HEI is being calculated. To determine this information, the food must be disaggregated into its constituent parts. See Step 2, Determine the amount of each dietary constituent in the set of foods for more details on what it means to disaggregate foods into dietary constituents.

If you are planning a study, how you will disaggregate your foods should be a consideration when deciding what type of tool to use for dietary assessment. If you are collecting food records or recalls, tools such as the freely available Automated Self-Administered 24-hour (ASA24) Dietary Assessment Tool or other commercially available tools provide data output that includes the disaggregated food variables needed to calculate the HEI from your participants diet intake information. Similarly, data collected using a food frequency questionnaire (FFQ), with a tool such as the Diet History Questionnaire (DHQ), will provide similar information in the data output files.

What is the Food Patterns Equivalents Database (FPED) and when do I need to use it?

The Food Patterns Equivalents Database (FPED) is a database developed by the U.S. Department of Agriculture (USDA) that determines amounts of dietary constituents in foods. The database breaks down foods as eaten into 37 different components such as whole grains, dairy, and solid fats. FPED is described further in this fact sheetExternal Web Site Policy and on the FPED overview website.

FPED components include all of the variables necessary for calculating the HEI. Information about which FPED and FNDDS constituents are used to calculate each HEI component can be found in the table Dietary Constituents for HEI-2020.

For researchers using the National Health and Nutrition Examination Survey (NHANES) data to calculate HEI scores, it may be necessary to use multiple versions of FPED (or MPED, the My Pyramid Equivalents Database that preceded FPED) depending on which cycles of NHANES being used. See the FAQ on which information is needed for NHANES analyses for which versions of FPED or MPED correspond to each cycle of NHANES. FPED and MPED databases can be accessed on the USDA's website.

What information do I need when using National Health and Nutrition Examination Survey (NHANES) data to calculate HEI scores?

When using NHANES data, there may be variety of datasets, databases and new variables that need to be created to successfully calculate HEI scores.  The table below contains details on datasets and files needed to calculate HEI scores with NHANES datasets. We provide examples of HEI analyses using NHANES data on the HEI SAS code page.  We recommend reviewing the Read Me file contained with any SAS code downloaded from this website for the purpose of calculating the HEI. Some considerations to note when performing HEI analyses with NHANES data include ensuring that your code: reads in only reliable recalls; reads in the relevant demographic dataset (e.g. think about what variables to keep based on analysis).

See the footnotes below the table for information about steps needed to resolve issues noted for various NHANES cycles. Example code performing many of these extra steps, such as reading in the CNPP Whole Fruit and Fruit Juice Database, and adjusting soy beverages, legumes, and pizza code values can be found in many of the zip files on the HEI SAS code page, (see NHANES-2003-2004-MPED-Population Ratio HEI-2010 [ZIP - 53.1 KB])

Dataset (Year) Database for guidance-based food groups Other databases needed Other variable creation issues Other database coding issues HEI version-specific details
NHANES (2015-16) FPED 2015-2016      

Differences between HEI-2005, HEI-2010 and HEI-2015 should be considered and include changes to the:

  • allocation of legumes in 2015
  • scoring of SoFAAS, Empty Calories, and Saturated Fat/Added Sugars
  • number and type of included components
View details about changes across versions.
NHANES (2013-14) FPED 2013-2014      
NHANES (2011-12) FPED 2011-2012      
NHANES (2009-10) FPED 2009-2010      
NHANES (2007-08) FPED 2007-2008      
NHANES (2005-06) FPED 2005-2006      
NHANES (2003-04) MPED 2.0 CNPP Whole Fruit and Fruit Juice Database1 Adjust soy beverages and units for legumes2,3 Adjust nutrient values on pizza food codes4
NHANES (2001-02) MPED 1.0 CNPP Whole Fruit and Fruit Juice Database1 Adjust soy beverages and units for legumes2,3  
NHANES (1999-2000) MPED 1.0 CNPP Whole Fruit and Fruit Juice Database1 Adjust soy beverages and units for legumes2,3  

1 An additional step is needed to separate whole fruit and fruit juice to calculate the HEI for these years. Datasets available from USDA's Center for Nutrition Policy and Promotion (CNPP) MyPyramid Equivalents Databases for Whole Fruit and Fruit Juice can be merged with the NHANES Individual Food Files by food code to properly allocate foods that contain some amount of fruit into whole fruit or fruit juice for the creation of the Whole Fruit HEI component. For the 1999/2000 NHANES cycle, there are 14 food codes (that appear in the Individual Food File dataset 19 times) that contain some amount of fruit but do not exist in the 1999/2000 CNPP database. To determine the whole fruit amount for these foods, you can use the values from the 2001/2002 CNPP fruit database.

2 The calculation of soy beverages affects the Dairy and Total Protein Foods componentsSoy beverages are counted as part of the Dairy component of the HEI-2010. This differs from the MyPyramid Equivalents Database (MPED), which groups them with other Soybean Products (M_SOY). Soy beverages (food codes 11310000, 11320000, 11321000, and 11330000) are moved from Soybean Products (M_SOY), in ounce equivalents, to Total Milk (D_TOTAL), in cup equivalents, based on the weight in grams of 1 cup.  Below are the conversion factors for the four affected food codes:

  • 11310000, MILK, IMITATION, FLUID, SOY BASED (1 cup=244 g)
  • 11320000, MILK, SOY, READY-TO-DRINK, NOT BABY (1 cup=245 g)
  • 11321000, MILK, SOY, READY-TO-DRINK, NOT BABY'S, CHOCOLATE (1 cup=240 g)
  • 11330000, MILK, SOY, DRY, RECONSTITUTED, NOT BABY (1 cup=245 g)

3 Legume amounts in the MPED are in cup equivalents; therefore, the cup equivalents are first converted to ounce equivalents of meat when they are counted for the Meat and Beans component, and are then converted back to cup equivalents when counted as vegetables. One-fourth cup of legumes is equal to 1-ounce equivalent of meat. Thus, the number of cup equivalents of legumes is multiplied by 4 to convert to ounce equivalents of meat.

4 In the MPED database related to NHANES 2003-04 only, there have been identified errors in the nutrient and food group values for the three pizza food codes below to correct for previously identified errors in the MPED 2003-2004 database These codes can be updated to match those in FPED 2011-2012.

  • 58106210, PIZZA, CHEESE, NS AS TO TYPE OF CRUST
  • 58106220, PIZZA, CHEESE, THIN CRUST
  • 58106230, PIZZA, CHEESE, THICK CRUST

HEI Code

Is there SAS code available for FFQ data?

SAS code that can be applied to FFQ data to estimate total and component scores for each individual is currently available on the SAS Code page.

This code uses NIH-AARP Diet and Health Study data as an example. This code estimates component and total HEI scores for each individual and can be modified for use with other FFQs.

Where do I find the SAS code to calculate the HEI score?

See SAS Code for links to code that will help calculate HEI scores and perform other tasks such as calculate distributions of scores of estimated usual intake. Questions about the sample SAS code can be sent to RFAB@mail.nih.gov.

Can I use other statistical software packages besides SAS to calculate the HEI score?

Yes, it is possible to use other software packages, such as Stata or R, to calculate HEI scores. Researchers are encouraged to ask questions and share their code via the HEI list-serv. To join the HEI list-serv, send an email to listserv@list.nih.gov with SUBSCRIBE HEI <first name> <last name> in the body of the text (remove > and <, for example: SUBSCRIBE HEI John Smith).

Please note that the NCI is only able to provide sample code in SAS at this time.

Citing HEI in Research Papers

How do I cite the HEI website?

Though citation format will vary depending on the journal style you are following, the general information that should be included in the citation of the HEI website includes the author (National Cancer Institute), the web page title (such as The Healthy Eating Index - Population Ratio Method) and the URL where the information is located. For example:

National Cancer Institute. The Healthy Eating Index – Population Ratio Method. https://epi.grants.cancer.gov/hei/population-ratio-method.html. Updated December 14, 2021.

What are other resources about the HEI to cite?

See Selected HEI Publications for peer-reviewed articles about the HEI.

How do I interpret the HEI scores?

A graded approach can be used to aid interpretation of the HEI scores. The letter grade should not be reported alone, it should only be reported in combination with the numerical score. The grading system is as follows:

  • Overall scores of 90 to 100, or component scores that are 90% to 100% of maximum score: A;
  • Overall scores of 80 to 89, or component scores that are 80% to 89% of maximum score: B;
  • Overall scores of 70 to 79, or component scores that are 70% to 79% of maximum score: C;
  • Overall scores of 60 to 69, or component scores that are 60% to 69% of maximum score: D; and
  • Overall scores of 0 to 59, or component scores that are 0% to 59% of maximum score: F.

Furthermore, we do not recommend using letter grades as a way of categorizing scores for subsequent analyses because translating scaled data into categories discards useful information. Also, given the variability in diets, misclassification can occur, especially affecting scores at or near cutpoints.