HEI Scores for Examining Association between Diet and Another Variable

When examining relationships between HEI scores (as predictors) and other variables (as outcomes), for example, in epidemiological studies, interest is on quantifying the relationship between the HEI and the outcome.

The recommended approaches for estimating means and distributions of scores for a population, subpopulation, or group are intended to minimize the effects of measurement error in dietary intake data such that the results better reflect HEI scores for usual intake.

For analyses that require estimating scores for each person rather than means or distributions for a group, such as examining relationships between diet and health or other variables, there currently are few options for correcting for error in dietary intake data. Further, little is known about the impact of measurement error on the results of analyses that make use of the HEI.

Biomarker-based validation studies focusing on energy and protein have shown that the observed effects of diet on health are biased (typically toward the null, or attenuated) when diet is measured with error (see Freedman et al 2014; Freedman et al 2015). Energy adjustment appears to lessen, though not eliminate, this problem. The extent to which these findings apply to models using the HEI is not yet understood. Further, techniques for dealing with measurement error in such analyses have not yet been developed. Until more is known about the effects of measurement error on analyses using HEI total or component scores as exposures in regression models, researchers should consider the potential for bias due to error in the interpretation of their results.

Guidance on the appropriate estimation of HEI scores using the MCMC approach for use in models estimating relationships with health or other variables based on 24-hour recall data is in preparation. Further details will be posted when available.

Below is a table that summarizes what methods are possible and which are recommended given various types of nutritional intake data available.

Examining Association between Diet and Another Variable (e.g. health status)

Purpose Method Data Considerations & Caveats
To Calculate Total and Component HEI Scores and Estimate Regression Coefficients Simple HEI Scoring Algorithm + Regression Single 24HR on all Yields HEI scores for a single day, contrary to the recommendation that HEI be based on usual intake.

Will be biased to the extent that the 24HR-report of the intake of the component is biased.
Single 24HR on all, multiple 24HRs on at least a subset Yields HEI scores for a single day or the mean intake over a limited number of days, contrary to the recommendation that HEI be based on usual intake

Will be biased to the extent that the 24HR-report of the intake of the component is biased.
FFQ on all Can be done, but is biased because FFQ-reported intakes are usually biased.
Bivariate Approach + Regression Single 24HR on all, multiple 24HRs on at least a subset Can only be used to put a single component (or other ratio) into a regression model, unlike the MCMC method which allows for the multivariate HEI total score to be included in the model

Doesn’t account for relationship between HEI components

Gives estimates of single component (or other ratio) consistent with the MCMC method

More user-friendly than MCMC method

Including the FFQ data would only be recommended when the subset with 24HRs is very small compared to total sample and/or the subset with the 24HRs is not a random subsample of the total sample. (Outside of these cases the recommendation would be to just use the information from the 24HRs as the FFQ would not add much new information.)
FFQ on all, multiple 24HRs on at least a subset
Multivariate Approach (MCMC) + Regression Single 24HR on all, multiple 24HRs on at least a subset This method is under development, but is not currently available

Allows the multivariate total HEI score to be included in a regression model, accounting for the relationship between the HEI components.

The MCMC method is computationally intensive and has a steep learning curve.
FFQ on all, multiple 24HRs on at least a subset
Multiple 24HRs and FFQ on all participants

Return to Top