Skip to Main Content
An official website of the United States government
Epidemiology and Genomics Research Program

Using the NHANES.RAW80Hz Package

Return to Physical Activity Assessment Resources

2022-04-05

Introduction

This package is to be used with the 2011-2014 NHANES and 2012 NNYFS publicly available data from the websites:
2011-2012 NHANES
2013-2014 NHANES
2012 NNYFS

The raw 80 Hz physical activity monitor (accelerometer) data along with other data files can be downloaded using this package.

Required Packages

The NHANES.RAW80Hz package requires the “foreign” library for reading “.xpt” files. First, check if this library needs to be installed. If it does, then it can be downloaded and installed from a CRAN repository.

if (!("foreign" %in% installed.packages())) install.packages("foreign")

Load the NHANES.RAW80Hz Package

library(NHANES.RAW80Hz)

1. Non-raw Data Files

Before getting to the large raw accelerometer data, lets start with some of the smaller data files available from the above websites. To see which non-raw data files are available for a particular data source, the getDataFileInfo() function can be used. Lets see the files for 2012 NNYFS. A data frame of file information will be returned.

info <- getDataFileInfo("NNYFS")
info

##             Type
## 1   Demographics
## 2        Dietary
## 3        Dietary
## 4        Dietary
## 5        Dietary
## 6        Dietary
## 7        Dietary
## 8        Dietary
## 9        Dietary
## 10       Dietary
## 11   Examination
## 12   Examination
## 13   Examination
## 14   Examination
## 15   Examination
## 16   Examination
## 17   Examination
## 18   Examination
## 19   Examination
## 20   Examination
## 21   Examination
## 22   Examination
## 23 Questionnaire
## 24 Questionnaire
## 25 Questionnaire
## 26 Questionnaire
## 27 Questionnaire
## 28 Questionnaire
## 29 Questionnaire
## 30 Questionnaire
## 31 Questionnaire
## 32 Questionnaire
## 33 Questionnaire
## 34 Questionnaire
##                                                               Name     File
## 1                           Demographic Variables & Sample Weights   Y_DEMO
## 2                             Dietary Interview - Individual Foods Y_DR1IFF
## 3                       Dietary Interview - Total Nutrient Intakes Y_DR1TOT
## 4                  Dietary Supplement Database - Blend Information     DSBI
## 5             Dietary Supplement Database - Ingredient Information     DSII
## 6                Dietary Supplement Database - Product Information     DSPI
## 7  Dietary Supplement Use 24-Hour - Individual Dietary Supplements Y_DS1IDS
## 8       Dietary Supplement Use 24-Hour - Total Dietary Supplements Y_DS1TOT
## 9   Dietary Supplement Use 30 Day - Individual Dietary Supplements Y_DSQIDS
## 10       Dietary Supplement Use 30-Day - Total Dietary Supplements Y_DSQTOT
## 11                                                   Body Measures    Y_BMX
## 12                                     Cardiorespiratory Endurance    Y_CEX
## 13                                          Cardiovascular Fitness    Y_CVX
## 14                                      Lower Body Muscle Strength    Y_LMX
## 15                                                Modified Pull-Up    Y_MPX
## 16                                     Muscle Strength - Grip Test    Y_MGX
## 17                                 Physical Activity Monitor - Day Y_PAXDAY
## 18                              Physical Activity Monitor - Header  Y_PAXHD
## 19                                Physical Activity Monitor - Hour  Y_PAXHR
## 20                              Physical Activity Monitor - Minute Y_PAXMIN
## 21                                                           Plank    Y_PLX
## 22                                 Test of Gross Motor Development    Y_GMX
## 23                                                   Acculturation    Y_ACQ
## 24                                                        Diabetes    Y_DIQ
## 25                                       Diet Behavior & Nutrition    Y_DBQ
## 26                                                 Early Childhood    Y_ECQ
## 27                                                Health Insurance    Y_HIQ
## 28                           Hospital Utilization & Access to Care    Y_HUQ
## 29                                              Medical Conditions    Y_MCQ
## 30                                               Physical Activity    Y_PAQ
## 31                                            Physical Functioning    Y_PFQ
## 32                                        Prescription Medications Y_RXQ_RX
## 33                                              Respiratory Health    Y_RDQ
## 34                                         Smoking - Cigarette Use    Y_SMQ
##    Size_compressed Date_published                                         Path
## 1         350.4 KB   January 2015   https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DEMO.XPT
## 2          14.3 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DR1IFF.XPT
## 3           1.2 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DR1TOT.XPT
## 4           4.4 MB September 2014     https://wwwn.cdc.gov/Nchs/Nnyfs/DSBI.XPT
## 5            31 MB September 2014     https://wwwn.cdc.gov/Nchs/Nnyfs/DSII.XPT
## 6           2.9 MB September 2014     https://wwwn.cdc.gov/Nchs/Nnyfs/DSPI.XPT
## 7         224.9 KB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DS1IDS.XPT
## 8         573.4 KB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DS1TOT.XPT
## 9         645.7 KB   January 2015 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DSQIDS.XPT
## 10        505.8 KB   January 2015 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DSQTOT.XPT
## 11        274.7 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_BMX.XPT
## 12        393.1 KB   January 2016    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_CEX.XPT
## 13        132.1 KB     April 2014    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_CVX.XPT
## 14          1.1 MB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_LMX.XPT
## 15         43.3 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_MPX.XPT
## 16        204.5 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_MGX.XPT
## 17          1.3 MB  November 2020 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXDAY.XPT
## 18         77.4 KB  November 2020  https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXHD.XPT
## 19         26.3 MB  November 2020  https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXHR.XPT
## 20          1.6 GB  November 2020 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXMIN.XPT
## 21         50.5 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PLX.XPT
## 22        165.3 KB   January 2016    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_GMX.XPT
## 23         65.5 KB    August 2014    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_ACQ.XPT
## 24         78.4 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DIQ.XPT
## 25        156.1 KB  February 2015    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DBQ.XPT
## 26         91.4 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_ECQ.XPT
## 27        169.1 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_HIQ.XPT
## 28         52.5 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_HUQ.XPT
## 29         91.4 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_MCQ.XPT
## 30          1.2 MB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAQ.XPT
## 31         65.5 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PFQ.XPT
## 32        288.4 KB      July 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_RXQ_RX.XPT
## 33         39.6 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_RDQ.XPT
## 34         16.6 KB September 2013    https://wwwn.cdc.gov/Nchs/Nnyfs/Y_SMQ.XPT

Lets take a look into the first file containing the “Body Measures” data. We can download and read this file into a data frame using the downloadAndReadFile() function. The first argument to this function will be “NNYFS”, but the second argument can either be “Body Measures” or “Y_BMX”, which are the values of the “Name” and “File” columns in the data frame above for row 1.

x.bm <- downloadAndReadFile("NNYFS", "Body Measures") 
x.bm[1:5, ]

##    SEQN BMDSTATS BMXWT BMIWT BMXHT BMIHT BMXBMI BMDBMIC BMXARML BMIARML BMXARMC
## 1 71917        4    NA    NA    NA    NA     NA      NA      NA      NA      NA
## 2 71918        1  38.6    NA 131.6    NA   22.3       4    27.7      NA    25.4
## 3 71919        1  58.7    NA 172.0    NA   19.8       2    38.4      NA    26.0
## 4 71920        3  92.5    NA 167.1    NA   33.1       4    35.9      NA    37.9
## 5 71921        1  12.4    NA  90.2    NA   15.2       2    18.3      NA    15.1
##   BMIARMC BMXWAIST BMIWAIST BMXCALF BMICALF BMXCALFF BMICALFF BMXTRI BMITRI
## 1      NA       NA       NA      NA      NA       NA       NA     NA     NA
## 2      NA     71.9       NA    32.3      NA     22.0       NA   19.9     NA
## 3      NA     79.4       NA    35.3      NA     18.4       NA   15.0     NA
## 4      NA     96.4       NA    46.8      NA       NA        1   20.6     NA
## 5      NA     46.8       NA    19.4      NA      8.4       NA    8.6     NA
##   BMXSUB BMISUB
## 1     NA     NA
## 2   17.4     NA
## 3    9.8     NA
## 4   22.8     NA
## 5    5.7     NA

The downloadAndReadFile() only works for a single file, however several files can be downloaded and saved using the downloadFiles() function. The first argument to this function is the destination directory to where the data will be saved. Similar to the downloadAndReadFile() function, the files to be downloaded can either be from the “Name” or “File” columns in the info data frame above. The downloaded files can be saved as type “csv” (comma-seperated values), “rda” (binary R object file), or “xpt” (SAS transport file). Let us download the data for “Cardiorespiratory Endurance” and “Cardiovascular Fitness” and save them as csv files in the working directory. The saved file names will be prefixed with the data source (NNYFS) and contain the value of the “File” column in the data frame returned from getDataFileInfo(). Note that there are different ways to specify the files argument as noted in the comment below.

destDir <- getwd()
data    <- "NNYFS"
	
# Different ways to specify the same files
files   <- c("Cardiorespiratory Endurance", "Cardiovascular Fitness")
# files <- c("Cardiorespiratory Endurance", "Y_CVX")
# files <- c("Y_CEX", "Y_CVX")
# files <- c("Cardiovascular Fitness", "Y_CEX")

downloadFiles(destDir, data, files=files, save.type="csv")

## NULL

# Check that the files exist
file1 <- paste0(destDir, "/", data, "_", "Y_CEX", ".csv")
file2 <- paste0(destDir, "/", data, "_", "Y_CVX", ".csv")
file.exists(file1)

## [1] TRUE

file.exists(file2)

## [1] TRUE

2. Raw 80 Hz Physical Activity Monitor Data

Given that the raw physical activity monitor data is very large, the users must ensure that they have sufficient disk space to download and extract the data. The compressed sizes are 1.04 TB for 2011-2012 NHANES, 1.17 TB for 2013-2014 NHANES, and 245 GB for 2012 NNYFS. The uncompressed data will be about 12 times larger than the compressed data.

The raw data is partitioned by subject and stored in .tar.bz2 files, where each .tar.bz2 file contains many csv files for that subject, where each csv file is for a particular day. The data is downloaded subject by subject into a separate folder for each subject. To download this data, the downloadAcclRawData80Hz() function can only be used. The syntax of this function is
downloadAcclRawData80Hz(destDir, data=“NNYFS”, subject.seqn=“72102”, extract=TRUE, compress=TRUE, zipcmd=NULL, checkForFiles=TRUE, delete.bz2=TRUE, DEBUG=FALSE),
and the complete documentation for all input arguments can be found in the user manual. The user must specifiy the destination folder to store the data (destDir), the data to download (data) and the subjects to download (subject.seqn). If running on Windows, then the user must also specify the “zipcmd” argument and have either the 7-zip software or WinZip software installed in order to extract the csv files from the .tar.bz2 file. The path to the 7-zip or WinZip software would then be used as the zipcmd argument. If running on a Mac, then the zipcmd argument is not required. The downloadAcclRawData80Hz() function returns a list containing the SEQN ids that do not have accelerometer data, and the SEQN ids where an error occured while downloading.

As a simple example, we will download to the working directory the 2012 NNYFS raw data for subjects 72102 and 72343. In the working directory, a folder with name NNYFS will be created, and within the NNYFS directory, folders with names 72102 and 72343 will be created. The code below is for Unix and Windows.

destDir <- getwd()
subs    <- c(72102, 72343)

if (.Platform$OS.type == "windows") {
  # On Windows, the zipcmd argument must be specified. Modify below.
  #zipcmd <- "C:/Program Files/7-Zip/7z.exe"        # 7-zip software
  zipcmd <- "C:/Program Files/WinZip/winzip64.exe"  # WinZip software
  downloadAcclRawData80Hz(destDir, data="NNYFS", subject.seqn=subs, zipcmd=zipcmd)
} else {
  # For Unix and Mac operating systems
  downloadAcclRawData80Hz(destDir, data="NNYFS", subject.seqn=subs)
} 

## Downloading file 1 of 2

## Extracting files

## Compressing files

## Downloading file 2 of 2

## Extracting files

## Compressing files

## $missing.data
## NULL
## 
## $error.download
## NULL

Note that if the subject.seqn argument is set to NULL, then all subjects with accelerometer data will be downloaded.

To obtain all available subjects with raw accelerometer data, the getSEQNids() function can be called as follows.

all.subs <- getSEQNids("NNYFS")
all.subs[1:10]

##  [1] 71917 71918 71919 71920 71921 71922 71923 71924 71925 71926

length(all.subs)

## [1] 1477

sessionInfo()

## R version 4.1.2 (2021-11-01)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] NHANES.RAW80Hz_0.0.3
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.29   magrittr_2.0.2  evaluate_0.14   rlang_1.0.1    
##  [5] stringi_1.7.6   cli_3.1.1       rstudioapi_0.13 jquerylib_0.1.4
##  [9] rmarkdown_2.11  tools_4.1.2     foreign_0.8-81  stringr_1.4.0  
## [13] xfun_0.29       yaml_2.2.2      fastmap_1.1.0   compiler_4.1.2 
## [17] htmltools_0.5.2 knitr_1.37