Return to Physical Activity Assessment Resources
On this page...
2022-04-05
Introduction
This package is to be used with the 2011-2014 NHANES and 2012 NNYFS
publicly available data from the websites:
2011-2012
NHANES
2013-2014
NHANES
2012
NNYFS
The raw 80 Hz physical activity monitor (accelerometer) data along with other data files can be downloaded using this package.
Required Packages
The NHANES.RAW80Hz package requires the “foreign” library for reading “.xpt” files. First, check if this library needs to be installed. If it does, then it can be downloaded and installed from a CRAN repository.
if (!("foreign" %in% installed.packages())) install.packages("foreign")
Load the NHANES.RAW80Hz Package
library(NHANES.RAW80Hz)
1. Non-raw Data Files
Before getting to the large raw accelerometer data, lets start with some of the smaller data files available from the above websites. To see which non-raw data files are available for a particular data source, the getDataFileInfo() function can be used. Lets see the files for 2012 NNYFS. A data frame of file information will be returned.
info <- getDataFileInfo("NNYFS")
info
## Type
## 1 Demographics
## 2 Dietary
## 3 Dietary
## 4 Dietary
## 5 Dietary
## 6 Dietary
## 7 Dietary
## 8 Dietary
## 9 Dietary
## 10 Dietary
## 11 Examination
## 12 Examination
## 13 Examination
## 14 Examination
## 15 Examination
## 16 Examination
## 17 Examination
## 18 Examination
## 19 Examination
## 20 Examination
## 21 Examination
## 22 Examination
## 23 Questionnaire
## 24 Questionnaire
## 25 Questionnaire
## 26 Questionnaire
## 27 Questionnaire
## 28 Questionnaire
## 29 Questionnaire
## 30 Questionnaire
## 31 Questionnaire
## 32 Questionnaire
## 33 Questionnaire
## 34 Questionnaire
## Name File
## 1 Demographic Variables & Sample Weights Y_DEMO
## 2 Dietary Interview - Individual Foods Y_DR1IFF
## 3 Dietary Interview - Total Nutrient Intakes Y_DR1TOT
## 4 Dietary Supplement Database - Blend Information DSBI
## 5 Dietary Supplement Database - Ingredient Information DSII
## 6 Dietary Supplement Database - Product Information DSPI
## 7 Dietary Supplement Use 24-Hour - Individual Dietary Supplements Y_DS1IDS
## 8 Dietary Supplement Use 24-Hour - Total Dietary Supplements Y_DS1TOT
## 9 Dietary Supplement Use 30 Day - Individual Dietary Supplements Y_DSQIDS
## 10 Dietary Supplement Use 30-Day - Total Dietary Supplements Y_DSQTOT
## 11 Body Measures Y_BMX
## 12 Cardiorespiratory Endurance Y_CEX
## 13 Cardiovascular Fitness Y_CVX
## 14 Lower Body Muscle Strength Y_LMX
## 15 Modified Pull-Up Y_MPX
## 16 Muscle Strength - Grip Test Y_MGX
## 17 Physical Activity Monitor - Day Y_PAXDAY
## 18 Physical Activity Monitor - Header Y_PAXHD
## 19 Physical Activity Monitor - Hour Y_PAXHR
## 20 Physical Activity Monitor - Minute Y_PAXMIN
## 21 Plank Y_PLX
## 22 Test of Gross Motor Development Y_GMX
## 23 Acculturation Y_ACQ
## 24 Diabetes Y_DIQ
## 25 Diet Behavior & Nutrition Y_DBQ
## 26 Early Childhood Y_ECQ
## 27 Health Insurance Y_HIQ
## 28 Hospital Utilization & Access to Care Y_HUQ
## 29 Medical Conditions Y_MCQ
## 30 Physical Activity Y_PAQ
## 31 Physical Functioning Y_PFQ
## 32 Prescription Medications Y_RXQ_RX
## 33 Respiratory Health Y_RDQ
## 34 Smoking - Cigarette Use Y_SMQ
## Size_compressed Date_published Path
## 1 350.4 KB January 2015 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DEMO.XPT
## 2 14.3 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DR1IFF.XPT
## 3 1.2 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DR1TOT.XPT
## 4 4.4 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/DSBI.XPT
## 5 31 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/DSII.XPT
## 6 2.9 MB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/DSPI.XPT
## 7 224.9 KB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DS1IDS.XPT
## 8 573.4 KB September 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DS1TOT.XPT
## 9 645.7 KB January 2015 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DSQIDS.XPT
## 10 505.8 KB January 2015 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DSQTOT.XPT
## 11 274.7 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_BMX.XPT
## 12 393.1 KB January 2016 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_CEX.XPT
## 13 132.1 KB April 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_CVX.XPT
## 14 1.1 MB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_LMX.XPT
## 15 43.3 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_MPX.XPT
## 16 204.5 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_MGX.XPT
## 17 1.3 MB November 2020 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXDAY.XPT
## 18 77.4 KB November 2020 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXHD.XPT
## 19 26.3 MB November 2020 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXHR.XPT
## 20 1.6 GB November 2020 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAXMIN.XPT
## 21 50.5 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PLX.XPT
## 22 165.3 KB January 2016 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_GMX.XPT
## 23 65.5 KB August 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_ACQ.XPT
## 24 78.4 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DIQ.XPT
## 25 156.1 KB February 2015 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_DBQ.XPT
## 26 91.4 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_ECQ.XPT
## 27 169.1 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_HIQ.XPT
## 28 52.5 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_HUQ.XPT
## 29 91.4 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_MCQ.XPT
## 30 1.2 MB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PAQ.XPT
## 31 65.5 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_PFQ.XPT
## 32 288.4 KB July 2014 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_RXQ_RX.XPT
## 33 39.6 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_RDQ.XPT
## 34 16.6 KB September 2013 https://wwwn.cdc.gov/Nchs/Nnyfs/Y_SMQ.XPT
Lets take a look into the first file containing the “Body Measures” data. We can download and read this file into a data frame using the downloadAndReadFile() function. The first argument to this function will be “NNYFS”, but the second argument can either be “Body Measures” or “Y_BMX”, which are the values of the “Name” and “File” columns in the data frame above for row 1.
x.bm <- downloadAndReadFile("NNYFS", "Body Measures")
x.bm[1:5, ]
## SEQN BMDSTATS BMXWT BMIWT BMXHT BMIHT BMXBMI BMDBMIC BMXARML BMIARML BMXARMC
## 1 71917 4 NA NA NA NA NA NA NA NA NA
## 2 71918 1 38.6 NA 131.6 NA 22.3 4 27.7 NA 25.4
## 3 71919 1 58.7 NA 172.0 NA 19.8 2 38.4 NA 26.0
## 4 71920 3 92.5 NA 167.1 NA 33.1 4 35.9 NA 37.9
## 5 71921 1 12.4 NA 90.2 NA 15.2 2 18.3 NA 15.1
## BMIARMC BMXWAIST BMIWAIST BMXCALF BMICALF BMXCALFF BMICALFF BMXTRI BMITRI
## 1 NA NA NA NA NA NA NA NA NA
## 2 NA 71.9 NA 32.3 NA 22.0 NA 19.9 NA
## 3 NA 79.4 NA 35.3 NA 18.4 NA 15.0 NA
## 4 NA 96.4 NA 46.8 NA NA 1 20.6 NA
## 5 NA 46.8 NA 19.4 NA 8.4 NA 8.6 NA
## BMXSUB BMISUB
## 1 NA NA
## 2 17.4 NA
## 3 9.8 NA
## 4 22.8 NA
## 5 5.7 NA
The downloadAndReadFile() only works for a single file, however several files can be downloaded and saved using the downloadFiles() function. The first argument to this function is the destination directory to where the data will be saved. Similar to the downloadAndReadFile() function, the files to be downloaded can either be from the “Name” or “File” columns in the info data frame above. The downloaded files can be saved as type “csv” (comma-seperated values), “rda” (binary R object file), or “xpt” (SAS transport file). Let us download the data for “Cardiorespiratory Endurance” and “Cardiovascular Fitness” and save them as csv files in the working directory. The saved file names will be prefixed with the data source (NNYFS) and contain the value of the “File” column in the data frame returned from getDataFileInfo(). Note that there are different ways to specify the files argument as noted in the comment below.
destDir <- getwd()
data <- "NNYFS"
# Different ways to specify the same files
files <- c("Cardiorespiratory Endurance", "Cardiovascular Fitness")
# files <- c("Cardiorespiratory Endurance", "Y_CVX")
# files <- c("Y_CEX", "Y_CVX")
# files <- c("Cardiovascular Fitness", "Y_CEX")
downloadFiles(destDir, data, files=files, save.type="csv")
## NULL
# Check that the files exist file1 <- paste0(destDir, "/", data, "_", "Y_CEX", ".csv") file2 <- paste0(destDir, "/", data, "_", "Y_CVX", ".csv") file.exists(file1)
## [1] TRUE
file.exists(file2)
## [1] TRUE
2. Raw 80 Hz Physical Activity Monitor Data
Given that the raw physical activity monitor data is very large, the users must ensure that they have sufficient disk space to download and extract the data. The compressed sizes are 1.04 TB for 2011-2012 NHANES, 1.17 TB for 2013-2014 NHANES, and 245 GB for 2012 NNYFS. The uncompressed data will be about 12 times larger than the compressed data.
The raw data is partitioned by subject and stored in .tar.bz2 files,
where each .tar.bz2 file contains many csv files for that subject, where
each csv file is for a particular day. The data is downloaded subject by
subject into a separate folder for each subject. To download this data,
the downloadAcclRawData80Hz() function can only be
used. The syntax of this function is
downloadAcclRawData80Hz(destDir, data=“NNYFS”,
subject.seqn=“72102”, extract=TRUE, compress=TRUE, zipcmd=NULL,
checkForFiles=TRUE, delete.bz2=TRUE, DEBUG=FALSE),
and the complete documentation for all input arguments can be found in
the user manual. The user must specifiy the destination folder to store
the data (destDir), the data to download (data) and the subjects to
download (subject.seqn). If running on Windows, then the user must also
specify the “zipcmd” argument and have either the 7-zip software or
WinZip software installed in order to extract the csv files from the
.tar.bz2 file. The path to the 7-zip or WinZip software would then be
used as the zipcmd argument. If running on a Mac, then the zipcmd
argument is not required. The downloadAcclRawData80Hz()
function returns a list containing the SEQN ids that do not have
accelerometer data, and the SEQN ids where an error occured while
downloading.
As a simple example, we will download to the working directory the 2012 NNYFS raw data for subjects 72102 and 72343. In the working directory, a folder with name NNYFS will be created, and within the NNYFS directory, folders with names 72102 and 72343 will be created. The code below is for Unix and Windows.
destDir <- getwd()
subs <- c(72102, 72343)
if (.Platform$OS.type == "windows") {
# On Windows, the zipcmd argument must be specified. Modify below.
#zipcmd <- "C:/Program Files/7-Zip/7z.exe" # 7-zip software
zipcmd <- "C:/Program Files/WinZip/winzip64.exe" # WinZip software
downloadAcclRawData80Hz(destDir, data="NNYFS", subject.seqn=subs, zipcmd=zipcmd)
} else {
# For Unix and Mac operating systems
downloadAcclRawData80Hz(destDir, data="NNYFS", subject.seqn=subs)
}
## Downloading file 1 of 2
## Extracting files
## Compressing files
## Downloading file 2 of 2
## Extracting files
## Compressing files
## $missing.data
## NULL
##
## $error.download
## NULL
Note that if the subject.seqn argument is set to NULL, then all subjects with accelerometer data will be downloaded.
To obtain all available subjects with raw accelerometer data, the getSEQNids() function can be called as follows.
all.subs <- getSEQNids("NNYFS")
all.subs[1:10]
## [1] 71917 71918 71919 71920 71921 71922 71923 71924 71925 71926
length(all.subs)
## [1] 1477
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] NHANES.RAW80Hz_0.0.3
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.29 magrittr_2.0.2 evaluate_0.14 rlang_1.0.1
## [5] stringi_1.7.6 cli_3.1.1 rstudioapi_0.13 jquerylib_0.1.4
## [9] rmarkdown_2.11 tools_4.1.2 foreign_0.8-81 stringr_1.4.0
## [13] xfun_0.29 yaml_2.2.2 fastmap_1.1.0 compiler_4.1.2
## [17] htmltools_0.5.2 knitr_1.37