1 The SAS System 17:06 Thursday, March 8, 2007 NOTE: Copyright (c) 1999-2001 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software Release 8.2 (TS2M0) Licensed to NATIONAL INSTITUTES OF HEALTH, Site 0040679003. NOTE: This session is executing on the WIN_PRO platform. NOTE: SAS initialization used: real time 0.10 seconds cpu time 0.10 seconds NOTE: AUTOEXEC processing beginning; file is C:\Program Files\SAS Institute\SAS\V8\autoexec.sas. NOTE: Libref GDEVICE0 was successfully assigned as follows: Engine: V8 Physical Name: C:\Documents and Settings\DoddK\My Documents\SASGRAPH NOTE: AUTOEXEC processing completed. 1 *----------------------------------------------------------------------------*; 2 *create.pam_perminute.sas *; 3 * *; 4 *This program edits the Physical Activity Monitor (PAM) data. It edits *; 5 *invalid and unreliable intensity values. It creates a SAS dataset named *; 6 *pam_perminute in the location specified by the "libname myfolder" statement *; 7 *below. *; 8 * *; 9 *Before running the code below: *; 10 *1. Modify the libname statement to refer to the folder where you want to *; 11 * store the input and output datasets. *; 12 *2. Create a SAS dataset named paxraw_c in that folder from the Physical *; 13 * Activity Monitor data from *; 14 * http://www.cdc.gov/nchs/about/major/nhanes/nhanes2003-2004/exam03_04.htm.*; 15 *3. Save PAM_formats.txt (included with these programs) and list the full *; 16 * path in the %include statement below. You will need to include these *; 17 * formats in any program that uses the output dataset. *; 18 *----------------------------------------------------------------------------*; 19 libname myfolder "&home/EATS_NHANES/sasdata"; NOTE: Libref MYFOLDER was successfully assigned as follows: Engine: V8 Physical Name: C:\Documents and Settings\DoddK\My Documents\EATS_NHANES\sasdata 20 %include "&home/EATS_NHANES/sasprog/AccelerometryPA/PAM_formats.sas"; NOTE: Format YESNO has been output. NOTE: Format WKDAY has been output. NOTE: Format GENDER has been output. NOTE: Format AGEGRP has been output. NOTE: PROCEDURE FORMAT used: real time 0.14 seconds cpu time 0.00 seconds 2 The SAS System 17:06 Thursday, March 8, 2007 55 56 *----------------------------------------------------------------------------*; 57 *Create the sequential day variable DAY. (Note that DAY is different from the*; 58 *day of the week variable PAXDAY.) *; 59 *----------------------------------------------------------------------------*; 60 data paxraw; 61 set myfolder.paxraw_c; 62 day=ceil(paxn/1440); 63 label day='Sequential Day'; 64 format paxday wkday.; 65 run; NOTE: There were 72250027 observations read from the data set MYFOLDER.PAXRAW_C. NOTE: The data set WORK.PAXRAW has 72250027 observations and 9 variables. NOTE: DATA statement used: real time 1:56.73 cpu time 56.09 seconds 66 67 *----------------------------------------------------------------------*; 68 *1. edit records with intensity count=32767(invalid monitor reading) *; 69 *2. assign missing values for other unreliable data points *; 70 *----------------------------------------------------------------------*; 71 * identify people with fewer than 6 data points; 72 * with intensity count = 32767 (invalid monitor reading); 73 * These data will be replaced with the average of the adjacent minutes that have 73 ! counts that are not = 32767; 74 data invalid valid; 75 set paxraw; 76 by seqn; 77 if first.seqn then invalid_cnt=0; 78 retain invalid_cnt; 79 if paxinten=32767 then invalid_cnt=invalid_cnt+1; 80 if last.seqn then do; 81 if 01 minutes with intensity count(s) = 32767, take the average of 112 ! ; 113 *the valid intensity counts immediately before and after the invalid minute(s); 114 else if paxinten > . and paxinten ne 32767 then do; 115 if paxn_invalid ne . then do; 116 sv_int=paxinten; 117 sv_paxn=paxn; 118 *one or more consecutive minutes with count=32767; 119 do i=paxn_valid+1 to paxn-1; 120 paxn=i; 121 paxinten=round(sum(sv_int,last_int)/2); 122 output; 123 end; 124 paxn_invalid=.; 4 The SAS System 17:06 Thursday, March 8, 2007 125 *last minute with a valid intensity; 126 last_int=sv_int; 127 paxn_valid=sv_paxn; 128 end; 129 else do; 130 paxn_valid=paxn; 131 last_int=paxinten; 132 end; 133 end; 134 *if the last minute of the day has intensity count = 32767; 135 *use the last valid minute; 136 else if paxinten=32767 and last.day then do; 137 paxinten=last_int; 138 if paxn_invalid ne . then do; 139 do i=paxn_invalid to paxn; * >=1 consecutive minutes of count=32767 up to the 139 ! last minute; 140 paxn=i; 141 output; 142 end; 143 end; 144 else output; 145 end; 146 run; NOTE: There were 584640 observations read from the data set WORK.PAXRAW_INVALID. NOTE: The data set WORK.PAXRAW_NEW has 74 observations and 4 variables. NOTE: DATA statement used: real time 1.03 seconds cpu time 0.57 seconds 147 148 proc sort data=paxraw_new; 149 by seqn day paxn ; 150 run; NOTE: There were 74 observations read from the data set WORK.PAXRAW_NEW. NOTE: The data set WORK.PAXRAW_NEW has 74 observations and 4 variables. NOTE: PROCEDURE SORT used: real time 0.00 seconds cpu time 0.00 seconds 151 152 proc sort data=paxraw; 153 by seqn day paxn ; 154 run; NOTE: There were 72250027 observations read from the data set WORK.PAXRAW. NOTE: The data set WORK.PAXRAW has 72250027 observations and 9 variables. NOTE: PROCEDURE SORT used: real time 25:44.05 cpu time 2:58.29 5 The SAS System 17:06 Thursday, March 8, 2007 155 156 *Update the raw data with the modified values above and assign; 157 *missing values for other unreliable data points. Records noted 158 *by SEQN below were identified as having partial or completely; 159 *anomalous data. Anomalous data are set to missing to preserve; 160 *remaining data. Alternatively, users may exclude all data with; 161 *PAXSTAT=2; 162 data pam_perminute; 163 update paxraw paxraw_new; 164 by seqn day paxn ; 165 166 if (seqn = 29585 and paxn>1400 ) or 167 (seqn = 23953 and paxn>=1196 ) or 168 seqn = 23163 or 169 (seqn = 27791 and paxn<6000) or 170 (seqn = 26976 and paxinten=32767) or 171 (seqn = 24719 and paxinten = 32767) or 172 (seqn = 26984 and (8572<=paxn<=8613 or 8634<=paxn<=9556)) or 173 (seqn in ( 22577, 28310, 21901, 29814) and paxinten = 32767) or 174 (seqn = 22194) or 175 (seqn = 25262 and paxn<=939) or 176 seqn in (22732, 23111, 27441, 21347) or 177 (seqn = 21310 and paxn> 1728) or 178 (seqn = 23443 and paxn < 7200) or 179 (seqn = 29229 and paxn > 5472) or 180 (seqn = 24515 and paxn > 6048) or 181 (seqn = 27900 and paxn > 7920) or 182 (seqn = 28107 and paxn < 3312) or 183 (seqn = 27434 and paxn < 3024) or 184 (seqn = 25319 and paxn > 8496) or 185 (seqn = 25318 and paxn < 1872) or 186 (seqn = 22213 and 4500<=paxn<=4700) or 187 (seqn = 24845 and 8150<=paxn<=9580) or 188 (seqn = 30908 and 1<=paxn<=320) or 189 (seqn = 23561 and 1800<=paxn<=3800) then paxinten=.; 190 run; NOTE: There were 72250027 observations read from the data set WORK.PAXRAW. NOTE: There were 74 observations read from the data set WORK.PAXRAW_NEW. NOTE: The data set WORK.PAM_PERMINUTE has 72250027 observations and 9 variables. NOTE: DATA statement used: real time 7:39.12 cpu time 2:28.57 191 192 *copy the work dataset pam_perminute to the folder referenced by the libname 192 ! statement above; 193 data myfolder.pam_perminute; 194 set pam_perminute; 195 run; 6 The SAS System 17:06 Thursday, March 8, 2007 NOTE: There were 72250027 observations read from the data set WORK.PAM_PERMINUTE. NOTE: The data set MYFOLDER.PAM_PERMINUTE has 72250027 observations and 9 variables. NOTE: DATA statement used: real time 3:32.06 cpu time 47.54 seconds 196 197 proc contents data=myfolder.pam_perminute; 198 run; NOTE: PROCEDURE CONTENTS used: real time 1.04 seconds cpu time 0.01 seconds NOTE: The PROCEDURE CONTENTS printed page 1. 199 200 *-------------------------------------------------------------------------*; 201 *You have now created a dataset that has had two kinds of problem data *; 202 *removed (invalid and unreliable intensity counts). It contains the *; 203 *variables listed in pam_perminute_contents.doc. It is used in *; 204 *create.pam_perday.sas to create a dataset with one record per person *; 205 *per day, and it may also be used for other analyses. *; 206 *-------------------------------------------------------------------------*; NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 45:34.52 cpu time 8:22.99