1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;7071 /********************* cars2.sas ***************************/72 title 'Regression on Metric Cars Data';7374 /* Read data directly from Excel spreadsheet */75 proc import datafile="/home/brunner0/441s20/mcars4.xlsx"76 out=cars dbms=xlsx replace;77 getnames=yes;78 /* Input data file is mcars4.xlsx79 Ouput data set is called cars80 dbms=xlsx The input file is an Excel spreadsheet.81 Necessary to read an Excel spreadsheet directly under unix/linux82 Works in PC environment too except for Excel 4.0 spreadsheets83 If there are multiple sheets, use sheet="sheet1" or something.84 replace If the data set cars already exists, replace it.85 getnames=yes Use column names as variable names. */86NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run withoptions MSGLEVEL=I.NOTE: The import data set has 100 observations and 4 variables.NOTE: WORK.CARS data set was successfully created.NOTE: PROCEDURE IMPORT used (Total process time):real time 0.01 secondsuser cpu time 0.00 secondssystem cpu time 0.01 secondsmemory 2791.81kOS Memory 29608.00kTimestamp 01/20/2020 02:01:43 AMStep Count 24 Switch Count 2Page Faults 0Page Reclaims 842Page Swaps 0Voluntary Context Switches 15Involuntary Context Switches 0Block Input Operations 0Block Output Operations 26487 proc print;88 title2 'Look at input data set';89NOTE: There were 100 observations read from the data set WORK.CARS.NOTE: PROCEDURE PRINT used (Total process time):real time 0.13 secondsuser cpu time 0.12 secondssystem cpu time 0.00 secondsmemory 2732.09kOS Memory 29864.00kTimestamp 01/20/2020 02:01:43 AMStep Count 25 Switch Count 1Page Faults 0Page Reclaims 865Page Swaps 0Voluntary Context Switches 6Involuntary Context Switches 0Block Input Operations 0Block Output Operations 4890 data auto;91 set cars;92 mpg = 100/lper100k * 0.6214/0.2642;93 Country = Cntry; /* I just like the spelling more */94 label Country = 'Location of Head Office'95 lper100k = 'Litres per 100 kilometers'96 mpg = 'Miles per Gallon'97 weight = 'Weight in kg'98 length = 'Length in meters';99 /* Indicator dummy vars: Ref category is Japanese */100 if country = 'US' then c1=1; else c1=0;101 if country = 'Europ' then c2=1; else c2=0;102 label c1 = 'US = 1'103 c2 = 'Europe = 1';104 /* Interaction Terms */105 cw1 = c1*weight; cw2 = c2*weight;106 cL1 = c1*length; cL2 = c2*length;107 /* This way of creating dummy variables is safe only because108 Country is never missing. If it could be missing, better is109 if country = ' ' then c1 = .;110 else if country = 'US' then c1=1;111 else c1=0;112 if country = ' ' then c2 = .;113 else if country = 'Europ' then c2=1;114 else c2=0;115 Note that a blank space is the missing value code for character variables,116 while a period is missing for numeric variables. */117NOTE: There were 100 observations read from the data set WORK.CARS.NOTE: The data set WORK.AUTO has 100 observations and 12 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 976.28kOS Memory 30380.00kTimestamp 01/20/2020 02:01:43 AMStep Count 26 Switch Count 2Page Faults 0Page Reclaims 158Page Swaps 0Voluntary Context Switches 17Involuntary Context Switches 0Block Input Operations 0Block Output Operations 264118 proc freq;119 title2 'Check dummy variables';120 tables (c1 c2)*country / norow nocol nopercent;121122 /* First an analysis with country only. */123124 /* Questions for every significance test:125 * What is E(y|x) for the model SAS is using?126 * Give the null hypothesis in symbols.127 * Do you reject H0 at alpha = 0.05? Answer Yes or No.128 * In plain, non-statistical language, what do you conclude? */129130NOTE: There were 100 observations read from the data set WORK.AUTO.NOTE: PROCEDURE FREQ used (Total process time):real time 0.04 secondsuser cpu time 0.04 secondssystem cpu time 0.00 secondsmemory 1771.15kOS Memory 31152.00kTimestamp 01/20/2020 02:01:43 AMStep Count 27 Switch Count 5Page Faults 0Page Reclaims 538Page Swaps 0Voluntary Context Switches 27Involuntary Context Switches 0Block Input Operations 0Block Output Operations 528131 proc means;132 title2 'Litres per 100 k Broken Down by Country';133 class Country;134 var lper100k;135NOTE: There were 100 observations read from the data set WORK.AUTO.NOTE: PROCEDURE MEANS used (Total process time):real time 0.02 secondsuser cpu time 0.02 secondssystem cpu time 0.01 secondsmemory 9036.65kOS Memory 40124.00kTimestamp 01/20/2020 02:01:43 AMStep Count 28 Switch Count 2Page Faults 0Page Reclaims 2400Page Swaps 0Voluntary Context Switches 35Involuntary Context Switches 0Block Input Operations 0Block Output Operations 24136 proc reg plots = none; /* Suppress diagnostic plots for now*/137 title2 'Regression with Just Country';138 model lper100k = c1 c2;139 USvsEURO: test c1=c2;140NOTE: PROCEDURE REG used (Total process time):real time 0.06 secondsuser cpu time 0.06 secondssystem cpu time 0.01 secondsmemory 2579.96kOS Memory 34752.00kTimestamp 01/20/2020 02:01:43 AMStep Count 29 Switch Count 2Page Faults 0Page Reclaims 847Page Swaps 0Voluntary Context Switches 19Involuntary Context Switches 0Block Input Operations 0Block Output Operations 64141 proc glm;142 title2 'Compare Oneway with proc glm';143 class country;144 model lper100k = country;145NOTE: PROCEDURE GLM used (Total process time):real time 2.56 secondsuser cpu time 0.12 secondssystem cpu time 0.02 secondsmemory 14954.23kOS Memory 45240.00kTimestamp 01/20/2020 02:01:46 AMStep Count 30 Switch Count 3Page Faults 0Page Reclaims 4304Page Swaps 0Voluntary Context Switches 579Involuntary Context Switches 0Block Input Operations 0Block Output Operations 912146 proc reg plots = none data = auto;147 title2 'Country, Weight and Length';148 model lper100k = c1 c2 weight length;149 country: test c1 = c2 = 0; /* Country controlling for wgt, length */150 USvsEURO: test c1=c2; /* US vs. Europe controlling for wgt, length */151 wgt_len: test weight=length=0; /* wgt, length controlling for Country */152153 /* Proportions of remaining variation, using a = sF/(n-p+sF) */154NOTE: PROCEDURE REG used (Total process time):real time 0.08 secondsuser cpu time 0.08 secondssystem cpu time 0.00 secondsmemory 2463.31kOS Memory 46528.00kTimestamp 01/20/2020 02:01:46 AMStep Count 31 Switch Count 2Page Faults 0Page Reclaims 354Page Swaps 0Voluntary Context Switches 20Involuntary Context Switches 0Block Input Operations 0Block Output Operations 88155 proc iml;NOTE: IML Ready156 title2 'Proportion of remaining variation';157 print "Country controlling for Weight and Length";158 n = 100;158 ! p = 5;158 ! s = 2;159 F = 6.90;159 ! a = s*F/(n-p + s*F);160 print a;161162 print "Weight and Length controlling for Country";163 F = 115.16;163 ! a = s*F/(n-p + s*F);164 print a;165NOTE: Exiting IML.NOTE: PROCEDURE IML used (Total process time):real time 0.01 secondsuser cpu time 0.02 secondssystem cpu time 0.00 secondsmemory 678.21kOS Memory 44708.00kTimestamp 01/20/2020 02:01:46 AMStep Count 32 Switch Count 1Page Faults 0Page Reclaims 266Page Swaps 0Voluntary Context Switches 11Involuntary Context Switches 0Block Input Operations 0Block Output Operations 0166 proc glm data=auto plots=none;167 title2 'Country, weight and length with proc glm';168 class country;169 model lper100k = weight length country;170 lsmeans country / pdiff tdiff adjust = bon;171NOTE: PROCEDURE GLM used (Total process time):real time 0.08 secondsuser cpu time 0.09 secondssystem cpu time 0.00 secondsmemory 2293.25kOS Memory 46520.00kTimestamp 01/20/2020 02:01:46 AMStep Count 33 Switch Count 3Page Faults 0Page Reclaims 340Page Swaps 0Voluntary Context Switches 25Involuntary Context Switches 0Block Input Operations 0Block Output Operations 312172 proc reg plots = none;173 title2 'Country, Weight and Length with Interactions';174 model lper100k = c1 c2 weight length cw1 cw2 cL1 cL2;175 country: test c1 = c2 = 0; /* Is it really still country? */176 Interactions: test cw1 = cw2 = cL1 = cL2 = 0;177178 /* Centering an explanatory variable by subtracting off the mean affects the179 intercept, but not the relationships among variables. I want to create a new180 data set with weight and length centered, and to avoid confusion181 I will make sure the variables are nicely labelled. */182NOTE: PROCEDURE REG used (Total process time):real time 0.07 secondsuser cpu time 0.08 secondssystem cpu time 0.00 secondsmemory 2396.28kOS Memory 47040.00kTimestamp 01/20/2020 02:01:46 AMStep Count 34 Switch Count 2Page Faults 0Page Reclaims 282Page Swaps 0Voluntary Context Switches 18Involuntary Context Switches 0Block Input Operations 0Block Output Operations 80183 proc standard mean=0 data=auto out=cntrd;184 var weight length;185186 /* In the new data set "cntrd," weight and length are adjusted to have mean187 zero (the sample means have been subtracted from each observation). If I had188 said mean=0 std=1, they would have been converted to z-scores. All the other189 variables (including the product terms) are as they were before, and the190 labels are the same as before too. */191NOTE: The data set WORK.CNTRD has 100 observations and 12 variables.NOTE: PROCEDURE STANDARD used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 841.75kOS Memory 45740.00kTimestamp 01/20/2020 02:01:46 AMStep Count 35 Switch Count 2Page Faults 0Page Reclaims 118Page Swaps 0Voluntary Context Switches 14Involuntary Context Switches 0Block Input Operations 0Block Output Operations 264192 data centered;193 set cntrd; /* Now centered has everything in cntrd */194 /* Re-create Interaction Terms and re-label explanatory vars*/195 cw1 = c1*weight; cw2 = c2*weight;196 cL1 = c1*length; cL2 = c2*length;197 label weight = 'Weight in kg (Centered)'198 length = 'Length in cm (Centered)';199200 /* By default, SAS procedures use the most recently created data set,201 but specify it anyway. */202NOTE: There were 100 observations read from the data set WORK.CNTRD.NOTE: The data set WORK.CENTERED has 100 observations and 12 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.01 secondsmemory 958.03kOS Memory 45996.00kTimestamp 01/20/2020 02:01:46 AMStep Count 36 Switch Count 2Page Faults 0Page Reclaims 139Page Swaps 0Voluntary Context Switches 14Involuntary Context Switches 0Block Input Operations 0Block Output Operations 264203 proc reg plots=none simple data=centered;204 title2 'Weight and length are now centered: Mean=0';205 model lper100k = c1 c2 weight length cw1 cw2 cL1 cL2;206 country: test c1 = c2 = 0; /* Does this make better sense? */207 Interactions: test cw1 = cw2 = cL1 = cL2 = 0;208209 quit;NOTE: PROCEDURE REG used (Total process time):real time 0.10 secondsuser cpu time 0.11 secondssystem cpu time 0.00 secondsmemory 2398.71kOS Memory 47296.00kTimestamp 01/20/2020 02:01:46 AMStep Count 37 Switch Count 2Page Faults 0Page Reclaims 263Page Swaps 0Voluntary Context Switches 18Involuntary Context Switches 0Block Input Operations 0Block Output Operations 104210211212213214215216217 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;228