Two sample t-test with SAS

The idea of two sample t-test is to compare two population averages by comparing two independent samples. A common experiment design is to have a test and control conditions and then randomly assign a subject into either one. One variable to be measured and compared between two conditions (samples).

Suppose there is a study to compare two study methods and see how they improve the grades differently. There is a new method (treament, or t) and a standard method (control, or c). Users will be randomly assigned either one method. After they are trained with the method, their performance is measured as grades. The data set is �reading.csv�. The problem is to test whether the two methods make a difference? The model you can set up for this problem is

Grade (continuous) ~ method (categorical: 2 levels)

Open the data set from SAS.

data read; infile "H:\sas\data\reading.csv" dlm=',' firstobs=2; input method $ grade; run;

Checking assumptions

There is one continuous dependent variable and one categorical independent variable (with 2 levels);
The two samples are independent;
The two samples follow normal distributions, and can be done with Normality check.

Two dependent samples and follow Normal distribution, suggest Paired T-test;
Two independent samples and does not follow Normal distribution, suggest WMW test;
Two dependent samples and does not follow Normal distribution, suggest Signed Rank test;

In this demo example, two samples (control and treatment) are independent, and pass the Normality check. So we continue with two sample t-test. Note that the test is two-sided (sides=2), the significance level is 0.05, and the test is to compare the difference between two means (mu1 - mu2) against 0 (h0=0).

Compare two independent samples

proc ttest data=read sides=2 alpha=0.05 h0=0; title "Two sample t-test example"; class method; var grade; run;

Reading the output

two sample t example The TTEST Procedure Variable: Grade Method N Mean Std Dev Std Err Minimum Maximum control 5 88.6000 7.3007 3.2650 80.0000 98.0000 treatment 5 101.6 2.0736 0.9274 99.0000 104.0 Diff (1-2) -13.0000 5.3666 3.3941 Method Method Mean 95% CL Mean Std Dev 95% CL Std Dev control 88.6000 79.5350 97.6650 7.3007 4.3741 20.9789 treatment 101.6 99.0252 104.2 2.0736 1.2424 5.9587 Diff (1-2) Pooled -13.0000 -20.8268 -5.1732 5.3666 3.6249 10.2811 Diff (1-2) Satterthwaite -13.0000 -21.9317 -4.0683 Method Variances DF t Value Pr > |t| Pooled Equal 8 -3.83 0.0050 Satterthwaite Unequal 4.6412 -3.83 0.0141 Equality of Variances Method Num DF Den DF F Value Pr > F Folded F 4 4 12.40 0.0318

Note that the results show both "Pooled" and "Satterthwaite" sections, which is based on sample variances check. The test on Equality of Variances is given at the end, and is repeated below,

Equality of Variances Method Num DF Den DF F Value Pr > F Folded F 4 4 12.40 0.0318

When the p-value (shown under "Pr>F") is greater than 0.05, then the variances are equal then read the "Pooled" section of the result
When the p-value (shown under "Pr>F") is no more than 0.05, then the variances are unqueal then read the "Satterthwaite" section of the result