Solutions to Chapter 4 Practice Problems
1.
a. Without using SAS, provide a one-sided permutation p-value.
The mean in group 1 is 2.25. The mean in group 2 is 112. The differences between means is 112-2.25=109.75.
We need to find the total number of ways of dividing the 18 observed numbers into a group of 4 and a group of 14 numbers where
the mean in the group of 14 numbers is bigger than the mean in the group of 4 numbers by 109.75 or more. An equivalent (and much
simpler) plan is to find the number of ways of picking, out of the 18 observed numbers, 4 numbers that have a sum of
1 + 1 + 2 + 5 = 9 or less. It is not hard to see that if the 4 numbers have a sum of 9 or less, than the difference between the mean of the
4 numbers and the mean of the remaining 14 numbers will be 109.75 or more.
Below are the different choices for the four numbers along with the number of ways that the 4 numbers could be picked from the 18
observed numbers. For example, there are 4 x 3=12 ways to get 1,1,1,2 because
i) there are four ways to choose three 1s from the four 1s in the eighteen observed data values, and
ii) for each of the choices in (i), there are three ways to choose one 2 from the three 2s in the eighteen observed data values.
Data Set Number of Ways
1,1,1,1 (4 nCr 4) = 1
1,1,1,2 (4 nCr 3)(3 nCr 1) = 12
1,1,1,3 (4 nCr 3)(1 nCr 1) = 4
1,1,1,4 (4 nCr 3)(1 nCr 1) = 4
1,1,1,5 (4 nCr 3)(1 nCr 1) = 4
1,1,2,2 (4 nCr 2)(3 nCr 2) = 18
1,1,2,3 (4 nCr 2)(3 nCr 1)(1 nCr 1) = 18
1,1,2,4 (4 nCr 2)(3 nCr 1)(1 nCr 1) = 18
1,1,2,5 (4 nCr 2)(3 nCr 1)(1 nCr 1) = 18
1,1,3,4 (4 nCr 2)(1 nCr 1)(1 nCr 1) = 6
1,2,2,2 (4 nCr 1)(3 nCr 3) = 4
1,2,2,3 (4 nCr 1)(3 nCr 2)(1 nCr 1) = 12
1,2,2,4 (4 nCr 1)(3 nCr 2)(1 nCr 1) = 12
2,2,2,3 (3 nCr 3)(1 nCr 1) = 1
Total Number of Ways = 132
The total number of ways to choose 4 data values from the 18 observed values is
(n1+n2)Cn1 = 18C4 = 3060.
Thus our one-sided p-value is 132/3060 = 0.0431
b. Provide an approximate one-sided p-value for the rank sum test by computing a Z-statistic and comparing its value to the standard normal distribution.
First, we will list all values and rank them. (The bold values correspond to group 1.)
Value
1 1 1 1 2 2 2 3 4 5
7 15 32 41 77 107 299 976
Rank
2.5 2.5 2.5 2.5
6
6 6 8 9 10
11 12 13 14 15 16 17 18
Note that for the ties we use average ranks:
For the set of 1’s : (1 + 2 + 3 + 4) / 4 = 2.5
For the set of 2’s : (5 + 6 + 7) / 3 = 6
T = 2.5 + 2.5 + 6 + 10 = 21
Mean(T) = n1(n1 + n2 + 1) / 2 = 4(19)/2 = 38
S_R=5.300 <----This is the standard deviation of all the ranks: 2.5, 2.5, 2.5, 2.5, 6, 6, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18
SD(T)=5.300 * sqrt[4*14/(4+14)]=9.348
Z = (21 – 38) / 9.348 = -1.819
Our one-sided p-value is between 0.025 and 0.05. (If you use the standard normal table, you get 0.0344.)
2.
a.
Leaf Prep. 1 Prep. 2 Diff Rank
1 2 12 -10 5
2 0 1 -1 1
3 7 3 4 2
4 9 2 7 3
5 13 5 8 4
6 15 4 11 6
7 14 2 12 7
8 22 7 15 8
9 20 4 16 9
10 27 10 17 10
11 32 14 18 11
12 21 2 19 12
13 44 20 24 13
14 47 20 27 14
15 51 20 31 15
16 47 12 35 16
17 67 22 45 17
18 72 24 48 18
19 69 15 54 19
20 71 16 55 20
S = 2+3+4+6+7+8+9+10+11+12+13+14+15+16+17+18+19+20 = 204
b.
Mean(S) = n(n+1)/4 = 20(21)/4 = 105
SD(S) = sqrt [(n)(n + 1)(2n + 1)/24] = 26.786 <---This formula works only because there are no ties.
In general, you need to use the formula based on the
square root of the sum of the squares of all the ranks
divided by 4.
Z = (204 – 105) / 26.786 = 3.696
c.
3.696 is off the chart, so we can safely say that the two-sided p-value is less than or equal to 0.001.
d.
The sum of the ranks associated with negative differences is 1 + 5 = 6.
We want to find how many other ways we can assign signs to the observed ranks and still have the sum
of the negative ranks less than or equal to 6. (This is equivalent to finding all the ways of assigning + and
- signs to the ranks that will give us S>=204, the value observed in the original data.)
RANKS
11111111112
12345678901234567890 sum of negative ranks
++++++++++++++++++++ 0
-+++++++++++++++++++ 1
+-++++++++++++++++++ 2
++-+++++++++++++++++ 3
+++-++++++++++++++++ 4
++++-+++++++++++++++ 5
+++++-++++++++++++++ 6
--++++++++++++++++++ 3
-+-+++++++++++++++++ 4
-++-++++++++++++++++ 5
-+++-+++++++++++++++ 6
+--+++++++++++++++++ 5
+-+-++++++++++++++++ 6
---+++++++++++++++++ 6
No
other configurations give 6 or less.
Total # outcomes =14
Possible combinations = 2^20 (2 raised to the power 20)
One-sided p-value = 14/2^20 = 0.00001335
Two-sided p-value = 2(.00001335) = .0000267
e.
Sign Test
Z = (k – (n/2)) / sqrt (n / 4)
k = 18
n = 20
Z = (18 – 10) / sqrt (5) = 3.578
This is off the chart, so we can safely say that the two-sided p-value is less than 0.001.