Problem Set 8

Exercise 4.5

This problem involves a single mean and we are considering the following competing hypotheses

\[ \begin{align} H_{0}:~\mu &= 8~hours \\ H_{A}:~\mu &\neq 8~hours \end{align} \]

Since we have a sample size of 25 and we do not know the population standard deviation, \(\sigma\), we will use a t-test. \[ \begin{align} t &= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}} \\ &= \frac{7.73-8}{\frac{0.77}{\sqrt{25}}} \\ &= -1.75 \end{align} \] Given that there are \(n=25\) observations, the degrees of freedom are \(df = n-1 = 24\).
Let us start by drawing a picture.

custom <- function(x) {dt(x, 24)}
p1<-ggplot(data.frame(x = c(-4, 4)), aes(x = x)) +
    stat_function(fun = custom) + geom_vline(xintercept=-1.75)
p1

As we can see from the visualization, the t-score is getting close to the tail but isn’t thare far into it. We can determine the p-value two ways, looking it up in a table or using R. If we consult the table in the book we see that for \(df=24\) this t-score gives us a p-value between 0.1 and 0.05. Using R

pt(-1.75,df=24)*2 # times 2 since the default is to return P(X <= x), i.e. one-tail

## [1] 0.09289509

So, some what suggestive, but not enough to reject the null hypothesis.
Given that we failed to reject the null hypothesis, we should expect 8 hours to be in a \(95\%\) confidence interval around our point estimate of 7.73 hours. Lets check. \[ \begin{align} \bar{x} &\pm t^{*} \times SE \\ &\pm t^{*} \times \frac{s}{\sqrt{n}} \\ &\pm 2.06 \times \frac{0.77}{\sqrt{25}} \\ &\pm 0.32 \end{align} \] or \[ (7.41,8.05) \] So, yes, just barely.

Exercise 4.10

No there is not a clear difference in the average reading and writing scores.
The scores of each student should be independent of the scores of other students. Here we have a random sample of 200 students out of a national survey (of unkown but presumably large size). We should not expect the verbal and math scores of the same student, however, to be independent, hence the matched pairs t-test.
The two competing hypotheses are: \[ \begin{align} H_{0}:~{\rm diff}_{read-write} &= 0 \\ H_{A}:~{\rm diff}_{read-write} &\neq 0 \end{align} \]
As discussed in part (b) independence between the observations is a reasonable assumption. Further, the distribution of differences in score, as shown in the histogram accompanying the problem, looks quite normal. We thus can use either a t-test with \(199\) degrees of freedom, or given that the degrees of freedom are so large, a z-test.

\[ \begin{align} t &= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}} \\ &= \frac{-0.545-0}{\frac{8.887}{\sqrt{200}}} \\ &= -0.87 \end{align} \] e. We can get the p-value from a t-distribution table. Examining the table in the back of the book we get a p-value of greater that 0.4. Thus we would fail to reject the null hypothesis. If we used a z-score, we would get a p-value of (again checking a table) of \(0.38\). Again, we would fail to reject the null hypothesis.

Using R.

pt(-0.87,df=199)*2 # times 2 since the default is to return P(X <= x), i.e. one-tail

## [1] 0.3853486

pnorm(-0.87)*2 # again, times 2 for two-tail

## [1] 0.3843004

Since we are failing to reject the null hypothesis, there is the possibility for type II error, i.e. failing to reject the null when the alternative is true. In this context, it means that there is actually a difference in the reading and writing scores of the students even though our test is consistent with no difference.
Since confidence intervals give a plausible range of values for the parameter of interest, yes, we would expect the confidence interval to include 0. Let’s find out (using a critical z-value since there is little difference between the z and t-scores here).

\[ \begin{align} \bar{x} &\pm z^{*} \times SE \\ &\pm z^{*} \times \frac{s}{\sqrt{n}} \\ &\pm 1.96 \times \frac{8.887}{\sqrt{200}} \\ &\pm 1.23 \end{align} \] Which gives us a confidence interval of \[ (-1.78,0.69) \]

Exercise 4.14

Not paired. No natural correspondence between cases in the samples.
Paired. Coorespondence between cases in the samples.
Not paired. No natural correspondence between cases in the samples.

Exercise 4.30

We want to test the competing hypotheses: \[ \begin{align} H_{0} &: \mu_{Automatic} = \mu_{Manual} H_{A} &: \mu_{Automatic} \neq \mu_{Manual} \end{align} \]

Since we have small sample sizes \(n_{Automatic}=26\) and \(n_{Manual}=26\) and we do not know the population standard deviation, we will use a t-test. We will calculate a t-statistic: \[ \begin{align} t &= \frac{(\bar{x}_{1} - \bar{x}_{2}) - (\mu_{1} - \mu_{2})}{SE} \\ &= \frac{\bar{x}_{1} - \bar{x}_{2}}{ \sqrt{ \frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}} } } \\ &= \frac{16.12 - 19.85}{ \sqrt{ \frac{3.58^{2}}{26} + \frac{4.51^{2}}{26} } } \\ &= \frac{-3.73}{1.13} \\ &= -3.3 \end{align} \] For a two-tailed t-test with 25 degrees of freedom (the minimum of (\(n_{Automatic}\),\(n_{Manual}\))), a t-score of greater than 2.79 allows one to reject the null hypothesis at the 0.01 level. With our t-score of -3.3 we are safely in the tail. Note, we can also calculate the t score with R.

pt(-3.3,25)

## [1] 0.00145261

Exercise 4.35

This problem is asking that each treatment be compared to a null hypothesis of no effect \[ \begin{align} H_{0}:~{\rm diff} &= 0 \\ H_{A}:~{\rm diff} &\neq 0 \end{align} \]

Since the subjects are chosen randomly are are a small subset of the prison population (42 out of presumably many more) independence is a reasonable assumption. Inspecting the Q-Q plots, however, shows the distributions have some skew. Given that we have small samples and do not know the population standard deviation we will use a t-test.

For trial 1 \[ \begin{align} t_{tr1} &= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}} \\ &= \frac{6.21-0}{\frac{12.3}{\sqrt{14}}} \\ &= 1.89 \end{align} \] For trial 2 \[ \begin{align} t_{tr2} &= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}} \\ &= \frac{2.86-0}{\frac{7.94}{\sqrt{14}}} \\ &= 1.35 \end{align} \]

For trial 3 \[ \begin{align} t_{tr2} &= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}} \\ &= \frac{-3.21-0}{\frac{8.57}{\sqrt{14}}} \\ &= -1.40 \end{align} \]

With 13 degrees of freedom, the (one-tailed) p-values are: \[ \begin{align} {\rm p-value}_{tr1} &=0.04 \\ {\rm p-value}_{tr1} &=0.10 \\ {\rm p-value}_{tr1} &=0.09 \end{align} \]

If we use a one-tailed p-value (since the question is asking about a reduction in psychopathic deviant T scores), we see by inspecting the table that the first trial does show a significant difference (p-value < 0.05). In R.

pt(1.89,13,lower.tail = FALSE) # This calculates the p-value for just the upper tail

## [1] 0.04063037

1.-pt(1.35,13, lower.tail = FALSE)

## [1] 0.8999732

pt(-1.40,13)

## [1] 0.09246203