## Statistical Inference

As in the previous
labs, we have the same question: *How often would we get data like we observed when the null hypothesis is true?*

As in Lab 4, when we carry out a randomized experiment, we need to consider that when we observe a difference between treatment groups, it is possible that this difference arose from an unlucky random assignment rather than a true treatment effect. Because we now have a quantitative response variable, you need to decide whether the observed *difference in sample means* (our statistic) is large enough that it cannot plausibly be explained by the random assignment process alone/luck of the draw/coincidence. So instead of assuming (as we did in the Lab 4) that we have a certain number of successes and failures to randomly assign to the treatment groups, this time we will **assume that each student would have received the same improvement score, regardless of which explanatory variable group **(restricted or unrestricted sleep) he or she was assigned to (this is the null hypothesis). So we will look at the differences in the sample means that *could have been observed *when the null hypothesis is true and we are using random assignment to create the two groups. This will show us any pattern to the differences in sample means from random assignment to random assignment when we know there is no actual treatment effect. Then we can see where the actual observed research result (15.92) falls in this null distribution.

It is very important to keep in mind, that this simulation assumes there is no difference between the two treatments. Then we will examine a could-have-been null distribution of the differences in the sample means under the null hypothesis that there is no difference in the population means, to see whether the difference in sample means actually observed in the study is a surprising outcome under the null hypothesis. (If so, we reject the null hypothesis.)

**Notes:**

- We can apply the same process whether the explanatory variables groups are formed by random assignment or random sampling or neither.
- You will worry about generalizability issues at the end of this analysis, but for now you can define the populations to be all people like those in this study who are sleep deprived on night one and all people like those in this study who have unrestricted sleep on night one.

Let _{unrestricted}- _{sleep deprived} represent the true underlying difference in population mean improvement scores for which we have observed an estimate in this study. (So keep in mind that the parameter is a number describing the population or process, not the sample.)

(e) State your null and alternative hypotheses for this study in terms of _{unrestricted} - _{sleep deprived}. (*Hint*: Should the alternative be one-sided or two-sided? How are you deciding?)

**Note:** This null hypothesis is equivalent to saying there is no association between whether or not the subjects were sleep deprived and their improvement score. Just now one variable is quantitative and one is categorical.