Math 1530

Unit 4 Hypothesis Testing

Helper or Hinderer?

Part One - Introduction

Sociology Study from the Yale Infant Cognition Center:

If 16 infants participated in this study, how many do youthink chose the helper toy? Why? What actors do you think might be at play when the infants make their choice?

Possible response: Maybe 12 to 15 of the infants would choose the helper toy and the rest would choose the hinderer toy.

What are some possible hypotheses we could make for this situation regarding infants and their choice of a toy?

Example: The infants chose the brighter toy.

Example: The infants chose the softer toy.

If infants really do NOT have a preference for the helper or the hinderer toy, what would be the most likely outcome (number of infants choosing the helper toy) when this study is conducted on 16 infants?

Half of the infants would choose the helper toy and half of the infants would choose the hinderer toy.

Still assuming that infants show NO preference between the helper and hinderer, what kind of results (for number of infants choosing the helper toy) would NOT surprise you when this study is conducted on 16 infants? How far off your guess from number 3 could you go and it still be “okay?”

Possible response: Maybe nine infants would choose the helper toy and 7 infants would choose the hinderer toy.

The researchers actually found that ________ of the 16 infants in the study selected the helper toy. f it is REALLY the case that infants show NO preference between the helper and hinderer toy, do you find the researchers’ results surprising? Why or why not?

Yes, because 14 is far higher than half of the number of infants.

Part Two - Exploration

A key question is, “How surprising is the observed result under the assumption that infants have NO real preference for the helper toy or the hinderer toy?”
We will call this assumption of no real preference the null hypothesis.
Let’s simulate this situation using coin flips. If children truly have no preference, they will have a 50/50 chance of picking the helper toy.

Flip a coin 16 times. If getting heads represents that the child chose the helper toy, count how many of your 16 hypothetical infants chose the helper toy and place in the blank on the left.(Count the total number of heads from your 16 coin flips.) Repeat this process 3 more times and write the number of heads for each round of 16 in the blanks below.
___________

___________

___________

___________
Combine your results with your classmates. Do this by producing a dot plot (of the number of infants who choose the helper toy) on the board, where you contribute 4 dots corresponding to your 4 simulations from number 1 above. Copy the class dotplot from the board here below.

a number line that is numbered from 0 to 16, counting by one. The title of the number line is Dot Plot: Number of heads from 16 coin flips

Does it seem like the results actually obtained by these researchers (see number 5 on Part One) would be surprising under the null hypothesis that infants do NOT have a genuine preference for either toy? Explain.
Yes. The coin flipping scenario shows no preference for heads or tails; thus, there is no preference for the helper or hinderer toy.

Now, we will use technology to simulate completing this experiment many, many times (100, 500, 1000 times) under the assumption that the null hypothesis is true – that infants show no preference of choice over the Helper or Hinderer toy. Based on this simulation, how surprising are the actual results of this study? (Refer back to Part One, number 5). Explain your reasoning.
The results are still quite surprising since the simulation shows that abot 50% of the infants would choose the helper toy, while the other half would choose the hinderer toy.

Part Three - Analysis and Conclusions

There is variability between what we expected to happen, assuming the null hypothesis is true (8 infants choosing helper toy), and the actual results. The question is, “Can the random process of choosing explain this variability, or is there another explanation for this variability?” Many statisticians say that the field of statistics is primarily about explaining variability. This is what we are attempting to do in this investigation, and we will continue to explore these ideas throughout this course.

TERMINOLOGY:

The probability of an event is the long-run proportion of times the event happens when its random process is repeated indefinitely. It has to do with the likelihood that an event occurs.
The p-value is the probability that randomness would produce data as (or more) extreme as an actual study, assuming the null hypothesis to be true.
- A small p-value indicates that the observed data would be surprising to occur by randomness alone, if the null hypothesis were true.
- Having results with a small p-value is said to be statistically significant,

Based on our simulations from Part Two, what conclusion should the researchers draw? Justify your conclusions and use the above terminology in your justification.
We would fail to reject the null hypothesis. There is not sufficient evidence to reject the claim that there is no preference in choice of the toy that is chosen.
If the actual study had instead found that 9 of the 16 infants chose the Helper toy, then what decision should the researchers make based on this result? Justify your conclusions, and use the above terminology in your justification.
We would reject the null hypothesis. There is not sufficient evidence to support the claim that there is no preference in choice of the toy that is chosen.

7.1 Hypothesis Testing with One Sample

Basic Steps of a P-value Hypothesis Test:

Step 1: Write the claim as a mathematical statement.

Step 2: Identify the Null Hypothesis and the Alternative Hypothesis.

Null Hypothesis:

What we assume is true about the population parameter

Will be the same as the claim if the claim contains an equal sign.

\( \left[H_{0}\right] \quad \leq \quad \geq \quad =\)

Always Contains Equality
Alternative Hypothesis:

The complement of the null hypothesis

Will be the same as the claim if the claim does not contain an equal sign.

\( \left[H_{A}\right] \quad < \quad > \quad \not=\)

Never Contains Equality

Step 3: Determine the type of test and shade the graph.

A normal curve with a critical value marked with a vertical line alpha units from the left end of the graph. The area to the left of the critical value is shaded and labeled critical region. The area to the right of the critical value is not shaded and is labeled non-critical region.

Left- tail: HA contains the following symbol: <

A normal curve with a critical value marked with a vertical line alpha units from the right end of the graph. The area to the right of the critical value is shaded and labeled critical region. The area to the left of the critical value is not shaded and is labeled non-critical region.

Right-tail: HA contains the following symbol: >

A normal curve with two critical values marked with vertical lines alpha divided by 2 units from each end of the graph. The area to the left and right of the 2 critical values are shaded to the ends of the curve and labeled critical regions. The area between the critical values is not shaded and is labeled non-critical region.

2-tail: HA contains the following symbol: ≠

Rejection area (critical region) is the area beyond your Critical Value and has area α (the significance level of your hypothesis test). In a right or left tail test, that area is in one tail. In a two tail test, the area in each side of the critical region is α/2.

Step 4: Calculate the test statistic from sample data.

A Test Statistic is the value computed from sample data and used in making a decision about rejection of the Null Hypothesis.

Test statistic for proportion:

p = population proportion q = 1 – p
n = sample size
\(\hat{p}=\frac{x}{n}\) sample proportion
\(z=\frac{\hat{p}-p}{\sqrt{\frac{p q}{n}}}\)

Test statistic for mean:

s = sample standard deviation
n = sample size
\(\mu\) = population mean
\(\overline{x}\) = sample mean
\(t=\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\)

Step 5: Calculate the p-value for your test statistic.

P-value: the probability of being as extreme, or more extreme, than your data, assuming the null hypothesis is correct.

Left tail test: contains “< ”. P-value = area to the left of the test statistic.
Right tail test: contains “ >”. P-value = area to the right of the test statistic.
Two tail test: contains “≠”. P-value = twice the area in the tail beyond the test statistic

Step 6: Determine the rejection criteria.

Using probability: Reject \(H_0\) if the p-value \(\leq\alpha\) .
Using test statistics: Reject \(H_0\) if the test statistic is in the rejection region.

Step 7: Make One of Two Decisions About the Null Hypothesis

Reject the Null:

If P-value \(\leq\alpha\) , reject \(H_0\).
Type I error: rejecting \(H_0\) when it is actually true

Fail to Reject the Null:
- If P-value \(> \alpha\), fail to reject \(H_0\).
- Type II error: failing to reject \(H_0\) when it is actually false

Step 8: Make A Statement About The Claim Based On Your Decision About the Null Hypothesis.

	Null is the Claim	Alternate is the Claim
Reject the Null	“There is sufficient sample evidence to reject the claim that…”	“There is sufficient sample evidence to support the claim that…”
Fail to Reject the Null	“There is not sufficient sample evidence to reject the claim that…”	“There is not sufficient sample evidence to support the claim that…”

Introduction to Hypothesis Testing: Steps 1-3

Remember to use the correct symbols.

Mean: \(\mu\)
Proportion: p
Standard deviation: \(\sigma\)

The mean IQ of statistic students is at least 110.
1. Claim: \( \mu \geq 110 \)
2. \(H_0\): \( \mu \geq 110 \) (Assume \(H_0: \mu = 110 \))
3. \(H_A\): \( \mu < 110 \)
4. The significant area (rejection area) is located where \( \mu \) is significantly less than 110 , therefore, this is a Left-Tailed test.

The mean wait time for a GrubHub delivery is more than 40 minutes.
1. Claim: \( \mu > 40 \)
2. \(H_0\): \( \mu \leq 40 \) (Assume \(H_0: \mu = 40 \))
3. \(H_A\): \( \mu > 40 \)
4. The significant area (rejection area) is located where the mean wait time is significantly greater than 40 minutes , therefore, this is a Right-Tailed test.
The percentage of people who prefer milk chocolate over dark chocolate is 70% as claimed by Madison advertising agency.
1. Claim: \(p=0.7\)
2. \(H_0\): \(p=0.7\) (Assume \(H_0: p=0.7 \))
3. \(H_A\): \(p \neq 0.7\)
4. The significant area (rejection area) is located where \(p\) is significantly different than 0.7 , therefore, this is a Two-Tailed test.
The percentage of all students who eat sushi at least once per week is less than 27%.
1. Claim: \(p < 0.27\)
2. \(H_0\): \(p \geq 0.27\) (Assume \(H_0: p=0.27 \))
3. \(H_A\): \(p < 0.27\)
4. The significant area (rejection area) is located where the proportion of all students who eat sushi is significantly less than 27%, , therefore, this is a Left-Tailed test.

The mean IQ of statistic students is at most 110.
1. Claim: \( \mu \leq 110 \)
2. \(H_0\): \( \mu \leq 110 \) (Assume \(H_0: \mu = 110 \))
3. \(H_A\): \( \mu > 110 \)
4. The significant area (rejection area) is located where \( \mu \) is significantly greater than 110 , therefore, this is a Right-Tailed test.
The percentage of all men who prefer milk chocolate over dark chocolate is different than 70% as claimed by Madison advertising agency.
1. Claim: \(p \neq 0.7\)
2. \(H_0\): \(p = 0.7\)
3. \(H_A\): \(p \neq 0.7\) (this can be translated ar \(p<0.7\) or \(p>0.7\))
4. The significant area (rejection area) is located where \(p\) is significantly different than 0.7 , therefore, this is a Two-Tailed test.