STAT 2000 - Assignment 1 Question 2 Part (a) [3 marks] Repeat Part (f) from the previous question, but assume now that the population standard deviation is unknown, and must be estimated from the dat

STAT 2000 – Assignment 1

Question 2

Part (a) [3 marks] Repeat Part (f) from the previous question, but assume now that the population standard deviation is unknown, and must be estimated from the dataset.

Use qt to determine the critical value (t∗) for this confidence interval, and save it as an object named TSTAR. Print this value out as well. In particular, by using the code qt(p, df), you will be value x such that P (X < x) = p, where X is a random variable with a t-distribution with df degrees of freedom.

#Delete this line (including the # symbol) and place your code here.

Now, use TSTAR to calculate the margin of error for this confidence interval. Save this as an object named MOE, and print it out as well.

#Delete this line (including the # symbol) and place your code here.

Finally, use MOE to calculate the confidence interval. Type out the interval below, to two decimal places.: Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove

the asterisks.

Also, provide an interpretation of this confidence interval.

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove the asterisks.

Part (b) [4 marks] Also, repeat Part (g) from the previous question, but again assume that the population standard deviation is unknown and must be estimated from the dataset.

First, write out your hypothesis statements, by augmenting the LaTeX code below. You may replace the {} brackets either =, <, >, or any number.

H0:μ vs. Ha:μNext, calculate the test statistic for this test. Name the result TSTAT. Also, print the value out.

#Delete this line (including the # symbol) and place your code here.

Now, use the pt function to determine the p-value of this test. In particular, by using the code pt(x, df), you will be given the probability P(X < x), where X is a random variable with a t-distribution with df degrees of freedom.

#Delete this line (including the # symbol) and place your code here.

Finally, based on the p-value, provide a fully-worded conclusion to this test.

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove the asterisks.

Part (c) [1 mark] Provide an interpretation of the p-value calculated in Part (b).Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove

the asterisks.

Part (d) [1 mark] Could the interval calculated in Part (a) have been used to conduct the hypothesis test in Part (b)? If so, why and what would the conclusion be? If not, why not?

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove the asterisks.

Part (e) [2 marks] Use the qt function to calculate the critical value of the test in Part (b). #Delete this line (including the # symbol) and place your code here.

Explain what your conclusion would be by using the critical value method.

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove the asterisks.

Part (f) [2 marks] Use the qnorm function to calculate the critical value of the test in Part (g) from the previous question.

#Delete this line (including the # symbol) and place your code here.

Compare this critical value to the critical value calculated in Part (e) of this question.

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and do not remove the asterisks.

Part (g) [3 marks] You can also use the function t.test to easily conduct a t-test in R, and calculate a confidence interval. The syntax of this function is t.test(x, alternative = …, mu = …,). In this function:

The vector x is the dataset we provide.

The argument alternative allows us to specify whether the test is supposed to be upper-tailed

(alternative = “greater”), lower-tailed (alternative = “less”), or two-tailed (alternative =

“two.sided”)

The value mu is the value for μ that we see in the null hypothesis.

Use the t.test function to repeat your work from Part (b).#Delete this line (including the # symbol) and place your code here.

Question 3

A consumer protection board would like to conduct a hypothesis test to determine whether or not the true weight of a brand of potato chips is below 200 grams. The population standard deviation is assumed to be 1.5g, and we will be using a sample of size 17, assuming that the weights are Normally distributed.

Part (a) [2 marks] Explain what a Type I and Type II error represent in this scenario.Do this work separately on paper. Reference the assignment instructions for details on

how to format your work. Show all of your work for all written questions.

Part (b) [2 marks] Is a Z-test, or a T-test the appropriate test in this situation? Explain why.

Part (c) [3 marks] What is the rejection region of this test if we want a significance level of 1%? Instead of using tables, use pnorm or qnorm, and write out the function you use in your written work.

Part (d) [3 marks] Assuming that the true weight of this brand of potato chips is 198.4g, what is the power, and the probability of a Type II error in this test? Instead of using tables, use pnorm or qnorm, and write out the function you use in your written work.

Part (e) [2 marks] Suppose that we increase our sample size. What effect will this have on our Type I and Type II error rates?

Part (f) [1 marks] Suppose that the true mean is is 198.2g, and not 198.4g. What effect will this have on the Power of the test?

Part (g) [2 marks] Instead of assuming a significance level of 1%, suppose that our rejection rule is to reject if x ̄ < 198. What is the Type I error rate of this test?

Question 4

The developer of a website would like to test whether the average daily users of his website differs from the historical average, which he knows to be 5000 users per day.

Part (a) [2 marks] Assuming a standard deviation of 95, what is the rejection region of this test if we want a significance level of 5%? Instead of using tables, use pnorm or qnorm, and write out the function you use in your written work.

Part (b) [2 marks] Assuming that the true new average daily users is 5400, what is the power, and the probability of a Type II error in this test? Instead of using tables, use pnorm or qnorm, and write out the function you use in your written work.

Part (c) [2 marks] Assuming that the standard deviation of users per day is 95, what sample size is needed to obtain a 95% confidence interval with a margin of error of 50 people?

Part (d) [1 marks] Without repeating all of the calculations in the previous question, what sample size is needed to obtain a 95% confidence interval with a margin of error of 25 people?