% 260s20Assignment.tex Tests: Part One Continued \documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb %\usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{comment} %\usepackage{fullpage} \oddsidemargin=0in % Good for US Letter paper \evensidemargin=0in \textwidth=6.3in \topmargin=-1in \headheight=0.2in \headsep=0.5in \textheight=9.4in %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 260s20 Assignment Seven: More hypothesis testing, still Part One}}\footnote{Copyright information is at the end of the last page.} %\vspace{1 mm} \end{center} \noindent The following homework problems are not to be handed in. They are preparation for Quiz 7 (Week of March 16th) and Term Test 2. \textbf{Please try each question before looking at the solution}. Use the formula sheet. \begin{enumerate} \item Based on a random sample from a Bernoulli distribution, testing a null hypothesis about $\theta$ yields $Z=-1.82$. \begin{enumerate} \item For $H_0: \theta=\theta_0$ versus $H_1: \theta \neq \theta_0$ \begin{enumerate} \item What is the $p-$value? \item Do you reject $H_0$ at $\alpha = 0.05$? \item What do you conclude? Pick one: $\theta<\theta_0$ ~~ $\theta>\theta_0$ ~~ $\theta=\theta_0$. \end{enumerate} \item For $H_0: \theta \leq \theta_0$ versus $H_1: \theta > \theta_0$ \begin{enumerate} \item What is the $p-$value? \item Do you reject $H_0$ at $\alpha = 0.05$? \item What do you conclude? Pick one: $\theta \leq \theta_0$ ~~ $\theta>\theta_0$ . \end{enumerate} \item For $H_0: \theta \geq \theta_0$ versus $H_1: \theta < \theta_0$ \begin{enumerate} \item What is the $p-$value? \item Do you reject $H_0$ at $\alpha = 0.05$? \item What do you conclude? Pick one: $\theta \geq \theta_0$ ~~ $\theta<\theta_0$. \end{enumerate} \end{enumerate} \item For $i=1, \ldots, n$, let $Y_i = \beta x_i + E_i$, where \begin{itemize} \item[] $x_1, \ldots, x_n$ are fixed, known constants \item[] $E_1, \ldots, E_n$ are independent and identically distributed Normal(0,$\sigma^2$) random variables. The parameter $\beta$ is unknown, but we will pretend that somehow the variance $\sigma^2$ is \emph{known}. \end{itemize} % For example, the $x_i$ values could be kilograms of bananas on display at store $i$ at the beginning of the day, and the $Y_i$ could be kilograms of bananas sold. \begin{enumerate} \item What is the distribution of $Y_i$? Include the parmeters. Are the $Y_i$ independent? \item Derive a formula the MLE of $\beta$. Don't bother with the second derivative test. \item What is the distribution of $\widehat{\beta}$? Include the parmeters. Show some work. \item Standardizing $\widehat{\beta}$, obtain a $Z$ statistic for testing $H_0: \beta=\beta_0$ against $H_1: \beta \neq \beta_0$. \item Suppose $\sigma^2=4$ and the null hypothesis is $\beta=0$. Calculate the test statistic for these data: \begin{center} \begin{tabular}{lcccccc} $x$ & 1.00 & 1.00 & 2.00 & 2.00 & 3.0 & 3.00 \\ $y$ & 4.29 & -0.59 & 4.66 & 1.71 & 2.5 & 2.98 \end{tabular} \end{center} \item What is the $p-$value? Show some work, including a picture. \item Do you reject $H_0$ at $\alpha=0.01$? Answer Yes or No. \item What do you conclude? Pick one: $\beta=0$ ~~~ $\beta>0$ ~~~ $\beta<0$. \end{enumerate} % End of regression question \pagebreak \item Here is an example from an earlier assignment. In a taste test, customers are choosing between two coffees. We adopt a Bernoulli model for the probability of choosing Coffee $A$, and we plan to test $H_0: \theta = \frac{1}{2}$ against $H_1: \theta \neq \frac{1}{2}$ at the 0.05 significance level using the test statistic \begin{displaymath} Z_n = \frac{\sqrt{n}(\overline{X}_n-\theta_0)}{\sqrt{\theta_0(1-\theta_0)}}. \end{displaymath} \begin{enumerate} \item Derive a general formula for the power of the test of $H_0: \theta=\theta_0$ when the true parameter value is $\theta$. Use $\Phi(\cdot)$ to denote the cumulative distribution function of a standard normal. \item Verify that your formula yields $\alpha$ when $\theta=\theta_0$. \item Suppose that in the taste test, $\theta_0=0.5$, and the true value of $\theta = 0.45$, so that in the population of consumers being sampled, 45\% would prefer Coffee $A$. What is the approximate power of the test for $n=100$ participants? % 0.17 \item What is the approximate power for $n=783$? One of the numbers is off the charts; do your best. % 0.8002392 \end{enumerate} \item Before actually carrying out a study, it's a good idea to decide the sample size on some rational basis. For all statistical tests used in practice, the power gets closer and closer to one as the sample size increases, as long as the null hypothesis is false. That is, the test becomes more and more sensitive, and more able to detect the truth. This is the key to using statistical power to decide on a sample size. Choose a particular way the null hypothesis might be wrong, and also choose a desired power -- that is, a desired probability of correctly rejecting the null hypothesis for that particular scenario. Then increase the sample size until the desired power is attained. Accordingly, please return to the appetizing example of the rat hairs in peanut butter. We are confident that the number of rat hairs in a jar of peanut butter follows a Poisson distribution, and there is a govenrment standard that says the expected number of rat hairs in a 500g jar may be no more than 8. So using the traditional $\alpha=0.05$ significance level, we will test $H_0: \lambda \leq 8$ against $H_1: \lambda>8$ with the test statistic \begin{displaymath} Z_n = \frac{\sqrt{n}(\overline{X}_n-\lambda_0)}{\sqrt{\lambda_0}}. \end{displaymath} Suppose the true expected number of rat hairs is nine. What sample size is required for the probability of rejecting $H_0$ to be at least 0.80? The answer is an integer. Show your work. Hint: You can directly solve for $n$, and then increase your answer to the next integer. \item Let $X_1, \ldots, X_n$ be a random sample from a Normal$(\mu,1)$ distribution. We plan to test $H_0:\mu \geq 0$ against $H_1: \mu<0$ with a $Z$ test, using the usual $\alpha=0.05$ significance level. We want to be able to reject $H_0$ with probability 0.80 when the true value of $\mu$ equals minus one. What sample size is required. That is, what's the smallest sample size that will get the job done? \pagebreak \item Let $X_1, \ldots, X_n$ be a random sample from a Normal$(\mu,\sigma^2)$ distribution, and suppose we are testing $H_0: \sigma^2=\sigma^2_0$ with a chi-squared test. Prove that the null hypothesis is rejected at significance level $\alpha$ with a two-sided test if and only if the $(1-\alpha)100\%$ confidence interval does not contain $\sigma^2_0$. This example shows that as one student pointed out after lecture, the correspondence between tests and confidence intervals does not depend on the distribution of the test statistic being symmetric. \item Suppose that a null hypothesis is to be rejected when some test statistic $T$ exceeds a critical value $c$. Also suppose that $T$ is a continuous random variable with cumulative distribution function $F(t)$ when the null hypothesis is true. The distribution of $T$ is supported on a single interval (which might be infinite), guaranteeing that $F(t)$ has a unique inverse. \begin{enumerate} \item The $p-$value is a random variable, a function of $T$. Write a formula for the $p-$value. \item Show that when the null hypothesis is true, the $p-$value has a uniform distribution on the interval from zero to one. \end{enumerate} \item The objective of \emph{meta-analysis} is to pool the results of several studies on the same or related topics, and come to a single conclusion. For example, there might be multiple clinical trials of a new drug, with some results positive, and some inconclusive or negative. Usually the original data are not available, just the published statistical results. Suppose we have a collection of $m$ studies carried out by different investigators, each of which did some statistical test and came up with a $p-$value. Our null hypothesis is that the null hypothesis was true in \emph{all} of the studies. \begin{enumerate} \item What is the joint distribution of $P_1, \ldots, P_m$ under our null hypothesis that no effect was present in any of the studies? \item What is the distribution of $Y_i = -2\ln(P_i)$? Show your work. \item What is the distribution of $W=\sum_{i=1}^m Y_i$? No proof is required. Just write down the answer. \item Why would a large value of $W$ be surprising if all $m$ null hypotheses were true? \item What is the critical value of $W$? It's a quantile of the chi-squared distribution. \item Suppose that 6 studies on a particular topic yield the following $p-$values: \texttt{0.255, 0.065, 0.268, 0.044, 0.080, 0.135}. Notice that only one test rejected the null hypothesis at $\alpha=0.05$, but all the $p-$values are on the small side. It's really hard to judge whether these results provide support for the idea being tested. Do the meta-analysis test. What do you conclude? \end{enumerate} \pagebreak \item Here is the standard model for testing equality of $k$ means. There are $k$ ``groups" of observations (data values), maybe from $k$ different experimental treatments. There are $n_j$ observations in group $j$, for $j = 1, \ldots, k$. The total number of observations is $n = \sum_{j=1}^k n_j$. Independently for $j = 1, \ldots, k$ and $i = 1, \ldots n_j$, $X_{i,j} \sim N(\mu_j,\sigma^2)$. We seek to test $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$ at significance level $\alpha$. We will use $F = \frac{W_1/\nu_1}{W_2/\nu_2}$. \begin{enumerate} \item Let $W_2 = \sum_{j=1}^k \frac{(n_j-1)S^2_j}{\sigma^2}$. \begin{enumerate} \item Just to establish that you understand the notation, give a formula for $S^2_j$. \item What is the distribution of $W_2$, and why? \item Verify that $W_2 = \frac{1}{\sigma^2}\sum_{j=1}^k\sum_{i=1}^{n_j}(X_{i,j}-\overline{X}_j)^2$. \end{enumerate} \item \label{decomp} Show $\sum_{j=1}^k\sum_{i=1}^{n_j}(X_{i,j}-\overline{X}_{\mbox{.}})^2 = \sum_{j=1}^k n_j(\overline{X}_j - \overline{X}_{\mbox{.}})^2 + \sum_{j=1}^k \sum_{i=1}^{n_j} (X_{i,j} - \overline{X}_j )^2$, where $\overline{X}_{\mbox{.}} = \sum_{j=1}^k\left(\frac{n_j}{n}\right) \overline{X}_j$. \item Verify that $\overline{X}_{\mbox{.}}$ is the sample mean of all the observations. \item If $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$ is true, say if all the $\mu_j=\mu$, what is the distribution of $\frac{1}{\sigma^2}\sum_{j=1}^k\sum_{i=1}^{n_j}(X_{i,j}-\overline{X}_{\mbox{.}})^2$? Why? \item Let $W_1 = \frac{1}{\sigma^2}\sum_{j=1}^k n_j(\overline{X}_j - \overline{X}_{\mbox{.}})^2$. What is the distribution of $W_1$ if $H_0$ is true? Why? \item How do you know $W_1$ and $W_2$ are independent?. \item Write a formula for the $F$ statistic. What are the degrees of freedom? \item Here's some vocabulary and notation. \begin{itemize} \item The \emph{Total Sum of Squares}, denoted by $SSTO = \sum_{j=1}^k\sum_{i=1}^{n_j}(X_{i,j}-\overline{X}_{\mbox{.}})^2$ is the sum of squared deviations of all the observations from the overall mean. \item The \emph{Sum of Squares Within Groups} is denoted by $SSW = \sum_{j=1}^k \sum_{i=1}^{n_j} (X_{i,j} - \overline{X}_j )^2$. It's the amount of variation still unexplained once you know group membership. \item The \emph{Sum of Squares Between Groups} is denoted by $SSB = \sum_{j=1}^k n_j(\overline{X}_j - \overline{X}_{\mbox{.}})^2$. It's the (weighted) sum of squared deviation of the group means from the overall mean. Since $SSTO$ is total variation and $SSW$ is variation that is still unexplained after you know group membership, $SSB$ must be the amount of variation that is explained by group membership. \item $R^2 = \frac{SSB}{SSTO}$ is the \emph{proportion} of variation in the observations that is explained by group membership. It's an index of how strongly the $X_{i,j}$ values are related to group membership. \end{itemize} Finally, here is a nice easy question about all this. Show $F = \left(\frac{n-k}{k-1}\right)\left(\frac{R^2}{1-R^2}\right)$. \pagebreak \item Frequently, reports will give a table of $n$'s, means and standard deviations, but no hypothesis test. Fortunately the information in the table is all you need. You may have wondered which U~of~T campus hands out the highest marks. Here are some sample data collected long ago. Normality is not a bad assumption. \begin{center} \begin{tabular}{lccc} \hline Campus & $n$ & Mean & Standard Deviation \\ \hline SG & 117 & 2.81 & 0.589 \\ UTM & ~32 & 2.70 & 0.478 \\ UTSC & ~51 & 2.61 & 0.553 \\ \hline \end{tabular} \end{center} \begin{enumerate} \item Carry out the $F$ test using the $\alpha = 0.05$ significance level. The critical value is 3.04. I got it using R, not the awful tables of the $F$ distribution. What do you conclude? \item What is $R^2$? Give a number. Does this seem like a strong relationship between campus and GPA? \end{enumerate} \end{enumerate} \end{enumerate} % End of all the questions % Some ideas % CI <=> test % p ~ U(0,1) % Meta-analysis of p-values. % Independent exponentials, exact F test for H0: lambda1 = lambda2 Done % Check the HW problems of course. \vspace{90mm} \vspace{3mm} \hrule %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \vspace{3mm} \noindent This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Mathematical and Computational Sciences, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \begin{center} \href{http://www.utstat.toronto.edu/~brunner/oldclass/260s20} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/260s20}} \end{center} \end{document} \begin{verbatim} \vspace{3mm} \hrule %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \vspace{3mm} # Regression question > rbind(x,y) [,1] [,2] [,3] [,4] [,5] [,6] x 1.00 1.00 2.00 2.00 3.0 3.00 y 4.29 -0.59 4.66 1.71 2.5 2.98 > summary(lm(y~0+x)) Call: lm(formula = y ~ 0 + x) Residuals: 1 2 3 4 5 6 3.1157 -1.7643 2.3114 -0.6386 -1.0229 -0.5429 Coefficients: Estimate Std. Error t value Pr(>|t|) x 1.1743 0.3771 3.114 0.0264 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.996 on 5 degrees of freedom Multiple R-squared: 0.6598, Adjusted R-squared: 0.5917 F-statistic: 9.695 on 1 and 5 DF, p-value: 0.02644 # Power of the coffee taste test. pow = function(n,theta0=1/2,theta=0.45,alpha=0.05) { z = qnorm(1-alpha/2) left = sqrt(n)*(theta0-theta)/sqrt(theta*(1-theta)) right = z*sqrt(theta0*(1-theta0)/(theta*(1-theta))) value = 1 - pnorm(left+right) + pnorm(left-right) return(value) } n = 100 1 - pnorm(sqrt(n)*0.1005+1.9699) + pnorm(sqrt(n)*0.1005-1.9699) > # Meta-analysis > qchisq(0.95,12) [1] 21.02607 > y = c(0.255, 0.065, 0.268, 0.044, 0.080, 0.135) > W = sum(-2*log(y)); W [1] 26.13681 # Tri-campus ANOVA n = c(117,32,51); N=sum(n); k = length(n) xbar = c(2.81,2.70,2.61) s2 = c(0.589,0.478,0.553)^2 xbardot = sum(n*xbar)/N; xbardot SSW = sum((n-1)*s2) SSB = sum(n*(xbar-xbardot)^2) Fstat = (SSB/(k-1)) /(SSW/(N-k)); Fstat > n = c(117,32,51); N=sum(n); k = length(n) > xbar = c(2.81,2.70,2.61) > s2 = c(0.589,0.478,0.553)^2 > > xbardot = sum(n*xbar)/N; xbardot [1] 2.7414 > SSW = sum((n-1)*s2) > SSB = sum(n*(xbar-xbardot)^2) > Fstat = (SSB/(k-1)) /(SSW/(N-k)); Fstat [1] 2.337599 > aggregate(HSGPA,by=list(Campus),FUN=mean) aggregate(HSGPA,by=list(Campus),FUN=sd) aggregate(HSGPA,by=list(Campus),FUN=length) summary(lm(HSGPA~Campus)) > aggregate(UNIVGPA,by=list(Campus),FUN=mean) Group.1 x 1 SG 2.813248 2 UTM 2.695312 3 UTSC 2.613333 > aggregate(UNIVGPA,by=list(Campus),FUN=sd) Group.1 x 1 SG 0.5886730 2 UTM 0.4779120 3 UTSC 0.5532582 > aggregate(UNIVGPA,by=list(Campus),FUN=length) Group.1 x 1 SG 117 2 UTM 32 3 UTSC 51 > summary(lm(UNIVGPA~Campus)) Call: lm(formula = UNIVGPA ~ Campus) Residuals: Min 1Q Median 3Q Max -1.32325 -0.38980 0.04072 0.37520 1.18667 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.81325 0.05211 53.989 <2e-16 *** CampusUTM -0.11794 0.11244 -1.049 0.2955 CampusUTSC -0.19991 0.09457 -2.114 0.0358 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.5636 on 197 degrees of freedom Multiple R-squared: 0.02352, Adjusted R-squared: 0.01361 F-statistic: 2.373 on 2 and 197 DF, p-value: 0.09588