Page 7

Semester 4: Testing of Statistical Hypothesis

Statistical Hypothesis: Null and Alternative, Simple and Composite, Critical region, Type-I and II errors, Most Powerful and Uniformly Most powerful tests, Neyman Pearson Lemma
Testing of Statistical Hypothesis
- Null and Alternative Hypothesis
  Null hypothesis is a statement suggesting no effect or no difference, serving as a default position. Alternative hypothesis proposes the existence of an effect or difference. Testing involves deciding whether to reject the null hypothesis in favor of the alternative.
- Simple and Composite Hypothesis
  A simple hypothesis specifies the exact value of a parameter, while a composite hypothesis includes a range of values. The choice of hypothesis affects the tools and approaches used in statistical testing.
- Critical Region
  The critical region is the set of all outcomes that would lead to the rejection of the null hypothesis. It is determined based on the significance level, which defines the probability of making a Type I error.
- Type-I and Type-II Errors
  Type-I error occurs when the null hypothesis is incorrectly rejected when it is true. Type-II error happens when the null hypothesis is not rejected when the alternative hypothesis is true. The balance between these errors is crucial in hypothesis testing.
- Most Powerful and Uniformly Most Powerful Tests
  Most powerful tests maximize the probability of correctly rejecting the null hypothesis for a given significance level. Uniformly most powerful tests maintain this property across all values of the parameter under the alternative hypothesis.
- Neyman-Pearson Lemma
  This lemma provides a method for deriving the most powerful test for simple hypotheses. It states that the most powerful test for a given size is one that maximizes the likelihood ratio of the two hypotheses.
Likelihood ratio test, Tests of mean and variance of normal populations, Equality of variances
- Likelihood Ratio Test
  The likelihood ratio test is a statistical method for comparing the goodness of fit of two models. Specifically, it tests the ratio of the maximum likelihoods of the two models to determine whether the data supports one model over the other. It is commonly used in hypothesis testing to evaluate complex models against simpler ones, especially in the context of nested models.
- Tests of Mean for Normal Populations
  In statistics, testing the mean of a normal population involves using methods such as the Z-test or T-test. The Z-test is used when the population variance is known, while the T-test is applicable when the population variance is unknown and the sample size is small. The fundamental goal is to determine whether the sample mean significantly differs from a hypothesized population mean.
- Tests of Variance for Normal Populations
  Testing the variance of normal populations generally involves using the Chi-square test. This test assesses whether the variance of a sample differs from a known variance in the population. It is crucial for understanding the spread or variability of data, which impacts many statistical analyses.
- Equality of Variances
  Testing for equality of variances, often done using the F-test, compares two or more sample variances to see if they come from populations with the same variance. This comparison is important in various statistical applications, especially in ANOVA, where the assumption of equal variances is a critical condition for valid results.
Chi-square tests, Distribution of quadratic forms, Analysis of Variance (ANOVA), Correlation and Regression testing
Testing of Statistical Hypothesis
- Chi-square tests
  Chi-square tests are statistical methods used to determine if there is a significant association between categorical variables. The Chi-square statistic measures how expectations compare to actual observed data. Common applications include the Chi-square test for independence and the goodness of fit test. The Chi-square distribution is used to interpret the results, often in hypothesis testing frameworks.
- Distribution of quadratic forms
  The distribution of quadratic forms pertains to the statistical distribution of expressions formed by the summation of the squares of random variables, often analyzed in multivariate statistics. Quadratic forms are commonly used in ANOVA and regression analysis. The properties of these distributions help to understand variance and equality of means.
- Analysis of Variance (ANOVA)
  ANOVA is a statistical method used to compare means among three or more groups to determine whether at least one group mean is significantly different from the others. It partitions total variance into components attributable to different sources, assessing them through F-statistics. Variants include one-way ANOVA, two-way ANOVA, and repeated measures ANOVA.
- Correlation and Regression testing
  Correlation analysis quantifies the degree of association between two variables, providing correlation coefficients like Pearson's r. Regression analysis extends this by modeling the relationship between a dependent variable and one or more independent variables, allowing predictions. The assumptions and diagnostics of regression models are crucial for valid inference.
Exact tests based on t distribution for one sample and two samples, Variance known and unknown
Exact tests based on t distribution for one sample and two samples, Variance known and unknown
- Introduction to t Distribution
  The t distribution is a family of distributions that are symmetric and bell-shaped, similar to the standard normal distribution, but have heavier tails. It is used in situations where sample sizes are small, and the population standard deviation is unknown.
- One Sample t Test
  The one sample t test is used to determine if the mean of a single sample differs from a known value (usually the population mean). It is applicable when the population variance is unknown. The test statistic is calculated as the difference between the sample mean and the population mean divided by the sample standard deviation divided by the square root of the sample size.
- Two Sample t Test (Independent Samples)
  The independent two sample t test compares the means of two independent groups. There are two scenarios based on the assumption of variance: equal variances and unequal variances. When variances are equal, the pooled variance is used to calculate the test statistic. When variances are unequal, a separate variance estimation is used.
- Paired Sample t Test
  The paired sample t test, also known as the dependent t test, is used when there are two measurements taken on the same group (like before and after treatment). The differences between pairs are calculated, and the t test is performed on these differences, assessing if the mean difference is significantly different from zero.
- Variance Known
  When population variance is known, a z test is typically used instead of a t test. This scenario is less common in practice as population variances are often unknown. The z test uses the normal distribution.
- Variance Unknown
  When the population variance is unknown, the t test is applied for both one sample and two sample tests. This is more common in practical applications as sample data is used to estimate unknown population parameters.
- Conclusion
  The choice between one sample, two sample, and paired sample t tests is driven by the study design and the nature of data. Understanding when to apply each test is critical in hypothesis testing.
Nonparametric methods: Confidence intervals for quantiles, Tolerance limits, Sign test, Wilcoxon test
Nonparametric methods: Confidence intervals for quantiles, Tolerance limits, Sign test, Wilcoxon test
- Confidence Intervals for Quantiles
  Nonparametric methods are used to estimate confidence intervals for quantiles without assuming a specific distribution. The most common methods include the bootstrap method, which resamples the data to create an empirical distribution, allowing for the estimation of quantiles and their confidence intervals.
- Tolerance Limits
  Tolerance limits provide bounds within which a specified proportion of the population is expected to fall, with a certain level of confidence. Nonparametric methods help in calculating these limits when data do not adhere to normality assumptions. They rely on rank-based approaches or samples to derive nonparametric tolerance intervals.
- Sign Test
  The sign test is a nonparametric test that compares the median of a single sample to a hypothesized value. It works by analyzing the signs (+ or -) of the differences between observed values and the hypothesized value, making it useful for ordinal data or when the assumptions of normality are not met.
- Wilcoxon Test
  The Wilcoxon test includes the Wilcoxon signed-rank test for paired observations and the Wilcoxon rank-sum test for independent samples. These tests assess whether there are differences between groups without the assumption of normally distributed data. They utilize ranks instead of raw data, making them robust against outliers.

Page 7

Semester 4: Testing of Statistical Hypothesis

Statistical Hypothesis: Null and Alternative, Simple and Composite, Critical region, Type-I and II errors, Most Powerful and Uniformly Most powerful tests, Neyman Pearson Lemma

Testing of Statistical Hypothesis

Null and Alternative Hypothesis

Simple and Composite Hypothesis

Critical Region

Type-I and Type-II Errors

Most Powerful and Uniformly Most Powerful Tests

Neyman-Pearson Lemma

Likelihood ratio test, Tests of mean and variance of normal populations, Equality of variances

Likelihood Ratio Test

Tests of Mean for Normal Populations

Tests of Variance for Normal Populations

Equality of Variances

Chi-square tests, Distribution of quadratic forms, Analysis of Variance (ANOVA), Correlation and Regression testing

Testing of Statistical Hypothesis

Chi-square tests

Distribution of quadratic forms

Analysis of Variance (ANOVA)

Correlation and Regression testing

Exact tests based on t distribution for one sample and two samples, Variance known and unknown

Exact tests based on t distribution for one sample and two samples, Variance known and unknown

Introduction to t Distribution

One Sample t Test

Two Sample t Test (Independent Samples)

Paired Sample t Test

Variance Known

Variance Unknown

Conclusion

Nonparametric methods: Confidence intervals for quantiles, Tolerance limits, Sign test, Wilcoxon test

Nonparametric methods: Confidence intervals for quantiles, Tolerance limits, Sign test, Wilcoxon test

Confidence Intervals for Quantiles

Tolerance Limits

Sign Test

Wilcoxon Test

Testing of Statistical Hypothesis

B.Sc. Statistics

Statistics

IV

Periyar University

Core Theory VII