Page 11

Semester 4: M.Sc. Biotechnology Syllabus 2023-2024

  • Statistics Scope collection, classification, tabulation of Statistical Data Diagrammatic representation graphs graph drawing graph paper plotted curve Sampling method and standard errors random sampling use of random numbers expectation of sample estimates means confidence limits standard errors variance

    Statistics Scope and Collection
    • Collection of Statistical Data

      The process of gathering quantitative and qualitative data from various sources. Techniques include surveys, experiments, and observational studies. Effective collection ensures accurate analysis.

    • Classification of Data

      Data is organized into categories for analysis. The two main types are qualitative (categorical) and quantitative (numerical). Classification enhances data understanding and aids in comparison.

    • Tabulation of Data

      The systematic arrangement of data in tables to summarize and facilitate analysis. Tables can be one-way or two-way, and they provide a clear overview of the data.

    • Diagrammatic Representation

      Visual tools like graphs and charts are used to illustrate data relationships and trends. This method simplifies complex information for better comprehension.

    • Graphs and Graph Drawing

      Graphs such as bar graphs, histograms, and pie charts visually represent data. Proper scaling and labeling are essential for accuracy.

    • Graph Paper and Plotted Curves

      Graph paper is used for accurately plotting data points. Plotted curves help in visualizing trends within the data.

    • Sampling Methods

      Sampling techniques allow data collection from a subset of the population. Major types include random sampling, systematic sampling, and stratified sampling.

    • Standard Errors

      Standard error measures the accuracy of sample estimates. It indicates the extent to which a sample statistic is expected to vary from the population parameter.

    • Random Sampling

      A technique where every individual has an equal chance of being selected. This method reduces bias and improves the representativeness of the sample.

    • Use of Random Numbers

      Random numbers are utilized in sampling to ensure that selections are unbiased. Tools and software can generate random numbers for various applications.

    • Expectation of Sample Estimates

      Estimation involves determining a population parameter from sample data. The expectation value provides a central tendency measure for sample estimates.

    • Means and Confidence Limits

      The mean is a measure of central tendency, while confidence limits provide a range within which the true population parameter lies, based on sample data.

    • Variance

      Variance measures the dispersion of a set of values. It quantifies how far each number in the dataset is from the mean and from each other.

  • Correlation and regression correlation table coefficient of correlation Z transformation regression relation between regression and correlation. Probability Markov chains applications Probability distributions Binomial Gaussian distribution and negative binomial, compound and multinomial distributions Poisson distribution

    Correlation and Regression in Biostatistics
    • Correlation Coefficient

      The correlation coefficient measures the strength and direction of the linear relationship between two variables. Ranges from -1 to 1. A value close to 1 indicates a strong positive correlation, while a value close to -1 indicates a strong negative correlation.

    • Coefficient of Correlation (r)

      The coefficient of correlation (denoted as r) quantifies the degree of correlation between two variables. It is calculated using the formula r = cov(X,Y) / (σX * σY), where cov is covariance and σ is the standard deviation.

    • Z Transformation in Correlation

      Z transformation standardizes data by converting scores to a common scale with a mean of 0 and standard deviation of 1. This is useful for comparing scores from different distributions.

    • Regression Analysis

      Regression analysis is a statistical method for modeling the relationship between a dependent variable and one or more independent variables. It helps in predicting outcomes and understanding relationships.

    • Relation between Correlation and Regression

      While correlation measures the strength of a relationship, regression assesses how well one variable predicts another. Correlation does not imply causation, whereas regression is used to make causal inferences.

    • Probability Concepts

      Probability is the measure of the likelihood that an event will occur. It forms the basis for statistical inference and decision-making.

    • Markov Chains

      Markov chains are stochastic processes involving transitions from one state to another on a state space. They are used in various applications such as queueing theory and genetics.

    • Probability Distributions

      Probability distributions describe how the values of a random variable are distributed. They are crucial in understanding the behavior of random variables.

    • Binomial Distribution

      The binomial distribution models the number of successes in a fixed number of independent Bernoulli trials. It is characterized by parameters n (number of trials) and p (probability of success).

    • Gaussian Distribution

      The Gaussian distribution, or normal distribution, is symmetric and defined by its mean and standard deviation. Many biological phenomena and measurement errors follow this distribution.

    • Negative Binomial Distribution

      The negative binomial distribution represents the number of failures before a specified number of successes occurs in a sequence of Bernoulli trials.

    • Compound Distributions

      Compound distributions are formed by summing random variables, allowing for modeling of more complex phenomena where the total result depends on multiple sources.

    • Multinomial Distribution

      The multinomial distribution generalizes the binomial distribution for scenarios with more than two possible outcomes. It is used in categorical data analysis.

    • Poisson Distribution

      The Poisson distribution models the number of events that occur in a fixed interval of time or space. It is useful for count-based data, especially in biology and ecology.

  • Normal distribution graphic representation. frequency curve and its characteristics measures of central value, dispersion, coefficient of variation and methods of computation Basis of Statistical Inference Sampling Distribution Standard error Testing of hypothesis Null Hypothesis Type I and Type II errors

    Normal distribution graphic representation
    • Normal Distribution Overview

      Normal distribution is a continuous probability distribution characterized by its bell-shaped curve. It is defined by two parameters: mean and standard deviation.

    • Graphic Representation

      The graphic representation of a normal distribution shows a symmetric curve centered around the mean. The area under the curve represents the total probability of the distribution.

    • Frequency Curve

      A frequency curve is a smooth curve that represents the frequency of data points in a dataset, allowing for visual identification of trends within a normal distribution.

    • Characteristics

    • Measures of Central Value

      In a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution.

    • Measures of Dispersion

      Standard deviation is a key measure of dispersion in normal distribution, indicating how much data varies from the mean.

    • Coefficient of Variation

      The coefficient of variation is the ratio of the standard deviation to the mean, useful for comparing variability between different datasets.

    • Methods of Computation

      Methods include calculating mean and standard deviation directly from data, or using software for larger datasets.

    • Basis of Statistical Inference

      Normal distribution forms the foundation for many statistical inference methods and is crucial for hypothesis testing.

    • Sampling Distribution

      The sampling distribution is the distribution of sample means. When samples are taken from a population, the sampling distribution of the mean can often be approximated by a normal distribution.

    • Standard Error

      The standard error measures how far the sample mean of the data is likely to be from the true population mean, decreasing as sample size increases.

    • Testing of Hypothesis

      Statistical hypothesis testing utilizes normal distribution to determine the validity of a null hypothesis based on sample data.

    • Null Hypothesis

      The null hypothesis is a statement that there is no effect or difference, serving as the basis for statistical testing.

    • Type I and Type II Errors

  • Tests of significance for large and small samples based on Normal, t, z distributions with regard to mean, variance, proportions and correlation coefficient chi-square test of goodness of fit contingency tables c2 test for independence of two attributes Fisher and Behrens d test 22 table testing heterogeneity r X c table chi-square test in genetic experiments partition X 2 Emersons method

    Tests of significance for large and small samples
    • Normal Distribution

      The normal distribution is a continuous probability distribution characterized by its bell-shaped curve. Tests of significance using this distribution involve z-tests for large sample sizes (n > 30) to determine if sample means significantly differ from population means.

    • t Distribution

      The t distribution is used for smaller sample sizes (n ≤ 30) and is characterized by its heavier tails compared to the normal distribution. t-tests are employed to compare sample means and assess significance in mean differences.

    • Z Distribution

      Z tests are applicable when the population variance is known. The method involves calculating the z statistic to test hypotheses about population means or proportions.

    • Variance Testing

      Tests like F-test assess the equality of variances from two populations. It is crucial in ANOVA tests to determine if means from different groups have significant deviations.

    • Proportions

      Proportion tests, such as z-tests for proportions, evaluate if the observed proportion in a sample significantly differs from a hypothesized proportion.

    • Correlation Coefficient

      The significance of the correlation coefficient (r) can be evaluated using statistical tests that determine if the observed correlation reflects a true relationship in the population.

    • Chi-square Test of Goodness of Fit

      This test evaluates how well observed categorical data fit an expected distribution. It compares the frequencies of observed categories to those expected.

    • Contingency Tables

      Contingency tables summarize the relationship between two categorical variables. The chi-square test assesses independence between these variables.

    • Chi-square Test for Independence

      This procedure determines whether two categorical variables are independent by comparing observed frequencies with expected frequencies.

    • Fisher's Exact Test

      Utilized for small sample sizes, Fisher's Exact Test provides a method for determining the significance of the association between two categorical variables.

    • Behrens-Fisher Problem

      This challenge arises when comparing means from two different populations with unknown and possibly unequal variances. Adjustments through specific tests help overcome this.

    • Heterogeneity Testing

      Utilized to assess whether observed differences across groups or studies are greater than would be expected by chance. Relevant for meta-analysis.

    • X² Test in Genetic Experiments

      The chi-square test is useful in genetics for comparing expected frequencies with observed frequencies in trait inheritance.

    • Emerson's Method

      A statistical method for genetic data analysis that accounts for complex inheritance patterns and provides a framework for hypothesis testing.

  • Tests of significance t tests F tests Analysis of variance one way classification Two way classification, CRD, RBD, LSD. Spreadsheets Data entry mathematical functions statistical function Graphics display printing spreadsheets use as a database word processes databases statistical analysis packages graphicspresentation packages

    Tests of Significance
    Used to determine if there is a significant difference between the means of two groups.
    • Independent samples t-test

    • Paired samples t-test

    Commonly used in medical research, psychology, and other fields to compare groups.
    Used to compare variances between two populations and assess the overall fit of a model.
    Frequently used in ANOVA and regression analysis.
    A statistical method used to determine if there are significant differences between the means of three or more groups.
    Involves one independent variable or factor.
    Used when comparing means across multiple groups.
    Involves two independent variables or factors.
    Enables understanding of interaction effects between factors.
    An experimental design where subjects are randomly assigned to different treatments.
    Ensures that treatment effects can be attributed to the treatments themselves.
    An experimental design that blocks similar experimental units and randomizes treatments within blocks.
    Used to reduce variability among experimental units.
    A post-hoc test used after ANOVA to determine which means are significantly different.
    Helps in identifying specific group differences.
    Input of data into a spreadsheet application.
    Excel, Google Sheets.
    Functions to perform basic arithmetic and calculations.
    SUM, AVERAGE, COUNT.
    Functions that provide statistical analyses.
    AVERAGE, STDEV, TTEST.
    Visualization tools within spreadsheets.
    Charts, graphs, and plots.
    Outputting spreadsheet data in printed form.
    Reports, presentations.
    Utilizing spreadsheets for database functions.
    Storing, sorting, and analyzing data.
    Creating and editing text documents.
    Reports, documentation, and essays.
    Structured collections of data.
    Used for data management and retrieval.
    Software specifically designed for statistical analysis.
    SPSS, R, SAS.
    Tools for creating visual representations of data.
    PowerPoint, Canva.

M.Sc. Biotechnology Syllabus 2023-2024

M.Sc. Biotechnology

Core Paper-11 BIOSTATISTICS

4

Periyar University

BIOSTATISTICS

free web counter

GKPAD.COM by SK Yadav | Disclaimer