When analyzing variation series distribution great importance has how much empirical distribution sign corresponds normal. To do this, the frequencies of the actual distribution must be compared with the theoretical ones, which are characteristic of a normal distribution. This means that, based on actual data, it is necessary to calculate the theoretical frequencies of the normal distribution curve, which are a function of normalized deviations.
In other words, the empirical distribution curve needs to be aligned with the normal distribution curve.
Objective characteristics of compliance theoretical And empirical frequencies can be obtained using special statistical indicators which are called consent criteria.
Agreement criterion called a criterion that allows you to determine whether the discrepancy is empirical And theoretical distributions are random or significant, i.e. whether the observational data agree with the put forward statistical hypothesis or do not agree. Distribution population, which it has due to the hypothesis put forward, is called theoretical.
There is a need to install criterion(rule) that would allow one to judge whether the discrepancy between the empirical and theoretical distributions random or significant. If the discrepancy turns out to be random, then they believe that the observational data (sample) are consistent with the hypothesis put forward about the law of distribution of the general population and, therefore, the hypothesis is accepted; if the discrepancy turns out to be significant, then the observational data do not agree with the hypothesis and it is rejected.
Typically, empirical and theoretical frequencies differ because:
- the discrepancy is random and due to limited quantity observations;
- the discrepancy is not accidental and is explained by the fact that the statistical hypothesis that the population is normally distributed is erroneous.
Thus, consent criteria make it possible to reject or confirm the correctness of the hypothesis put forward when aligning the series about the nature of the distribution in the empirical series.
Empirical frequencies obtained as a result of observation. Theoretical frequencies calculated using formulas.
For normal distribution law they can be found as follows:
- Σƒ i - sum of accumulated (cumulative) empirical frequencies
- h - difference between two neighboring options
- σ - sample standard deviation
- t–normalized (standardized) deviation
- φ(t)–probability density function of normal distribution (found for the corresponding value of t)
There are several goodness-of-fit tests, the most common of which are: chi-square test (Pearson), Kolmogorov test, Romanovsky test.
Pearson's goodness-of-fit test χ 2– one of the main ones, which can be represented as the sum of the ratios of the squares of the differences between theoretical (f T) and empirical (f) frequencies to theoretical frequencies:
- k is the number of groups into which the empirical distribution is divided,
- f i –observed frequency of the trait in the i-th group,
- f T – theoretical frequency.
For the χ 2 distribution, tables have been compiled that indicate the critical value of the χ 2 goodness-of-fit criterion for the selected significance level α and degrees of freedom df (or ν).
The significance level α is the probability of erroneously rejecting the proposed hypothesis, i.e. the probability that a correct hypothesis will be rejected. R - statistical significance
adoption correct hypothesis. In statistics, three levels of significance are most often used:
α=0.10, then P=0.90 (in 10 cases out of 100)
α=0.05, then P=0.95 (in 5 cases out of 100)
α=0.01, then P=0.99 (in 1 case out of 100) the correct hypothesis can be rejected
The number of degrees of freedom df is defined as the number of groups in the distribution series minus the number of connections: df = k –z. The number of connections is understood as the number of indicators of the empirical series used in calculating theoretical frequencies, i.e. indicators connecting empirical and theoretical frequencies.For example, when aligned with a bell curve, there are three relationships.Therefore, when aligned bybell curvethe number of degrees of freedom is defined as df =k–3.To assess significance, the calculated value is compared with the table χ 2 tables
With complete coincidence of the theoretical and empirical distributions χ 2 =0, otherwise χ 2 >0. If χ 2 calc > χ 2 tab , then for a given level of significance and number of degrees of freedom, we reject the hypothesis about the insignificance (randomness) of the discrepancies. If χ 2 calculated< χ 2 табл то we accept the hypothesis and with probability P = (1-α) it can be argued that the discrepancy between theoretical and empirical frequencies accidentally. Therefore, there is reason to assert that the empirical distribution obeys normal distribution. Pearson's goodness-of-fit test is used if the population size is large enough (N>50), and the frequency of each group must be at least 5.
Based on determining the maximum discrepancy between the accumulated empirical and theoretical frequencies:
where D and d are, respectively, the maximum difference between the accumulated frequencies and the accumulated frequencies of the empirical and theoretical distributions.
Using the distribution table of the Kolmogorov statistics, the probability is determined, which can vary from 0 to 1. When P(λ) = 1, there is a complete coincidence of frequencies, P(λ) = 0 - a complete discrepancy. If the probability value P is significant in relation to the found value λ, then we can assume that the discrepancies between the theoretical and empirical distributions are insignificant, that is, they are random.
The main condition for using the Kolmogorov criterion is that big number observations.
Kolmogorov goodness-of-fit test
Let us consider how the Kolmogorov criterion (λ) is applied when testing the hypothesis of normal distribution general population.Aligning the actual distribution with the bell curve consists of several steps:
- Compare actual and theoretical frequencies.
- Based on actual data, the theoretical frequencies of the normal distribution curve, which is a function of the normalized deviation, are determined.
- They check to what extent the distribution of the characteristic corresponds to normal.
ForIVtable columns:
In MS Excel, the normalized deviation (t) is calculated using the NORMALIZATION function. It is necessary to select a range of free cells by the number of options (rows spreadsheet). Without removing the selection, call the NORMALIZE function. In the dialog box that appears, indicate the following cells, which contain, respectively, the observed values (X i), average (X) and standard deviation Ϭ. The operation must be completed simultaneous by pressing Ctrl+Shift+Enter
ForVtable columns:
The probability density function of the normal distribution φ(t) is found from the table of values of the local Laplace function for the corresponding value of the normalized deviation (t)
ForVItable columns:
Kolmogorov goodness-of-fit test (λ) determined by dividing the modulemax differencebetween empirical and theoretical cumulative frequencies by the square root of the number of observations:
Using a special probability table for the agreement criterion λ, we determine that the value λ = 0.59 corresponds to a probability of 0.88 (λ
Distribution of empirical and theoretical frequencies, probability density of theoretical distribution
When applying goodness-of-fit tests to check whether the observed (empirical) distribution corresponds to the theoretical one, one should distinguish between testing simple and complex hypotheses.
The one-sample Kolmogorov-Smirnov normality test is based on maximum difference between cumulative empirical distribution sample and the assumed (theoretical) cumulative distribution. If the Kolmogorov-Smirnov D statistic is significant, then the hypothesis that the corresponding distribution is normal should be rejected.
See also
Criteria for testing randomness and assessing outlier observations Literature Introduction In practice statistical analysis experimental data, the main interest is not the calculation of certain statistics itself, but the answers to questions of this type. Accordingly, many criteria have been developed to verify the put forward statistical hypotheses. All criteria for testing statistical hypotheses are divided into two large groups: parametric and non-parametric.
Share your work on social networks
If this work does not suit you, at the bottom of the page there is a list of similar works. You can also use the search button
Using Consent Criteria
Introduction
Literature
Introduction
In the practice of statistical analysis of experimental data, the main interest is not the calculation of certain statistics itself, but the answers to questions of this type. Is the population mean really equal to a certain number? Is the correlation coefficient significantly different from zero? Are the variances of the two samples equal? And many such questions may arise, depending on the specific research problem. Accordingly, many criteria have been developed to test the proposed statistical hypotheses. We will consider some of the most common of them. These will mainly relate to means, variances, correlation coefficients and abundance distributions.
All criteria for testing statistical hypotheses are divided into two large groups: parametric and non-parametric. Parametric tests are based on the assumption that the sample data are drawn from a population with a known distribution, and the main task is to estimate the parameters of this distribution. Nonparametric tests do not require any assumptions about the nature of the distribution, other than the assumption that it is continuous.
Let's look first parametric criteria. The test sequence will include the formulation of the null hypothesis and the alternative hypothesis, the formulation of the assumptions to be made, the determination of the sample statistics used in the test and, the formation of the sample distribution of the statistics being tested, the determination of the critical regions for the selected criterion, and the construction of a confidence interval for the sample statistics.
1 Goodness-of-fit criteria for means
Let the hypothesis being tested be that the population parameter. The need for such a check may arise, for example, in the following situation. Suppose that, based on extensive research, the diameter of the shell of a fossil mollusk in sediments from some fixed location has been established. Let us also have at our disposal a certain number of shells found in another place, and we make the assumption that a specific place does not affect the diameter of the shell, i.e. that the average value of the shell diameter for the entire population of mollusks that once lived in a new place is equal to the known value obtained earlier when studying this type of mollusk in the first habitat.
If this known value is equal, then the null hypothesis and the alternative hypothesis are written as follows: Let us assume that the variable x in the population under consideration has normal distribution, and the amount of population variance is unknown.
We will test the hypothesis using statistics:
, (1)
where is the sample standard deviation.
It was shown that if true, then t in expression (1) has a Student t-distribution with n-1 degrees of freedom. If we choose the significance level (the probability of rejecting the correct hypothesis) equal, then in accordance with what was discussed in previous chapter, you can define critical values for checking =0.
IN in this case, since the Student distribution is symmetrical, then (1-) part of the area under the curve of this distribution with n-1 degrees of freedom will be contained between points and, which are equal to each other in absolute value. Therefore, all values are less than negative and greater than positive for the t-distribution with given number degrees of freedom at the chosen significance level will constitute the critical region. If the sample t value falls within this region, the alternative hypothesis is accepted.
Confidence interval for is constructed according to the previously described method and is determined from the following expression
(2)
So, let us know in our case that the diameter of the shell of a fossil mollusk is 18.2 mm. We had at our disposal a sample of 50 newly found shells, for which mm, a = 2.18 mm. Let's check: =18.2 against We have
If the significance level is chosen =0.05 then critical value. It follows that it can be rejected in favor at the significance level =0.05. Thus, for our hypothetical example it can be stated (with some probability, of course) that the diameter of the shell of fossil mollusks certain type depends on the places in which they lived.
Due to the fact that the t-distribution is symmetrical, only positive values t of this distribution at selected significance levels and the number of degrees of freedom. Moreover, not only the share of the area under the distribution curve to the right of the t value is taken into account, but also to the left of the -t value at the same time. This is due to the fact that in most cases, when testing hypotheses, we are interested in the significance of deviations in themselves, regardless of whether these deviations are larger or smaller, i.e. we check against, not against: >a or: Let's return now to our example. The 100(1-)% confidence interval for is 18,92,01
Let us now consider the case when it is necessary to compare the means of two general populations. The hypothesis being tested looks like this: : =0, : 0. It is also assumed that it has a normal distribution with a mean and variance, and - a normal distribution with a mean and the same variance. In addition, we assume that the samples from which the general populations are estimated are extracted independently of each other and have a volume, respectively, and From the independence of the samples it follows that if we take a larger number of them and calculate the average values for each pair, then the set of these pairs of averages will be completely uncorrelated. Null hypothesis testing is done using statistics (3)
where and are variance estimates for the first and second samples, respectively. It is easy to see that (3) is a generalization of (1). It was shown that statistics (3) have a Student t-distribution with degrees of freedom. If and are equal, i.e. = = formula (3) is simplified and has the form (4)
Let's look at an example. Let us assume that when measuring the stem leaves of the same plant population over two seasons, the following results are obtained: We assume that the conditions for using the Student’s t-test, i.e. the normality of the populations from which the samples are taken, the existence of an unknown but the same variance for these populations, and the independence of the samples are satisfied. Let us estimate at the significance level =0.01. We have Table value t = 2.58. Therefore, the hypothesis about the equality of the average values of stem leaf lengths for a plant population over two seasons should be rejected at the chosen level of significance. Attention! The null hypothesis in mathematical statistics is the hypothesis that there are no significant differences between the compared indicators, regardless of whether we are talking about means, variances or other statistics. And in all these cases, if the empirical (calculated by formula) value of the criterion is greater than the theoretical (selected from the tables), it is rejected. If the empirical value is less than the tabulated value, then it is accepted. In order to construct a confidence interval for the difference between the means of these two populations, let us pay attention to the fact that the Student’s test, as can be seen from formula (3), evaluates the significance of the difference between the means relative to the standard error of this difference. It is easy to verify that the denominator in (3) represents exactly this standard error using the previously discussed relationships and assumptions made. In fact, we know that in the general case If x and y are independent, then so are Taking sample values and instead of x and y and recalling the assumption made that both populations have the same variance, we obtain (5)
The variance estimate can be obtained from the following relation (6)
(We divide by because two quantities are estimated from the samples and, therefore, the number of degrees of freedom must be reduced by two.) If we now substitute (6) into (5) and take the square root, we get the denominator in expression (3). After this digression, let's return to constructing a confidence interval for through -. We have Let us make some comments related to the assumptions used in constructing the t-test. First of all, it was shown that violations of the assumption of normality for have an insignificant effect on the level of significance and power of the test for 30. Violations of the assumption of homogeneity of variances of both populations from which the samples are taken are also insignificant, but only in the case when the sample sizes are equal. If the variances of both populations differ from each other, then the probabilities of errors of the first and second types will differ significantly from those expected. In this case, the criterion should be used to check (7)
with the number of degrees of freedom . (8)
As a rule, it turns out to be a fractional number, therefore, when using t-distribution tables, it is necessary to take the table values for the nearest integer values and interpolate to find the t corresponding to the obtained one. Let's look at an example. When studying two subspecies of the lake frog, the ratio of body length to tibia length was calculated. Two samples were taken with volumes =49 and =27. The means and variances of the relationship we are interested in turned out to be equal, respectively, =2.34; =2.08; =0.21; =0.35. If we now test the hypothesis using formula (2), we obtain that At a significance level of =0.05, we must reject the null hypothesis (tabulated value t = 1.995) and assume that there are statistically significant differences at the selected significance level between the average values of the measured parameters for the two subspecies of frogs. When using formulas (6) and (7) we have In this case, for the same significance level =0.05, the table value is t=2.015, and the null hypothesis is accepted. This example clearly shows that neglecting the conditions adopted when deriving a particular criterion can lead to results that are directly opposite to those that actually occur. Of course, in this case, having samples of different sizes in the absence of a pre-established fact that the variances of the measured indicator in both populations are statistically equal, it was necessary to use formulas (7) and (8), which showed the absence of statistically significant differences. Therefore, I would like to repeat once again that checking compliance with all assumptions made when deriving a particular criterion is an absolutely necessary condition for its correct use. The constant requirement in both of the above modifications of the t-test was the requirement that the samples be independent of each other. However, in practice there are often situations when this requirement cannot be met for objective reasons. For example, some indicators are measured on the same animal or area of territory before and after the action of an external factor, etc. And in these cases we may be interested in testing the hypothesis against. We will continue to assume that both samples are drawn from normal populations with the same variance. In this case, we can take advantage of the fact that differences between normally distributed quantities also have a normal distribution, and therefore we can use the Student's t test in the form (1). Thus, the hypothesis will be tested that n differences are a sample from a normally distributed population with a mean equal to zero. Denoting the i-th difference by, we have , (9) Let's look at an example. Let us have at our disposal data on the number of impulses of an individual nerve cell during a certain time interval before () and after () the action of the stimulus: Hence, keeping in mind that (9) has a t-distribution, and choosing a significance level of =0.01, from the corresponding table in the Appendix we find that the critical value of t for n-1=10-1=9 degrees of freedom is 3.25. A comparison of the theoretical and empirical t-statistic values shows that the null hypothesis of no statistically significant differences between firing rates before and after the stimulus should be rejected. It can be concluded that the stimulus used statistically significantly changes the frequency of impulses. In experimental studies, as mentioned above, dependent samples appear quite often. However, this fact is sometimes ignored and the t-test is used incorrectly in form (3). The inappropriateness of this can be seen by considering the standard errors of the difference between uncorrelated and correlated means. In the first case And in the second The standard error of the difference d is Taking this into account, the denominator in (9) will have the form Now let us pay attention to the fact that the numerators of expressions (4) and (9) coincide: therefore, the difference in the value of t in them depends on the denominators. Thus, if formula (3) is used in a problem with dependent samples, and the samples have a positive correlation, then the resulting t values will be less than they should be when using formula (9), and a situation may arise where that the null hypothesis will be accepted when it is false. The opposite situation may arise when there is a negative correlation between samples, i.e. in this case, differences will be recognized as significant that in fact are not. Let's return again to the example with impulse activity and calculate the t value for the given data using formula (3), not paying attention to the fact that the samples are related. We have: For the number of degrees of freedom equal to 18, and the significance level = 0.01, the table value is t = 2.88 and, at first glance, it seems that nothing happened, even when using a formula that is unsuitable for the given conditions. And in this case, the calculated t value leads to the rejection of the null hypothesis, i.e. to the same conclusion that was made using formula (9), correct in this situation. However, let's reformat the existing data and present it in the following form (2): These are the same values, and they could well be obtained in one of the experiments. Since all values in both samples are preserved, using the Student's t test in formula (3) gives the previously obtained value = 3.32 and leads to the same conclusion that has already been made. Now let’s calculate the value of t using formula (9), which should be used in this case. We have: The critical value of t at the selected significance level and nine degrees of freedom is 3.25. Consequently, we have no reason to reject the null hypothesis, we accept it, and it turns out that this conclusion is directly opposite to that which was made when using formula (3). Using this example, we were once again convinced of how important it is to obtain correct conclusions when analyzing experimental data to strictly comply with all the requirements that were the basis for determining a particular criterion. The considered modifications of the Student's test are intended to test hypotheses regarding the average of two samples. However, situations arise when it becomes necessary to draw conclusions regarding the equality of k averages at the same time. For this case, a certain statistical procedure has also been developed, which will be discussed later when discussing issues related to analysis of variance. 2 Goodness-of-fit tests for variances Testing statistical hypotheses regarding population variances is carried out in the same sequence as for means. Let us briefly recall this sequence. 1. A null hypothesis is formulated (about the absence of statistically significant differences between the compared variances). 2. Some assumptions are made regarding the sampling distribution of the statistics with which it is planned to estimate the parameter included in the hypothesis. 3. The significance level for testing the hypothesis is selected. 4. The value of the statistics of interest to us is calculated and a decision is made regarding the truth of the null hypothesis. Now let's start by testing the hypothesis that the variance of the population =a, i.e. against. If we assume that the variable x has a normal distribution and that a sample of size n is drawn randomly from the population, then statistics are used to test the null hypothesis (10)
Remembering the formula for calculating dispersion, we rewrite (10) as follows: . (11)
From this expression it is clear that the numerator is the sum of the squares of the deviations of normally distributed values from their mean. Each of these deviations is also normally distributed. Therefore, in accordance with the distribution known to us, the sums of squares of normally distributed values of statistics (10) and (11) have a -distribution with n-1 degrees of freedom. By analogy with the use of the t-distribution, when checking for the selected significance level, critical points are established from the distribution table, corresponding to the probabilities of accepting the null hypothesis and. The confidence interval for at selected is constructed as follows: . (12)
Let's look at an example. Let us assume, on the basis of extensive experimental research, that the dispersion of the alkaloid content of one plant species from a certain area is equal to 4.37 conventional units. The specialist has at his disposal a sample of n = 28 such plants, presumably from the same area. The analysis showed that for this sample =5.01 and it is necessary to make sure that this and previously known variances are statistically indistinguishable at the significance level =0.1. According to formula (10) we have The resulting value must be compared with the critical values /2=0.05 and 1--/2=0.95. From the Appendix table for with 27 degrees of freedom we have 40.1 and 16.2, respectively, which means that the null hypothesis can be accepted. The corresponding confidence interval for is 3.37<<8,35.
In contrast to testing hypotheses regarding sample means using the Student's test, when errors of the first and second types did not change significantly when the assumption of normal distribution of populations was violated, in the case of hypotheses about variances when the conditions of normality were not met, the errors changed significantly. The problem considered above about the equality of the variance to some fixed value is of limited interest, since situations are quite rare when the variance of the population is known. Of much greater interest is the case when you need to check whether the variances of two populations are equal, i.e. testing a hypothesis against an alternative. It is assumed that samples of size and are randomly drawn from general populations with variances and. To test the null hypothesis, Fisher's variance ratio test is used (13)
Since the sums of squared deviations of normally distributed random variables from their mean values have a distribution, then both the numerator and denominator of (13) are distributed values divided by and respectively, and therefore their ratio has an F-distribution with -1 and -1 degrees of freedom. It is generally accepted - and this is how F-distribution tables are constructed - that the largest of the variances is taken as the numerator in (13), and therefore only one critical point is determined, corresponding to the selected significance level. Let us have at our disposal two samples of volume =11 and =28 from populations of common and oval pond snails, for which the height-to-width ratios have variances =0.59 and =0.38. It is necessary to test the hypothesis about the equality of these variances of these indicators for the populations being studied at a significance level of =0.05. We have In the literature, you can sometimes find a statement that testing the hypothesis about the equality of means using the Student's test should be preceded by testing the hypothesis about the equality of variances. This is the wrong recommendation. Moreover, it can lead to mistakes that can be avoided if not followed. Indeed, the results of testing the hypothesis of equality of variances using Fisher's test largely depend on the assumption that the samples are drawn from populations with a normal distribution. At the same time, the Student's test is insensitive to violations of normality, and if it is possible to obtain samples of equal size, then the assumption of equality of variances is also not significant. In the case of unequal n, formulas (7) and (8) should be used for verification. When testing hypotheses about equality of variances, some features arise in calculations associated with dependent samples. In this case, statistics are used to test a hypothesis against an alternative (14)
If the null hypothesis is true, then statistics (14) has a Student t-distribution with n-2 degrees of freedom. When measuring the gloss of 35 coating samples, a dispersion of =134.5 was obtained. Repeated measurements two weeks later showed =199.1. In this case, the correlation coefficient between paired measurements turned out to be equal to =0.876. If we ignore the fact that the samples are dependent and use the Fisher test to test the hypothesis, we get F=1.48. If you choose the significance level =0.05, then the null hypothesis will be accepted, since the critical value of the F-distribution for =35-1=34 and =35-1=34 degrees of freedom is 1.79. At the same time, if we use formula (14) suitable for this case, we obtain t = 2.35, while the critical value of t for 33 degrees of freedom and the selected significance level = 0.05 is equal to 2.03. Therefore, the null hypothesis of equal variances in the two samples should be rejected. Thus, from this example it is clear that, as in the case of testing the hypothesis of equality of means, the use of a criterion that does not take into account the specifics of experimental data leads to an error. In the recommended literature you can find the Bartlett test, which is used to test hypotheses about the simultaneous equality of k variances. In addition to the fact that calculating the statistics of this criterion is quite laborious, the main disadvantage of this criterion is that it is unusually sensitive to deviations from the assumption of normal distribution of the populations from which samples are drawn. Thus, when using it, you can never be sure that the null hypothesis is actually rejected because the variances are statistically significantly different, and not because the samples are not normally distributed. Therefore, if the problem of comparing several variances arises, it is necessary to look for a formulation of the problem where it will be possible to use the Fisher criterion or its modifications. 3 Criteria for agreement regarding shares Quite often it is necessary to analyze populations in which objects can be classified into one of two categories. For example, by gender in a certain population, by the presence of a certain microelement in the soil, by the dark or light color of eggs in some species of birds, etc. We denote the proportion of elements that have a certain quality by P, where P represents the ratio of objects with the quality we are interested in to all objects in the aggregate.
Where