Let the random sample be generated by the observed random variable ξ, the mathematical expectation and variance which are unknown. It was proposed to use the sample average as estimates for these characteristics

and sample variance

. (3.14)

Let us consider some properties of estimates of mathematical expectation and dispersion.

1. Calculate the mathematical expectation of the sample average:

Therefore, the sample mean is an unbiased estimator for .

2. Recall that the results observations are independent random variables, each of which has the same distribution law as the value, which means , , . We will assume that the variance is finite. Then, according to Chebyshev’s theorem on the law of large numbers, for any ε > 0 the equality holds ,

which can be written like this: . (3.16) Comparing (3.16) with the definition of the consistency property (3.11), we see that the estimate is a consistent estimate of the mathematical expectation.

3. Find the variance of the sample mean:

. (3.17)

Thus, the variance of the mathematical expectation estimate decreases in inverse proportion to the sample size.

It can be proven that if the random variable ξ is normally distributed, then the sample mean is an effective estimate of the mathematical expectation, that is, the variance takes the smallest value compared to any other estimate of the mathematical expectation. For other distribution laws ξ this may not be the case.

The sample variance is a biased estimate of the variance because . (3.18)

Indeed, using the properties of the mathematical expectation and formula (3.17), we find

To obtain an unbiased estimate of the variance, estimate (3.14) must be corrected, that is, multiplied by . Then we get the unbiased sample variance

. (3.19)

Note that formulas (3.14) and (3.19) differ only in the denominator, and for large values the sample and unbiased variances differ little. However, with a small sample size, relation (3.19) should be used.

To estimate the standard deviation of a random variable, the so-called “corrected” standard deviation is used, which is equal to the square root of the unbiased variance: .

Interval estimates

In statistics, there are two approaches to estimating unknown parameters of distributions: point and interval. In accordance with point estimation, which was discussed in the previous section, only the point around which the estimated parameter is located is indicated. It is desirable, however, to know how far this parameter may actually be from the possible realizations of the estimates in different series of observations.

The answer to this question - also approximate - is given by another method of estimating parameters - interval. In accordance with this estimation method, an interval is found that, with a probability close to one, covers the unknown numerical value of the parameter.

The concept of interval estimation

Point estimate is a random variable and for possible sample implementations takes values only approximately equal to the true value of the parameter . The smaller the difference, the more accurate the estimate. Thus, a positive number for which , characterizes the accuracy of the estimate and is called estimation error (or marginal error).

Confidence probability(or reliability) called probability β , with which inequality is realized , i.e.

. (3.20)

Replacing inequality equivalent double inequality , or , we get

Interval , covering with probability β , , unknown parameter, is called confidence interval (or interval estimation), corresponding confidence probability β .

A random variable is not only an estimate, but also an error: its value depends on the probability β and, as a rule, from the sample. Therefore, the confidence interval is random and expression (3.21) should be read as follows: “The interval will cover the parameter with probability β ”, and not like this: “The parameter will fall into the interval with probability β ”.

The meaning of the confidence interval is that when repeating a sample volume many times in a relative proportion of cases equal to β , confidence interval corresponding to the confidence probability β , covers the true value of the estimated parameter. Thus, the confidence probability β characterizes reliability confidence assessment: the more β , the more likely it is that the implementation of the confidence interval contains an unknown parameter.

PURPOSE OF THE LECTURE: introduce the concept of estimating an unknown distribution parameter and give a classification of such estimates; obtain point and interval estimates of mathematical expectation and dispersion.

In practice, in most cases, the distribution law of a random variable is unknown, and according to the results of observations
it is necessary to estimate numerical characteristics (for example, mathematical expectation, dispersion or other moments) or an unknown parameter , which determines the distribution law (distribution density)
random variable being studied. Thus, for an exponential distribution or Poisson distribution, it is enough to estimate one parameter, but for a normal distribution, two parameters must be estimated - the mathematical expectation and the variance.

Types of assessments

Random value
has a probability density
, Where – unknown distribution parameter. As a result of the experiment, the values of this random variable were obtained:
. To make an assessment essentially means that the sample values of a random variable must be associated with a certain parameter value , i.e. create some function of observation results
, the value of which is taken as an estimate parameter . Index indicates the number of experiments performed.

Any function that depends on the results of observations is called statistics. Since the results of observations are random variables, the statistics will also be a random variable. Therefore, the assessment
unknown parameter should be considered as a random variable, and its value, calculated from experimental data in volume , – as one of the possible values of this random variable.

Estimates of distribution parameters (numerical characteristics of a random variable) are divided into point and interval. Point estimate parameter determined by one number , and its accuracy is characterized by the variance of the estimate. Interval estimation called a score that is determined by two numbers, And – ends of the interval covering the estimated parameter with a given confidence probability.

Classification of point estimates

For a point estimate of an unknown parameter
best in terms of accuracy, it must be consistent, unbiased and efficient.

Wealthy called assessment
parameter , if it converges in probability to the estimated parameter, i.e.

. (8.8)

Based on Chebyshev’s inequality, it can be shown that a sufficient condition for the fulfillment of relation (8.8) is the equality

Consistency is an asymptotic characteristic of the estimate at
.

Unbiased called assessment
(estimate without systematic error), the mathematical expectation of which is equal to the estimated parameter, i.e.

. (8.9)

If equality (8.9) is not satisfied, then the estimate is called biased. Difference
called bias or systematic error in estimation. If equality (8.9) is satisfied only for
, then the corresponding estimate is called asymptotically unbiased.

It should be noted that if consistency is an almost mandatory condition for all estimates used in practice (inconsistent estimates are used extremely rarely), then the property of unbiasedness is only desirable. Many frequently used estimates do not have the unbiased property.

In general, the accuracy of estimating some parameter , obtained on the basis of experimental data
, characterized by the mean squared error

which can be reduced to the form

where is the variance,
– squared estimate bias.

If the estimate is unbiased, then

At finite estimates may differ by mean squared error . Naturally, the smaller this error, the more closely the assessment values are grouped around the estimated parameter. Therefore, it is always desirable that the estimation error be as small as possible, i.e., the condition is satisfied

. (8.10)

Evaluation , satisfying condition (8.10), is called an estimate with a minimum squared error.

Effective called assessment
, for which the mean squared error is not greater than the mean squared error of any other estimate, i.e.

Where – any other parameter estimate .

It is known that the variance of any unbiased estimate of one parameter satisfies the Cramer–Rao inequality

Where
– conditional probability density distribution of the obtained values of the random variable at the true value of the parameter .

Thus, the unbiased estimate
, for which the Cramer–Rao inequality becomes equality, will be effective, i.e., such an estimate has minimal variance.

Point estimates of expectation and variance

If a random variable is considered
, which has a mathematical expectation and variance , then both of these parameters are considered unknown. Therefore, over a random variable
produced independent experiments that give results:
. It is necessary to find consistent and unbiased estimates of unknown parameters And .

As estimates And Usually the statistical (sample) mean and statistical (sample) variance are chosen respectively:

; (8.11)

. (8.12)

The estimate of the mathematical expectation (8.11) is consistent according to the law of large numbers (Chebyshev’s theorem):

Mathematical expectation of a random variable

Therefore, the estimate is unbiased.

Dispersion of the mathematical expectation estimate:

If the random variable
is distributed according to the normal law, then the estimate is also effective.

Expectation of variance estimate

In the same time

Because
, A
, then we get

. (8.13)

Thus,
– a biased assessment, although it is consistent and effective.

From formula (8.13) it follows that to obtain an unbiased estimate
the sample variance (8.12) should be modified as follows:

which is considered “better” compared to estimate (8.12), although at large these estimates are almost equal to each other.

Methods for obtaining estimates of distribution parameters

Often in practice, based on an analysis of the physical mechanism that generates the random variable
, we can draw a conclusion about the distribution law of this random variable. However, the parameters of this distribution are unknown and must be estimated from the experimental results, usually presented in the form of a finite sample
. To solve this problem, two methods are most often used: the method of moments and the maximum likelihood method.

Method of moments. The method consists in equating theoretical moments with corresponding empirical moments of the same order.

Empirical starting points -th order are determined by the formulas:

and the corresponding theoretical initial moments -th order – formulas:

for discrete random variables,

for continuous random variables,

Where – estimated distribution parameter.

To obtain estimates of the parameters of a distribution containing two unknown parameters And , a system of two equations is compiled

Where And – theoretical and empirical central moments of the second order.

The solution to the system of equations is the estimates And unknown distribution parameters And .

Equating the theoretical and empirical initial moments of the first order, we obtain that by estimating the mathematical expectation of a random variable
, having an arbitrary distribution, will be the sample mean, i.e.
. Then, equating the theoretical and empirical central moments of the second order, we obtain that the estimate of the variance of the random variable
, which has an arbitrary distribution, is determined by the formula

In a similar way, one can find estimates of theoretical moments of any order.

The method of moments is simple and does not require complex calculations, but the estimates obtained by this method are often ineffective.

Maximum likelihood method. The maximum likelihood method of point estimation of unknown distribution parameters comes down to finding the maximum of the function of one or more estimated parameters.

Let
is a continuous random variable, which as a result tests took values
. To obtain an estimate of an unknown parameter it is necessary to find such a value , at which the probability of implementing the resulting sample would be maximum. Because
represent mutually independent quantities with the same probability density
, That likelihood function call the argument function :

By maximum likelihood estimation of the parameter this value is called , at which the likelihood function reaches a maximum, i.e., is a solution to the equation

which clearly depends on the test results
.

Since the functions
And
reach a maximum at the same values
, then to simplify calculations they often use the logarithmic likelihood function and look for the root of the corresponding equation

which is called likelihood equation.

If you need to evaluate several parameters
distribution
, then the likelihood function will depend on these parameters. To find estimates
distribution parameters it is necessary to solve the system likelihood equations

The maximum likelihood method provides consistent and asymptotically efficient estimates. However, estimates obtained by the maximum likelihood method are biased, and, in addition, to find estimates, it is often necessary to solve rather complex systems of equations.

Interval parameter estimates

The accuracy of point estimates is characterized by their variance. However, there is no information about how close the obtained estimates are to the true values of the parameters. In a number of tasks, you not only need to find for the parameter suitable numerical value, but also to evaluate its accuracy and reliability. You need to find out what errors replacing a parameter can lead to its point estimate and with what degree of confidence should we expect that these errors will not exceed known limits.

Such tasks are especially relevant when there is a small number of experiments. , when the point estimate largely random and approximate replacement on can lead to significant errors.

A more complete and reliable way to estimate distribution parameters is to determine not a single point value, but an interval that, with a given probability, covers the true value of the estimated parameter.

Let according to the results experiments, an unbiased estimate was obtained
parameter . It is necessary to evaluate the possible error. Some sufficiently large probability is selected
(for example), such that an event with this probability can be considered a practically certain event, and such a value is found , for which

. (8.15)

In this case, the range of practically possible values of the error that occurs during replacement on , will
, and errors that are large in absolute value will appear only with a low probability .

Expression (8.15) means that with probability
unknown parameter value falls into the interval

. (8.16)

Probability
called confidence probability, and the interval , covering with probability the true value of the parameter is called confidence interval. Note that it is incorrect to say that the parameter value lies within the confidence interval with probability . The formulation used (covers) means that although the parameter being estimated is unknown, it has a constant value and therefore has no spread since it is not a random variable.

Expectation is the probability distribution of a random variable

Mathematical expectation, definition, mathematical expectation of discrete and continuous random variables, sample, conditional expectation, calculation, properties, problems, estimation of expectation, dispersion, distribution function, formulas, calculation examples

Expand contents

Collapse content

Mathematical expectation is the definition

One of the most important concepts in mathematical statistics and probability theory, characterizing the distribution of values or probabilities of a random variable. Typically expressed as a weighted average of all possible parameters of a random variable. Widely used in technical analysis, the study of number series, and the study of continuous and time-consuming processes. It is important in assessing risks, predicting price indicators when trading in financial markets, and is used in developing strategies and methods of gaming tactics in the theory of gambling.

Mathematical expectation is the average value of a random variable, the probability distribution of a random variable is considered in probability theory.

Mathematical expectation is a measure of the average value of a random variable in probability theory. Mathematical expectation of a random variable x denoted by M(x).

Mathematical expectation is

Mathematical expectation is in probability theory, a weighted average of all possible values that a random variable can take.

Mathematical expectation is the sum of the products of all possible values of a random variable and the probabilities of these values.

Mathematical expectation is the average benefit from a particular decision, provided that such a decision can be considered within the framework of the theory of large numbers and long distance.

Mathematical expectation is in gambling theory, the amount of winnings a player can earn or lose, on average, for each bet. In gambling parlance, this is sometimes called the "player's edge" (if it is positive for the player) or the "house edge" (if it is negative for the player).

Mathematical expectation is the percentage of profit per win multiplied by the average profit, minus the probability of loss multiplied by the average loss.

Mathematical expectation of a random variable in mathematical theory

One of the important numerical characteristics of a random variable is its mathematical expectation. Let us introduce the concept of a system of random variables. Let's consider a set of random variables that are the results of the same random experiment. If is one of the possible values of the system, then the event corresponds to a certain probability that satisfies Kolmogorov’s axioms. A function defined for any possible values of random variables is called a joint distribution law. This function allows you to calculate the probabilities of any events from. In particular, the joint distribution law of random variables and, which take values from the set and, is given by probabilities.

The term “mathematical expectation” was introduced by Pierre Simon Marquis de Laplace (1795) and comes from the concept of “expected value of winnings,” which first appeared in the 17th century in the theory of gambling in the works of Blaise Pascal and Christiaan Huygens. However, the first complete theoretical understanding and assessment of this concept was given by Pafnuty Lvovich Chebyshev (mid-19th century).

The distribution law of random numerical variables (distribution function and distribution series or probability density) completely describes the behavior of a random variable. But in a number of problems, it is enough to know some numerical characteristics of the quantity under study (for example, its average value and possible deviation from it) in order to answer the question posed. The main numerical characteristics of random variables are the mathematical expectation, variance, mode and median.

The mathematical expectation of a discrete random variable is the sum of the products of its possible values and their corresponding probabilities. Sometimes the mathematical expectation is called a weighted average, since it is approximately equal to the arithmetic mean of the observed values of a random variable over a large number of experiments. From the definition of mathematical expectation it follows that its value is no less than the smallest possible value of a random variable and no more than the largest. The mathematical expectation of a random variable is a non-random (constant) variable.

The mathematical expectation has a simple physical meaning: if you place a unit mass on a straight line, placing a certain mass at some points (for a discrete distribution), or “smearing” it with a certain density (for an absolutely continuous distribution), then the point corresponding to the mathematical expectation will be the coordinate "center of gravity" is straight.

The average value of a random variable is a certain number that is, as it were, its “representative” and replaces it in roughly approximate calculations. When we say: “the average lamp operating time is 100 hours” or “the average point of impact is shifted relative to the target by 2 m to the right,” we are indicating a certain numerical characteristic of a random variable that describes its location on the numerical axis, i.e. "position characteristics".

Of the characteristics of a position in probability theory, the most important role is played by the mathematical expectation of a random variable, which is sometimes called simply the average value of a random variable.

Consider the random variable X, having possible values x1, x2, …, xn with probabilities p1, p2, …, pn. We need to characterize with some number the position of the values of a random variable on the x-axis, taking into account the fact that these values have different probabilities. For this purpose, it is natural to use the so-called “weighted average” of the values xi, and each value xi during averaging should be taken into account with a “weight” proportional to the probability of this value. Thus, we will calculate the average of the random variable X, which we denote M |X|:

This weighted average is called the mathematical expectation of the random variable. Thus, we introduced into consideration one of the most important concepts of probability theory - the concept of mathematical expectation. The mathematical expectation of a random variable is the sum of the products of all possible values of a random variable and the probabilities of these values.

X is connected by a peculiar dependence with the arithmetic mean of the observed values of the random variable over a large number of experiments. This dependence is of the same type as the dependence between frequency and probability, namely: with a large number of experiments, the arithmetic mean of the observed values of a random variable approaches (converges in probability) to its mathematical expectation. From the presence of a connection between frequency and probability, one can deduce as a consequence the presence of a similar connection between the arithmetic mean and the mathematical expectation. Indeed, consider the random variable X, characterized by a distribution series:

Let it be produced N independent experiments, in each of which the value X takes on a certain value. Let's assume that the value x1 appeared m1 times, value x2 appeared m2 times, general meaning xi appeared mi times. Let us calculate the arithmetic mean of the observed values of the value X, which, in contrast to the mathematical expectation M|X| we denote M*|X|:

With increasing number of experiments N frequencies pi will approach (converge in probability) the corresponding probabilities. Consequently, the arithmetic mean of the observed values of the random variable M|X| with an increase in the number of experiments it will approach (converge in probability) to its mathematical expectation. The connection between the arithmetic mean and mathematical expectation formulated above constitutes the content of one of the forms of the law of large numbers.

We already know that all forms of the law of large numbers state the fact that some averages are stable over a large number of experiments. Here we are talking about the stability of the arithmetic mean from a series of observations of the same quantity. With a small number of experiments, the arithmetic mean of their results is random; with a sufficient increase in the number of experiments, it becomes “almost non-random” and, stabilizing, approaches a constant value - the mathematical expectation.

The stability of averages over a large number of experiments can be easily verified experimentally. For example, when weighing a body in a laboratory on precise scales, as a result of weighing we obtain a new value each time; To reduce observation error, we weigh the body several times and use the arithmetic mean of the obtained values. It is easy to see that with a further increase in the number of experiments (weighings), the arithmetic mean reacts to this increase less and less and, with a sufficiently large number of experiments, practically ceases to change.

It should be noted that the most important characteristic of the position of a random variable - the mathematical expectation - does not exist for all random variables. It is possible to compose examples of such random variables for which the mathematical expectation does not exist, since the corresponding sum or integral diverges. However, such cases are not of significant interest for practice. Typically, the random variables we deal with have a limited range of possible values and, of course, have a mathematical expectation.

In addition to the most important characteristics of the position of a random variable - the mathematical expectation - in practice, other characteristics of the position are sometimes used, in particular, the mode and median of the random variable.

The mode of a random variable is its most probable value. The term "most probable value" strictly speaking applies only to discontinuous quantities; for a continuous quantity, the mode is the value at which the probability density is maximum. The figures show the mode for discontinuous and continuous random variables, respectively.

If the distribution polygon (distribution curve) has more than one maximum, the distribution is called "multimodal".

Sometimes there are distributions that have a minimum in the middle rather than a maximum. Such distributions are called “anti-modal”.

In the general case, the mode and mathematical expectation of a random variable do not coincide. In the particular case, when the distribution is symmetrical and modal (i.e. has a mode) and there is a mathematical expectation, then it coincides with the mode and center of symmetry of the distribution.

Another position characteristic is often used - the so-called median of a random variable. This characteristic is usually used only for continuous random variables, although it can be formally defined for a discontinuous variable. Geometrically, the median is the abscissa of the point at which the area enclosed by the distribution curve is divided in half.

In the case of a symmetric modal distribution, the median coincides with the mathematical expectation and mode.

The mathematical expectation is the average value of a random variable - a numerical characteristic of the probability distribution of a random variable. In the most general way, the mathematical expectation of a random variable X(w) is defined as the Lebesgue integral with respect to the probability measure R in the original probability space:

The mathematical expectation can also be calculated as the Lebesgue integral of X by probability distribution px quantities X:

The concept of a random variable with infinite mathematical expectation can be defined in a natural way. A typical example is the return times of some random walks.

Using the mathematical expectation, many numerical and functional characteristics of a distribution are determined (as the mathematical expectation of the corresponding functions of a random variable), for example, the generating function, characteristic function, moments of any order, in particular dispersion, covariance.

The mathematical expectation is a characteristic of the location of the values of a random variable (the average value of its distribution). In this capacity, the mathematical expectation serves as some “typical” distribution parameter and its role is similar to the role of the static moment - the coordinate of the center of gravity of the mass distribution - in mechanics. From other characteristics of the location with the help of which the distribution is described in general terms - medians, modes, mathematical expectation differs in the greater value that it and the corresponding scattering characteristic - dispersion - have in the limit theorems of probability theory. The meaning of mathematical expectation is revealed most fully by the law of large numbers (Chebyshev's inequality) and the strengthened law of large numbers.

Expectation of a discrete random variable

Let there be some random variable that can take one of several numerical values (for example, the number of points when throwing a dice can be 1, 2, 3, 4, 5 or 6). Often in practice, for such a value, the question arises: what value does it take “on average” with a large number of tests? What will be our average income (or loss) from each of the risky transactions?

Let's say there is some kind of lottery. We want to understand whether it is profitable or not to participate in it (or even participate repeatedly, regularly). Let’s say that every fourth ticket is a winner, the prize will be 300 rubles, and the price of any ticket will be 100 rubles. With an infinitely large number of participations, this is what happens. In three quarters of cases we will lose, every three losses will cost 300 rubles. In every fourth case we will win 200 rubles. (prize minus cost), that is, for four participations we lose on average 100 rubles, for one - on average 25 rubles. In total, the average rate of our ruin will be 25 rubles per ticket.

We throw the dice. If it is not cheating (without shifting the center of gravity, etc.), then how many points will we have on average at a time? Since each option is equally likely, we simply take the arithmetic mean and get 3.5. Since this is AVERAGE, there is no need to be indignant that no specific roll will give 3.5 points - well, this cube does not have a face with such a number!

Now let's summarize our examples:

Let's look at the picture just given. On the left is a table of the distribution of a random variable. The value X can take one of n possible values (shown in the top line). There cannot be any other meanings. Under each possible value, its probability is written below. On the right is the formula, where M(X) is called the mathematical expectation. The meaning of this value is that with a large number of tests (with a large sample), the average value will tend to this same mathematical expectation.

Let's return again to the same playing cube. The mathematical expectation of the number of points when throwing is 3.5 (calculate it yourself using the formula if you don’t believe me). Let's say you threw it a couple of times. The results were 4 and 6. The average was 5, which is far from 3.5. They threw it one more time, they got 3, that is, on average (4 + 6 + 3)/3 = 4.3333... Somehow far from the mathematical expectation. Now do a crazy experiment - roll the cube 1000 times! And even if the average is not exactly 3.5, it will be close to that.

Let's calculate the mathematical expectation for the lottery described above. The plate will look like this:

Then the mathematical expectation will be, as we established above:

Another thing is that doing it “on the fingers”, without a formula, would be difficult if there were more options. Well, let's say there would be 75% losing tickets, 20% winning tickets and 5% especially winning ones.

Now some properties of mathematical expectation.

It's easy to prove:

The constant factor can be taken out as a sign of the mathematical expectation, that is:

This is a special case of the linearity property of the mathematical expectation.

Another consequence of the linearity of the mathematical expectation:

that is, the mathematical expectation of the sum of random variables is equal to the sum of the mathematical expectations of random variables.

Let X, Y be independent random variables, Then:

This is also easy to prove) Work XY itself is a random variable, and if the initial values could take n And m values accordingly, then XY can take nm values. The probability of each value is calculated based on the fact that the probabilities of independent events are multiplied. As a result, we get this:

Expectation of a continuous random variable

Continuous random variables have such a characteristic as distribution density (probability density). It essentially characterizes the situation that a random variable takes some values from the set of real numbers more often, and some less often. For example, consider this graph:

Here X- actual random variable, f(x)- distribution density. Judging by this graph, during experiments the value X will often be a number close to zero. The chances are exceeded 3 or be smaller -3 rather purely theoretical.

Let, for example, there be a uniform distribution:

This is quite consistent with intuitive understanding. Let's say, if we receive many random real numbers with a uniform distribution, each of the segment |0; 1| , then the arithmetic mean should be about 0.5.

The properties of mathematical expectation - linearity, etc., applicable for discrete random variables, are also applicable here.

Relationship between mathematical expectation and other statistical indicators

In statistical analysis, along with the mathematical expectation, there is a system of interdependent indicators that reflect the homogeneity of phenomena and the stability of processes. Variation indicators often have no independent meaning and are used for further data analysis. The exception is the coefficient of variation, which characterizes the homogeneity of the data, which is a valuable statistical characteristic.

The degree of variability or stability of processes in statistical science can be measured using several indicators.

The most important indicator characterizing the variability of a random variable is Dispersion, which is most closely and directly related to the mathematical expectation. This parameter is actively used in other types of statistical analysis (hypothesis testing, analysis of cause-and-effect relationships, etc.). Like the average linear deviation, variance also reflects the extent of the spread of data around the mean value.

It is useful to translate the language of signs into the language of words. It turns out that the dispersion is the average square of the deviations. That is, the average value is first calculated, then the difference between each original and average value is taken, squared, added, and then divided by the number of values in the population. The difference between an individual value and the average reflects the measure of deviation. It is squared so that all deviations become exclusively positive numbers and to avoid mutual destruction of positive and negative deviations when summing them up. Then, given the squared deviations, we simply calculate the arithmetic mean. Average - square - deviations. The deviations are squared and the average is calculated. The answer to the magic word “dispersion” lies in just three words.

However, in its pure form, such as the arithmetic mean, or index, dispersion is not used. It is rather an auxiliary and intermediate indicator that is used for other types of statistical analysis. It doesn't even have a normal unit of measurement. Judging by the formula, this is the square of the unit of measurement of the original data.

Let us measure a random variable N times, for example, we measure the wind speed ten times and want to find the average value. How is the average value related to the distribution function?

Or we will roll the dice a large number of times. The number of points that will appear on the dice with each throw is a random variable and can take any natural value from 1 to 6. The arithmetic mean of the dropped points calculated for all dice throws is also a random variable, but for large N it tends to a very specific number - mathematical expectation Mx. In this case Mx = 3.5.

How did you get this value? Let in N tests n1 1 point is rolled once n2 once - 2 points and so on. Then the number of outcomes in which one point fell:

Similarly for outcomes when 2, 3, 4, 5 and 6 points are rolled.

Let us now assume that we know the distribution law of the random variable x, that is, we know that the random variable x can take values x1, x2, ..., xk with probabilities p1, p2, ..., pk.

The mathematical expectation Mx of a random variable x is equal to:

The mathematical expectation is not always a reasonable estimate of some random variable. So, to estimate the average salary, it is more reasonable to use the concept of median, that is, such a value that the number of people receiving a salary lower than the median and a higher one coincide.

The probability p1 that the random variable x will be less than x1/2, and the probability p2 that the random variable x will be greater than x1/2, are the same and equal to 1/2. The median is not determined uniquely for all distributions.

Standard or Standard Deviation in statistics, the degree of deviation of observational data or sets from the AVERAGE value is called. Denoted by the letters s or s. A small standard deviation indicates that the data clusters around the mean, while a large standard deviation indicates that the initial data are located far from it. The standard deviation is equal to the square root of a quantity called variance. It is the average of the sum of the squared differences of the initial data that deviate from the average value. The standard deviation of a random variable is the square root of the variance:

Example. Under test conditions when shooting at a target, calculate the dispersion and standard deviation of the random variable:

Variation- fluctuation, changeability of the value of a characteristic among units of the population. Individual numerical values of a characteristic found in the population under study are called variants of values. The insufficiency of the average value to fully characterize the population forces us to supplement the average values with indicators that allow us to assess the typicality of these averages by measuring the variability (variation) of the characteristic being studied. The coefficient of variation is calculated using the formula:

Range of variation(R) represents the difference between the maximum and minimum values of the attribute in the population being studied. This indicator gives the most general idea of the variability of the characteristic being studied, since it shows the difference only between the maximum values of the options. Dependence on the extreme values of a characteristic gives the scope of variation an unstable, random character.

Average linear deviation represents the arithmetic mean of the absolute (modulo) deviations of all values of the analyzed population from their average value:

Mathematical expectation in gambling theory

Mathematical expectation is The average amount of money a gambler can win or lose on a given bet. This is a very important concept for the player because it is fundamental to the assessment of most gaming situations. Mathematical expectation is also the optimal tool for analyzing basic card layouts and gaming situations.

Let's say you're playing a coin game with a friend, betting equally $1 each time, no matter what comes up. Tails means you win, heads means you lose. The odds are one to one that it will come up heads, so you bet $1 to $1. Thus, your mathematical expectation is zero, because From a mathematical point of view, you cannot know whether you will lead or lose after two throws or after 200.

Your hourly gain is zero. Hourly winnings are the amount of money you expect to win in an hour. You can toss a coin 500 times in an hour, but you won't win or lose because... your chances are neither positive nor negative. If you look at it, from the point of view of a serious player, this betting system is not bad. But this is simply a waste of time.

But let's say someone wants to bet $2 against your $1 on the same game. Then you immediately have a positive expectation of 50 cents from each bet. Why 50 cents? On average, you win one bet and lose the second. Bet the first dollar and you will lose $1, bet the second and you will win $2. You bet $1 twice and are ahead by $1. So each of your one-dollar bets gave you 50 cents.

If a coin appears 500 times in one hour, your hourly winnings will already be $250, because... On average, you lost one dollar 250 times and won two dollars 250 times. $500 minus $250 equals $250, which is the total winnings. Please note that the expected value, which is the average amount you win per bet, is 50 cents. You won $250 by betting a dollar 500 times, which equals 50 cents per bet.

Mathematical expectation has nothing to do with short-term results. Your opponent, who decided to bet $2 against you, could beat you on the first ten rolls in a row, but you, having a 2 to 1 betting advantage, all other things being equal, will earn 50 cents on every $1 bet in any circumstances. It makes no difference whether you win or lose one bet or several bets, as long as you have enough cash to comfortably cover the costs. If you continue to bet in the same way, then over a long period of time your winnings will approach the sum of the expectations in individual throws.

Every time you make a best bet (a bet that may turn out to be profitable in the long run), when the odds are in your favor, you are bound to win something on it, no matter whether you lose it or not in the given hand. Conversely, if you make an underdog bet (a bet that is unprofitable in the long run) when the odds are against you, you lose something regardless of whether you win or lose the hand.

You place a bet with the best outcome if your expectation is positive, and it is positive if the odds are on your side. When you place a bet with the worst outcome, you have a negative expectation, which happens when the odds are against you. Serious players only bet on the best outcome; if the worst happens, they fold. What does the odds mean in your favor? You may end up winning more than the real odds bring. The real odds of landing heads are 1 to 1, but you get 2 to 1 due to the odds ratio. In this case, the odds are in your favor. You definitely get the best outcome with a positive expectation of 50 cents per bet.

Here is a more complex example of mathematical expectation. A friend writes down numbers from one to five and bets $5 against your $1 that you won't guess the number. Should you agree to such a bet? What is the expectation here?

On average you will be wrong four times. Based on this, the odds against you guessing the number are 4 to 1. The odds against you losing a dollar on one attempt. However, you win 5 to 1, with the possibility of losing 4 to 1. So the odds are in your favor, you can take the bet and hope for the best outcome. If you make this bet five times, on average you will lose $1 four times and win $5 once. Based on this, for all five attempts you will earn $1 with a positive mathematical expectation of 20 cents per bet.

A player who is going to win more than he bets, as in the example above, is taking chances. On the contrary, he ruins his chances when he expects to win less than he bets. A bettor can have either a positive or a negative expectation, which depends on whether he wins or ruins the odds.

If you bet $50 to win $10 with a 4 to 1 chance of winning, you will get a negative expectation of $2 because On average, you will win $10 four times and lose $50 once, which shows that the loss per bet will be $10. But if you bet $30 to win $10, with the same odds of winning 4 to 1, then in this case you have a positive expectation of $2, because you again win $10 four times and lose $30 once, for a profit of $10. These examples show that the first bet is bad, and the second is good.

Mathematical expectation is the center of any gaming situation. When a bookmaker encourages football fans to bet $11 to win $10, he has a positive expectation of 50 cents on every $10. If the casino pays even money from the pass line in craps, then the casino's positive expectation will be approximately $1.40 for every $100, because This game is structured so that anyone who bets on this line loses 50.7% on average and wins 49.3% of the total time. Undoubtedly, it is this seemingly minimal positive expectation that brings enormous profits to casino owners around the world. As Vegas World casino owner Bob Stupak noted, “a one-thousandth of one percent negative probability over a long enough distance will ruin the richest man in the world.”

Expectation when playing Poker

The game of Poker is the most illustrative and illustrative example from the point of view of using the theory and properties of mathematical expectation.

Expected Value in Poker is the average benefit from a particular decision, provided that such a decision can be considered within the framework of the theory of large numbers and long distance. A successful poker game is to always accept moves with positive expected value.

The mathematical meaning of the mathematical expectation when playing poker is that we often encounter random variables when making decisions (we don’t know what cards the opponent has in his hands, what cards will come in subsequent rounds of betting). We must consider each of the solutions from the point of view of large number theory, which states that with a sufficiently large sample, the average value of a random variable will tend to its mathematical expectation.

Among the particular formulas for calculating the mathematical expectation, the following is most applicable in poker:

When playing poker, the expected value can be calculated for both bets and calls. In the first case, fold equity should be taken into account, in the second, the bank’s own odds. When assessing the mathematical expectation of a particular move, you should remember that a fold always has a zero expectation. Thus, discarding cards will always be a more profitable decision than any negative move.

Expectation tells you what you can expect (profit or loss) for every dollar you risk. Casinos make money because the mathematical expectation of all games played in them is in favor of the casino. With a long enough series of games, you can expect that the client will lose his money, since the “odds” are in favor of the casino. However, professional casino players limit their games to short periods of time, thereby stacking the odds in their favor. The same goes for investing. If your expectation is positive, you can make more money by making many trades in a short period of time. Expectation is your percentage of profit per win multiplied by your average profit, minus your probability of loss multiplied by your average loss.

Poker can also be considered from the standpoint of mathematical expectation. You may assume that a certain move is profitable, but in some cases it may not be the best because another move is more profitable. Let's say you hit a full house in five-card draw poker. Your opponent makes a bet. You know that if you raise the bet, he will respond. Therefore, raising seems to be the best tactic. But if you do raise the bet, the remaining two players will definitely fold. But if you call, you have full confidence that the other two players behind you will do the same. When you raise your bet you get one unit, and when you just call you get two. Thus, calling gives you a higher positive expected value and will be the best tactic.

The mathematical expectation can also give an idea of which poker tactics are less profitable and which are more profitable. For example, if you play a certain hand and you think your loss will average 75 cents including ante, then you should play that hand because this is better than folding when the ante is $1.

Another important reason to understand the concept of expected value is that it gives you a sense of peace of mind whether you win the bet or not: if you made a good bet or folded at the right time, you will know that you have earned or saved a certain amount of money that the weaker player could not save. It's much harder to fold if you're upset because your opponent drew a stronger hand. With all this, the money you save by not playing instead of betting is added to your winnings for the night or month.

Just remember that if you changed your hands, your opponent would have called you, and as you will see in the Fundamental Theorem of Poker article, this is just one of your advantages. You should be happy when this happens. You can even learn to enjoy losing a hand because you know that other players in your position would have lost much more.

As mentioned in the coin game example at the beginning, the hourly rate of profit is interrelated with the mathematical expectation, and this concept is especially important for professional players. When you go to play poker, you should mentally estimate how much you can win in an hour of play. In most cases you will need to rely on your intuition and experience, but you can also use some math. For example, you are playing draw lowball and you see three players bet $10 and then trade two cards, which is a very bad tactic, you can figure out that every time they bet $10, they lose about $2. Each of them does this eight times per hour, which means that all three of them lose approximately $48 per hour. You are one of the remaining four players who are approximately equal, so these four players (and you among them) must split $48, each making a profit of $12 per hour. Your hourly odds in this case are simply equal to your share of the amount of money lost by three bad players in an hour.

Over a long period of time, the player’s total winnings are the sum of his mathematical expectations in individual hands. The more hands you play with positive expectation, the more you win, and conversely, the more hands you play with negative expectation, the more you lose. As a result, you should choose a game that can maximize your positive anticipation or negate your negative anticipation so that you can maximize your hourly winnings.

Positive mathematical expectation in gaming strategy

If you know how to count cards, you can have an advantage over the casino, as long as they don't notice and throw you out. Casinos love drunk players and don't tolerate card counting players. An advantage will allow you to win more times than you lose over time. Good money management using expected value calculations can help you extract more profit from your edge and reduce your losses. Without an advantage, you're better off giving the money to charity. In the game on the stock exchange, the advantage is given by the game system, which creates greater profits than losses, price differences and commissions. No amount of money management can save a bad gaming system.

A positive expectation is defined as a value greater than zero. The larger this number, the stronger the statistical expectation. If the value is less than zero, then the mathematical expectation will also be negative. The larger the module of the negative value, the worse the situation. If the result is zero, then the wait is break-even. You can only win when you have a positive mathematical expectation and a reasonable playing system. Playing by intuition leads to disaster.

Mathematical expectation and stock trading

Mathematical expectation is a fairly widely used and popular statistical indicator when carrying out exchange trading in financial markets. First of all, this parameter is used to analyze the success of trading. It is not difficult to guess that the higher this value, the more reasons to consider the trade being studied successful. Of course, analysis of a trader’s work cannot be carried out using this parameter alone. However, the calculated value, in combination with other methods of assessing the quality of work, can significantly increase the accuracy of the analysis.

The mathematical expectation is often calculated in trading account monitoring services, which allows you to quickly evaluate the work performed on the deposit. The exceptions include strategies that use “sitting out” unprofitable trades. A trader may be lucky for some time, and therefore there may be no losses in his work at all. In this case, it will not be possible to be guided only by the mathematical expectation, because the risks used in the work will not be taken into account.

In market trading, the mathematical expectation is most often used when predicting the profitability of any trading strategy or when predicting a trader’s income based on statistical data from his previous trading.

With regard to money management, it is very important to understand that when making trades with negative expectations, there is no money management scheme that can definitely bring high profits. If you continue to play the stock market under these conditions, then regardless of how you manage your money, you will lose your entire account, no matter how large it was to begin with.

This axiom is true not only for games or trades with negative expectation, it is also true for games with equal chances. Therefore, the only time you have a chance to profit in the long term is if you take trades with positive expected value.

The difference between negative expectation and positive expectation is the difference between life and death. It doesn't matter how positive or how negative the expectation is; All that matters is whether it is positive or negative. Therefore, before considering money management, you should find a game with positive expectation.

If you don't have that game, then all the money management in the world won't save you. On the other hand, if you have a positive expectation, you can, through proper money management, turn it into an exponential growth function. It doesn't matter how small the positive expectation is! In other words, it doesn't matter how profitable a trading system is based on a single contract. If you have a system that wins $10 per contract per trade (after commissions and slippage), you can use money management techniques to make it more profitable than a system that averages $1,000 per trade (after deduction of commissions and slippage).

What matters is not how profitable the system was, but how certain the system can be said to show at least minimal profit in the future. Therefore, the most important preparation a trader can make is to ensure that the system will show a positive expected value in the future.

In order to have a positive expected value in the future, it is very important not to limit the degrees of freedom of your system. This is achieved not only by eliminating or reducing the number of parameters to be optimized, but also by reducing as many system rules as possible. Every parameter you add, every rule you make, every tiny change you make to the system reduces the number of degrees of freedom. Ideally, you need to build a fairly primitive and simple system that will consistently generate small profits in almost any market. Again, it is important for you to understand that it does not matter how profitable the system is, as long as it is profitable. The money you make in trading will be made through effective money management.

A trading system is simply a tool that gives you a positive expected value so that you can use money management. Systems that work (show at least minimal profits) in only one or a few markets, or have different rules or parameters for different markets, will most likely not work in real time for long. The problem with most technically oriented traders is that they spend too much time and effort optimizing the various rules and parameter values of the trading system. This gives completely opposite results. Instead of wasting energy and computer time on increasing the profits of the trading system, direct your energy to increasing the level of reliability of obtaining a minimum profit.

Knowing that money management is just a numbers game that requires the use of positive expectations, a trader can stop searching for the "holy grail" of stock trading. Instead, he can start testing his trading method, find out how logical this method is, and whether it gives positive expectations. Proper money management methods, applied to any, even very mediocre trading methods, will do the rest of the work themselves.

For any trader to succeed in his work, he needs to solve three most important tasks: . To ensure that the number of successful transactions exceeds the inevitable mistakes and miscalculations; Set up your trading system so that you have the opportunity to earn money as often as possible; Achieve stable positive results from your operations.

And here, for us working traders, mathematical expectation can be of great help. This term is one of the key ones in probability theory. With its help, you can give an average estimate of some random value. The mathematical expectation of a random variable is similar to the center of gravity, if you imagine all possible probabilities as points with different masses.

In relation to a trading strategy, the mathematical expectation of profit (or loss) is most often used to evaluate its effectiveness. This parameter is defined as the sum of the products of given levels of profit and loss and the probability of their occurrence. For example, the developed trading strategy assumes that 37% of all transactions will bring profit, and the remaining part - 63% - will be unprofitable. At the same time, the average income from a successful transaction will be $7, and the average loss will be $1.4. Let's calculate the mathematical expectation of trading using this system:

What does this number mean? It says that, following the rules of this system, on average we will receive $1,708 from each closed transaction. Since the resulting efficiency rating is greater than zero, such a system can be used for real work. If, as a result of the calculation, the mathematical expectation turns out to be negative, then this already indicates an average loss and such trading will lead to ruin.

The amount of profit per transaction can also be expressed as a relative value in the form of %. For example:

– percentage of income per 1 transaction - 5%;

– percentage of successful trading operations - 62%;

– percentage of loss per 1 transaction - 3%;

– percentage of unsuccessful transactions - 38%;

That is, the average trade will bring 1.96%.

It is possible to develop a system that, despite the predominance of unprofitable trades, will give a positive result, since its MO>0.

However, waiting alone is not enough. It is difficult to make money if the system gives very few trading signals. In this case, its profitability will be comparable to bank interest. Let each operation produce on average only 0.5 dollars, but what if the system involves 1000 operations per year? This will be a very significant amount in a relatively short time. It logically follows from this that another distinctive feature of a good trading system can be considered a short period of holding positions.

Sources and links

dic.academic.ru – academic online dictionary

mathematics.ru – educational website in mathematics

nsu.ru – educational website of Novosibirsk State University

webmath.ru is an educational portal for students, applicants and schoolchildren.

exponenta.ru educational mathematical website

ru.tradimo.com – free online trading school

crypto.hut2.ru – multidisciplinary information resource

poker-wiki.ru – free encyclopedia of poker

sernam.ru – Scientific library of selected natural science publications

reshim.su – website WE WILL SOLVE test coursework problems

unfx.ru – Forex on UNFX: training, trading signals, trust management

slovopedia.com – Big Encyclopedic Dictionary Slovopedia

pokermansion.3dn.ru – Your guide in the world of poker

statanaliz.info – information blog “Statistical data analysis”

forex-trader.rf – Forex-Trader portal

megafx.ru – current Forex analytics

fx-by.com – everything for a trader

Distribution parameters and statistics

Any parameters of the distribution of a random variable, for example, such as the mathematical expectation or variance, are theoretical quantities that cannot be directly measured, although they can be estimated. They represent a quantitative characteristic population and can themselves be determined only during theoretical modeling as hypothetical values, since they describe the features of the distribution of a random variable in the general population itself. In order to determine them in practice, the researcher conducting the experiment carries out a selective assessment of them. This assessment involves statistical calculation.

Statistics is a quantitative characteristic of the studied parameters characterizing the distribution of a random variable obtained on the basis of a study of sample values. Statistics are used either to describe the sample itself, or, which is of paramount importance in fundamental experimental research, to estimate the parameters of the distribution of a random variable in the population under study.

Separation of concepts "parameter" And "statistics" is very important, since it allows you to avoid a number of errors associated with incorrect interpretation of data obtained in the experiment. The fact is that when we estimate distribution parameters using statistical data, we obtain values that are only to a certain extent close to the estimated parameters. There is almost always some difference between parameters and statistics, and we usually cannot say how big this difference is. Theoretically, the larger the sample, the closer the estimated parameters are to their sample characteristics. However, this does not mean that by increasing the sample size, we will inevitably come closer to the estimated parameter and reduce the difference between it and the calculated statistics. In practice, everything can turn out to be much more complicated.

If, in theory, the expected value of the statistic coincides with the estimated parameter, then such an estimate is called undisplaced. An estimate in which the expected value of the estimated parameter differs from the parameter itself by a certain amount is called displaced.

It is also necessary to distinguish between point and interval estimates of distribution parameters. Spot called an assessment using a number. For example, if we say that the value of the spatial threshold of tactile sensitivity for a given subject under given conditions and on a given area of skin is 21.8 mm, then such an estimate will be point. In the same way, a point estimate occurs when the weather report tells us that it is 25°C outside the window. Interval estimation involves the use of a set or range of numbers in an assessment. Assessing the spatial threshold of tactile sensitivity, we can say that it was in the range from 20 to 25 mm. Similarly, weather forecasters may report that according to their forecasts, the air temperature in the next 24 hours will reach 22–24°C. Interval estimation of a random variable allows us not only to determine the desired value of this quantity, but also to set the possible accuracy for such an estimate.

Mathematical expectation and its evaluation

Let's return to our coin toss experiment.

Let's try to answer the question: how many times should "heads" appear if we flip a coin ten times? The answer seems obvious. If the probabilities of each of two outcomes are equal, then the outcomes themselves must be equally distributed. In other words, when tossing an ordinary coin ten times, we can expect that one of its sides, for example, “heads,” will land exactly five times. Similarly, when tossing a coin 100 times, “heads” should appear exactly 50 times, and if the coin is tossed 4236 times, then the side of interest to us should appear 2118 times, no more and no less.

So, the theoretical meaning of a random event is usually called mathematical expectation. The expected value can be found by multiplying the theoretical probability of the random variable by the number of trials. More formally, however, it is defined as a first-order central moment. Thus, the mathematical expectation is the value of a random variable to which it theoretically tends during repeated tests, around which it varies.

It is clear that the theoretical value of the mathematical expectation as a distribution parameter is not always equal to the empirical value of the random variable of interest to us, expressed in statistics. If we do an experiment with tossing a coin, then it is quite likely that out of ten outcomes, “heads” will come up only four or three times, or maybe, on the contrary, it will come up eight times, or maybe it will never come up at all. It is clear that some of these outcomes turn out to be more, some less likely. If we use the law of normal distribution, we can come to the conclusion that the more the result deviates from the theoretically expected one, specified by the value of the mathematical expectation, the less likely it is in practice.

Let us further assume that we have performed a similar procedure several times and have never observed the theoretically expected value. Then we may have doubts about the authenticity of the coin. We can assume that for our coin the probability of getting heads is not actually 50%. In this case, it may be necessary to estimate the probability of this event and, accordingly, the value of the mathematical expectation. This need arises whenever in an experiment we study the distribution of a continuous random variable, such as reaction time, without having any theoretical model in advance. As a rule, this is the first mandatory step in the quantitative processing of experimental results.

The mathematical expectation can be estimated in three ways, which in practice can give slightly different results, but in theory they should certainly lead us to the value of the mathematical expectation.

The logic of such an assessment is illustrated in Fig. 1.2. The expected value can be considered as the central tendency in the distribution of a random variable X, as its most probable and therefore most frequently occurring value and as a point dividing the distribution into two equal parts.

Rice. 1.2.

Let's continue our imaginary experiments with a coin and conduct three experiments with tossing it ten times. Let’s assume that in the first experiment “heads” came up four times, the same thing happened in the second experiment, in the third experiment “heads” came up more than one and a half times more often - seven times. It is logical to assume that the mathematical expectation of the event we are interested in actually lies somewhere between these values.

First, simplest assessment method mathematical expectation will be to find arithmetic mean. Then the estimate of the expected value based on the above three measurements will be (4 + 4 + 7)/3 = 5. Similarly, in reaction time experiments, the expected value can be estimated by taking the arithmetic mean of all the obtained values X. So, if we spent P reaction time measurements X, then we can use the following formula, which shows us that to calculate the arithmetic mean X it is necessary to add up all empirically obtained values and divide them by the number of observations:

In formula (1.2), the measure of mathematical expectation is usually denoted as ̅ X (read as "X with a bar"), although sometimes it can be written as M (from English mean - average).

The arithmetic mean is the most commonly used estimate of mathematical expectation. In such cases, it is assumed that the random variable is measured in metric scale. It is clear that the result obtained may or may not coincide with the true value of the mathematical expectation, which we never know. It is important, however, that this method is unbiased estimation of mathematical expectation. This means that the expected value of the estimated value is equal to its mathematical expectation: .

Second assessment method mathematical expectation is to take as its value the most frequently occurring value of the variable of interest to us. This value is called distribution mode. For example, in the case of tossing a coin just considered, “four” can be taken as the value of the mathematical expectation, since in the three tests conducted this value appeared twice; That is why the distribution mode in this case turned out to be equal to four. Mode estimation is used mainly when the experimenter is dealing with variables that take discrete values specified in non-metric scale.

For example, by describing the distribution of students' grades on an exam, one can construct a frequency distribution of grades received by students. This frequency distribution is called histogram. In this case, the most common estimate can be taken as the value of the central tendency (mathematical expectation). When studying variables characterized by continuous values, this measure is practically not used or is rarely used. If the frequency distribution of the results obtained is nevertheless constructed, then, as a rule, it concerns not the experimentally obtained values of the characteristic being studied, but some intervals of its manifestation. For example, by studying the height of people, you can see how many people fall within the range of up to 150 cm in height, how many fall into the range from 150 to 155 cm, etc. In this case, the mode will be related to the interval values of the characteristic being studied, in this case, height.

It is clear that the mode, like the arithmetic mean, may or may not coincide with the actual value of the mathematical expectation. But just like the arithmetic mean, the mode is an unbiased estimate of the mathematical expectation.

Let us add that if two values in the sample occur equally often, then such a distribution is called bimodal. If three or more values in a sample occur equally often, then such a sample is said to have no mode. Such cases, with a sufficiently large number of observations, as a rule, indicate that the data are extracted from a general population, the nature of the distribution of which differs from normal.

Finally, third assessment method mathematical expectation is to divide the sample of subjects according to the parameter of interest to us exactly in half. The quantity characterizing this boundary is called median distributions.

Suppose we are present at a skiing competition and after it ends we want to evaluate which of the athletes showed results above average and which below. If the composition of participants is more or less even, then when assessing the average result it is logical to calculate the arithmetic mean. Let us assume, however, that among the professional participants there are several amateurs. There are few of them, but they show results that are significantly inferior to others. In this case, it may turn out that out of 100 participants in the competition, for example, 87 showed results above average. It is clear that such an assessment of the average tendency cannot always satisfy us. In this case, it is logical to assume that the average result was shown by the participants who took somewhere in 50th or 51st place. This will be the median of the distribution. Before the 50th finalist, 49 participants finished, after the 51st – also 49. It is not clear, however, whose result among them should be taken as the average. Of course, it may turn out that they finished in the same time. Then there is no problem. The problem does not arise when the number of observations is odd. In other cases, however, you can use the average of the results of two participants.

The median is a special case of the quantile of a distribution. Quantile is part of the distribution. Formally, it can be defined as the integral value of the distribution between two values of a variable X. Thus, the value X will be the median of the distribution if the integral value of the distribution (probability density) is from -∞ to X equal to the integral value of the distribution from X to +∞. Similarly, the distribution can be divided into four, ten or 100 parts. Such quantiles are called accordingly quartiles, deciles And percentiles. There are other types of quantiles.

Just like the two previous methods for estimating mathematical expectation, the median is an unbiased estimate of mathematical expectation.

Theoretically, it is assumed that if we are really dealing with a normal distribution of a random variable, then all three estimates of the mathematical expectation should give the same result, since they all represent a variant unbiased estimates of the same distribution parameter of the estimated random variable (see Fig. 1.2). In practice, however, this rarely occurs. This may be due, in particular, to the fact that the analyzed distribution differs from normal. But the main reason for such discrepancies, as a rule, is that by estimating the value of the mathematical expectation, one can obtain a value that differs very significantly from its true value. However, as noted above, it has been proven in mathematical statistics that the more independent tests of the variable under consideration are carried out, the closer the estimated value should be to the true one.

Thus, in practice, the choice of method for estimating the mathematical expectation is determined not by the desire to obtain a more accurate and reliable estimate of this parameter, but only by considerations of convenience. Also, a certain role in choosing a method for estimating the mathematical expectation is played by the measurement scale, which reflects the observations of the random variable being evaluated.

Let there be a random variable X with mathematical expectation m and variance D, while both of these parameters are unknown. Above value X produced N independent experiments, as a result of which a set of N numerical results x 1 , x 2 , …, x N. As an estimate of the mathematical expectation, it is natural to propose the arithmetic mean of the observed values

(1)

Here as x i specific values (numbers) obtained as a result are considered N experiments. If we take others (independent of the previous ones) N experiments, then obviously we will get a different value. If you take more N experiments, then we will get another new value. Let us denote by X i random variable resulting from i th experiment, then the implementations X i there will be numbers obtained from these experiments. Obviously, the random variable X i will have the same probability density function as the original random variable X. We also believe that random variables X i And X j are independent when i, not equal j(various experiments independent of each other). Therefore, we rewrite formula (1) in a different (statistical) form:

(2)

Let us show that the estimate is unbiased:

Thus, the mathematical expectation of the sample mean is equal to the true mathematical expectation of the random variable m. This is a fairly predictable and understandable fact. Consequently, the sample mean (2) can be taken as an estimate of the mathematical expectation of a random variable. Now the question arises: what happens to the variance of the mathematical expectation estimate as the number of experiments increases? Analytical calculations show that

where is the variance of the mathematical expectation estimate (2), and D- true variance of the random variable X.

From the above it follows that with increasing N(number of experiments) the variance of the estimate decreases, i.e. The more we sum up independent realizations, the closer to the mathematical expectation we get an estimate.

Estimates of mathematical variance

At first glance, the most natural assessment seems to be

(3)

where is calculated using formula (2). Let's check whether the estimate is unbiased. Formula (3) can be written as follows:

Let's substitute expression (2) into this formula:

Let's find the mathematical expectation of the variance estimate:

(4)

Since the variance of a random variable does not depend on what the mathematical expectation of the random variable is, let us take the mathematical expectation equal to 0, i.e. m = 0.