Graph of uniform distribution of a continuous random variable. Uniform and exponential laws of distribution of a continuous random variable

This issue has long been studied in detail, and the most widely used method is the polar coordinate method, proposed by George Box, Mervyn Muller and George Marsaglia in 1958. This method allows you to obtain a pair of independent normally distributed random variables with mathematical expectation 0 and variance 1 as follows:

Where Z 0 and Z 1 are the desired values, s = u 2 + v 2, and u and v are random variables uniformly distributed on the interval (-1, 1), selected in such a way that condition 0 is satisfied< s < 1.
Many people use these formulas without even thinking, and many do not even suspect their existence, since they use ready-made implementations. But there are people who have questions: “Where did this formula come from? And why do you get a couple of quantities at once?” Next, I will try to give a clear answer to these questions.

To begin with, let me remind you what probability density, distribution function of a random variable and inverse function are. Suppose there is a certain random variable, the distribution of which is specified by the density function f(x), which has the following form:

This means that the probability that the value of a given random variable will be in the interval (A, B) is equal to the area of the shaded area. And as a consequence, the area of the entire shaded area must be equal to one, since in any case the value of the random variable will fall into the domain of definition of the function f.
The distribution function of a random variable is the integral of the density function. And in this case, its approximate appearance will be like this:

The meaning here is that the value of the random variable will be less than A with probability B. And as a consequence, the function never decreases, and its values lie in the interval.

An inverse function is a function that returns an argument to the original function if the value of the original function is passed into it. For example, for the function x 2 the inverse is the function of extracting the root, for sin(x) it is arcsin(x), etc.

Since most pseudorandom number generators produce only a uniform distribution as output, there is often a need to convert it to some other one. In this case, to normal Gaussian:

The basis of all methods for transforming a uniform distribution into any other is the inverse transformation method. It works as follows. A function is found that is inverse to the function of the required distribution, and a random variable uniformly distributed on the interval (0, 1) is passed into it as an argument. At the output we obtain a value with the required distribution. For clarity, I provide the following picture.

Thus, a uniform segment is, as it were, smeared in accordance with the new distribution, projected onto another axis through an inverse function. But the problem is that the integral of the density of a Gaussian distribution is not easy to calculate, so the above scientists had to cheat.

There is a chi-square distribution (Pearson distribution), which is the distribution of the sum of squares of k independent normal random variables. And in the case where k = 2, this distribution is exponential.

This means that if a point in a rectangular coordinate system has random X and Y coordinates distributed normally, then after converting these coordinates to the polar system (r, θ), the square of the radius (the distance from the origin to the point) will be distributed according to the exponential law, since the square of the radius is the sum of the squares of the coordinates (according to Pythagorean law). The distribution density of such points on the plane will look like this:

Since it is equal in all directions, the angle θ will have a uniform distribution in the range from 0 to 2π. The converse is also true: if you define a point in the polar coordinate system using two independent random variables (an angle distributed uniformly and a radius distributed exponentially), then the rectangular coordinates of this point will be independent normal random variables. And it is much easier to obtain an exponential distribution from a uniform one using the same inverse transformation method. This is the essence of the polar Box-Muller method.
Now let's derive the formulas.

(1)

To obtain r and θ, we need to generate two random variables uniformly distributed on the interval (0, 1) (let’s call them u and v), the distribution of one of which (let’s say v) needs to be converted to exponential to obtain the radius. The exponential distribution function looks like this:

Its inverse function is:

Since the uniform distribution is symmetrical, the transformation will work similarly with the function

From the chi-square distribution formula it follows that λ = 0.5. Substitute λ, v into this function and get the square of the radius, and then the radius itself:

We obtain the angle by stretching the unit segment to 2π:

Now we substitute r and θ into formulas (1) and get:

(2)

These formulas are already ready to use. X and Y will be independent and normally distributed with a variance of 1 and a mathematical expectation of 0. To obtain a distribution with other characteristics, it is enough to multiply the result of the function by the standard deviation and add the mathematical expectation.
But it is possible to get rid of trigonometric functions by specifying the angle not directly, but indirectly through the rectangular coordinates of a random point in the circle. Then, through these coordinates, it will be possible to calculate the length of the radius vector, and then find the cosine and sine by dividing x and y by it, respectively. How and why does it work?
Let us choose a random point from those uniformly distributed in a circle of unit radius and denote the square of the length of the radius vector of this point by the letter s:

The selection is made by specifying random rectangular coordinates x and y, uniformly distributed in the interval (-1, 1), and discarding points that do not belong to the circle, as well as the central point at which the angle of the radius vector is not defined. That is, condition 0 must be met< s < 1. Тогда, как и в случае с Гауссовским распределением на плоскости, угол θ будет распределен равномерно. Это очевидно - количество точек в каждом направлении одинаково, значит каждый угол равновероятен. Но есть и менее очевидный факт - s тоже будет иметь равномерное распределение. Полученные s и θ будут независимы друг от друга. Поэтому мы можем воспользоваться значением s для получения экспоненциального распределения, не генерируя третью случайную величину. Подставим теперь s в формулы (2) вместо v, а вместо тригонометрических функций - их расчет делением координаты на длину радиус-вектора, которая в данном случае является корнем из s:

We get the formulas as at the beginning of the article. The disadvantage of this method is that it discards points that are not included in the circle. That is, using only 78.5% of the generated random variables. On older computers, the lack of trigonometry functions was still a big advantage. Now, when one processor command calculates both sine and cosine in an instant, I think these methods can still compete.

Personally, I still have two questions:

Why is the value of s distributed evenly?
Why is the sum of the squares of two normal random variables distributed exponentially?

Since s is the square of the radius (for simplicity, I call the radius the length of the radius vector that defines the position of a random point), we first find out how the radii are distributed. Since the circle is filled evenly, it is obvious that the number of points with radius r is proportional to the length of the circle of radius r. And the circumference of a circle is proportional to the radius. This means that the distribution density of the radii increases uniformly from the center of the circle to its edges. And the density function has the form f(x) = 2x on the interval (0, 1). Coefficient 2 so that the area of the figure under the graph is equal to one. When this density is squared, it becomes uniform. Since theoretically in this case it is necessary to divide the density function by its derivative of the transformation function (that is, x 2). And clearly it happens like this:

If a similar transformation is made for a normal random variable, then the density function of its square will turn out to be similar to a hyperbola. And the addition of two squares of normal random variables is a much more complex process associated with double integration. And the fact that the result will be an exponential distribution, I personally only have to check using a practical method or accept as an axiom. And for those who are interested, I suggest that you take a closer look at the topic, gaining knowledge from these books:

Ventzel E.S. Probability theory
Knut D.E. The Art of Programming, Volume 2

In conclusion, here is an example of implementing a normally distributed random number generator in JavaScript:

Function Gauss() ( var ready = false; var second = 0.0; this.next = function(mean, dev) ( mean = mean == undefined ? 0.0: mean; dev = dev == undefined ? 1.0: dev; if ( this.ready) ( this.ready = false; return this.second * dev + mean; ) else ( var u, v, s; do ( u = 2.0 * Math.random() - 1.0; v = 2.0 * Math. random() - 1.0; s = u * u + v * v; while (s > 1.0 || s == 0.0); this.second = r * u; this.ready = true; return r * v * dev + mean ) ) g = new Gauss(); // create an object a = g.next(); // generate a pair of values and get the first one b = g.next(); // get the second c = g.next(); // generate a pair of values again and get the first one
The parameters mean (mathematical expectation) and dev (standard deviation) are optional. I draw your attention to the fact that the logarithm is natural.

As an example of a continuous random variable, consider a random variable X uniformly distributed over the interval (a; b). The random variable X is said to be evenly distributed on the interval (a; b), if its distribution density is not constant on this interval:

From the normalization condition we determine the value of the constant c. The area under the distribution density curve should be equal to unity, but in our case it is the area of a rectangle with base (b - α) and height c (Fig. 1).

Rice. 1 Uniform distribution density
From here we find the value of the constant c:

So, the density of a uniformly distributed random variable is equal to

Let us now find the distribution function using the formula:
1) for
2) for
3) for 0+1+0=1.
Thus,

The distribution function is continuous and does not decrease (Fig. 2).

Rice. 2 Distribution function of a uniformly distributed random variable

We'll find mathematical expectation of a uniformly distributed random variable according to the formula:

Dispersion of uniform distribution is calculated by the formula and is equal to

Example No. 1. The scale division value of the measuring device is 0.2. Instrument readings are rounded to the nearest whole division. Find the probability that an error will be made during the counting: a) less than 0.04; b) large 0.02
Solution. The rounding error is a random variable uniformly distributed over the interval between adjacent integer divisions. Let us consider the interval (0; 0.2) as such a division (Fig. a). Rounding can be carried out both towards the left border - 0, and towards the right - 0.2, which means that an error less than or equal to 0.04 can be made twice, which must be taken into account when calculating the probability:

P = 0.2 + 0.2 = 0.4

For the second case, the error value can also exceed 0.02 on both division boundaries, that is, it can be either more than 0.02 or less than 0.18.

Then the probability of an error like this:

Example No. 2. It was assumed that the stability of the economic situation in the country (absence of wars, natural disasters, etc.) over the past 50 years can be judged by the nature of the population distribution by age: in a calm situation it should be uniform. As a result of the study, the following data were obtained for one of the countries.

Is there any reason to believe that there was instability in the country?

We carry out the solution using a calculator. Testing hypotheses. Table for calculating indicators.

Groups	Midpoint of the interval, x i	Quantity, f i	x i * f i	Accumulated frequency, S	\|x - x avg \|*f	(x - x avg) 2 *f	Frequency, f i /n
0 - 10	5	0.14	0.7	0.14	5.32	202.16	0.14
10 - 20	15	0.09	1.35	0.23	2.52	70.56	0.09
20 - 30	25	0.1	2.5	0.33	1.8	32.4	0.1
30 - 40	35	0.08	2.8	0.41	0.64	5.12	0.08
40 - 50	45	0.16	7.2	0.57	0.32	0.64	0.16
50 - 60	55	0.13	7.15	0.7	1.56	18.72	0.13
60 - 70	65	0.12	7.8	0.82	2.64	58.08	0.12
70 - 80	75	0.18	13.5	1	5.76	184.32	0.18
		1	43		20.56	572	1

Distribution center indicators.
Weighted average

Variation indicators.
Absolute variations.
The range of variation is the difference between the maximum and minimum values of the primary series characteristic.
R = X max - X min
R = 70 - 0 = 70
Dispersion- characterizes the measure of dispersion around its average value (a measure of dispersion, i.e. deviation from the average).

Standard deviation.

Each value of the series differs from the average value of 43 by no more than 23.92
Testing hypotheses about the type of distribution.
4. Testing the hypothesis about uniform distribution general population.
In order to test the hypothesis about the uniform distribution of X, i.e. according to the law: f(x) = 1/(b-a) in the interval (a,b)
necessary:
1. Estimate the parameters a and b - the ends of the interval in which possible values of X were observed, using the formulas (the * sign denotes parameter estimates):

2. Find the probability density of the expected distribution f(x) = 1/(b * - a *)
3. Find the theoretical frequencies:
n 1 = nP 1 = n = n*1/(b * - a *)*(x 1 - a *)
n 2 = n 3 = ... = n s-1 = n*1/(b * - a *)*(x i - x i-1)
n s = n*1/(b * - a *)*(b * - x s-1)
4. Compare empirical and theoretical frequencies using the Pearson criterion, taking the number of degrees of freedom k = s-3, where s is the number of initial sampling intervals; if a combination of small frequencies, and therefore the intervals themselves, was carried out, then s is the number of intervals remaining after the combination.

Solution:
1. Find estimates of the parameters a * and b * of the uniform distribution using the formulas:

2. Find the density of the assumed uniform distribution:
f(x) = 1/(b * - a *) = 1/(84.42 - 1.58) = 0.0121
3. Let's find the theoretical frequencies:
n 1 = n*f(x)(x 1 - a *) = 1 * 0.0121(10-1.58) = 0.1
n 8 = n*f(x)(b * - x 7) = 1 * 0.0121(84.42-70) = 0.17
The remaining n s will be equal to:
n s = n*f(x)(x i - x i-1)

i	n i	n*i	n i - n * i	(n i - n* i) 2	(n i - n * i) 2 /n * i
1	0.14	0.1	0.0383	0.00147	0.0144
2	0.09	0.12	-0.0307	0.000943	0.00781
3	0.1	0.12	-0.0207	0.000429	0.00355
4	0.08	0.12	-0.0407	0.00166	0.0137
5	0.16	0.12	0.0393	0.00154	0.0128
6	0.13	0.12	0.0093	8.6E-5	0.000716
7	0.12	0.12	-0.000701	0	4.0E-6
8	0.18	0.17	0.00589	3.5E-5	0.000199
Total	1				0.0532

Let us determine the boundary of the critical region. Since the Pearson statistic measures the difference between the empirical and theoretical distributions, the larger its observed value K obs, the stronger the argument against the main hypothesis.
Therefore, the critical region for this statistic is always right-handed: if on this segment the probability distribution density of the random variable is constant, that is, if the differential distribution function f(x) has the following form:

This distribution is sometimes called law of uniform density. About a quantity that has a uniform distribution on a certain segment, we will say that it is distributed uniformly on this segment.

Let's find the value of the constant c. Since the area limited by the distribution curve and the axis Oh, is equal to 1, then

where With=1/(b-a).

Now the function f(x)can be represented in the form

Let's construct the distribution function F(x ), why do we find an expression for F(x) on the interval [ a, b]:

The graphs of the functions f (x) and F (x) look like:

Let's find the numerical characteristics.

Using the formula for calculating the mathematical expectation of the NSV, we have:

Thus, the mathematical expectation of a random variable uniformly distributed on the interval [a, b] coincides with the middle of this segment.

Let's find the variance of a uniformly distributed random variable:

from which it immediately follows that the standard deviation:

Let us now find the probability of the value of a random variable having a uniform distribution falling on the interval(a, b), belonging entirely to the segment [a,b ]:

Geometrically, this probability is the area of the shaded rectangle. Numbers A Andbare called distribution parameters And uniquely determine a uniform distribution.

Example 1. Buses on some routes run strictly on schedule. The movement interval is 5 minutes. Find the probability that a passenger who approaches the stop. The wait for the next bus will be less than 3 minutes.

Solution:

CB-bus waiting time has a uniform distribution. Then the required probability will be equal to:

Example 2. The edge of the cube x is measured approximately. Moreover

Considering the edge of a cube as a random variable distributed uniformly in the interval (a,b), find the mathematical expectation and variance of the volume of the cube.

Solution:

The volume of a cube is a random variable determined by the expression Y = X 3. Then the mathematical expectation is:

Dispersion:

Online service:

A distribution is considered uniform in which all values of a random variable (in the region of its existence, for example, in the interval) are equally probable. The distribution function for such a random variable has the form:

Distribution density:

Rice. Graphs of the distribution function (left) and distribution density (right).

Uniform distribution - concept and types. Classification and features of the category "Uniform distribution" 2017, 2018.

- Uniform distribution

Basic discrete distributions of random variables Definition 1. A random variable X, taking values 1, 2, ..., n, has a uniform distribution if Pm = P(X = m) = 1/n, m = 1, ..., n. Obviously. Consider the following problem. There are N balls in the urn, of which M are white... .

- Uniform distribution

Laws of distribution of continuous random variables Definition 5. A continuous random variable X, taking a value on the interval, has a uniform distribution if the distribution density has the form. (1) It is easy to verify that, . If a random variable... .

- Uniform distribution

Normal distribution laws Uniform, exponential and The probability density function of the uniform law is as follows: (10.17) where a and b are given numbers, a< b; a и b – это параметры равномерного закона. Найдем функцию распределения F(x)... .

- Uniform distribution

The uniform probability distribution is the simplest and can be either discrete or continuous. A discrete uniform distribution is a distribution for which the probability of each of the SV values is the same, that is: where N is the number... .

- Uniform distribution

Definition 16. A continuous random variable has a uniform distribution on the segment if the distribution density of this random variable is constant on this segment and equals zero outside it, that is (45) The density graph for a uniform distribution is shown...

Consider a uniform continuous distribution. Let's calculate the mathematical expectation and variance. Let's generate random values using the MS EXCEL functionRAND() and the Analysis Package add-ons, we will estimate the mean value and standard deviation.

Evenly distributed on the segment the random variable has:

Let's generate an array of 50 numbers from the range)