Measuring scales in statistics. Measurement scale in sociology

There are 4 main types of measuring scales.

Nominative scale is a scale that classifies by name: warm(lat.) - name, title. The name is not measured quantitatively; it only allows one to distinguish one object from another or one subject from another. A nominative scale is a way of classifying objects or subjects and distributing them into classification cells.

The simplest case of a nominative scale is a dichotomous scale, consisting of only two cells, for example: “has brothers and sisters - the only child in the family”; "foreigner - compatriot"; “voted for” - voted “against”, etc.

A trait that is measured on a dichotomous scale of names is called alternative. It can take only two values. At the same time, the researcher is often interested in one of them, and then he says that the feature “appeared” if it took on the meaning he was interested in, and that the feature “did not appear” if it took the opposite meaning. For example: “The sign of left-handedness appeared in 8 out of 20 subjects.” In principle, a nominative scale can consist of cells “the trait appeared - the trait did not appear.”

A more complex version of the nominative scale is a classification of three or more cells, for example: “extrapunitive - intrapunitive - impunitive reactions” or “choice of candidacy A - candidacy B - candidacy C - candidacy D” or “eldest - middle - youngest - only child V family" etc.

Having classified all objects, reactions or all subjects into classification cells, we get the opportunity to move from names to numbers, counting the number of observations in each cell.

As already indicated, an observation is one recorded reaction, one choice made, one action performed, or the result of one subject.

Suppose we determine that candidate A was chosen by 7 subjects, candidate B by 11, candidate C by 28, A candidate G - only 1. Now we can operate with these numbers, which represent the frequency of occurrence of different names, that is, the frequency of acceptance of the “choice” sign for each of the 4 possible meanings. Next, we can compare the resulting frequency distribution with a uniform or some other distribution.

Thus, the nominative scale allows us to count the frequencies of occurrence of different “names,” or meanings of a characteristic, and then work with these frequencies using mathematical methods.

The unit of measurement with which we operate is the number of observations (subjects, reactions, elections, etc.), or frequency. More precisely, the unit of measurement is one observation. Such data can be processed using the method χ 2 , binomial test m and Angular Fisher Transform φ*.

Ordinal scale (rank)- This is a scale that classifies according to the principle of “more - less”. If in the naming scale it did not matter in what order we arrange the classification cells, then in the ordinal scale they form a sequence from the “smallest value” cell to the “largest value” cell (or vice versa). It is now more appropriate to call cells classes, since in relation to classes the definitions “low”, “medium” and “high” class, or 1st, 2nd, 3rd class, etc. are used.

The ordinal scale must have at least three classes, for example, “positive reaction - neutral reaction - negative reaction” or “suitable for a vacant position - suitable with reservations - not suitable”, etc.

In an ordinal scale, we do not know the true distance between classes, but only that they form a sequence. For example, the classes “suitable for a vacant position” and “suitable with reservations” may actually be closer to each other than the class qualified with reservations” to the class “not suitable.”

It is easy to move from classes to numbers if we agree that the lowest class gets rank 1, the middle class gets rank 2, and the highest class gets rank 3, or vice versa. The more classes on the scale, the more. We have the capabilities to mathematically process the data obtained and test statistical hypotheses.

For example, we can evaluate the differences between two samples of subjects based on the prevalence of higher or lower ranks in them, or we can calculate the rank correlation coefficient between two variables measured on an ordinal scale, say, between assessments of a manager’s professional competence given to him by different experts.

All psychological methods that use ranking are based on the use of an order scale. If a subject is asked to order 18 values ​​according to the degree of their significance for him, to rank a list of personal qualities of a social worker or 10 applicants for this position according to the degree of their professional suitability, then in all these cases the subject performs the so-called forced ranking, in which the number of ranks corresponds to the number of those ranked subjects or objects (values, qualities, etc.).

Regardless of whether we assign one of 3-4 ranks to each quality or subject or perform a forced ranking procedure, in both cases we obtain series of values ​​measured on an ordinal scale. True, if we have only 3 possible classes and, therefore, 3 ranks, and at the same time, say, 20 ranked subjects, then some of them will inevitably receive the same ranks. All the diversity of life cannot fit into 3 gradations, so people who differ quite seriously from each other can fall into the same class. On the other hand, forced ranking, that is, the formation of a sequence of many subjects, can artificially exaggerate the differences between people. In addition, the data obtained in different groups may turn out to be incomparable, since the groups may initially differ in the level of development of the quality under study, and a subject who received the highest rank in one group would receive only an average rank in another, etc.

A way out of the situation can be found by specifying a fairly fractional classification system, say, of 10 classes, or gradations, of a characteristic. In essence, the overwhelming majority of psychological methods that use expert assessment are based on measuring the same “yardstick” of 10, 20 or even 100 gradations of different subjects in different samples.

So, the unit of measurement in the order scale is a distance of 1 class or 1 rank, while the distance between classes and ranks can be different (it is unknown to us). All criteria and methods described in this book apply to data obtained on an ordinal scale.

Interval scale is a scale that classifies according to the principle “more by a certain number of units - less by a certain number of units.” Each of the possible values ​​of the attribute is located at an equal distance from the other.

It can be assumed that if we measure the time to solve a problem in seconds, then this is clearly an interval scale. However, in reality this is not so, since psychologically the difference of 20 seconds between subjects A and B may not be at all equal to the difference of 20 seconds between subjects B and D, if subject A solved the problem in 2 seconds, B - in 22, C - for 222, and G - for 242.

Similarly, each second after the expiration of one and a half minutes in an experiment with the measurement of muscular volitional effort on a dynamometer with a moving pointer, at the “price”, may be equal to 10 or even more seconds in the first half-minute of the experiment. “One second passes in a year,” this is how one test subject once formulated it.

Attempts to measure psychological phenomena in physical units - will in seconds, abilities in centimeters, and the feeling of one’s own insufficiency in millimeters, etc., are, of course, understandable, because after all, these are measurements in units of “objectively” existing time and space. However, not a single experienced researcher deludes himself with the idea that he is making measurements on a psychological interval scale. These dimensions still belong to the scale of order, whether we like it or not.

We can only say with a certain degree of certainty that subject A solved the problem faster than B, B faster than C, and C faster than D.

Similarly, the values ​​obtained by subjects in points using any non-standardized method turn out to be measured only on an order scale. In fact, only scales in standard deviation units and percentile scales can be considered equally interval, and then only under the condition that the distribution of values ​​in the standardizing sample was normal.

The principle of constructing most interval scales is based on the well-known “three sigma” rule: approximately 97.7-97.8% of all values ​​of a characteristic, with its normal distribution, fall within the range M ± 3δ. It is possible to construct a scale in units of fractions of a standard deviation, which will cover the entire possible range of changes in a characteristic if the leftmost and rightmost intervals are left open (more on this will be said later).

R.B. Cattell proposed, for example, the “standard ten” wall scale. The arithmetic mean in “raw” points is taken as the starting point. To the right and left, intervals equal to 1/2 standard deviation are measured. In Fig. Figure 2 shows a diagram for calculating standard scores and converting “raw” scores into walls on the N scale of R. B. Cattell’s 16-factor personality questionnaire.

Rice. 2. Scheme for calculating standard scores (stens) for factor N of the 16-factor personality questionnaire of R. B. Cattell; below are intervals in units of 1/2 standard deviation

To the right of the average there will be intervals equal to the 6th, 7th, 8th, 9th and 10th walls, with the last of these intervals being open. To the left of the middle value there will be intervals equal to 5, 4, 3, 2 and 1 walls, and the extreme interval is also open. Now we go up to the raw points axis and mark the boundaries of the intervals in units of raw points. Since M=10.2; δ=2.4, to the right we put 1/2δ i.e. 1.2 "raw" points. Thus, the boundary of the interval will be: (10.2 + 1.2) = 11.4 “raw” points. So, the boundaries of the interval corresponding to 6 walls will extend from 10.2 to 11.4 points. In essence, only one “raw” value falls into it - 11 points. To the left of the average we put 1/2δ and get the boundary of the interval: 10.2-1.2=9. Thus, the boundaries of the interval corresponding to 9 walls extend from 9 to 10.2. Two “raw” values ​​already fall into this interval - 9 and 10. If the subject received 9 “raw” points, he is now awarded 5 walls; if he received 11 “raw” points - 6 walls, etc.

We see that in the wall scale sometimes the same number of walls will be awarded for a different number of “raw” points. For example, for 16, 17, 18, 19 and 20 points 10 walls will be awarded, and for 14 and 15 - 9 walls, etc.

In principle, the wall scale can be constructed from any data measured at least on an ordinal scale, with a sample size of n>200 and a normal distribution of the characteristic 2.

Another way to construct an equal-interval scale is to group intervals according to the principle of equality of accumulated frequencies. With a normal distribution of a characteristic, most of all observations are grouped in the vicinity of the average value, therefore, in this area of ​​the average value, the intervals are smaller, narrower, and as they move away from the center of the distribution, they increase (see Fig. 3). Therefore, such a percentile scale is equal interval only relative to the accumulated frequency.

Rice. 3. Percentile scale; At the top for comparison, intervals are indicated in standard deviation units

Constructing equal interval scales from order scale data is reminiscent of the rope ladder trick referred to by S. Stevens. We first climb the ladder, which is not fixed to anything, and get to the ladder, which is fixed. However, how did we get there? We measured a certain psychological variable on an order scale, calculated means and standard deviations, and then finally obtained an interval scale. As Stevens noted, "A certain pragmatic justification can be given for such illegal use of statistics; in many cases it leads to fruitful results."

Many researchers do not check the degree of agreement between the empirical distribution they obtained and the normal distribution, much less convert the obtained values ​​into units of fractions of a standard deviation or percentiles, preferring to use “raw” data. “Raw” data often produces a skewed, edge-cut, or two-vertex distribution. In Fig. Figure 4 shows the distribution of the muscle volitional effort indicator on a sample of 102 subjects. The distribution can be considered normal with satisfactory accuracy (χ 2 = 12.7 with v = 9, M = 89.75, δ = 25.1).

Rice. 4 Histogram and smooth curve of distribution of muscle volitional effort (n=102)

In Fig. Figure 5 shows the distribution of the self-esteem indicator according to the scale of the J. Menester - R. Corzini method “The level of success that I should have achieved now” (n = 356). The distribution is significantly different from normal

(χ 2 = 58.8, with v=7; p<0,01; М=80,64; δ=16,86).

Rice. 5 . Histogram and smooth distribution curve of the expected success rate (n=356).

One encounters such “abnormal” distributions very often, more often, perhaps, than classical normal ones. And the point here is not some kind of flaw, but the very specificity of psychological signs. According to some methods, from 10 to 20% of subjects receive a “zero” rating - for example, in their stories there is not a single verbal formulation that would reflect the motive “hope for success” or “fear of failure” (Heckhausen method). It is normal for a subject to receive a “zero” rating, but the distribution of such ratings cannot be normal, no matter how much we increase the sample size

Relationship scale

The relationship scale is also called the equal relationship scale. A feature of this scale is the presence of a firmly fixed zero, which means the complete absence of any property or attribute. The ratio scale is the most informative scale, allowing any mathematical operations and the use of a variety of statistical methods.

The ratio scale is essentially very close to the interval scale, since if you strictly fix the starting point, then any interval scale turns into a ratio scale

It is on the ratio scale that precise and ultra-precise measurements are made in sciences such as physics, chemistry, microbiology, etc. Measurements on the ratio scale are also made in sciences close to psychology, such as psychophysics, psychophysiology, psychogenetics.

Obviously, all measurements must be carried out on a specific material. And here we should dwell on the basic definitions related to the concept Sample.

Population- this is the entire set of objects in relation to which a research hypothesis is formulated.

A sample is a group of objects limited in number (in psychology - subjects, respondents), specially selected from the general population to study its properties. Accordingly, studying the properties of a population using a sample is called sampling research. Almost all psychological studies are selective, and their conclusions extend to general populations.

Representativeness of the sample- in other words, its representativeness is the ability of the sample to represent the phenomena being studied quite fully in terms of their variability in the general population.

Stratified sampling, or selection based on the properties of the general population (dividing the sample into “strata”. It involves the preliminary determination of those qualities that can influence the variability of the property being studied (this could be gender, level of income or education, etc.).

Statistical significance The , or statistical significance, of a study's results is determined using statistical inference methods, which impose certain requirements on the size, or size, of the sample.

Theoretical validation in sociological research: Methodology and methods

Thanks to Stanley Stevenson, in our research practice we operate with several types of scales. Some criticize this typology, but apparently no one has come up with anything better.

0 Click if it was useful =ъ

Regardless of the complexity of the questionnaire questions or test techniques you are considering, they can all be divided into three types depending on which measurement scale they belong to. In this case, we are not talking about specific methods for constructing measuring instruments (for example, the Guttmann scale or the Thurstone scale), but about the classification of measuring scales proposed by Stanley Stevens in 1946. Knowledge of this classification is crucial from the point of view of using a quantitative approach, since the use of certain methods of mathematical statistics is based, among other things, on measurement scales in which the variables of interest to the researcher are displayed.

Learn more about the concept of "variable"
“Variable” is a frequently used concept within scientific research (not just in the social and behavioral sciences) and especially when we are talking about a quantitative approach and the use of statistical methods. In fact, a variable is any property of the objects being studied that changes from one observation to another. In this case, observations refer to the objects of study (people, organizations, countries, or anything else - it depends on the study itself).
If some property does not change from one observation to another, then it does not provide any valuable information in a mathematical sense (most methods will simply be unusable).
Thus, within the framework of the quantitative approach, the objects being studied are presented as a set of variables that are of interest and subject to study. It is not difficult to guess that variables are primarily divided depending on the scales in which they are displayed. Thus, we can distinguish, for example, nominal, ordinal and metric variables. At the same time, ordinal ones can be divided into collapsed and continuous ordinal ones. Continuous ordinal variables have many numerical values ​​and look (at least at first glance) like metric ones. Collapsed ordinal variables have only a few categories or numerical values ​​(no more than five or six). They can be obtained either by collecting data in collapsed form or by collapsing a continuous ordinal or metric scale.
Another important division of variables is the division into dependent and independent. Often in the process of analysis, hypotheses are put forward about the influence of some variables on others. In such cases, the influencing variables are called independent, and the influenced variables are called dependent. For example, if we are talking about the relationship between the gender of a student and the success of his studies, then gender will be an independent variable, and the success of his studies will be a dependent one.

According to Stevenson's classification, in the most general form, three types of scales can be distinguished:
- nominal,
- ordinal,
- metric.

Nominal the scale includes a class of variables whose values ​​can be divided into groups, but cannot be ranked. Examples of relevant variables are gender, nationality, religion, etc. Let us consider in more detail such a variable as nationality. In this case, respondents can be divided into different groups depending on what nationality they consider themselves to be. At the same time, on the basis of this information, it is impossible to sort respondents in terms of the quantitative expression of the parameter we are interested in, because nationality is not a measurable property, in the traditional sense of the word.
Ordinal the scale includes a class of variables whose values ​​can not only be divided into groups, but also ranked depending on the severity of the property being measured. A classic example of an ordinal scale is the Bogardus Scale, designed to measure national distance. Below is a version adapted for the population of Ukraine (N. Panina, E. Golovakha):

Questionnaire task
For each nationality listed below, select one of the positions that is closest to you personally in which you would allow representatives of that nationality.
Response scale
1) as members of my family;
2) as close friends;
3) as neighbors;
4) as colleagues at work;
5) as residents of Ukraine;
6) as visitors to Ukraine;
7) would not allow him into Ukraine at all.

This scale allows you to order respondents depending on their attitude towards a particular nationality. However, it provides only approximate information, which does not make it possible to accurately assess the differences between the scale gradations. So, for example, we can argue that a respondent who is ready to admit Jews as members of his family will treat them better than one who is ready to admit them only as neighbors. At the same time, we cannot say “by how much?” or "what time?" since the first respondent has a better attitude towards representatives of Jewish nationality than the second. In other words, we do not have any arguments that would support the equality of intervals between scale items.
Metric the scale includes a class of variables whose values ​​can be either divided into groups and ranked, or their value can be determined in precise terms (the same “by how much?” and “at what time?”). Typical examples of relevant variables are age, salary, number of children, etc. Each of them can be measured as accurately as possible: age in years, salary in hryvnia, number of children in... pieces;)
Naturally, if a variable can potentially be expressed in a metric scale, then the same variable can be expressed in an ordinal scale.

For example, age can be expressed in age groups (youth, middle age, old age), which provide only approximate information about the respondent, despite the possibility of ranking them.
Belonging to a metric scale opens up the possibility of using any statistical methods. In turn, belonging to an ordinal or nominal scale limits the choice of mathematical tools (in the case of an ordinal scale, to a lesser extent, and in the case of a nominal scale, to a greater extent). The classification of statistical methods is given.
In order to make the differences between the nominal, ordinal and metric scales even more obvious, I will give an additional example dedicated to the rating of professional heavyweight boxers according to boxrec.com (information current as of 01/31/2012). At the same time, we will look at data regarding the top ten boxers according to three variables: the ethnicity of the boxer, his place in the ranking and the number of rating points that he had on January 31, 2012.

A) Ethnicity ( nominal scale). Three boxers (brothers Klitschko and Dimitrenko) are Ukrainian, one (Povetkin) is Russian, one (Adamek) is a Pole, two (Chambers and Thompson) are American, one (Fury) is British, one (Helenius) is Finnish, one ( Pulev) - Bulgarian. Thus, the variable "nationality" helped us divide all boxers into 7 groups, depending on their ethnicity. Owning this data, a person far from boxing will not be able to say anything about the success of the listed boxers, although he will receive information about the ethnicity of the 10 best heavyweights (we will continue to turn to a hypothetical expert):
Ukrainians - 30%;
Americans - 20%;
Russians, Poles, British, Finns and Bulgarians - 10% each.
B) Place in the ranking ( ordinal scale) gives approximate information about the boxer's success. The situation is as follows:
1. Wladimir Klitschko
2. Vitali Klitschko
3. Alexander Povetkin
4. Tomasz Adamek
5. Eddie Chambers
6. Tyson Fury
7. Robert Helenius
8. Tony Thompson
9. Alexander Dimitrenko
10. Kubrat Pulev
Now our uninformed analyst knows the sequence of the top ten heavyweight boxers. And although the numbers from 1 to 10 are already present here, he still cannot perform any mathematical operations other than comparison. For example, he cannot say that Vladimir Klitschko is 4 units better than Eddie Chambers. The expression “5 minus 1” does not make sense in this case. Regarding these two boxers, he can only say that Vladimir Klitschko is a better boxer than Eddie Chambers (as well as everyone else out of the top ten). The reason it is impossible to carry out mathematical operations is that there is no equality of intervals between points 1 to 10. What the actual intervals between points are can be seen thanks to the last variable.
B) Number of rating points ( metric scale). This indicator

Page 1

The use of certain statistical methods is determined by which statistical scale the obtained material belongs to. S. Stevens proposed to distinguish between four statistical scales:

1. scale of names (or nominal);

2. order scale;

3. interval scale;

4. relationship scale.

Knowing the typical features of each scale, it is not difficult to determine which of them the material subject to statistical processing should be classified as.

Name scale. This scale includes materials in which the objects being studied differ from each other in their quality.

When processing such materials, there is no need to arrange these objects in any order based on their characteristics. In principle, objects can be arranged in any order.

Here is an example: the composition of an international scientific conference is being studied. Participants include French, English, Danes, Germans and Russians. Does the order in which participants are arranged matter when examining the composition of a conference? You can arrange them alphabetically, this is convenient, but it is clear that there is no fundamental significance in this arrangement. When translating these materials into another language (and therefore into another alphabet), this order will be disrupted. You can arrange national groups according to the number of participants. But when comparing this material with the material of another conference, we find that this order is unlikely to be the same. Objects assigned to the naming scale can be placed in any order depending on the purpose of the study.

When statistically processing this kind of material, one must take into account the number of units each object is represented by. There are very effective statistical methods that allow you to come to scientifically significant conclusions from these numerical data (for example, the chi-square method).

Order scale. If in the naming scale the order of the studied objects plays practically no role, then in the order scale - this is clear from its name - it is to this sequence that all attention is switched.

This scale in statistics includes such research materials in which objects are considered that belong to one or more classes, but differ when they are compared one with another - “more-less”, “higher-lower” - etc.

The easiest way to show the typical features of the order scale is to look at the published results of any sporting competition. These results sequentially list the participants who took first, second, third and next in order places, respectively. But in this information about the results of competitions, information about the actual achievements of athletes is often absent or fades into the background, and their ordinal places are put in the foreground.

Let's say chess player D. took first place in the competition. What are his achievements? It turns out he scored 12 points. Chess player E. took second place. His achievement is 10 points. Third place was taken by J. with eight points, fourth by 3. with six points, etc. In reports about the competition, the difference in achievements when placing chess players fades into the background, and their ordinal places remain in the first place. The fact that it is the ordinal place that is given the main importance has its own meaning. In fact, in our example, Z. scored six and D. scored 12 points. These are their absolute achievements - the games they won. If we tried to interpret this difference in achievements purely arithmetically, we would have to admit that Z plays twice as bad as D. But we cannot agree with this. The circumstances of the competition are not always simple, just as the way this or that participant conducted them is not always simple. Therefore, refraining from arithmetical absolutization, they limit themselves to what they establish: chess player 3. lags behind D., who took first place, by three ordinal places.

Socio-psychological causes and factors of maladjustment in adolescents
Deviant behavior of adolescents cannot be called only a psychological problem. It is conceptualized as a complex social problem. The main reasons for deviant behavior of people in general and adolescents in particular explain various...

Consciousness and unconsciousness in human personality
Consciousness is not the only level at which mental processes, properties and states of a person are represented; not everything that is perceived and controls a person’s behavior is actually realized by him. In addition to consciousness, a person...

Practical recommendations for optimizing educational activities in classes of different profiles.
Based on the analysis of the literature and analysis of the results obtained, we have developed the following recommendations that will help create favorable conditions for high school students for development, self-actualization, personal growth, and...

One of the most common problems encountered in survey design and instrumentation is how to assign a single representative value or score to a complex attitude or behavior. As an example, consider how one might measure public bias against college students. Such bias can manifest itself in a variety of forms, depending on what characteristics of students the attention of a particular individual (respondent) is focused on. Thus, some people judge students by their clothes, others by their manners, others by their behavior in everyday life, by their socioeconomic status, and even by their level of personal hygiene. Others may have formed a stereotypical opinion based on just one or two meetings (pleasant or not) with some specific students; and some may be barely able to distinguish a student from other people at all. The elements of judgment can vary greatly in content, focus, and degree of evaluation, but each of them represents, at least potentially, a component of the broader concept of “bias.”

If it is necessary to take into account all these points, then we need to select an instrument that will be able to identify and measure as many of these constituent elements of concepts as possible and at the same time be accurate enough to allow us to meaningfully determine the degree of manifestation of a general concept in a single observation. In other words, we need a tool that would capture and display a concept like the concept of “bias” in all its details, and in addition, would show us how much (what proportion) of this concept is contained in a particular case or response of the respondent. One such means is called scaling.

Scaling is a procedure for combining a number of relatively narrow indicators (for example, survey items relating to individual characteristics of students noted by respondents) into a single summary measure, which is taken to reflect a broader underlying concept (in our case, prejudice), of which each individual characteristic is a part. . Thus, one could measure the respondent's attitudes toward various types of student behavior (for example, how much alcohol they drink, or how noisy their parties are) or student manners (how swaggering, arrogant, or inconsiderate they are). other people), but we could not accept any of these signs separately as a full reflection of such a broad concept as prejudice. Rather, we should somehow bring all these measures together in order to be able to draw conclusions about the more general point of view that each of them in some way complements and reflects. Moreover, we need to solve this problem in such a way that we can compare the amount of bias (or whatever concept we are measuring) contained in one respondent's answer with the amount of it contained in another respondent's answer, and ultimately judge who of the respondents is more prejudiced.

A unifying measure that reflects a certain basic concept is called a scale. The particular value of the degree of manifestation in each given case of the basic concept is called a scale assessment. Scaling, or scale construction, is the procedure by which the researcher constructs a scale and assigns scores on that scale to individual cases.

Scaling is a method of modeling real processes using scales.

Scaling is a method of assigning numerical values ​​to individual attributes of a system.

Scaling allows you to break down the description of a complex process into a description of parameters on separate scales. As a result, when applied to economic problems, for example, one can get an idea of ​​the consumer’s area of ​​interest and explore the importance of each scale for him.

Scale (Latin scala - ladder) - a comparison of the results of measuring a quantity and points on a number line.

A scale is a set of designations, the relationships between which reflect the relationships between objects of the empirical system. A scale can be called the measurement results obtained in a study, as well as a measurement tool (i.e., a system of questions, a questionnaire, a test).

1.2 Types of scales and types of scaling

Scales are divided by type, according to the relationships they reflect. In addition, each scale corresponds to the mathematical transformations acceptable for this scale. Types of scales are hierarchically ordered by complexity. In psychometrics, econometrics, and applied statistics, the following classification of scales, proposed in 1946 by Stanley Smith Stevens, is accepted:

– scale of names (nominal) – the simplest of scales. Numbers are used to distinguish objects. Displays those relationships through which objects are grouped into separate non-overlapping classes. The class number does not reflect its quantitative content. An example of a scale of this kind is the classification of subjects into men and women, the numbering of players on sports teams, etc. Phone numbers, passports, bar codes of goods, individual taxpayer numbers are measured in a scale of names;

– ordinal scale – displaying order relationships. Subjects in this scale are ranked. For this scale, a monotonic transformation is acceptable. Such a scale is crude because it does not take into account the differences between the subjects of the scale. An example of such a scale: academic performance scores (unsatisfactory, satisfactory, good, excellent), Mohs scale;

– interval scale – in addition to the relationships indicated for the name and order scales, displays the relationship of the distance (difference) between objects. The differences at all points on this scale are equal. A linear transformation is acceptable for it. This allows you to reduce test results to common scales and thus compare indicators. Example: Celsius scale.

– ratio scale – unlike the interval scale, it can reflect how much one indicator is greater than another. The relationship scale has a zero point, which characterizes the absence of the quality being measured. This scale allows similarity transformation (multiplication by a constant). Determining the zero point is a difficult task for psychological research and imposes limitations on the use of this scale. Using such scales, mass, length, strength, and value (price) can be measured. Example: Kelvin scale (temperatures measured from absolute zero, with the unit of measurement chosen by agreement of experts - degree Celsius).

Difference scale – the starting point is arbitrary, the unit of measurement is specified. Acceptable transformations are shifts. Example: time measurement.

Absolute scale - it contains an additional feature - the natural and unambiguous presence of a unit of measurement. This scale has a single zero point. Example: number of people in the audience.

The issue of the type of scale is directly related to the problem of the adequacy of methods for mathematical processing of measurement results. In general, adequate statistics are those that are invariant with respect to admissible transformations of the measurement scale used.


Rice. 1. Classification of scaling methods

The scaling methods used in sociological research can be divided into comparative and non-comparative.

Comparative scales involve direct comparison of the objects in question. For example, respondents are asked whether they prefer Coke or Pepsi. Data from comparative scales are considered relative and have the properties of only ordinal and rank values. Therefore, comparative scaling is also called non-metric. As shown in Fig. 1, comparative scales include pairwise comparison, ordinal ranking, constant sum scales, Q-sorting and other operations.

Comparative scales are one of two scaling methods that involve direct comparison of the objects in question.

The main advantage of comparative scaling is the ability to recognize subtle differences between the objects under consideration. When comparing two objects, respondents have to choose between them. In addition, respondents complete the task based on the given preference scores. This makes comparative scales easy to understand and apply. Another advantage of these scales is the comparatively smaller number of theoretical assumptions used, as well as the elimination of the influence of the halo effect, or pass-through effect, when a strong preference for one product distorts the comparative assessment of others. The main disadvantage of comparative scales is their ordinal nature and the limitation of analysis to a certain number of objects under consideration. For example, a new study should be conducted to compare RC Cola with Coke and Pepsi. These disadvantages are largely eliminated when using non-comparative scaling methods.

When using noncomparative scales, also called monadic or metric scales, each object in the original population under consideration is evaluated independently of the others. The obtained data are considered to be measured on an interval or relative scale.

Noncomparative scales are one of two scaling methods that involve self-assessment of each object.

For example, respondents may be asked to rate Soke on a preference scale from 1 to 6 (1 = dislike at all, 6 = like very much). Pepsi and RC Cola are priced in the same way. From Fig. 1 shows that non-comparative rating scales can be continuous or detailed. Detailed rating scales, in turn, are divided into scales: Likert, semantic differential and Stapel. In marketing research, non-comparative scaling is most often used. This section discusses comparative scaling techniques.

1.3 Main problems in constructing scales

From the above, scaling may seem to be a fairly simple, straightforward procedure, when the researcher’s task is simply to identify a number of components of the main concept, establish what indicator can be used to measure each of them, then combine these indicators into a total assessment “... by uttering a few magic words or statistical spells, and - one-two! - It is done". Unfortunately, this apparent simplicity is deceptive, because when selecting and interpreting the components of a scale, we may encounter a number of pitfalls that require special care. First, there are problems associated with the concepts of validity and reliability.

Validity is a property determined by the answer to the question: “Are we really measuring what we want to measure?” In our current context, this question can be somewhat transformed as follows: “Is there any reason to believe that each of the individual components of the scale (each of the specific questions) is indeed directly related to the main concept and that all components together fully capture this concept?” In other words, we must ask the question: “Is there any real sense in combining a number of partial indicators with each other, and - since we have already done this - is there any point in labeling this series of indicators with the label of the basic concept we have chosen?” Thus, turning again to the example of students, it is necessary to find out, firstly, whether a person’s opinion about the behavior of students is directly related to his opinion about the student’s clothing style or about the students’ manners, and secondly, whether all these opinions in the aggregate really reflect the degree to which a given person is prejudiced against students.

As for reliability, it is determined by the answer to the question: “Regardless of what exactly we measure, do we do it consistently?” When applied to scaling, this issue translates into a concern that the various indicators that make up the scale are related to each other in a consistent and meaningful way. What we are really interested in here is not whether a given set of questions or measures allows us to distinguish apples from oranges, but rather whether this set allows us to consistently sort the apples we have already identified by size, color, etc., according to some standard. If so, then combining the different measures will say more about apples than any single measure. But if our standards (color, size, etc.) are inconsistent or ambiguous, then the observations based on them may turn out to be false. 1

Perhaps another example will help make these points clearer. Let's consider a certain scale designed for each respondent to express his agreement or disagreement with the following statements:

1. Cubans are bad and cannot be trusted.

2. The French are bad and cannot be trusted

3. The Japanese are bad and cannot be trusted.

4. The Chinese are bad and cannot be trusted.

Let's imagine that we have a scale for measuring xenophobia, that is, fear and distrust of foreigners. Presumably, the more statements a respondent agrees with, the higher the level of xenophobia we can attribute to him. But will this be the case? A person who believes that only Cubans are bad and cannot be trusted asserts this more out of anti-communism than xenophobia. In turn, a person who believes that only the Japanese and Chinese are bad and cannot be trusted asserts this more out of racism than xenophobia. And even the respondent who believes that all four groups are bad and cannot be trusted, as it turns out upon closer examination, does not suffer from xenophobia, but rather from the feeling that all people, or all governments (even the country where he lives) are bad and should not be trusted. believe. And therefore, since we cannot confidently say that this scale measures xenophobia in its essence, then this scale is invalid. And can we even trust her? Is it thoughtfully designed to even measure the level of xenophobia? Fear and mistrust of the Chinese, for example, may be an indicator of at least two very different characteristics, one ideological and the other racially motivated, and two respondents might give the same answer for very different reasons. And will the feeling of xenophobia be the same for an anti-communist and a racist? Most likely no. Thus, the mechanical connection of these specific points for the purpose of their comparison will at best be an exercise in futility, and at worst a source of erroneous conclusions. 1

Problems of this kind are not always easy to overcome, and because of this, when scaling you need to act very carefully, calculating everything in advance. However, the ability to represent a complex attitude or behavior as a single number or score, which is an undeniable advantage of scaling, serves as an incentive to use this technique in a wide variety of cases.

2. ROLE OF SCALES IN THE DATA ANALYSIS PROCESS

A measuring scale is an algorithm for assigning a number to an object, reflecting the presence or degree of expression of a certain property. There are four main types of measuring scales: name scale, order scale, interval scale and ratio scale. Scales of names and order allow an object to be classified into one of several non-overlapping classes and are called “qualitative”. Interval and ratio scales measure the “amount” or degree of expression of an object of some property and are called “quantitative”. The naming scale (nominal scale) allows you to classify an object into one of several classes, between which no order relationship has been established, i.e. classes in relation to which comparisons such as “more - less”, “better - worse”, etc. are not applied. On nominal scales such sociological indicators as gender, nationality or race, eye color, temperament, etc. are measured. When developing a nominal scale, a complete list of classes is compiled, which is numbered in random order. In this case, the numbers representing class numbers play the role of symbols or “labels”; no arithmetic operations can be applied to them. In other words, on the nominal scale only the relation of identity is defined: objects assigned to one class are considered identical, objects assigned to different classes are not identical. A special case of a nominal scale is a dichotomous scale, which records the presence or absence of a certain property in an object. The presence of quality is usually denoted by the number “1”, its absence by the number “0”. The order scale is intended to assign an object to one of the disjoint classes, ordered according to some criterion. On the order scale, in addition to the identity relation, the order relation (“more is less”) is defined. Thus, about objects classified as different classes, we can say that in one of them the measured property is expressed more strongly than in the other, but it is impossible to determine how much stronger. Typical examples of an order scale are education, type of settlement, social status, military ranks, etc. When constructing an order scale, classes are numbered in ascending or descending order of the corresponding characteristic. Arithmetic operations on class numbers are not performed. A special case of an order scale is a rank scale, used in cases where some attribute cannot be measured, but objects can be ordered according to the corresponding criterion, or when the order of objects is more important than the exact result of measurement - for example, seats occupied in sports competitions. Rank scales are also used in the study of preferences, value orientations, motives, attitudes, etc. In this case, the respondent is asked to order the proposed list of objects, concepts or judgments according to a certain criterion. Another special case of an order scale is a rating scale, with the help of which the properties of an object or the respondent’s attitude towards something are assessed based on a certain number of points. For example, academic performance is assessed on a 5-point scale. Rating scales are often seen as an exception to order scales because they assume that there is approximately equal distance between points on the scale. For example, it is assumed that an “excellent” student knows the subject as much better than a “good” student as the “good” student knows it better than a “C” student. This property allows rating scales to be treated as quasi-interval in many cases and used accordingly, for example, to calculate the GPA or to determine the average performance in a class. Interval and ratio scales are S.I. In the literal sense of the word. They are characterized by the presence of a unit of measurement that allows one to determine how much one object is larger or smaller than another, according to the criterion being studied. The difference between these two types of scales is that the ratio scale has an “objective” zero, independent of the arbitrariness of the observer, which, as a rule, corresponds to the complete absence of the measured quality in the object. On the interval scale, zero is set arbitrarily or in accordance with some traditions and agreement. Thus, age is measured on a ratio scale, and chronology is measured on an interval scale, although both scales use the same unit of measurement - the year. On the interval scale, in addition to the relations of identity and order, the relation of difference is defined: for any pair of objects it is possible to determine how many (units of measurement) one object is larger or smaller than the other. Interval scales are widely used in psychological tests and psychometrics, semantic differential techniques, and other methods of secondary measurements. Attitude scales measure indicators such as height, age, income, work experience, number of cigarettes smoked, etc. For such variables, not only the relations of identity, order and difference are defined, but also the relation of relations, which makes it possible to determine how many times one object is larger or smaller than another.

Measurement is the mapping of an empirical system into a numerical system that preserves the order of relationships between objects. The classical concept of measurement distinguishes two ways of assigning variable values ​​to objects. The first method is called assessment. The mapping of object properties onto the scale is carried out here in conventional units. For example, it is possible to determine with varying degrees of accuracy a person’s place on the “conservatism” scale. There is no unit of conservatism at the researcher’s disposal; gradations can change arbitrarily.

The measurement itself requires the definition of a unit - the standard of the scale. In this case, only spatial and temporal characteristics can be measured, as well as numbers—additive quantities. However, a broader view of measurement as the assignment of meanings to objects in accordance with a given system of relationships at various levels has gained acceptance in the social and behavioral sciences.

A variable is not the same as an actual attribute or property. This is a kind of ruler - a set of norms and operations that are necessary and sufficient to qualify an event, property, relationship, in a word, everything that is commonly understood as facts. For a ruler, it is not very important whether its divisions are applied to a wooden, plastic or metal plate. Much more important is the graduation of the scale, as well as the user’s ability to take measurements correctly. The situation is similar when measuring behavior, only the “ruler” in this case has the form of a questionnaire (or observation form), and “applying” them to an object is nothing more than an operational definition.

As a measurement instrument, a variable is constructed by the researcher by establishing a continuum of values ​​(gradations). The minimum minimorum of the continuum, as we already know, is a dichotomy: “yes” and “no,” plus and minus, affirmation and negation. In fact, we almost always deal with trichotomies, since any variable includes a gradation of “no answer” (or “no data”).

Thus, the variable contains three components: 1) some not always clearly formulated concept of the attribute being measured, for example, “electoral preferences”, “family stability”, “education”, etc.; 2) scale - a set of values ​​that define the criteria for classifying objects; 3) operational definition - a set of instructions regulating the process of identifying an object according to an established scale of values.

The elementary level of measurement is nominal. This level corresponds to a naming scale, which consists of characteristic values ​​that are not ordered by increasing or decreasing degrees. Typical examples of a scale of names: nationality, profession, political beliefs. The naming scale values ​​are constructed in accordance with logical classification rules. The first of these is the rule of non-contradiction. It states: “An object can be assigned to one and only one class, as specified by the value of the variable.” In other words, the researcher is obliged to call a spade a spade and avoid dialectics, in which an object simultaneously turns out to be both. This is not as easy to do as it seems - to call a thing by its proper name. Reactionaries sometimes seem to be liberals, stupid people seem smart, women seem like men. But even in the most difficult situations, the analyst is obliged to give an unambiguous qualification to the object. Much is allowed here. The only thing that is prohibited is to classify an object as white and black at the same time.

The consequence of this rule is the one hundred percent sum of the frequencies of all gradations of the variable. If the sum of frequencies exceeds the one hundred percent mark, it means that at least some units fell into two classes simultaneously and were counted more than once. This happens when the questionnaire asks for an assorted scale, where you can choose one, and the other, and the third. For example, it is asked: “What do you love most?” with answer options: matzo, shish kebab, liberal democratic freedoms... Here you can choose all the questionnaire tips, and you won’t get a 100% total if at least one of the respondents falls into the class of those who love both matzo and liberal democratic freedoms. The reason for the distortion is that the given positions do not constitute a variable; on the contrary, each of them is a “truncated” version of the variable. The full version requires the answers “Yes”, “No” and “I can’t say”. A correctly constructed variable represents a one-dimensional continuum. Unlike multipart dimensions, it does not require aggregation. Hence the second rule - the rule of a single basis for classification. You can’t divide people into smart people and red-haired people, because sometimes red-haired people turn out to be smart. You cannot mix two different variables in one question. It is impossible not to take into account the change in the meaning of a variable when it is moved to another context. For example, a question about the attitude towards intellectuals asked in Moscow and Chicago will turn out to be two different questions, because in the Russian tradition it is customary to attribute to the intellectual the role of a bearer of moral principles, while a resident of Chicago will not immediately guess who is meant by “intellectual”.

The third rule is the rule of completeness. In the population under study there should not be a single object that cannot be identified by the given values. In other words, the object must be distributed on the continuum of the variable and receive its due place in one of the classes. If this does not happen, the measurement process “freezes” - there is simply nothing or no one to apply the ruler to. Note that the “No Data” position solves the problem of completeness when the scale does not cover the entire range of values. For example, a respondent's refusal to report his or her age does not mean that the age scale is irrelevant to the item. Examples of scales that have no relation to the object, in other words, are not relevant to it, are numerous. Sociologists often try to measure opinions, attitudes, and other personal characteristics, assuming that everyone has the property being studied. For example, the question: “What is your attitude towards Burbulis?”, asked by some public opinion research centers in 1992, was based on the belief that the property “Attitude towards Burbulis” is present in everyone who was included in the sample. The very possibility that a person had neither a positive nor a negative attitude towards Burbulis was excluded. The position “I can’t say” would seem to include this kind of respondents, but this includes not only those who do not have an opinion, but also those who do not have the attribute itself.

In sociological measurement, a type of artificially created emergent variable often arises—variables generated by the procedure itself. People who did not have any relationship to the characteristic being studied before the interview construct this relationship in the process of interpersonal communication with the interviewer, answering “positively,” “negatively,” or most often “neutral.” The causes of emergent variables are most related to the influence of the interviewer.

G. A. Pogosyan shows typical circumstances under which variables describe not so much the independent speech behavior of the respondent, but rather the situation of data collection. In particular, Pogosyan showed that prompting an answer significantly changes the frequency distribution.

The table shows that the “hint” significantly increases the number of people who believe that good specialists have the most favorable chances for promotion, and almost as much reduces the number of those who indicate obsequiousness. Assuming that open-ended questions provide greater opportunity for independent opinion expression, the prompt leads to an artifact: 62% chose the appropriate version of the answer rather than expressing their opinion.

When designing variables, the sociologist seeks to ensure that they correspond to the actual behavior of the object. At the same time, he is obliged to organize them in a logical manner, neglecting the fact that “life” is often illogical and ambiguous. Here a dilemma emerges: either describe life in all its inconsistency, or construct diagrams. In the first case, it is better for a sociologist to choose a career as a writer; in the second case, it is necessary to try to ensure that the logical scheme corresponds to reality.

The demands of one-to-one correspondence and a single basis contain a certain violence against “human” reality. In life, “yes” often turns into “no,” “democrats” call themselves communists, but a plus turns out to be a minus. It is best to work with nominals that are assumed to be most consistent with the language of social interaction and behavior. Nominal measurements in sociological and socio-economic research are regarded as fundamental for understanding the very nature of social reality. S.V. Chesnokov bases this conclusion on the assumption that nominal variables are the final result of procedures for empirical verification of theoretical concepts whenever the object of research to one degree or another is people, their consciousness and behavior. “This is due to the fact,” writes S.V. Chesnokov, “that both the sociologist-researcher and the people who expressed good will to contact the sociologist in the role of respondents express their reactions, form and describe the social in images and concepts, the signs of which are words, not numbers”8. This implies the assumption of limited possibilities for numerical data analysis. The humanitarian dimension of S.V. Chesnokov calls any naming, and deterministic analysis is the establishment of the following “if a, then b,” where a and b are names.

Without a doubt, nominal variables that record specific values ​​lie at the foundation of the sociological vocabulary. However, this feature of them is rooted not so much in the “living language” of social communication, but in the equivalence of variable values ​​to protocol fact-recording statements. These kinds of nominal “protocols,” regardless of their content, lie at the foundation of any scientific description. Scales (continuums) themselves are ways of organizing nominal values ​​in idealized metrics, but in any case the requirement of one-to-one correspondence between the unit and the value of the variable must be met.

The requirements for nominal measurements (identifications) must also be met for scales of a higher level: ordered, interval and metric.

An ordered scale differs from a nominal one in that its gradations are arranged in a certain order relative to increasing or decreasing intensity of the property.

The ordered class includes rating scales, attitudes and preferences. In sociology, two types of ordered scales are used: ranks (ratings) and points. Ranks are established by assigning a place to an object such that the number of places is exactly equal to the number of objects. For example, you can distribute students by level of preparation and assign each one a place, starting from first to last. In other words, we rank them, knowing that regardless of the level of knowledge in the group there must be first and last. A similar system of production incentives, based on the idea of ​​rewarding the former at the expense of the latter, was used in the 1960s. V.M. Yakushev, experimenting in one of the design bureaus, the experiment became known as “Pulsar”. Since in any case someone will be the last, the group is placed in conditions of competition and struggle for survival.

Rating as a type of social assessment is the norm of a certain type of culture, based on the priority of individual interest over collective interests. Life and professional success is conceptualized here as victory over others. In this kind of game, it is considered stupid and even immoral to let a classmate cheat on a test - after all, this means losing to him in the competition. Eventually the driven horses are shot, right? All this happens not only in studies, but also in business, family, communication, and religion. The theory of rational choice is based precisely on the idea of ​​optimizing individual behavior with limited resources.

Point scales operate not with places, but with school values. These values ​​are independent of each other. In a sense, the point scale has egalitarian origins. All students, including the first and last, can get C's and be happy according to the theory of relative deprivation. However, the reliability of this kind of scale is very questionable, especially in cases where numbers are used to indicate marks. A distance of 4 to 5 is not the same as a distance of 2 to 3. Each teacher has his or her own preference for where on the continuum they place students. One puts 2 and 3, the other 4 and 5. How to compare them? There are no big difficulties here, since individual values ​​can be normalized relative to the average score or standard deviation of scores for each teacher.

Ordered rating scales involve a logical balancing of positions relative to a neutral center. This requirement reflects a more general rule for constructing scales: each category of the scale must be characterized by an equal probability of “hitting” the object, subject to random distribution. In other words, the number of gradations to the right of the center should be equal to the number of gradations to the left. Often the value “Can’t Tell” is used as the “center” of the scale. This creates obvious ambiguity in the interpretation of the data. “Can’t say” means that the respondent cannot choose any of the proposed positions; but if “Can’t say” is at the center of a balanced scale, it means “I find it difficult to prefer anything.”

When the values ​​of an ordered rating scale do not have clearly defined boundaries, the scale becomes semi-ordered. In fact, semi-ordered scales are most often used in sociological and psychological research.

Interval scales are based on procedures that ensure equal or approximately equal distances between gradations of a variable. In this case, it is not the values ​​of the variables that are compared, but the distances between the values. In other words, any two measurements of a given empirical system, carried out on an interval scale, are converted into each other using a linear function.

If on a nominal scale the sequence of objects is established without much difficulty, the interval scale involves solving the problem of comparing distances between objects. This property of linear transformations, characteristic of interval scales, is demonstrated by a numerical example: 5 - 2 / 2 - 1 = 24 - 15 / 15 - 12 = 3. The ratio of the differences between scale values ​​is constant in this case.” If one of the objects of an interval scale is displayed at zero, we can talk about a ratio scale - a special case of an interval scale. In this case, the beginning of the countdown12 is fixed.

You can build an interval scale using paired comparisons or using, as L. Thurstone did, judicial procedures. First, an array of relevant judgments is created that describe the attribute being measured, for example, an attitude, attitude, or assessment. Then the experts are asked to rank the judgments into categories from the greatest intensity of the attribute to the least. It is assumed that the distribution of judges' ratings around scale values ​​is subject to a normal law. Those judgments that received and agreed upon scores from the judges are selected. This is the method of constructing “intervals that appear equal.” The most well-known methods for constructing interval scales were developed by L. Thurstone, R. Likert, and L. Guttman. However, they are rarely used in modern sociology.

Metric, or absolute, scales meet all the requirements for scales of lower classes; they have not only a zero reference mark, but also a unit of measurement of time, distance or number of units. All transformations with numbers are allowed here.

Attribution of meaning to objects occurs in three forms: verbal, graphic and numerical. Verbal interpretation of variables is most common in mass surveys. The elements of the scale here are judgments indicating opinions, values, and states. How adequate this evidence is is a particular problem. One thing is clear: the judgments themselves are nothing more than evidence of the reality that stands behind them. Therefore, the verbal interpretation of the scale plays the role of a kind of probe in the language of everyday life. Its fundamental difference from everyday speech lies in its clear conceptual structure, adapted to a variety of speech situations and contexts. Even an open-ended question, seemingly maximally focused on the respondent’s vocabulary, works only if there is unambiguous conceptual coding.

Verbally interpreted scale positions are perceived quite clearly, if there are few of them. But already when choosing from five gradations, difficulties begin. For example, the categories “satisfied” and “rather satisfied than dissatisfied” differ with a significant degree of convention. In the seven-point scale, the possibilities of verbal interpretation are exhausted. Here, the graphic design of the scale is preferable, creating the possibility of a standard reading. Graphic interpretation of the scale is used in so-called cross-cultural studies, where the vocabulary of the instrument requires translation into the respondent’s language. Visualizing a variable in a figure is supposed to create a universal “pattern” of the scale. Gestures are used in a similar way in international communication. One example of a tool made in a graphical key is pictures of the thematic apperception test. Scales are often depicted as rulers and pictograms. Harvey Cantril developed the “ladder of happiness”: on a drawing of the ladder, the respondent must mark his current position relative to the best (top of the ladder) and worst (bottom of the ladder) combination of circumstances, and then indicate the direction of his intended movement along the “ladder of happiness”. In one of the early versions of the attitude scale, L. Thurstone proposed an eleven-point continuum, made in the form of a thermometer.

Numerical interpretation is sometimes mistakenly identified with verbal interpretation. The use of numbers as numeral names does not mean the introduction of metrics. For example, for coding purposes, men might be designated as 1 and women as 2. In this case, labels are used, but not numbers. Numbers involve the implementation of additivity operations and arithmetic operations. The range of numerical scales is limited to interval and metric levels of measurement, where the units of property intensity are established.


Types of statistical scales: nominal scale, ordinal scale, interval scale, ratio scale.

Nominal scale used to record the lowest level of measurement that assumes the existence of minimal prerequisites for measurement. When measuring at this level, practically no numbers are used. Here it is important to establish the similarity or difference of objects according to some characteristic, i.e., in this case we are dealing with qualitative data. Let's look at examples.

The distribution of students by class, by gender, by place of residence, by the types of sports they play, by the number of children in the family are examples of nominal scale values. In this case, it is possible to distribute students according to two or more characteristics (bidimensional or multidimensional data).

Using counting, you can establish the frequency of a particular category (the number of boys and girls in school; the number of students living in each microdistrict; the number of students in each class; the number of students involved in a particular sport; the number of companies involved in the production of buses, etc. .d.). In this case, it is possible to determine the most frequently occurring value (the class in which the largest number of students study; the type of sport that is most popular among students; the type of car produced by the largest number of companies). Categories of data on a nominal scale are designated, as a rule, verbally.

Ordinal, or rank, scale indicates only the sequence of carriers of the trait or the direction of the degree of expression of the trait.

For example, students can be ranked based on the number of test items they complete correctly. Let students A, B, C, D, E correctly complete 21, 16, 12, 9 and 3 tasks, respectively. Graphically it can be depicted like this

This ordinal scale has values ​​from 1 to 5, and students are placed on it depending on the number of correctly completed tasks: A - first, D - fifth. The figure shows that the intervals separating the places in the row are different in size. For this reason, it is not practical to add, subtract, multiply, and divide ordinal values.

On interval scale equal intervals reflect the same measure of the value of the measured characteristic. For example, 1 cm between 3 and 4 centimeters on a length measurement scale has the same meaning as 1 cm between 82 and 83 centimeters. In other words, on an interval scale, the distances between adjacent divisions are equal. On an interval scale, the question “by how much?” is quite meaningful. But it is not always possible to formulate the question “how many times?” when using an interval scale. The fact is that on the interval scale the reference point (scale zero), the unit of measurement and the reference direction are set arbitrarily. An example of an interval scale is the Celsius temperature scale. The difference between air temperatures +30 and +20 °C is as great as between -10 and -20 °C. However, it cannot be said that at an air temperature of +30 °C it is one and a half times warmer than at a temperature of +20 °C. Even if the air temperature is 0 °C, it cannot be said that there is no heat at all: after all, the starting point is chosen arbitrarily.

The scales on most physical instruments (ammeter, voltmeter, etc.) are interval. The IQ scale is an interval scale.

The interval scale is metric and can be used to perform addition and subtraction. It has significant advantages over the nominal and ordinal scales.

Relationship scale, or proportion scale, makes it possible to establish relationships between the values ​​of the measured characteristic due to the fact that the scale value “0” corresponds to a value for which the measured characteristic is absent. In other words, the origin on these scales is chosen involuntarily. Examples of ratio scales are measures of length (m, cm, etc.) and mass (kg, g, etc.). An object 100 cm long is twice as long as an object 50 cm long. Sometimes data needs to be transformed. In particular, the need for this arises when in a data series one or more data significantly exceeds the rest. If the data is clearly skewed, then replace each value of the given data set with the logarithm of that value in order to simplify statistical analysis.Logarithm converts "skewed" (asymmetrical) data into more symmetrical ones, as the scale "stretches" near zero, small values ​​grouped together are distributed along the scale. At the same time, logarithm brings together large values ​​at the right end of the scale. The most commonly used are decimal and natural logarithms. Equal distances atlogarithmic scale corresponds on the original scale to equal percentage increases, rather than equal increases in values.

^ Checking for a normal distribution.

Numerous methods by which interval scale variables are processed are based on the hypothesis that their values ​​follow a normal distribution. With this distribution, most of the values ​​are grouped around a certain average value, on both sides of which the frequency of observations decreases uniformly.

As an example, consider the normal age distribution, which is constructed from data from studies of hypertension (file hyper.sav) using the Graphs menu commands Histogramm... (Histogram) (see Fig. 5.1).

The diagram shows a normal distribution curve (Gaussian Bell). The actual distribution deviates to a greater or lesser extent from this ideal curve. Samples that strictly obey the normal distribution, as a rule, do not occur in practice. Therefore, it is almost always necessary to find out whether the real distribution can be considered normal and how significantly the given distribution differs from normal.

Before applying any method that assumes the existence of a normal distribution, the presence of the latter must be checked first. A classic example of a statistical test that assumes a normal distribution is the Student t test, which compares two independent samples. If the data does not follow a normal distribution, an appropriate nonparametric test should be used, in the case of two independent samples - the Mann and Whitney U test.

If visual comparison of the actual histogram with the bell curve seems insufficient, you can apply the Kolmogorov-Smirnov test, which is found in the Analyze menu in the nonparametric test suite (see Section 14.5).

Rice. 5.1: Age distribution

In our age distribution example, the Kolmogorov-Smirnov test does not show a significant deviation from the normal distribution.

^ Dependence and independence of samples.

Two samples depend on each other if each value of one sample can be assigned in a natural and unambiguous way to exactly one value of the other sample. The dependence of several samples is determined in the same way.

Most often, dependent samples occur when measurements are taken at multiple points in time. Dependent samples form the values ​​of the parameters of the process under study corresponding to different points in time.

In SPSS, dependent (also related, paired) samples will be represented by different variables that are compared with each other in a corresponding test on the same set of observations.

If a regular and unambiguous correspondence between samples is not possible, these samples are independent. In SPSS, independent samples contain different observations (for example, from different respondents), which are usually distinguished by a group variable related to a nominal scale.

^ A review of common tests for testing hypotheses about the mean.

In the most common situation, where different samples need to be compared with each other based on their means or medians, subject to the conditions described in Section 5.1, one of the following eight tests is usually used.

^ Variables related to the interval scale and subject to normal distribution

^ Variables that are on an ordinal scale or variables that are on an interval scale but are not normally distributed

^ Probability of error.

In analytical statistics, methods have been developed for calculating the so-called test (control) values, which are calculated using certain formulas based on data contained in samples or characteristics obtained from them. These test values ​​correspond to certain theoretical distributions (t-distribution, F-distribution, X2 distribution, etc.), which allow the so-called error probability to be calculated. This probability is equal to the percentage of error that can be made by rejecting the null hypothesis and accepting the alternative.

Probability is defined in mathematics as a value ranging from 0 to 1. In practical statistics, it is also often expressed as a percentage. Typically, probability is denoted by the letter p:

0
The probability of error at which it is acceptable to reject the null hypothesis and accept the alternative hypothesis depends on each specific case. To a large extent, this probability is determined by the nature of the situation being studied. The greater the required probability with which an erroneous decision must be avoided, the narrower the limits of the probability of error are chosen at which the null hypothesis is rejected, the so-called confidence interval of the probability.

There is generally accepted terminology that refers to probability confidence intervals. Statements with a probability of error p


^ Probability of error

Significance

Designation

p > 0.05

Not significant

ns

R

Significant

*

R

Very significant

**

R

Maximum significance

***

^ Confidence interval of probability.

Confidence interval - term used inmathematical statistics with interval (as opposed to point) estimation of statistical parameters, which is preferable with a small sample size. A confidence interval is one that covers an unknown parameter with a given reliability.

Confidence interval of the parameter θ random variable distribution X with confidence level 100 p%[note 1] , generated by the sample ( x 1 ,…,x n), is called an interval with boundaries ( x 1 ,…,x n) and ( x 1 ,…,x n), which are realizations of random variables L(X 1 ,…,X n) and U(X 1 ,…,X n), such that

The boundary points of the confidence interval are called confidence limits.

An intuition-based interpretation of the confidence interval would be: if p is large (say 0.95 or 0.99), then the confidence interval almost certainly contains the true value θ .

^ Descriptive (descriptive analysis).

This type of analysis involves a descriptive presentation of individual variables. This includes creating a frequency table, calculating statistical characteristics, or graphical representation. Frequency tables are constructed for variables related to the nominal scale and for ordinal variables that do not have too many categories; about this, see chapters 6, 12 and 24.

For variables related to the nominal scale, no significant statistical characteristics can be calculated. Most often, for ordinal variables and variables related to the interval scale, but not subject to a normal distribution, medians and both quartiles are calculated (see section 6.2); If the number of categories is small, the option for concentrated data can be used (see section 6.3).

For variables on an interval scale and subject to a normal distribution, the mean and standard deviation or standard error are most often calculated (see section 6.2). However, only one of these two scatter characteristics should be selected. For variables across all statistical scales, a wide variety of graphs can be constructed that present frequencies, means, or other characteristics.

^ Analytical statistics.

Almost any statistical analysis, along with purely descriptive operations, includes certain analytical methods (significance tests), the application of which ultimately determines the probability of error p (see Section 5.3).

A large battery of tests is used to determine whether two or more different samples differ in their means or medians. This takes into account the difference between independent samples (different observations) and dependent samples (different variables; see section 5.1.3). Depending on the number of samples (two or more), whether the samples are dependent or not, whether the variables belong to an interval or ordinal scale, or whether they are subject to a normal distribution, specialized tests are used (see section 5.2).

A very common situation occurs when different groups of observations or values ​​of variables related to a nominal scale are compared. In this case, contingency tables are built (see Chapter 11). Another group of tests concerns the study of relationships between two variables, that is, identifying correlations and reconstructing regressions (see Chapter 15, section 16.1).

In addition to these fairly simple statistical methods, there are also more complex methods of multivariate analysis, which usually use many variables at the same time. For example, if you want to reduce a large number of variables to a smaller number of “bundles of variables,” called factors, then factor analysis is performed (Chapter 19). If our goal is the opposite - to combine given observations, forming clusters from them, then cluster analysis is used (Chapter 20).

In a certain group of multivariate tests, a distinction is made between a dependent variable, also called the target, and several independent variables (influence or prediction variables).


^ Dependent Variable

Independent Variables

Multidimensional method

Dichotomous

Any

Binary logistic regression (section 16.4); discriminant analysis (Chapter 18)

Dichotomous



Logit-log linear models

With nominal scale

With nominal or ordinal scale

Multinomial Logistic Regression (Section 16.5)

With ordinal scale

With nominal or ordinal scale

Ordinal Regression (Section 16.6)

With interval scale

With nominal or ordinal scale

Analysis of Variance (Section 17.1)

With interval scale

Any

Analysis of Covariance (Section 17.2); multiple regression analysis (section 16.2)

Multinomial logistic regression and ordinal regression can also use interval scale covariates.

The independent variables related to the nominal scale in binary logistic regression, discriminant analysis, and multivariate regression analysis must be dichotomous or decomposed into a set of dichotomous variables (see Section 16.2). Logit-log linear models are not discussed in this book, but in the second volume, devoted to methods of market and public opinion research.



Did you like the article? Share with your friends!