Image and its orthogonal transformations. Algorithms for digital image compression using orthogonal transformations

Based on the DFT, Walsh-Hadamard and Haar transformations, a number of other transformations can be constructed orthogonal transformations. They can be determined either using the Kronecker product or as a sum of Kronecker products. For example, a hybrid Hadamard-Haar transform is proposed, the matrix of which is of the order of dimension is defined as

The paper gives a recursive definition of the so-called modified Hadamard transform

and its connection with the Haar transform is indicated.

We consider the matrix of the so-called generalized Walsh transform of order of dimension (transformation by Vilenkin-Chrestenson functions), defined as the Kronecker power of the matrix

The work describes the so-called -transformation, which is built on the basis of the Walsh-Hadamard transformation by replacing each sum in expression (3.114) with its absolute value. This transformation is irreversible.

Also worth mentioning are the slant transform, slant-Haar transform and discrete basis transform proposed for image encoding.

It can be shown that most of the unitary transformations currently used in image processing can be represented as sums of Kronecker products of elementary matrices, permutation matrices, and some others. This representation of the Haar, Hadamard, Walsh, Walsh-Paley matrices, the modified Hadamard matrix, the Hadamard-Haar matrix, the DFT matrix, and the generalized Walsh matrix is ​​shown in Table. 3.5 using the following notation:

A dimension permutation matrix, when multiplied by a vector, its elements are rearranged in accordance with the binary-inverted code of their number; - dimension permutation matrix that performs permutation of vector elements in accordance with the inverse Gray code of their numbers; - Kronecker product of matrices; Kronecker power of the matrix.

(see scan)

Continuation of the table. 3.5 (see scan)

This representation provides a convenient basis for comparing transformations. Thus, when comparing representations for a matrix, it is easy to “notice that they differ in the inverse order of the matrices in each term; the MHAD matrix differs from the Hadamard matrix HAD in that it is not built on

And on, etc. For all these matrices, there are fast algorithms for multiplying them by a vector when performing the transformation. This fact is most directly related to the possibility of representing matrices in the form of sums of Kronecker matrices (see Chapter 4).

Based on the described one-dimensional transformations, the corresponding two-dimensional separable transformations can be constructed as double one-dimensional ones:

where M is one of the transformation matrices described above; a - two-dimensional discrete signal; a is its transformation.

Note that all unitary image transformations currently used in digital image processing are separable, i.e., they are performed separately along the columns and rows of a two-dimensional signal. This reduces the number of operations required to complete them. Separable transformations can also be constructed by choosing different matrices for transformations along rows and columns:

This is how it turns out mixed transformations, used in specialized digital image encoding devices (see, for example,).

Applications of unitary transformations in image processing can be divided into three groups:

Image coding;

Feature extraction for image preparation and recognition;

Generalized filtering.

Image encoding is currently the main application of transformations (except DFT). Moreover, some of the transformations (for example, slant transformation and discrete linear basis transformation, etc.) were introduced specifically for use in coding.

The signal representation coefficients obtained as a result of its transformation can be considered as its signs and used in image preparation (see Part II, Chapter 7) and for recognition. An example of a transformation invented specifically for highlighting features during recognition is the -transformation. The applications of transformations for coding and recognition are related. As a rule, transformations that give better results for encoding are also better for feature extraction.

The use of unitary transformations for signal filtering is based on a generalization of the concept of filtering in the frequency domain discrete transform Fourier. When filtering signals using DFT, the following signal transformation is performed:

Transition matrices from the T transformation to the DFT and vice versa.

This approach was proposed to generalize optimal linear (Wiener) filtering (see also).

Depending on the type of transformation T and the properties of the required filter, the complexity of performing the filtration operation (3.139), estimated, say, by the number of operations, may vary. In particular, it may turn out that it is more advantageous to use the faster Walsh-Hadamard transform instead of the DFT, despite the greater complexity of multiplication by the off-diagonal filter matrix in this case (see also § 6.5).

One of the most common means of processing both one-dimensional and multidimensional signals, including images, are orthogonal transformations. The role of orthogonal transformations is especially great in solving the problem of reducing the transmission rate of binary symbols in digital television and, consequently, reducing the required frequency band of communication channels. The essence of orthogonal transformations is to represent the original signal as a sum of orthogonal basis functions.

Recall that functions x(t) and y(t) are called orthogonal on the segment (t 1, t 2) if their scalar product is equal to zero

This definition can be extended to discrete signals represented by sequences of numbers. Discrete signals x(n) and y(n), each having N samples, are called orthogonal if the condition is satisfied

One of the most famous examples The application of the orthogonal transformation is the expansion of the periodic signal x into a Fourier series

Where: ; T - signal repetition period x(t).

The real coefficients of the Fourier series are determined by the relations

In complex form, the Fourier series expansion has the form:

Complex harmonic amplitudes;

j is the imaginary unit.

Not only a periodic signal with a period T, but also a signal that differs from 0 only over the time interval (-T/2, T/2) can be expanded into a Fourier series. In this case, a periodic continuation of the signal along the entire time axis with a period T is used.

Let's consider a discrete signal x(n), different from 0 for n = 0.1, ..., N-1. For such a signal, it is also possible to introduce a basis expansion of sinusoidal functions. Since the frequency spectrum of the sampled signal must be limited from above in accordance with the condition of Kotelnikov’s theorem, a finite number of frequency components remaining in the decomposition of the discrete signal are discrete complex harmonic functions. This expansion, called the discrete Fourier transform (DFT), has the form

N=0, 1…N-1,(2.6)

where the DFT coefficients X(k) are determined by the relation

K=0, 1…N-1,(2.7)

Recall that finding the coefficients X(k) from (2.7) is usually called direct DFT, and obtaining a signal from these coefficients in accordance with (2.6) is an inverse DFT.

In these relations, sums appeared instead of integrals, since the original signal is not continuous, but discrete. Frequency used in decomposition analog signals and having the dimension rad/s, in the DFT corresponds to a dimensionless quantity, where k=0, 1…N-1. The ratio shows what fraction of the sampling frequency is the frequency of a given discrete harmonic.

The DFT coefficients Х(k) and exponential factors in (2.6), (2.7) are complex numbers. Each complex number is stored in digital memory as a pair of real numbers representing its real and imaginary parts. Adding two complex numbers requires performing two addition operations of real numbers - the real and imaginary parts are added separately. Multiplying two complex numbers requires four multiplication operations and two addition operations for real numbers. Thus, performing DFT in a complex form leads to a significant increase in the required memory volume and computation time.

To deal only with real numbers, usually use discrete cosine transform (DCT) decomposition, described by the relation:

where the monetary policy coefficients are determined by the formulas

As in the case of DFT, finding the coefficients C(k) according to (2.9) is called direct DCT, and representing the signal in the form (2.8) is called inverse DCT.

Similarly, we can write the relations for direct and inverse DFT and DCT in two-dimensional case. A two-dimensional discrete signal, for example, a single frame of a digital television signal, is represented by a matrix of values ​​x(t,n), where t = 0 ... M-1 - sample number in the line, n = 0 .., N-1 - line number in frame.

The direct two-dimensional DFT has the form:

k=0…M-1, l=0…N-1,

where X(k,l) are complex DFT coefficients that reflect the spatial-frequency spectrum of the image.

Inverse 2D DFT represents the decomposition of an image into basis functions:

The coefficients of two-dimensional direct monetary policy are determined by the formulas:

The inverse two-dimensional DCT has the form:

The quantities and are discrete spatial frequencies, along the horizontal and vertical coordinates, respectively, which are expressed as dimensionless quantities that have the same meaning as the discrete frequency in the one-dimensional case. Each discrete spatial frequency is proportional to the ratio of the spatial period of sampling at a given coordinate to the spatial period of this frequency component. Spatial periods are measured in units of distance.

In Fig. 2.3 shows the basis functions of two-dimensional DCT for M = 8, N = 8 in the form of halftone pictures. Light areas correspond to positive values, and dark areas correspond to negative ones.

Rice. 2.3.

Examples shown:

  • a) k = 1, l= 0; b) k = 0, l = 1; c) k = 1, l = 1;
  • d) k = 0, l = 2; e) k = 1, l = 2; e) k = 2,l = 2;
  • g) k = 4, l = 2; h) k = 7, l = 1; i) k = 7, l = 7.

A remarkable property of decomposing a video signal in the DCT basis is that each basis function contains information about the entire image at once. The number of basis functions used to decompose the video signal determines the accuracy of the image representation.

In accordance with , in general, it is possible to estimate the cost of computational resources when performing forward and inverse DFT as proportional to N 2 . Similarly, it can be shown that the calculation of two-dimensional forward and inverse DFTs requires a number of operations proportional to N 2 M 2.

For example, calculating the DFT for a square image block containing 8x8 elements (pixels) will require approximately 16 10 3 multiplication and addition operations. And calculating the DFT of a black-and-white television frame of the usual decomposition standard, containing 720x576 pixels, will require about 8·10 11 operations. If the calculations are performed on a computer that performs 10 6 operations on real numbers per second, the DFT calculation time will be 8 10 5 s or more than 200 hours. Obviously, to calculate the DFT of television images in real time, i.e., during the frame scanning period , it is necessary to look for ways to reduce the number of required operations.

The most radical way to reduce the amount of computation is to use fast DFT algorithms discovered in the 60s, called fast Fourier transform (FFT) algorithms. Fast algorithms for calculating DFT are described in detail in many literature sources and are not discussed here.

A two-dimensional FFT can be decomposed into a sequence of one-dimensional ones. The number of required operations turns out to be proportional. For the above example of a television frame consisting of 720x576 pixels, this value turns out to be approximately 8 10 6, which is 10 5 times less than the number of operations required for direct calculation DFT.

There are also fast algorithms for calculating monetary policy. As will be seen below, in digital television the main role is played by the DCT of 8x8 pixel blocks, which uses an algorithm for quickly calculating a one-dimensional DCT of a digital signal segment containing eight elements. In this case, the DCT is first calculated for each column of the block of image elements, and then the DCT is calculated for each row in the resulting 8x8 matrix of numbers.

In modern equipment, including digital television, DFT and DCT are usually performed in real time using digital signal processors (DSP) or special hardware, for example, parallel computing devices.

DCT underlies the currently most widely used encoding methods JPEG, MPEG-1, MPEG-2, a description of which will be given in section 2.2.

Algorithms for digital image compression using orthogonal transformations

As a manuscript

Umnyashkin Sergey Vladimirovich

UDC 004.932: 004.421: 519.722

Mathematical methods and digital algorithms

image compression using

orthogonal transformations

Specialty 05.13.11 - “Mathematics and software computers, complexes and computer networks”

Moscow – 2001 2

The work was carried out at the Moscow State Institute of Electronic Technology (Technical University)

Scientific consultant: Doctor of Physical and Mathematical Sciences, Professor Pospelov A.S.

Official opponents:

Doctor of Physical and Mathematical Sciences, Professor Ososkov G.A.

Doctor of Physical and Mathematical Sciences, Professor Selishchev S.V.

Doctor of Technical Sciences, Professor Koekin A.I.

Leading organization: Federal State Unitary Enterprise Research Institute Radio (Moscow)

The defense will take place on February 19, 2002 at 1430 at a meeting of the dissertation council D.212.134.02 at the Moscow State Institute of Electronic Technology at the address: 103498, Moscow, Zelenograd, MIET (TU).

The dissertation can be found in the MIET (TU) library.

Scientific secretary of the dissertation /Vorobiev N.V./ council professor

general description of work

Relevance Topics. Storing and transmitting images with direct digital representation in the form of a matrix of pixels (image points) requires processing enormous amounts of data.

However, direct image representation is ineffective: due to the significant correlation of matrix elements, independent coding of pixels generates redundant codes. Therefore, among other problems of digital image processing, the problem of image compression, which consists in finding ways to implement effective encoding of visual data, is of particular relevance.

Goal of the work. The complexity of algorithms used for image compression is steadily growing - this concerns not only the volume of calculations, but also the ideological foundations for constructing algorithms, most of which are based on the use of discrete orthogonal transformations for data preprocessing. At the same time, the problem of image compression is posed by practice, which requires constant attention to the capabilities of real equipment when solving it. Purpose The work involved the study of theoretical issues of efficient image coding using orthogonal transformations, as well as the development of appropriate compression algorithms suitable for practical use on the basis of universal general-purpose computing tools.

Direction of research. The research carried out in the dissertation included consideration of the following issues:

1. Research and development of methods for theoretical analysis and synthesis of discrete transformations for correlated data compression schemes;

2. Development of new fast algorithms for calculating the discrete Chrestenson-Levy transform (DCLT) and a halftone image compression algorithm based on statistical coding of DPCL spectra;

3. Study of the specifics and formalization of the general compression scheme using block-by-block processing image fragments using orthogonal transformation followed by quantization and statistical coding of transform coefficients;

4. Development of wavelet image compression algorithms and studying the possibilities of fractal coding in the wavelet spectrum;

5. Development of an algorithm for compressing video sequences (dynamic images), suitable for use in the form of software implementation based on universal general-purpose computing tools (multimedia personal computers).

Research methods. Methods of mathematical and functional analysis were used as the main theoretical research tool, linear algebra, probability theory and mathematical statistics, information theory. A significant part of the research also consisted of computer experiments on processing real still and dynamic images, aimed at obtaining the necessary statistical data and determining the characteristics of the final compression algorithms. The experiments carried out confirmed the accuracy theoretical solutions and the effectiveness of the proposed compression algorithms.

Scientific novelty. As a result of the dissertation work, new methods for analyzing the effectiveness of orthogonal transformations designed to compress correlated data were obtained; specifically for data compression, the discrete pseudocosine transform (DPCT) was introduced into consideration (constructed for the first time). New fast algorithms for calculating DPCL have been developed, on the basis of which a compression scheme for static images has been obtained for the first time, having characteristics similar to the JPEG method.

For processing still and dynamic images, both new algorithms and general theoretical approaches have been proposed that formalize procedures for the analysis and synthesis of digital image compression schemes based on discrete orthogonal transformations.

The following main results are submitted for the defense of the dissertation:

A method for assessing the decorrelating efficiency of orthogonal transformations and clustering algorithms for correlated DPKP based on it and a fast algorithm for its calculation;

New fast DPCL algorithm and its modification - an algorithm with incomplete calculation; algorithm of combined calculations DPKL for processing real arrays in the basis (1,exp(-2i/3));

An image compression method based on a special method of arithmetic coding of the DPCL spectra of image blocks;

Deterministic and probabilistic estimates discrete cosine transform (DCT) coefficients;

Algorithm for contextual coding of DCT image spectra;

General scheme for image compression based on adaptive vector quantization in the field of orthogonal transformations;

Wavelet compression algorithms for static images;

Algorithm for searching for moved image blocks;

Experimental technique for constructing partitioning spectra into independent coding areas;

Video compression algorithm.

Practical value. In general, the content of the work is applied, so the theoretical results obtained also serve to achieve goals related to the development of specific algorithms and schemes for digital image compression. The use of the obtained image compression algorithms is possible for a wide class of systems for storing and transmitting visual information, primarily in multimedia and network computer applications. The developed algorithms, as confirmed by experiments, have high characteristics in terms of speed, quality of processing and data compression, which correspond to the modern world level.

Implementation of work results. The theoretical results of the work and algorithms for compressing video images were introduced at the State Research and Production Complex “Technological Center” of MIET (http://www.tcen.ru) and used in the scientific and production activities of the Scientific and Production Enterprise “Technology” (Moscow).

Approbation of work. Main results The work was reported and discussed at the following scientific conferences and meetings:

1. VII Saratov winter school on the theory of functions and approximations (SSU, January 1994).

2. Int. conference on the theory of functions and approximations, dedicated to the 90th anniversary of Acad. S.M. Nikolsky (Moscow, MI RAS, May 1995).

3. All-Russian scientific and technical conferences “Electronics and Informatics” (Moscow, MIET, 1995-2000).

4. Int. conference on the theory of approximation of functions, dedicated to the memory of prof. P.P. Korovkina (Kaluga, KSPU, June 26-29, 1996).

5. International conferences “Methods for optimization of calculations” (Kyiv, 1997, 2001).

6. International conference “Problems of mathematical education”, dedicated. 75th anniversary of corresponding member. RAS prof. L.D. Kudryavtseva (1998).

7. International conference “Approximation theory and harmonic analysis” (Tula, May 26-29, 1998).

8. International conference “Information technologies in innovative projects” (Izhevsk, April 20-22, 1999).

9. VII International Conference “Mathematics. Economy. Ecology. Education” – International Symposium “Fourier Series and Their Applications”

(Novorossiysk, 1999).

10. VII International Conference “Mathematics. Computer. Education"

11. International conference dedicated to the 80th anniversary of the birth of S.B. Stechkin (Ekaterinburg, February 28 - March 3, 2000).

Publications. The main content of the dissertation is reflected in 30 works.

Structure and scope of the dissertation. The dissertation contains pages (of which 26 pages are appendices) and consists of an introduction, six chapters, a conclusion and 6 appendices. The bibliographic list includes 178 titles. The appendices provide numerical results a number of experiments on image processing, as well as reference Information and copies of documents on the use of the results of the dissertation work.

In the introduction(28 pages) the relevance, scientific novelty, and practical value of the research are substantiated. The contents of the chapters are briefly summarized.

In the first chapter(35 pages) provides preliminary information necessary for further presentation, provides a brief overview and classification of the main approaches to implementing effective image coding.

The use of lossy compression algorithms for halftone images is widespread: by allowing an error in the reconstructed image, you can achieve much more high level data compression. Most often, the quality of image processing is assessed by X = (xi, j) – the matrix of the original image, X = (xi, j) – the matrix of the image obtained after processing (data compression and restoration). For the logarithmic value of the standard deviation, the generally accepted measure PSNR (peak signal to noise ratio) is used. It is convenient to consider image compression methods in the form of a general scheme consisting of three main stages: reduction of inter-element data correlation, quantization of data elements, statistical coding . Quantization is the main tool used in lossy data compression. In essence, quantization is the extraction of some basic part of information from the input data, when there is less of it significant part falls.

Both scalar and vector quantization are used.

Converting an image to the generalized spectral region using linear transformation F can significantly reduce the inter-element correlation in the transform matrix Y=F(X) compared to the correlation of elements in the discrete image matrix. Then independent component-wise encoding of the matrix Y, rather than the matrix X, becomes more efficient. One can also give an energetic interpretation of the purpose of using transformations, which in this understanding is to concentrate the maximum part of the energy of the original discrete signal (matrix X) in the minimum number of spectral coefficients (elements of matrix Y). There is a certain connection between the energy distribution in the generalized spectrum and the decorrelating properties of transformations. Studying the effectiveness of decorrelating properties is therefore an important task when choosing a transform to use in a compression scheme.

Real photographic images are two-dimensional signals that have inhomogeneities (features) in the areas of object contours, therefore the basis of functions used for decomposition must have good localization in the original image. However, in background areas, the image can be considered as a realization of a stationary signal, which makes it preferable to use a frequency-localized basis for expansion (it is well known that the Fourier coefficients of the trigonometric expansion of a stationary signal are uncorrelated).

It is impossible to achieve simultaneously high resolution in the frequency and time domains due to the Heisenberg uncertainty principle. The solution is to use functional wavelet (burst) bases, which have variable time-frequency resolution. Splash-based approaches are currently dominant in still image processing, gradually replacing the traditional decorrelation tool, the discrete cosine transform.

The first chapter notes that to optimize lossy data compression algorithms, an approach based on minimizing the Lagrange RD function is often used. Let X be some input data set, which, as a result of the compression-recovery procedure, is associated with an output data set of the same nature, Y=F(X,u), where u=(u1,...,un) is a set of control parameters of the algorithm compression F. We consider X, Y elements of some space with metric D(X,Y), the set of all possible values control vector u we denote by U. The coding optimization problem is to find such parameters u* = u1,..., un of the algorithm F for a given set of input data X and the maximum allowable bit costs Rb such that the data coding error D(X, Y)=D(X,F(X,u)) would take minimum value. That is, where R(X,u) is the number of bits required to encode a data set X with parameters u.

Finding a solution tasks(1) in most cases comes down to cumbersome numerical procedures of an iterative nature. If the constraint R(X,u)Rb is not specified, then to determine the optimal encoding parameters u* corresponding to the solution of problem (1) for some (previously unknown) value of Rb, a simplified version of minimizing the Lagrange RD function is used:

where is a non-negative parameter specified externally. The parameter in the J(u) function sets the balance between quality and data compression level. The value =0 corresponds to the smallest coding error; increasing the value, we obtain, when optimizing the parameters of algorithm F according to (2), a smaller code length, but a larger error. Thus, you can customize the encoding algorithm F to the required characteristics. To find a solution to problem (1), minimization (2) is repeated iteratively, with different meanings– this procedure is called RD optimization1.

The first chapter also briefly notes the features associated with processing (compression-recovery) of dynamic images. The main transformation used for video compression is still DCT, as it is simpler in terms of the amount of calculations compared to wavelet transforms.

As with static compression, video encoding algorithms are often more complex than decoding algorithms.

Implementation of software video compression in real time, Berger T. Rate Distortion Theory. – Endlewood Cliffs, NJ: Prentice Hall, 1971.

Thus, it imposes significant restrictions on the permissible complexity of calculations.

Chapter two(52 pages) is devoted to the study of the effectiveness and synthesis of orthogonal transformations intended for use in data compression. The proposed new method for analyzing efficiency is based on the following reasoning. Let the covariance matrix KX of the original data vector X = (x0, x1,..., x N 1)T be known, the vector-spectrum Y is obtained as a result of some orthogonal transformation with the matrix W: Y=WX.

The average unconditional entropy of the vector-spectrum coefficient can be written as:

where fk(mk,k,x) is the probability density function for the spectral characteristic yk (k-th component of the vector Y), mk – expected value, k – standard deviation, f k0 (x) = f k (0.1, x). The lower the average entropy (3), the more effective the subsequent independent coding of spectrum components will be. Having imposed the constraint that for the class that for the optimal Karhunen-Loeve transformation (when the matrix W=Wopt is composed of eigenvectors KX and the matrix KY=WKXWT has a diagonal form N 1 N nal form) of the decorrelating efficiency, we will consider the value of the average excess entropy H (W, K X) = H cp (W, K X) H cp (Wopt, K X), which is expressed through the elements of the matrices K X = (cov(xi, x j))i, j = 0 and W = (wi, j)i, j =0 as follows:

The larger the value of H(W,KX), the lower the efficiency of the decorrelating transformation with the matrix W. Numerical calculations of value (4) for various transformations and types of covariance matrices showed results that are completely consistent with the known data obtained by other methods, for example, according to Pearl (Pearl J. On coding and filtering stationary signals by discrete Fourier transforms // IEEE Trans. Inf. Theory. - 1973. - Vol. ITP. 229-232.).

Of great interest for analysis is the model of a discrete signal (vector X), which has statistics of discrete Markov process first order, when the covariance matrix has the following form:

This model is often used to describe inter-row and inter-column correlation in discrete images. When =1, when all components in the original vector X are the same (for any two samples of the vector, the correlation coefficient equal to one), calculation of the introduced criterion (4) for matrix (5) is impossible, because in this case we have det K X = 0. At the same time, in the background areas of the image 1. In Chapter 2 the following theorem was proven.

Theorem 2.1. For anyone orthogonal matrix W (NN) such that j = 0.1,... N 1: w0, j = (the basis function with zero index is the normalized constant component) and the covariance matrix (5) Various studies, including those carried out in Chapter 2, show that among discrete transformations that have fast computational algorithms (for dimension N, implemented in ~NlogN arithmetic operations), the decorrelation characteristics for the Markov process (5) that are closest to the optimal Karhunen-Loeve transformation are obtained by using DCT.

However, despite the availability of well-developed fast calculation algorithms, DCT fundamentally requires multiplication operations for its implementation and is noticeably inferior in terms of the volume of calculations, for example, to the Haar, Walsh, and Chrestenson-Levy transformations. A separate issue, the consideration of which is given considerable attention in Chapter 2, is the construction of a new transformation that has both high decorrelation characteristics for model (5) and calculation algorithms that are much faster than for DCT. The resulting discrete pseudocosine transform is defined for vectors of dimension N, which allows the expansion N=N1…Nn, with k Nk(2,3,4). The representation N=N1…Nn must be written with a minimum number of factors Nk, arranging them without decreasing, i.e. k, m>k: NkNm. For example, for N=8 we have N1=2, N2=4 (but not N1=4, N2=2 and not N1=N2=N2=2). Then the DPKP matrix WN (in this case, the subscript indicates the dimension of the transformation) is constructed as a direct product2 WN = WN 1 ... WN n elementary DPKP matrices WN k (W2,W3,W4), k=1,...,n, where the elementary matrices orthogonal Under the tensor (direct) product of matrices D=(dl,m) (l=0,…,-1; m=0,…,-1) and and are obtained as a result of certain modifications from the DCT matrices of the corresponding dimension. Elementary matrices can be represented as the product of a certain diagonal matrix D by a matrix C, and the structure C allows multiplication by an arbitrary vector U to be implemented only using the operations of addition and subtraction of numbers (multiplication by 2 is equivalent to addition, 2x=x+x). Exactly:

From the properties of the tensor product follows the representation WN = D N C N, C N = C N 1 ... C N n. Matrices C2, C3, C4, D2, D3, D4 are given above. Thus, the implementation of the DPKP Y = WN X = D N C N X consists in the implementation of the multiplication of the matrix CN by the vector, Y = C N X, and subsequent normalization of the resulting vector Y, Y = D N Y. To calculate the DPKP, it is convenient to use fast algorithms based on the factorized representation for matrices3 :

unit matrix of dimension N j N j. Since the matrices TN j) consist of sparse matrix blocks C N j in a certain way, multiplying the matrix TN j) by a vector also reduces only to the operations of adding and subtracting numbers. Fast algorithms reverse DPCP are constructed in a similar way, because due to opT) () Note that the normalization (multiplication by the matrix DN) required when calculating the DPKP and the inverse DPKP for the compression scheme with scalar quantization of the transformation coefficients does not complicate the calculations.

Normalization can be combined during data compression with the stage of the scalar called vector Y of the individual quantization step qj=q/djj (where d jj is an element of the diagonal normalization matrix D N). When dequantizing y j = m~ j, the multiplier for the element y j should be chosen in the form mj=qdjj.

As shown by calculations of the average excess entropy (4) and residual correlation according to Pearl, for data with first-order Markov process statistics (5), DPKP is more efficient in decorrelation compared to other fast transformations, the implementation of which also reduces only to addition operations and subtracting numbers.

Chapter Three(48 pages) is devoted to the study of the use of the discrete Chrestenson-Levy transform (DCLT) for image compression and is a development of the research of the author’s PhD work.

To substantiate the validity of this idea, see pp. 84-85 from the monograph “Abstract algebraic systems and digital signal processing” / Varichenko L.V., Labunets V.G., Rakov M.A. - Kyiv: Naukova Dumka, 1986. – 248 p.

The idea behind the compression algorithm proposed in Chapter 5 is based on the work of Lewis-Knowles6 (LK) and Xiong-Ramchandran-Orchard7 (XRO). When coded Lewis A.S., Knowles G. Image Compression Using the 2-D Wavelet Transform // IEEE Trans.

Image Proc. – 1992. – Vol. 1. - No. 2. – P.244-250.

When developing the S topology in XRO, a binary map (ni) was used for all tree nodes except leaves: if ni=0, then the tree at this node is pruned, and if ni=1, then at least the immediate descendants are preserved. This is illustrated in Fig. 2A. Statistical feature encoding (ni) is not used in the XRO algorithm. At the same time, the attributes (ni) of neighboring (by position within the subband) nodes are correlated quantities. To take this correlation into account, in the developed algorithm it is proposed to group the features for neighboring nodes (ni )iC j into single element data, so that the pruning map (tree topology) is described by a new data alphabet with symbols Ni=(ni1,ni2,ni3,ni4)=ni1+2ni2+4ni3+8ni4, which are also encoded statistically. The new extended feature Nj turns out to be associated with node j of a higher level, see Fig. 2B.

Pruning branches Figure 2. Method of pruning branches when viewing nodes layer-by-layer: A – XRO algorithm, B – proposed topology coding. Ci=(i1,i2,i3,i4) An idea that goes back to the work of LK and is used in the same form in XRO is the following: the greater the absolute value of the wavelet coefficient wi (or energy, wi2) of parent node i, the less it is likely that a zero (i.e., trimmed) branch will appear at this node. A more accurate prediction of the occurrence of the null branch can be made by using Xiong Z., Ramchandran K. and Orchard M.T. Space-frequency Quantization for Wavelet Image Coding // IEEE Trans. Image Proc. – V.6 – May 1997, P. 677-693.

of the predicted value Pi is a sum that includes, in addition to wi2, also the squares of the values ​​at node i. For the predicted value of node i, it is proposed to use the following amount absolute values wavelet coefficients:

where sets for summation indices are determined among the neighbors of node i in the subband in accordance with Fig. 3. The weighting coefficients included in the sum were obtained as a result of statistical processing of a number of test images in order to identify the maximum sample value for the correlation coefficient between the predicted value (10) and the energy of the quantized wavelet coefficients of the immediate descendants Ci: Pi, w2 max.

The proposed wavelet compression algorithm uses several statistical models. The function of the arithmetic encoder, which estimates, using internal statistical models, the number of bits required to encode the symbol c in the kth stream, will be denoted by H(k,c). The designation Hspec refers to streams in which quantized wavelet coefficients are encoded, Hmap - to streams in which signs of the beginning of the zero branch are encoded. Spectrum trees are processed sequentially; After optimizing the topology of the next tree, it must be encoded, and thereby adapt the statistical models of the arithmetic encoder.

Step 0. /* initialization */ i Ln1: /* all nodes of the penultimate level are viewed */ /* adjustment of quantized coefficients */ /* calculation of RD functions for saving and trimming leaves */ Step 2. i Ll: /* viewing the current level with an attempt to trim branches */ If i0 then /* have not reached the beginning of the tree */ /* determining the optimal topology of branches */ /* adjusting quantized coefficients */ /* preparing to view the next level */ otherwise /* i=0, reached the beginning tree */ /* determining the optimal topology of the tree */ Step 3. /* generating and displaying the result */ End The tree nodes are viewed from the leaves to the root. At the first (preparatory) step, those nodes i that have only immediate descendants (Ci=Ui) are looked through, arrays are formed from the values ​​of RD functions corresponding to the options of cutting (J U i) and preserving (J U i) leaves. Previously, at step 1.1, for each leaf node the possibility of additional Lagrange minimization, which characterizes the scalar quantization of wavelet coefficients, is analyzed. This procedure is based on the known properties of the probability distribution of wavelet coefficients, according to which the bit costs R for coding the coefficient are a priori smaller, the smaller its absolute value (for this reason, for the case w = 0, the additional minimization procedure is not applied). In the second step, which is performed for all nodes i of the next levels, iLl (l=n-2,...,0), the RD-optimal method of pruning branches is selected (steps 2.1-2.2) starting from nodes jCi. Steps 2.3 and 2. have the same meaning as steps 1.1, 1.2. Note that the introduction of additional optimization of scalar quantization (steps 1.1 and 2.3) allows, at the same compression level, to further increase the PSNR by 0.02-0.03 dB.

For all nodes, except for the root one, the choice of model for encoding features Ni (at step 2.1) is made using the rule defined by the IndMap(i) function. To encode the feature N0 associated with the root of the tree, a separate data stream is used, which is conventionally assigned the number 0 (see step 2.6).

As follows from the above description of the optimization algorithm, vital role in its work the functions IndMap(Pi) and IndSpec(i) play a role, which set the rules for selecting streams for data encoding. The first function selects a feature coding model Ni=(ni1,ni2,ni3,ni4) based on the average value of the predicted values ​​Pi (10), i=i1,i2,i3,i4, and has the following form:

Thresholds (t1,t2,t3)=(0.3;1.1;4.0) for selecting models were found as a result of minimizing the length of the output bit code R(t1,t2,t3) when processing well-known test images Lena, Barbara, Goldhill. Experiments also showed that introducing a larger number of models for features makes no sense.

The second key function of the wavelet compression algorithm, IndSpec(i), is a model selection rule for encoding wavelet coefficients that do not fall into the zero branches. To encode spectral coefficients, it is more effective to use a predictive value obtained not only from the parent node, but also from the wavelet coefficients of nodes that are in the same subband, next to the one being processed8. In the algorithm under consideration, the sequence of encoding and decoding of wavelet spectrum nodes is determined by the diagram in Figure 4 (the numbers indicate the order of subband processing). For use in the IndSpec(j) function, the predicted value of the current node j (marked in black) is formed s j = 0.36 Pi + 1.06 w j y + w jx + 0.4 w jd, where jy is a vertical neighbor node, jx is the idea of ​​using the context of neighboring wavelets coefficients was proposed in the work Chrysafis S., Ortega A. Efficient Context-Based Entropy Coding for Lossy Wavelet Image Compression // Proc. Data Compression Conference. – Snowbird (Utah), 1997. – P. 241-250.

horizontal neighbor node, jd – diagonal neighbor node (which have already been processed, see Fig. 4), Pi is determined for the parent node i (jCi) by (10). In this case, the results of processing test images.

The null model refers to the encoding of spectral coefficients under scaling functions and is again isolated. The first model includes the lowest frequency wavelet coefficients (level L1), as well as coefficients for which the forecast sj is the largest.

Characteristics given by the Table. 1. The PSNR value (in dB), according to the proposed compression algorithm, obtained when compressing test images for standard test images according to the proposed algorithm, are given in Table. 1. In the experiments, Lena Barbara Goldhill used five-step wavelet transforms from the work Wei D., Pai H.-T., and Bovik A. C., Antisymmetric Biorthogonal Coiflets for Image Coding // Proc. IEEE International Conference on Image Processing. – Chicago, 1998. – V. 2. – P. 282-286. Comparison achieved characteristics with the results of applying other well-known algorithms shows that the proposed algorithm has very high performance.

The final studies of Chapter 5 relate to the construction of a hybrid coding scheme for the wavelet spectrum, when in addition to the method of trimming the branches of wavelet coefficients described above, the possibility of vector “self-quantization” of branches is also allowed, which can be interpreted as fractal coding in the field of wavelet transforms9. The resulting hybrid algorithm requires significantly more computation, but the fractal component of the coding turned out to be almost completely suppressed by the basic wavelet compression scheme based on branch pruning. It should be noted, however, that the combination of the two approaches in a hybrid algorithm was carried out in a simple way, and the possibilities for further development leave a vast field for research here.

The sixth chapter (45 pages) is devoted to the study of dynamic image compression algorithms in order to construct a video compression scheme suitable for software implementation, which provides real-time processing on personal computers.

A frame of a video sequence is a matrix of pixels consisting of M1 rows and M2 columns: B=(bk,l), k=0.1,…,M1-1, l=0.1,…,M2-1, and by a video sequence we mean an ordered frame set B0,B1,…,Bi,…. Let's call the (y,x)-block of frame B (y, x – integer coordinates) some submatrix By,x=(bk,l), where k=y,y+1,…,y+N1-1, l=x ,x+1,…,y+N2-1. In the developed algorithm, each frame of the video sequence is divided during processing into adjacent matrix blocks (Bm,n) of size 88, m,n=0,8,16... If any block Biy,x of the video sequence turns out to be in a certain sense “similar” to the original block Bim, n, we assume that the block Bim,n is a moved fragment Biy,x See, for example, Davis G.M., A wavelet-based analysis of fractal image compression // IEEE Trans. Image Proc. – 1998. – V.7 – No. 2. – P.141-154.

the previous frame, and to encode an (m,n)-block of the image, it is enough to set the coordinates of the block in the previous frame, y and x, or change the coordinates y-m and x-n. A special case of a moved block is a stationary block when m=y, n=x. If a block Bim,n cannot be found similar in a previous frame, the block must be encoded as new. To select the encoding method for the next processed block Bim, n, we are again guided by the minimum criterion of the Lagrange function J(b)=D(b)+R(b). Let's assume that the argument b= corresponds to the encoding of the moved (fixed) block, and b=1 – the new one. Those. if J(1)>J(0), then the block is coded as moved, and as new otherwise.

When using RD optimization, the problem of searching for moved blocks is formulated as follows. For a given (m,n)-block Bim, n of the i-th frame, find in the previous reconstructed frame such a (y,x)-block B iy,x so that the Lagrange RD function takes the minimum value. Here it is taken into account that the coordinates of the found block will be encoded as relative, i.e. displacement vector r=(y-m,x-n). In order for the search to be carried out in real time, only points (v,u) sufficiently close to point (m,n) must be considered as a region. Increasing the search efficiency by expanding the area is achieved by using various directed search algorithms aimed at minimizing the representation error of the moved block Bim, n Biy,1, which corresponds to the special case (11) at =0. In order to take into account the contribution of bit costs to the RD function J*, we will carry out minimization (11) step by step, at each stage approximately assuming that the displacement vectors under consideration entail the same costs for statistical coding. In addition, to increase the universality of iterative search algorithms, we will search for small movements more accurately. Indeed, if a certain block of the image has moved a significant distance compared to the previous frame, then the corresponding fragment of the image is perceived by the human eye as blurred, and there is no need to accurately determine the movement vector. Small movements of blocks are not only predominant, but also must be accurately tracked due to the specific nature of visual perception. The proposed RD search algorithm has the following form.

Step 0. Calculation of the value of the RD function of the fixed block, r=(0,0):

Step 1. Close to brute force exact search for small movements.

1.1. Among the nine (y,x)-blocks of the previous frame, 1.2. Among the nine (y,x)-blocks of the previous frame, 1.3. Calculation values ​​of the RD function Step 2. Rough search for large displacements.

2.1. Among the eight (y,x)-blocks of the previous frame, 2.2. Among the nine (y,x)-blocks of the previous frame, (y 4, x 4)((0,0), (2,2), (2,2), (2,2), (2,2), (3,0), (0,3), (3,0), ( 0.3)) 2.3. Calculating the value of the RD function Step 3. Selecting the best option for moving the block.

End To calculate the value of the function J0, it is necessary to take into account the bit costs for encoding the sign of the moved block: J 0 = J * log2 (mov), where mov is the frequency of occurrence of moved blocks in the already processed data.

The main volume of calculations in the above algorithm is related to the calculation of deviations B im, n Biy, x. During calculations, one test point moves from step 0 to step 1.1, one from step 1.1 to 1.2, one from step 2.1 to 2.2. As a result, the deviation calculation needs to be performed 33 times.

To increase the efficiency of the given search algorithm for the vector (y, x) of the processed block, a forecast should be made using the vectors () and () of two already processed neighboring blocks (vertical neighbor and horizontal neighbor, respectively). The forecast itself is the relative coordinates y 0, x 0, which determine the transfer of the center of the search area: from point (m, n) to point (m, n) = m + y 0, n + x 0. Experiments show that the number of new image blocks are reduced by 5...25% if we accept the following rule for making a forecast:

Following the ideology of the MPEG standard, we also process new blocks using quantization followed by statistical coding of two-dimensional DCT coefficients. Let S=F(Bm,n) be the result of the DCT of block Bm,n. Let us denote SQ = (~k,l = round(sk,l / qk,l))k,l =0, SQ = (sk,l = qk,l ~k,l )k,l =0, where Q = (qk,l )k,l =0 – one of the JPEG quantization matrices. To statistically encode the matrix S, the contextual encoding algorithm discussed in Chapter 4 is used, which includes an additional stage of RD optimization. Let again ZQ = (z0,..., z63) be the vector obtained as a result of zigzag reading of data from the SQ matrix according to the rule defined by the JPEG standard (SQ ZQ), G (ZQ) = (g k )C=1 (0,..., 63) is the set of indices of non-zero elements of the vector ZQ, i.e. z g k 0 if gkG; gkG: z g k = 0. RD optimization of statistical coding is possible by lengthening the zero series by reducing the number of elements in the set G (additionally zeroing out the components of the vector ZQ). In order to avoid a noticeable complication of the coding algorithm as a result, we will analyze only the possibility of increasing the final zero series, which makes the greatest contribution to minimizing the function J(ZQ)=D(ZQ)+R(ZQ). Let ZQ be the vector obtained from the vector ZQ as a result of zeroing the last components (zk )k = g m +1, i.e.

G (ZQ) = (g k )m=1 G (ZQ), m=1,…,C. Then the simplest RD optimization consists of searching for an index g m * G (ZQ) such that the procedure for encoding a new image block considered above assumes the use of a given quantization matrix Q. A universal encoding algorithm must operate with a certain set of quantization matrices (Qj) with the ability to select the required for specific conditions.

If the set is large enough, then selecting the quantization matrix Q based on the principle of minimizing the function JQ (12) turns into a cumbersome procedure that cannot be implemented in real time using standard means. In addition, with a large range of possible values ​​for the index j, encoding it separately for each new block entails unacceptably high additional bit costs. Therefore, only a few matrices from the set recommended by JPEG were selected as the initial set, which correspond to the best, worst, and some intermediate quality levels. In the experiments, the number of matrices |(Qj)|=4 was chosen. To speed up the execution of division operations, which are necessary for quantization, the elements of the matrices (Qj) were rounded to the nearest value 2k, k=0,1,... This approach allows us to replace the operations of integer division and multiplication with shifts of bits of the binary representation of numbers, which are usually performed by real equipment much faster.

Taking into account the bit cost that is required to encode the quantization matrix index, the Lagrange function corresponding to encoding a new image block is defined as follows:

J = min (J Q log 2 Q) log 2 new, where JQ is in accordance with (12), Q is the frequency of occurrence of the matrix Q, new is the frequency of appearance of new blocks during previous processing.

When studying the characteristics of the final video compression algorithm, to estimate the magnitude of the encoding error in the reconstructed sequence B0, B1,..., B K 1, the ratio of the peak signal to noise value was used, which was determined as follows:

where M1 and M2 set the frame size in pixels. The well-known test sequences News, Container ship, Hall monitor, Akiyo, Claire were chosen for the experiments. All of them have a frame size of 144176 pixels. Before the experiments, the original sequences were thinned out: only every third frame, B3k (k=0,...,24), was used to form those 25-frame video sequences, which were then processed. This time thinning was designed to simulate a video capture rate of 10 frames per second, instead of the original (for all the above sequences) 30 frames per second. Only the brightness Y-component was processed, and only the brightness component was used to analyze the error. Software compression of video sequences was achieved in real time.

The results of numerical experiments obtained by graduate student F.V. Strelkov are shown in Table 2. The PSNR value (13) achieved on the same thinned 25-frame test video sequences using the publicly available MPEG encoder - http://www.mpeg .org/MSSG. The length of the compressed data file obtained in each experiment is exactly equal to the product of the size of the data bit stream (shown in the table) by a factor of 2.5. In all tests, the proposed compression scheme gives good results, surpassing the characteristics of the specified MPEG-2 encoder, despite the fact that the mpeg2encode codec used exhaustive search in an area of ​​2323 pixels to find moved image blocks.

Table 2. Compression characteristics of the proposed algorithm The described video compression algorithm was implemented in software as part of the work carried out at the State Research and Production Complex "Technological Center" of Moscow state institute electronic equipment and at NPP "Technology"

The implementation of the developed video compression libraries was carried out in a number of software systems, among which the Visual Security video control and registration system is of greatest practical interest (see.

http://www.tcen.ru/vs).

The results of the research carried out in the dissertation work are summarized in final section– “Main findings and conclusion” (3 pages).

The presented dissertation work examines various aspects of the use of discrete orthogonal transformations for digital image compression, both from a purely formal theoretical analysis and from the requirements and limitations that practice imposes on specific computational schemes and algorithms. In general, the content of the work is applied-oriented, so most theoretical results supported by computational experiments, the results of which, in turn, not only served as an illustration or test of the theory, but often gave impetus and provided the source material for further research. Based on the results of the research conducted in the dissertation, the following conclusions can be drawn.

1. Orthogonal transformations are the main tool used for data decorrelation during image compression. In case mathematical model discrete signal is given by a covariance matrix, to analyze the effectiveness of decorrelating processing, it is advisable to use the average excess entropy criterion proposed in the work.

2. Especially for compression of correlated data, the discrete pseudocosine transform (DPCT) was obtained and introduced for the first time. In a compression scheme that assumes the presence of a scalar quantization stage of transformation coefficients, among the considered fast transformations, the implementation of which is reduced only to addition-subtraction operations (Walsh, Haar, pseudocosine), DPKP gives the best decorrelation results for a discrete signal described by the Markov model.

3. Using the obtained fast DPCL algorithms, which take into account the specifics of processing real arrays, the proposed image compression scheme based on arithmetic coding of DPCL coefficients achieves characteristics that are close to the JPEG version based on DCT in terms of processing quality and computational complexity.

4. When using statistical coding of DCT coefficients using the JPEG method, the presence of “jumps” in the discrete signal is least desirable in the central region of processing fragments.

5. The multi-model (multi-stream) arithmetic coding method is highly effective when used in various data compression schemes and algorithms, and one of the key points in the development of compression schemes is the determination of rules for selecting the current coding model based on the context of the already processed data. Thus, the use of the multi-model contextual arithmetic coding algorithm of DCT coefficients proposed in Chapter 4 in the JPEG scheme increases the efficiency of data compression by 10%.

6. When compressing images using multi-model coding of tree structures of wavelet spectra, the model selection rule should be based on a combined context that takes into account both the environment of the wavelet coefficient itself in the subband and the environment of the “parent” coefficient. The new effective lossy digital image compression algorithm obtained on this basis, which was developed based on the results of studying the statistical properties of the spectra of discrete wavelet transforms, shows high compression characteristics with a complexity of implementation acceptable for a wide range of applications.

7. To eliminate inter-frame (temporal) redundancy of video data, the most preferable for practical use among the studied algorithms for block compensation of movements should be the proposed hybrid algorithm of directed search, in which small movements are searched accurately and carefully, and large movements - more roughly.

8. When using the proposed video compression algorithm, which was developed on the basis of an RD optimization approach taking into account the requirements and specifics of software implementation, real-time video compression and restoration is achieved on the basis of modern personal computers, with high quality processing.

In general, the dissertation work obtained new scientific results, the theoretical provisions of which made it possible to significantly develop and formalize procedures for the analysis and synthesis of digital image compression schemes based on the use of discrete orthogonal transformations. The developed approaches and recommendations led to the construction of specific compression schemes and algorithms, many of which were implemented in software and experimentally confirmed the effectiveness of their application.

List of main works on the topic of the dissertation 1. Efimov A.V., Umnyashkin S.V. Fast algorithms for calculating the discrete Chrestenson-Levy transform and estimating its spectral characteristics // Teor. functions and approx.: Tr. 7th Saratov. winter school (1994). Part 2. - Saratov: SSU Publishing House, 1995. - P. 9-20.

2. Efimov A.V., Pospelov A.S., Umnyashkin S.V. Application of the Chrestenson-Levy transform in problems of digital information processing // International.

conf. "Function spaces, approximation theory, nonlinear analysis", dedicated. 90th anniversary of academician S.M. Nikolsky (April 27 - May 3, 1995): Abstract.

report - M.: MIPT Publishing House, 1995. - P.124-125.

3. Umnyashkin S.V. Application of the discrete Chrestenson-Levy transform (DCLT) for image encoding: comparison with the discrete Fourier transform (DFT) // Vseros. scientific-technical conf. “Electronics and Computer Science” November 15-17. 1995: Abstract. report - M.: MGIET (TU), 1995. - P. 265-266.

4. Umnyashkin S.V. Estimation of the dispersion of spectrum elements of a discrete cosine transform of a stationary first-order Markov process. Int. conf. according to theory approx. func., dedicated in memory of prof. P.P. Korovkina (Kaluga, June 26-29, 1996): Abstract. report -T.2.-Tver: TSU, 1996. - P. 217-218.

5. Umnyashkin S.V. Assessing the effectiveness of using unitary transformations for encoding discrete signals // Informatics and Communications: Sat.

Scientific works ed. V.A. Barkhotkina. M.: MIET. - 1997. P.73-78.

6. Umnyashkin S.V. Assessing the effectiveness of using discrete transformations for data compression // “Electronics and Informatics – 97”. Second All-Russian Scientific and Technical Conference with international participation(Zelenograd, November 25-26, 1997): Abstract. doc. Part 2. - P.79.

7. Efimov A.V., Pospelov A.S., Umnyashkin S.V. Theoretical foundations and some features of the application of discrete multiplicative transformations in problems of digital image compression // Proceedings of the international conference “Nutrition optimization calculation” (6-8 June 1997, Kiev) Kiev: Institute of Cybernetics named after V.M. Glushkov, 1997. - pp. 108-112.

8. Efimov A.V., Pospelov A.S., Umnyashkin S.V. Some properties of multiplicative orthonormal systems used in digital signal processing // Proceedings mathematical institute them. V.A. Steklov RAS.

- T.219. - 1997. - From 137-182.

9. Efimov A.V., Umnyashkin S.V. On some properties of the generalized Haar system and assessing the effectiveness of using orthogonal transformations for data compression // Functional spaces. Differential operators. Problems of mathematical education: Proceedings of the international. conf., dedicated 75th anniversary of corresponding member. RAS prof. L.D. Kudryavtseva. - Volume 1. - M.: Ros. Peoples' Friendship University, 1998. - pp. 70-73.

10. Umnyashkin S.V. On the modification of the discrete cosine transform // Approximation theory and harmonic analysis: Proc. report international conf.

11. Umnyashkin S.V. On modification of the discrete cosine transform // Izv. Tul. state un-ta. Ser. Mathematics. Mechanics. Computer science. Tula: TulGU, 1998. T. 4. Issue. 1. pp. 143-147.

12. Umnyashkin S.V., Kochetkov M.E. Analysis of the effectiveness of using discrete orthogonal transformations for digital coding of correlated data // News of universities. Electronics. - No. 6. - 1998. - P. 79-84.

13. Umnyashkin S.V. On clustering correlated data. // Information technologies in innovative projects. Intl. conf. (Izhevsk, April 20-22, 1999): Materials of reports. - Izhevsk, IzhSTU, 1999. - P. 59-65.

14. Umnyashkin S.V. Algorithm for clustering correlated data // VII Int. conf. Mathematics. Economy. Ecology. Education. Intl. symp. Fourier series and their applications: Proc. doc. – Rostov: Rost. state economy acad., 1999. – P. 211-212.

15. Efimov A.V., Pospelov A.S., Umnyashkin S.V. Some issues of application of multiplicative systems and transformations in problems of digital image processing // VII Intern. conf. Mathematics. Economy.

Ecology. Education. Intl. symp. Fourier series and their applications: Proc.

doc. – Rostov: Rost. state economy acad., 1999. – pp. 154-155.

16. Umnyashkin S.V. Pseudocosine transformation for compression of discrete signals // Information technologies and problems of microelectronics.

Sat. scientific tr. – M.: MIET. – 1999. – P.158-170.

17. Umnyashkin S.V. Algorithm for compression of still images based on discrete wavelet transform // VII International Conference “Mathematics. Computer. Education" (Dubna, JINR, 24-29 January 2000):

Abstract. doc. – Moscow: Progress-Tradition, 2000. – P.327.

18. Efimov A.V., Umnyashkin S.V. On the structure of some direct and inverse wavelet transforms // Theory of approximation of functions and operators: Proc.

report Intl. conf., dedicated 80 years old. from the day of birth S.B. Stechkina (Ekaterinburg, February 28 – March 3, 2000). -Ekaterinb.: UrSU, 2000. – P.74-75.

19. Umnyashkin S.V., Strelkov F.V., Zhukov V.G. Three-step algorithms for searching for moved image blocks // Information technologies and control systems. Sat. scientific tr. edited by V.A. Barkhotkina. – M: MIET, 2000.

– pp. 47-55.

20. Umnyashkin S.V. Digital image compression using the discrete Chrestenson-Levy transform // Interindustry scientific and technical journal “Defense complex - scientific and technical progress of Russia”, No. 2, 2000. P.28-39.

21. Umnyashkin S.V., Kosmach M.V. Optimization of digital image encoding using the JPEG method // Izv. university Electronics. -No. 4-5. -2000. -P.139-141.

22. Umnyashkin S.V. Compensation for the movement of objects when compressing video data // “Electronics and computer science - XXI century” Third international scientific and technical conference: Proc. doc. – M.: MIET, 2000. – P. 365-366.

23. Umnyashkin S.V. Algorithm for searching moved blocks for encoding digital video images // Interbranch. n.-t. magazine “Defense complex - scientific and technological progress of Russia”, No. 3, 2001. – pp. 38-41.

24. Umnyashkin S.V. Using contextual arithmetic coding to increase data compression using the JPEG scheme // News of universities. Electronics. - No. 3. - 2001. – P. 96-99.

25. Umnyashkin S.V. Wavelet compression of digital images with prediction of statistical models // Izv. universities Electronics. - No. 5. - 2001. P.86-94.

26. Umnyashkin S.V. Algorithm for fractal coding of images in the field of wavelet transforms // Computer Mathematics. Optimization of calculations: Collection of scientific works. – Volume 1. – Kiv: Institute of Cybernetics im.

V.M. Glushkova, 2001. – P. 385-391.

27. Umnyashkin S.V. Image compression based on mixed predictive modeling of wavelet trees // Reports from Vxj University (Sweden) – Mathematics, natural sciences and technology. – Nr.11 (September), 2001. – 18 pages.

28. Umnyashkin S.V., Strelkov F.V. An RD-optimized scheme for real-time video compression // Reports from Vxj University (Sweden) – Mathematics, natural sciences and technology. – Nr.12 (September), 2001. – 15 pages.

29. Development of algorithms and software for the implementation of digital analysis and compression of video images in real time based on a general-purpose hardware and software system: Research report (final) / NPP “Technology”; hands – Umnyashkin S.V. – “Guardian”; State No.

reg. 01990011697; Inv. No. 01077. – Moscow, 1999. – 74 p.

30. Research and development of lossy software data compression algorithms for digital video processing: Research report (final) / NPP “Technology”; hands – Umnyashkin S.V. – “Watch”; State No. reg.

01200004624; Inv. No. 100704. – Moscow, 2000. – 48 p.

Signed for publication: December 25, 2001 Order No. 332. Circulation 100 copies. Academician-ed.l. 2.4. Format 6084 1/
equations Abstract of the dissertation for the degree of Doctor of Physical and Mathematical Sciences Moscow 2007 The work was carried out at the Research Institute of Applied Mathematics and Automation...”

“KORNILOV Dmitry Aleksandrovich RESEARCH OF THE PROPERTIES OF FULLERENES AND NANOTUBES USING THE METHOD OF MOLECULAR DYNAMICS Specialty 01.04.07 – Physics of Condensed Matter Abstract of the dissertation for the scientific degree of Candidate of Physical and Mathematical Sciences St. Petersburg 2003. The work was completed in a state educational institution of higher education vocational education St. Petersburg State Polytechnic University Scientific supervisor: Doctor...”

“CHIRIKOV ANTON MIKHAILOVICH NEW UNIQUENESS THEOREMS FOR POWER SERIES 01.01.01 - real, complex and functional analysis Abstract of the dissertation for the scientific degree of candidate of physical and mathematical sciences ST. PETERSBURG 2011 The work was completed at the Department of Mathematical Analysis Faculty of Mathematics Russian State Pedagogical University them. Herzen Scientific supervisor, Doctor of Physical and Mathematical Sciences, Professor Nikolay Shirokov...”

“Sidorov Evgeniy Nikolaevich FEATURES OF OPTICAL PROPERTIES OF HEAVY DOPED GaAs:Te UNDER CONDITIONS OF CORRELATED IMPURITY DISTRIBUTION Specialty 01.04.10 - semiconductor physics ABSTRACT of the dissertation for the degree of candidate of physical and mathematical sciences Tomsk - 2010 Work completed in O Moscow branch of the Institute of Semiconductor Physics named after. A.V. Rzhanova SB RAS Scientific supervisor: Candidate of Physical and Mathematical Sciences Davletkildeev Nadim Anvarovich Official..."

“LYASHEDKO ANDREY DMITRIEVICH Thermooptical distortions in neodymium lasers based on plate active elements with longitudinal diode pumping Specialty: 01.04.21 – laser physics ABSTRACT of the dissertation for the degree of candidate of physical and mathematical sciences Moscow - 2012 The work was carried out at the Federal State Budgetary Institution of Science Institute of General physics named after A.M. Prokhorov RAS Scientific supervisor: Doctor of Physical and Mathematical Sciences Tsvetkov...”

“UDC: 535.326, 534.18 Pyatakova Zoya Aleksandrovna ACOUSTO-OPTICAL INTERACTION IN TWO-DIMENSIONAL PHOTIC CRYSTALS Specialty 01.04.03 – radiophysics Abstract of the dissertation for the degree of candidate of physical and mathematical sciences Moscow - 2011 The work was completed at the Faculty of Physics of Moscow state university them. M.V. Lomonosov Scientific supervisor: candidate...”

“Alla Aleksandrovna KRUTIKOVA SPECTRAL ANALYSIS OF COMPOSITE MATERIALS BASED ON NANOCRYSTALLINE SILICON Specialty: 02.00.02 – Analytical chemistry ABSTRACT of the dissertation for the scientific degree of Candidate of Chemical Sciences Moscow–2007 The work was completed at the department analytical chemistry Moscow State Academy of Fine Chemical Technology named after. M.V. Lomonosov Scientific supervisor: Doctor of Chemical Sciences, Professor Anatoly Aleksandrovich Ishchenko Official..."

“MATVENKO Sergey Ivanovich PERIODIC STRUCTURES IN LOW-DIMENSION CORRELATED SYSTEMS Specialty 01.04.02 - theoretical physics ABSTRACT of the dissertation for the degree of Doctor of Physical and Mathematical Sciences Chernogolovka - 2012 The work was carried out at the Federal State Budgetary Institution of Science Institute of Theoretical Physics named after... ."

“UDC 004.896 AKSENOV Konstantin Aleksandrovich THEORY AND PRACTICE OF SUPPORTING DECISION MAKING IN THE FIELD OF RESOURCE CONVERSION PROCESSES Specialty 05.13.01 – System analysis, management and information processing Abstract of the dissertation for the degree of Doctor of Technical Sciences Ekaterinburg - 2011 Work done at the cafe Department of automated control systems of the Federal State Autonomous Educational Institution of Higher Professional Education Ural federal university named after the first President of Russia B.N. Yeltsin. Scientific..."

“Voskov Aleksey Leonidovich CALCULATION OF PHASE EQUILIBRIA BY THE METHOD OF CONVEX SHELLS Specialty 02.00.04 - physical chemistry ABSTRACT of the dissertation for the degree of candidate of chemical sciences Moscow - 2010 The work was carried out in the laboratory of chemical thermodynamics at the Department of Physical Chemistry of the Faculty of Chemistry of Moscow State University named after M.V. ova . Scientific supervisor: Doctor of Chemical Sciences, Professor Gennady Fedorovich Voronin Official..."

“Kravchenko Igor Vitalievich FEATURES OF STRUCTURING LAYERED AND DISPERSED SYSTEMS OF INCOMPATIBLE POLYMERS UNDER SHEAR FLOW. NUMERICAL MODELING 02.00.06 – High-molecular compounds Abstract of the dissertation for the degree of candidate of physical and mathematical sciences Moscow 2010 www.sp-department.ru The work was carried out at the Institution of the Russian Academy of Sciences Institute of Problems chemical physics RAS Scientific supervisor: Doctor of Physical and Mathematical Sciences Patlazhan...”

“Gribov Andrey Gennadievich ANALYSIS OF INFORMATION EXCHANGES IN MANAGEMENT SYSTEMS Specialty 05.13.01 – System analysis, management and information processing (industry) ABSTRACT of the dissertation for the scientific degree of candidate of physical and mathematical sciences Moscow - 2011 The work was performed at the Institution of the Russian Academy of Sciences Computing Center. A.A. Dorodnitsyn RAS in the Department of Applied Optimization Problems. Scientific supervisor: Doctor of Physical and Mathematical Sciences..."

“Jardimalieva Gulzhian Iskakovna (CO)POLYMERIZATION AND THERMAL TRANSFORMATIONS OF METAL-CONTAINING MONOMERS AS A WAY TO CREATE METALLOPOLYMERS AND NANOCOMPOSITES 02.00.06 – high molecular weight compounds ABSTRACT of the dissertation for the scientific degree of Doctor of Chemical Sciences Chernogolovka - 2009 www.sp-department.ru The work was carried out at the Institute of Problems of Chemical Physics of the Russian Academy of Sciences Doctor of Chemical Sciences, Professor Scientific consultant: Anatoly Dmitrievich Doctor of Chemical Sciences helped..."

“GAVRILOV Aleksey Andreevich RESEARCH OF LINEAR AND NETWORK RANDOM POLYMER SYSTEMS BY COMPUTER MODELING METHODS Specialties 02.00.06 high-molecular compounds, 01.04.07 – physics of condensed matter Abstract of the dissertation for the scientific degree of candidate of physical and mathematical sciences Moscow – 201 3 The work was carried out at the Department of Physics of Polymers and Crystals Faculty of Physics, Moscow State University named after M.V. Lomonosov....”

“NGUYEN XUAN NGIA DIELECTRIC RELAXATION OF SUPRAMOLECULAR STRUCTURES IN BIOLOGICAL FLUIDS AT LOW AND INFRANOCITY FREQUENCIES Specialty - 01.04.04. Physical electronics ABSTRACT of the dissertation for the degree of candidate of physical and mathematical sciences St. Petersburg - 2011 The work was carried out at the state educational institution of higher professional education St. Petersburg State Polytechnic University Scientific supervisor:...”

“Naymushina Ekaterina Aleksandrovna. UDC 538.945 APPLICATION OF THE METHOD OF X-RAY ELECTRON SPECTROSCOPY FOR STUDYING THE CHEMICAL STRUCTURE OF COMPLEX COPPER OXIDES IN THE SUPERCONDUCTING STATE Specialty 01.04.01. – instruments and methods experimental physics ABSTRACT of the dissertation for the degree of candidate of physical and mathematical sciences Izhevsk - 2004 The work was carried out in the laboratory of electron spectroscopy of the Institute of Surface Physics at the Udmurt State...”

“Dashkov Evgeniy Vladimirovich On propositional calculi representing the concept of provability 01.01.06 – mathematical logic, algebra and number theory ABSTRACT of the dissertation for the scientific degree of candidate of physical and mathematical sciences Moscow - 2012 Work completed at the department mathematical logic and theory of algorithms of the Faculty of Mechanics and Mathematics of Moscow State University named after M.V...."

“Rusakov Dmitry Mikhailovich PROGRAM DIAGRAMS WITH CONSTANTS Specialty 01.01.09 – discrete mathematics and mathematical cybernetics ABSTRACT of the dissertation for the academic degree of candidate of physical and mathematical sciences Moscow - 2008 The work was carried out at the Department of Mathematical Cybernetics of the Faculty of Computational Mathematics and Cybernetics of Moscow State University named after M.V. . Lomonosov. Scientific..."

“REMEEVA ALFIA NILOVNA METHODS OF TEACHING PHYSICS IN SOCIO-ECONOMIC PROFILE CLASSES 13.00.02 – theory and methods of teaching and education (physics) ABSTRACT of the dissertation for the scientific degree of candidate pedagogical sciences Chelyabinsk - 2008 The work was carried out at the Department of Theory and Methods of Teaching Physics of the State educational institution higher professional education Sterlitamak State pedagogical academy Scientific supervisor: doctor...”

The transformations that are used to compress images must be fast and, if possible, easily implemented on a computer. This primarily assumes that such transformations must be linear. That is, the converted values WITH( are linear combinations (sums with some factors or weights) of the original quantities (pixels) dj, and the corresponding multiplier or weight is a certain number Wij(conversion factor). Means, WITH( -]G\- djWij, where r, j= 1,2,..., P. For example, when P= 4 this transformation can be written as matrix form which in the general case will take the following form: C = W D. Each column vector of the matrix W is called a “basis vector”.

An important task is to determine the conversion coefficients wij. The main requirement is that after the transformation the value With\ would be large, and all other quantities C2, сз,... would become small. Basic ratio С( = Ylj djWij assumes that WITH( will be large if the weight Wij will enhance the corresponding values dj. This will happen, for example, if the components of the vectors wij And dj have similar meanings and identical signs. Vice versa, WITH( will be small if the weights are small and half of them have the sign opposite to the sign of the corresponding number dj. Therefore, if large c* are obtained, then the vectors W(j are similar to the original vector dj, and small WITH( mean that the components wij very different from dj. Therefore, the basis vectors wij can be interpreted as a tool for extracting some characteristic features of the original vector.

In practice the weights Wij should not depend on the source data. Otherwise, they will have to be added to the compressed file for use by the decoder. This consideration, as well as the fact that the input data is pixels, that is, non-negative quantities, determines the way the basis vectors are chosen. The first vector, the one that generates With\, must consist of close, possibly matching numbers. It will enhance non-negative pixel values. And all other basis vectors should consist half of positive numbers, and the other half of negative numbers. After multiplying by positive values and their addition, the result will be a small number. (This is especially true when the source data is close, and we know that neighboring pixels tend to have close values.) Recall that basis vectors provide a tool for extracting features from the source data. Therefore, a good choice would be basis vectors that are very different from each other and therefore can extract different features. This leads to the idea that the basis vectors should be mutually orthogonal. If the transformation matrix W consists of orthogonal vectors, then the transformation is called orthogonal. Another observation that allows for the correct choice of basis vectors is that these vectors must have increasingly higher frequencies of sign changes in order to extract, so to speak, the high-frequency characteristics of the compressed data when calculating the transformed quantities.

The first basis vector (top row W) is all ones, so its frequency is zero. All other vectors have two +1s and two -1s, so they will produce small converted values, and their frequencies (measured by the number of sign changes in a line) increase. This matrix is ​​similar to the Hadamard-Walsh transformation matrix (see equation (3.11)). For example, let's transform the initial vector (4,6,5,2)

The result is quite encouraging, since the number With\ became large (compared to the original data), and the other two numbers became small. Let's calculate the energies of the original and transformed data. The initial energy is 4 2 + b 2 + 5 2 + 2 2 = 81, and after the transformation the energy became 17 2 + 3 2 + (-5) 2 + I 2 - 324, which is four times greater. Energy can be saved by multiplying the transformation matrix W by a factor of 1/2. The new product W-(4,6,5,2) t will be equal to (17/2,3/2, -5/2,1/2). So, the energy is conserved and concentrated in the first component, and it now amounts to 8.5 2 /81 = 89% of the total energy of the original data, in which the first component accounted for only 20%.

Another advantage of the W matrix is ​​that it also does the inverse transformation. The original data (4,6,5,2) is restored using the product W-(17/2,3/2, -5/2,1/2) t.

We are now in a position to appreciate the merits of this transformation. We quantize the transformed vector (8.5,1.5,-2.5,0.5) by rounding it to an integer and get (9,1,-3,0). We do the inverse transformation and get the vector (3.5,6.5,5.5,2.5). In a similar experiment, we simply remove the two smallest numbers and get (8. 5,0, -2.5,0), and then do the inverse transformation of this roughly quantized vector. This results in reconstructed data (3,5.5,5.5,3), which is also quite close to the original. So, our conclusion: even this simple and intuitive transformation is a good tool for “squeezing” redundancy out of the original data. More sophisticated transformations produce results that allow data to be recovered with a high degree of similarity even with very coarse quantization.

“Image in printing” - Specifics of image in printing. The main property of a printed image. Book. Distinctive feature most fine printing works. Plurality Massiveness Public accessibility. Connecting images with text. The art of the book. Font.

“Vector and raster graphics” - Vector primitives are specified using descriptions. Principles of constructing vector and raster images. Vector images take up a relatively small amount of memory. Types of computer graphics. Vector images are described by tens and sometimes thousands of commands. Disadvantages of raster graphics.

“Computer graphics” - The main problems when working with raster graphics. Types of computer graphics differ in the principles of image formation. Computer graphics. Fractal graphics. Types of computer graphics. Large volumes of data. Pixel. Comparative characteristics of raster and vector graphics. Each point on the screen can have only two states - “black” or “white”.

“Creating graphic images” - Canvas boundaries. Task 4. Create a drawing consisting of autoshapes. Create a drawing using the Draw toolbar. The position of the graphic image in the text. Insert a picture from the collection into the text. Canvas. Comparative characteristics of raster and vector graphics. Features of creating a vector image in Word 2003.

“Image of a man’s head” - Other cold, dead faces are closed with bars, like a dungeon. Others are like towers in which no one lives or looks out the window for a long time. What types of portraits are there? Proportions of a person's face. Image of facial features. Human face and emotions. N. Zabolotsky. What types of faces are there? Drawing of a human head. Truly the world is both great and wonderful!

“Raster images” - Conclusions from the experiment. Red. What primary colors does a computer use? Raster coding of graphic information. Raster image. Pixels different colors. Blue (turquoise). Grey. Pink. Palette modern computers. All colors can be numbered, and each number can be converted into binary code.



Did you like the article? Share with your friends!