Machine translate. Cars - retired

Content:
Introduction……………………………………………………….………………. 3
1.1 What is machine translation?.................................................... ........ ................ 5
1.2 Start of machine translation……..………….……...….………………… 8
1.3 Stages of development of machine translation…………………….………….…. 12
1.4 Modern machine translation …………………..……………………….. 15
1.5 Machine translation on the Internet …….………………… ……………….. 18
Conclusion ……………………………………………………………………. 21
Literature…….…………………...……………………………………. . 22

Introduction.
Mechanization of translation is the oldest dream of humanity. But in the 20th century such a dream became a reality. This is largely due to the constant desire of society for globalization and even ethnic conflicts and political cataclysms, the strengthening of socio-economic ties between states, and the integration of many previously “closed” countries into the world community. Knowledge of foreign languages ​​is not only a useful skill in everyday life, but also one of the basic requirements when applying for a job. Currently, the need to know one or even several foreign languages ​​is becoming increasingly urgent. Knowledge of the language (English or German) is necessary not only when traveling on vacation abroad, but also when receiving business partners from abroad, in everyday life when reading the news or watching films. Therefore, a large number of routine, everyday and everyday operations that did not previously require knowledge of a foreign language, today, due to the development of international integration processes and the widespread desire of business for globalization, are becoming increasingly difficult if one relies on only one language. In this regard, today, the services of translators who perform professional translations into English, German and other languages ​​and language pairs are becoming increasingly in demand. However, today knowledge of foreign languages ​​alone is not enough, since the volume of information that needs to be translated every day has increased significantly. At the same time, this task is successfully solved, and it is not difficult for anyone to translate a contract or content of a foreign website in just a few seconds. And all because the translation in this case is carried out by a translator program: the person does not even have time to blink an eye, and the translation is already ready.
But today, as before, reality is not perfect. There is not a single machine translation system that, with the click of just a few buttons, can produce a flawless translation of any text in any language without human intervention or at least editing. For now, these are only plans for the distant future, if such an ideal can be achieved at all, since many question this assumption.

1.1 What is machine translation?

Machine translation is a translation process performed by a special computer program that allows you to convert text in one natural language into equivalent text in another language. This is also the name of the direction of scientific research related to the construction of such systems.
Modern machine or automatic translation can be considered in the interaction of a computer program with a person:

      With post-editing, when the source text is processed by a machine, and a human editor corrects the result.
      With pre-editing, when a person adapts the text for processing by a machine, for example, eliminating possible ambiguous readings, simplifying and marking up the text, after which software processing begins.
      With inter-editing, in which a person intervenes in the operation of the translation system, resolving difficult cases.
      Mixed systems, including, for example, simultaneous pre- and post-editing.
The main goal of machine translation as a science is to develop an algorithm that completely automates the translation process.
To carry out machine translation, a special program is introduced into the computer that implements the translation algorithm, which is understood as a sequence of uniquely and strictly defined actions on the text to find translation correspondence in a given pair of languages ​​L 1 - L 2 for a given direction of translation (from one specific language to another) . The machine translation system includes bilingual dictionaries equipped with the necessary grammatical information (morphological, syntactic and semantic) to ensure the transmission of equivalent, variant and transformational translation correspondences, as well as algorithmic grammatical analysis tools that implement any of the formal grammars accepted for automatic text processing . There are also separate machine translation systems designed to translate within three or more languages, but these are currently experimental.
The most common is the following sequence of formal operations that provide analysis and synthesis in a machine translation system:
1. At the first stage, text is entered and a search for input word forms (words in a specific grammatical form, for example, the dative plural) is carried out in the input dictionary (dictionary of the language from which the translation is made) with accompanying morphological analysis, during which it is established that the given word form belongs to a certain lexeme (a word as a unit of vocabulary). In the process of analysis, information related to other levels of organization of the language system can also be obtained from the form of a word.
2. The next stage includes the translation of idiomatic phrases, phraseological units or cliches of a given subject area. Includes the determination of the basic grammatical (morphological, syntactic, semantic and lexical) characteristics of the elements of the input text, produced within the framework of the input language; resolution of homography (conversion homonymy of word forms - say, English. round can be a noun, adjective, adverb, verb or preposition); lexical analysis and translation of lexemes. Typically, at this stage, single-valued words are separated from polysemous words (having more than one translation equivalent in the target language), after which single-valued words are translated using lists of equivalents, and to translate polysemantic words, so-called contextual dictionaries are used, the dictionary entries of which are algorithms for querying the context in the presence or absence of contextual determinants of meaning.
3. Final grammatical analysis, during which the necessary grammatical information is determined taking into account the data of the target language (for example, with Russian nouns like sled, scissors the verb must be in the plural form, although the original may also have a singular form).
4. Synthesis of output word forms and sentences as a whole in the target language.
Depending on the characteristics of the morphology, syntax and semantics of a particular language pair, as well as the direction of translation, the general translation algorithm may include other stages, as well as modifications of these stages or the order of their occurrence, but variations of this kind in modern systems are usually insignificant. Analysis and synthesis can be carried out both phrase by phrase and for the entire text entered into the computer memory; in the latter case, the translation algorithm provides for the identification of so-called anaphoric connections.
Modern machine translation should be distinguished from the use of computers to assist human translators. In the latter case, we mean an automatic dictionary that helps a person quickly select the desired translation equivalent. Although in both cases the computer works together with a person (translator or editor), the content of the term “machine translation” includes the idea that the main part of the work of translating and finding translation equivalents and translation correspondences is carried out by the machine. themselves, leaving the person only to control and correct mistakes. While a computer dictionary to help a person is a purely auxiliary tool for quickly finding translation matches; At the same time, however, in dictionaries of this kind, some functions inherent in machine translation systems can be implemented to a limited extent.

1.2 Start of machine translation.

Machine translation technology, as a scientific field, has a history of almost a century, and the first ideas for automating the translation process appeared in the 17th century.
As is generally accepted, the reasons for the emergence of machine translation were the rapidly growing flow of information in different languages ​​of different countries and continents since the 2nd half of the 20th century, the need to assimilate it for scientific and technological progress, the lack of qualified (especially in certain fields) translators, as well as high the cost of their preparation.
The English inventor Charles Babbage first thought about developing new methods of translation, who proposed it in the late 1830s. project of the first computer in history. The essence of the device's operation was to use the potential of computer memory to store dictionaries. Ch. Babbage's idea was that a memory of 1000 50-bit decimal numbers (50 gears in each register) could be used to store dictionaries. However, Babbage never succeeded in bringing his idea to life.
The theoretical basis of the initial period of work on machine translation was the view of language as a code system. The pioneers of machine translation were mathematicians and engineers. Descriptions of their first experiments using newly emerging computers to solve cryptographic problems were published in the USA in the late 1940s. The birth date of machine translation as a research field is usually considered to be March 1947. It was then that the director of the natural sciences department of the Rockefeller Foundation, Warren Weaver, developed a memorandum in which he identified the task of text translation from one language to another as another area of ​​​​application of decryption technology. In his letter to Norbert Wiener, Warren Weaver first posed the problem of machine translation, comparing it to the problem of decryption.
This was followed by a heated discussion of the idea of ​​automated translation and the theoretical development of the first technologies. Suggestions were made about the complete replacement of human translators with electronic systems, and many professional translators feared being unemployed in the near future. Weaver's ideas formed the basis of an approach to machine translation based on the concept of interlingva: the information transfer stage is divided into two stages; At the first stage, the source sentence is translated into an intermediary language (created on the basis of simplified English), and then the result of this translation is presented in the target language.
The same Warren Weaver, after a series of discussions, drew up a memorandum in 1949 in which he theoretically substantiated the fundamental possibility of creating machine translation systems.Machine translation systems in those years were quite different from modern systems. These were very large and expensive machines that occupied entire rooms and required a large staff of engineers, operators and programmers for their maintenance. These computers were mainly used to carry out mathematical calculations for the needs of military institutions, as well as mathematics and physics departments of universities (the latter were also closely related to the military sphere). Therefore, in the early stages, the development of machine translation was actively supported by the military; Moreover, in the USA the main attention was paid to the Russian-English direction, and in the USSR - to the English-Russian direction.
In addition to obvious practical needs, an important role in the development of machine translation was played by the fact that the famous test of intelligence (“Turing test”), proposed in 1950 by the English mathematician A. Turing, actually replaced the question of whether a machine can think with the question of whether whether a machine can communicate with a person in natural language in such a way that he will not be able to distinguish it from a human interlocutor. Thus, for decades, issues of computer processing of natural language messages became the focus of research in cybernetics (and subsequently in artificial intelligence), and productive cooperation was established between mathematicians, programmers and computer engineers, on the one hand, and linguists, on the other.
Soon, funding for research began, and in 1952 the first conference on machine translation was held at the Massachusetts Institute of Technology, organized by logician and mathematician J. Bar-Hillel.
In 1954, the first results were presented to the public: IBM, together with Georgetown University (USA), successfully carried out the first experiment. It went down in history as the so-called Georgetown experiment, in which the first version of an electronic translator was presented. The experiment demonstrated fully automatic translation of more than 60 sentences from Russian to English . The presentation had a positive impact on the development of machine translation over the next 12 years.
The experiment was designed and prepared to attract public and governmentattention. Paradoxically, it was based on a rather simple system : it was based on only 6grammar rules, and the dictionary included 250 entries. The system was specialized: assubject areawas chosen for translationorganic chemistry. The program ran on an IBM 701 mainframe.
In the same 1954, the first experiment on machine translation was carried out in the USSR by I.K. Belskaya (linguistic part) and D.Yu. Panov (software part) at the Institute of Precision Mechanics and Computer Science of the USSR Academy of Sciences, and the first industrially suitable machine translation algorithm and a machine translation system from English into Russian on a universal computer were developed by a team led by Yu.A. Motorin. After this, work began in many information institutes, scientific and educational organizations in the country. The work in this area by domestic linguists such as I.A. Melchuk and Yu.D. Apresyan (Moscow), which resulted in the linguistic processor ETAP, deserves special mention. In 1960, an experimental machine translation laboratory was organized as part of the Research Institute of Mathematics and Mechanics in Leningrad, which was later transformed into the Laboratory of Mathematical Linguistics of Leningrad State University.
The Georgetown Experiment demonstration was widely reported in mass media and was perceived as a success. It influenced the decisions of some governments states , Firstly USA, invest in the region computational linguistics. The organizers of the experiment assured that within three to five years the problem of machine translation would be solved. The idea of ​​machine translation has stimulated the development of research in theoretical and applied linguistics around the world. Theories of formal grammars appeared, much attention was paid to the modeling of language and its individual aspects, linguistic and mental activity, issues of linguistic form and quantitative distributions of linguistic phenomena. New areas of linguistic science have emerged - computational, mathematical, engineering, statistical, algorithmic linguistics and a number of other branches of applied and theoretical linguistics. During the 1950s, departments of applied linguistics and machine translation were opened in educational centers around the world. So, in the USSR, such departments were created in Moscow (MSU named after M.V. Lomonosov, Moscow State Pedagogical Institute named after M. Thorez - now MSLU), in Minsk Moscow State Pedagogical Institute of Foreign Languages, in Yerevan, Makhachkala, Leningrad University, in the universities of Kyiv, Kharkov, Novosibirsk , a number of other cities. Research and development in machine translation has also taken place in France, England, the USA, Canada, Italy, Germany, Japan, the Netherlands, Bulgaria, Hungary and other countries, as well as in international organizations where there is a large volume of translations from various languages. Currently, research is being conducted in countries such as Malaysia, Saudi Arabia, Iran, etc.

1.3 Stages of development of machine translation.

As a result of such a successful start to the development of machine translation, it seemed that the creation of high-quality automatic translation systems was quite achievable within a few years. At the same time, the emphasis was on the development of fully automatic systems providing high-quality translations; human involvement in the post-editing phase was seen as a temporary compromise. Professional translators seriously feared that they would soon be left without work...
However, machine translation research has experienced both ups and downs throughout its history. In the 1950s, significant investments were made in research, but the results quickly disappointed investors. One of the main reasons for the low quality of machine translation in those years was the limited capabilities of hardware: a small amount of memory with slow access to the information contained in it, and the inability to fully use high-level programming languages. Another reason was the lack of a theoretical framework necessary to solve linguistic problems. As a result of this, the first machine translation systems were reduced to word-by-word (word by word) translation of texts without any syntactic, much less semantic, integrity.
In 1959, the philosopher J. Bar-Hillel argued that high-quality, fully automatic translation could not be achieved in principle. He proceeded from the fact that the choice of one translation or another is determined by knowledge of extra-linguistic reality, and this knowledge is too extensive and diverse to be entered into a computer. However, Bar-Hillel did not deny the idea of ​​machine translation as such, considering the development of machine systems oriented towards their use by a human translator (a kind of “human-machine symbiosis”) as a promising direction. But this speech had the most unfavorable impact on the development of machine translation in the United States. In the early 1960s, the initial euphoric stage in the development of MP ended. This was greatly facilitated by the publication of the so-called “Black Book of Machine Translation” - a report by the Ad Hoc Committee on Applied Linguistics (ALPAC) of the US National Academy of Sciences, which stated the impossibility of creating universal high-quality machine translation systems in the foreseeable future. The commission came to the conclusion that machine translation was unprofitable: the ratio of cost and quality was clearly not in favor of the latter, and there were sufficient human resources for the needs of translating technical and scientific texts. The consequence of this publication was a reduction in funding and a general decline in interest in the problems of machine translation, but a complete curtailment of research, especially theoretical research, did not occur. And the first translation systems continued to be popular in military and scientific institutions of the USSR and the USA.
A new stage in the development of machine translation technologies began in the 1970s. This rise was associated with the advent of computing technology - the emergence of microcomputers, the development of networks, and an increase in memory resources. Programmers abandoned the idea of ​​​​creating an “ideal” translator machine: new systems were developed with the goal of greatly increasing the speed of information translation, but with the obligatory participation of a person at various stages of the translation process to achieve the best quality of work.
About the revival of machine translation in the 70-80s. The following facts indicate: the Commission of the European Communities (CEC) buys the English-French version of Systran, as well as a translation system from Russian into English (the latter developed after the ALPAC report and continued to be used by the US Air Force and NASA); in addition, CEC commissions the development of French-English and Italian-English versions. At that time, thanks to the CEC, the foundations of the EUROTRA project were laid, based on the developments of the SUSY and GETA groups. At the same time, there is a rapid expansion of activities to create machine translation systems in Japan; in the USA, the Pan American Health Organization (PAHO) orders the development of a Spanish-English track (SPANAM system); The US Air Force is funding the development of the MP system at the Linguistic Research Center at the University of Texas at Austin; The TAUM group in Canada makes significant progress in developing its METEO system (which was used primarily for the translation of weather reports). A number of projects started in the 70-80s subsequently developed into full-fledged commercial systems. In our country, the development of the fundamentals of machine translation technology was continued by a group of specialists at VINITI under the leadership of Professor G. G. Belonogov. As a result, in 1993, an industrial version of the RETRANS system for phraseological machine translation from Russian into English and vice versa was created, which was used in the ministries of defense, railways, science and technology, as well as in the All-Russian Scientific Information Center.
The next stage of research in the field of machine translation was the 90s of the last century. This is, of course, connected with the colossal progress of modern personal computers, the emergence of high-quality scanners and effective optical text recognition programs accessible to the mass user and, of course, with the advent of the global computer network Internet. All this gave new impetus to work on machine translation, attracted new significant investments into this area and resulted in serious practical results. Namely, quite effective machine translation systems and computer dictionaries have appeared for working on a personal computer; machine translation systems were combined with optical text recognition and spell checking systems. Special machine translation tools have been created for working on the Internet, providing either translation of texts on the servers of relevant companies, or online translation of Web pages, allowing one to overcome the language barrier and navigate through foreign sites.

1.4 Modern machine translation.

Today's translation programs have a much broader outlook and operate on the basis of more advanced translation technologies. Translation systems are actively used all over the world in cases where it is necessary to quickly understand the meaning of a text or frequently translate large amounts of information. Some developers today have managed to achieve very acceptable translation quality in certain language areas.
Modern machine translation should be distinguished from the use of computers to assist human translators. In the latter case, we mean an automatic dictionary that helps a person quickly select the desired translation equivalent. The content of the term “machine translation” includes the idea that the machine takes on the main part of the work of translation and finding translation equivalents and translation correspondences. A person is provided only with control and correction of errors, while a computer dictionary to help a person is a purely auxiliary tool for quickly finding translation matches.
In translation practice and in information technology, there are two main approaches to machine translation. On the one hand, machine translation results can be used to briefly familiarize yourself with the content of a document in an unknown language. In this case, it can be used as signal information and does not require careful editing. Another approach involves using machine translation instead of regular human translation. This involves careful editing and customization of the translation system for a specific subject area. The completeness of the dictionary, its focus on the content and set of linguistic means of the translated texts, the effectiveness of methods for resolving lexical ambiguity, the effectiveness of algorithms for extracting grammatical information, finding translation correspondences and synthesis algorithms play a role here. In practice, translation of this type becomes cost-effective if the volume of translated texts is large enough, if the texts are sufficiently homogeneous, the system dictionaries are complete and allow further expansion, and the software is convenient for post-editing. This kind of machine translation systems is used in organizations whose needs for prompt and high-quality translations are quite large.
Within the framework of machine translation technology, there are two approaches: traditional (rule-based) and statistical (based on statistical processing of dictionary databases). The traditional MT method is used by most translation system developers. The work of such a program includes several stages and, in essence, consists of using linguistic rules (algorithms). Accordingly, the creation of such an electronic translator includes the development of rules and replenishment of the system’s dictionary databases. The quality of the output translation depends on the development of the necessary algorithms. The rich vocabulary of the system also allows you to cope with the translation of a wide variety of texts. The statistical method operates on a completely different principle. It is based on mathematical methods for obtaining translation. More precisely, the entire operating principle of such a system is based on statistical calculation of the probability of matches of phrases from the source text with phrases that are stored in the translation system database.
In Russia, using the traditional method of machine translation, software products of the PROMT company are developed - the only manufacturer of translation programs in our country. Currently, the PROMT company is a leading developer of automated translation systems and has enormous technological expertise, which allows it to developtranslation systemswith different functionality. Unique technologies for constructing translation systems and original algorithms for working with texts in natural languages ​​became the basis on which all the company’s software products were created, and which provided the opportunity to develop a wide range of solutions for automated translation from one language to another. PROMT software products are equally useful for solving business problems and for home use. Recently, PROMT has been paying special attention to the creation of special tools and technologies for professional translators. Currently, PROMT systems perform translations for24 language directions. The general dictionary for one language pair contains from 40 to 200 thousand dictionary entries, which in turn contain a structured description of various linguistic information necessary for the system to operate complex text analysis and synthesis algorithms. Dictionaries by topic contain specific words and expressions characteristic of the subject area; their volume can vary from 5 to 50 thousand dictionary entries. For example, specialized dictionaries have been developed for the English-Russian and Russian-English systems, covering more than 50 different topics.

1.5 Machine translation on the Internet.

Online translation of information on the Internet is becoming increasingly popular. The Internet is rapidly transforming from a predominantly English-speaking environment to a multilingual environment, forcing Web site owners to provide information in multiple languages. Most often, information and search sites that seek to attract multilingual users to their pages resort to the services of MP. Thus, a new translation service has opened on the Canadian information retrieval portal InfiniT (http://www.infiniT.com). The website now offers online translation of text from English and German into French and vice versa. The increase in the number of visitors to the portal is due to the possibility of online translation of Web pages. To do this, the user just needs to indicate the address of the Web page, select the direction of translation and click the translation button. As a result, in a few seconds the user receives a fully translated Web page with formatting preserved.
The new service allows us to eliminate the language problem on the Canadian Internet, where, due to historical features, two languages ​​are widely used: English and French. In addition, the online translator provides access to sites in German to those residents of Canada who do not speak foreign languages. The service runs on the PROMT Internet server solution called PROMT Internet Translation Server version 2.0. The project was implemented jointly with the Softissimo company, which promotes PROMT products under the REVERSO brand. An interesting feature of Web sites introducing MP programs, electronic dictionaries and other linguistic support programs is that you can get acquainted with the work of many software products interactively, using the version installed on the server and having a gateway for remote communication via a Web interface . On the server of the Web publishing house "InfoArt" (http://www.
infoart.ru/misc/dict) an interactive demonstration of the Lingvo and MultiLex dictionaries was organized. You can enter a word or phrase and instantly get a translation, interpretation, examples of use and common phrases.
The most universal is PROMT Internet. By purchasing this package, you will receive several programs for translating Web pages, and not only them. It is safe to say that the capabilities of this set of applications are quite sufficient for full-fledged work with documents in English, French and German. If you plan to use the universal translation program WebTranSite 98 or the WebView browser more than other parts of the PROMT Internet package, and at the same time want to save some money, you can purchase these products separately. In this case, WebTranSite 98 will appeal to those who often translate small fragments of text not only from the Internet, but also from office, email and other programs, as well as from online help systems.
WebTranSite 98 is suitable for more than just translating Web pages. It is quite universal and allows you to process fragments
etc.................

Product overview

With the advent of writing, people received a powerful tool for preserving knowledge and for communication. The first writings that have come down to us on the walls of temples and tombs tell about the deeds of kings and generals that took place many centuries ago. In addition, people recorded the results of economic activities in order to successfully trade, collect taxes, etc.

To facilitate written communication between peoples, the first dictionaries were created.

One of these dictionaries was written by Sumerian priests on clay tablets.

Each tablet was divided into two equal parts. On one side, a Sumerian word was written, and on the other, a word of similar meaning in another language, sometimes with a brief explanation. From those times to the present day, the structure of dictionaries has remained virtually unchanged.

With the advent of the personal computer, electronic dictionaries began to be created, making it easier to find the right word and offering many new useful functions (voicing the word, searching for synonyms, etc.).

Recently, the possibilities and prospects of machine translation (MT) technologies have been actively discussed. Both professional translators and MT system manufacturers take part in the discussions. Let's try to evaluate the capabilities of MP, based on the experience of using real systems.

To be fair, it should be noted that in the foreseeable future, machine technology will not be able to completely replace the human translator. In terms of translation quality, MP programs cannot compete with humans. However, with the help of such programs you can significantly increase the efficiency of a translator’s work.

Based on the formal description of languages, the program analyzes text in one language and then synthesizes a phrase in another. Analysis and synthesis algorithms, as a rule, are quite complex and are controlled by dictionary information assigned to lexical units in the system dictionaries for both the language of the source text and the language of its translation.

Where are MP systems used? Firstly, translation programs can be used to quickly translate text in order to understand its meaning.

Of course, the quality of machine translation cannot compare with translation made by a person, but the user receives an answer “here and now.” In addition, with the help of MP systems, you can read information posted on foreign websites, as well as understand the text of a sent letter written in French, German, Japanese or another language.

What benefits does the use of MP systems provide and where is it most appropriate? MP systems, using a common vocabulary base for translation, significantly minimize the costs of maintaining a uniform terminology, and, consequently, of editorial editing. In this case, the technical editor receives from the MP system a translation made in the same style. Thus, the use of machine translation systems is most effective for organizing the technological process of translating large arrays of similar documents in a short time, ensuring uniformity of terminology and style throughout the entire array of documents.

The possibility of using an MP system is determined by its ability to adapt to the translation of documents on various topics. The quality of the resulting translation largely depends on the settings. In addition to the general lexical dictionary, specialized dictionaries should be used that reflect both the topic of translation and the specifics of specific documents. In addition, the quality of translations depends on the ability of the translator to create his own custom dictionaries, which should include terminology specific to this documentation, as well as frequently occurring phrases/expressions (micro-segments), the translation of which cannot be formal.

Such a setting guarantees the quality at which the use of MT becomes effective for solving “industrial” translation problems.

To evaluate the effectiveness of using MP systems, PROMT provided its PROMT 2000 Translation Office system to the LONIIS translation center. The experiment showed that the use of MP can reduce the total project completion time by approximately 2 times.

It should be noted that there are a number of restrictions on the use of MP systems. It makes no sense to translate literary texts, proverbs and sayings using a translator program. Small texts on various topics are also better translated in the traditional way.

PROMT Translation Office 2000

PROMT Translation Office 2000 (hereinafter referred to as PROMT), priced at $300, is a set of professional tools that provides translation from major European languages ​​into Russian and vice versa. With its help, you can not only translate, but also edit the translation and work with dictionaries of all language areas at the same time.

  • PROMT includes the following collections of dictionaries:
  • "Light industry" ($180);
  • "Heavy Industry" ($180);
  • Commerce ($99);
  • Science ($120);

To ensure high quality translation, the PROMT system provides the ability to configure the translation of a specific text - by connecting specialized subject dictionaries, supplied separately, as well as creating your own custom dictionaries. A convenient means of setting up the system is also the ability to select the subject of the document: which dictionaries to connect, which words to leave without translation, and how to process special constructions such as email address, date and time.

The PROMT system includes the following modules:

  • PROMT - professional environment for translation;
  • Dictionary Editor - a tool for replenishing and editing dictionaries of machine translation systems of the PROMT family;
  • PROMT Electronic Dictionary is an electronic dictionary that provides the user with ample access to lexical and grammatical information collected in specialized dictionaries of the PROMT family. Can be used for any work with texts (for example, to quickly obtain information about translation equivalents of a given word or phrase);
  • WebView is a browser that allows you to get simultaneous translation of HTML pages when navigating the Internet. WebView contains two windows for displaying HTML pages: the top one displays the original page received from the Internet, the bottom one displays its translation, saving links, pictures, inserted objects, etc. You can follow links both in the upper window containing the source text and in the lower window containing the translation;
  • SmarTool is a tool that implements translation functions in Microsoft Office 97 (Word, Excel) and Microsoft Office 2000 (Word, Excel, PowerPoint, FrontPage, Outlook) applications. The translation menu and toolbar are built into all major Microsoft Office 2000 and Microsoft Office 97 applications, which allows you to get a translation of an open document directly in these applications;
  • QTrans is a program designed for quickly translating unformatted text. With its help you can easily and quickly translate text, a text file or a clipboard (Clipboard). To improve the quality of translation, you can select a suitable topic, connect specialized dictionaries and reserve words;
  • Clipboard Translator is a program designed to quickly translate text previously copied to the clipboard. The text can be copied from any Windows application (Help, Notepad, Word, Word Perfect, PageMaker, etc.);
  • “Integrator” is a means of access to all applications of the package.

Translation of a document in the PROMT system

The label marks the current paragraph of the source text and the translation of this paragraph (the current one is the one in which the cursor is currently positioned).

All documents with which the PROMT program works appear in document windows.

Several documents can be open at the same time - each in its own window (Fig. 4,).

  • The completed translation can be clarified by using electronic dictionaries developed by other companies (if they are installed on your computer, of course).
  • Electronic dictionaries can be used:
  • Lingvo 6.0 (ABBYY program);
  • “Context 3.0” (program from the Informatik company);

"MultiLex 1.0, 2.0, 3.0" (program of the company "MediaLingua");

PROMT Electronic Dictionary 1.0 (program from PROMT).

When translating, the PROMT system does not use electronic dictionaries from other manufacturers.

  1. Therefore, if a word is not in the PROMT system dictionaries or you are not satisfied with the translation of a word or phrase, you can call up the electronic dictionary and use it as a reference.
  2. The WebView browser is included in the package for translating HTML documents.
  3. Sequence of actions when performing a translation
  4. Open the file with the source text or create a new document (new text can be typed directly in the PROMT window).
  5. Check the text breakdown into paragraphs (after translation, the paragraph formatting will remain the same).
    • Check spelling and edit source text if necessary.
    • Select a topic template suitable for translating a given text (a topic template for a given translation direction is a set of dictionaries and a list of reserved words; it is installed to improve the quality of the translation).
    • Specify the subject of the document by customizing its components:
    • connect dictionaries that will be used when translating the text.
  6. If no dictionary is connected, only the general lexical general dictionary will be used for translation;
  7. reserve words that in the translation text should remain in the source language;
  8. connect the preprocessor if you want to cancel the translation of certain structures, such as email addresses, file names, and also choose the form of representing date and time in the translation text;
  9. Save the translation results.

System requirements

  • IBM PC-compatible computer with a P166 processor or higher;
  • 32 MB of RAM;
  • approximately 160 MB of hard disk space (for a system with all components);
  • SVGA or better resolution video adapter;
  • CD-ROM reader (for installation);
  • mouse or compatible device;
  • OS: Windows 98 (Russian version or pan-European with Russian language support and Russian regional settings), or Windows NT 4.0 SP3 (or higher) with Russian language support and Russian regional settings, or Windows 2000 Professional (with Russian language support and Russian regional settings );
  • Microsoft Internet Explorer 5.x (included in delivery).
  • IBM PC-compatible computer with a PII-300 processor or higher;
  • 64 MB RAM

Translation of a document in the Socrates Personal system

The main window of the program is shown in Fig. 6.

When you launch it for the first time, the main program window opens by default on the “Translator” tab. Translation of text typed in the program window: by typing text in the upper window of the “Translator” tab and clicking the “Translate” button on the toolbar or in the “Translation” menu, you will receive a translation of the text in the lower window of the tab.

In order to use the dictionary (Fig. 7), just click on the corresponding tab. In addition, the dictionary window can be called up using hot keys.

Using a dictionary, you can get the translation of the word you are looking for in the following ways:

  • Type the word in the input field located in the upper right window of the dictionary.
  • Navigation through the dictionary database is carried out as letters are entered, until the maximum possible match is obtained;
  • paste a word into the input field from the clipboard. In this case, a quick transition will be made to the word that most closely matches the entered one;
  • select a previously translated word from the input field history window, after which a quick transition will be made to the word that has the maximum possible match with the entered one;
  • Select a word in another application and, while holding down the Shift key, right-click on the selection. The translation of the selected word will appear in a pop-up window;

use a hotkey combination after placing the required word on the clipboard.

The Socrates Personal 4.0 system provides the ability to work with a translator and dictionary in other applications without leaving them. The translation is carried out in a pop-up window.

In order to get a translation of text from another application (for example, a text editor), you need to select the text to be translated and, while holding down the Shift key, right-click on the selection. A pop-up window will appear containing the translation of the selected fragment.

In order to get a translation of a word from another application, you need to select the word you are interested in and, while holding down the Shift key, right-click on the selection. The pop-up window that appears will contain the translation of the highlighted word.

If necessary, from this window you can go directly to the “Dictionary” tab using the hyperlink of the pop-up window.

System requirements

Minimum computer configuration:

  • IBM PC-compatible computer with a Pentium 90 processor or higher;
  • Operating system Windows 98/Me or Windows NT/2000;
  • 32 MB of RAM;
  • 16 MB of free hard disk space.

Test results for PROMT Translation Office 2000 and Socrates Personal 4.0

To compare the quality and speed of translation of the two systems, several fragments of texts in Russian and English were selected: individual phrases, news from companies, passages from the Bible, Murphy's Laws, technical, medical, legal texts. Ratings were given on a 10-point scale. After this, a comparison was made of the results of translation from English into Russian and vice versa (Table 1).

It should be noted that PROMT Translation Office 2000 and Socrates Personal 4.0 are products designed to solve different problems. PROMT Translation Office 2000 is a professional translation system that makes it much more efficient to translate large volumes of information. In addition, the PROMT system correctly implements the grammatical rules of a particular language. Therefore, the quality of the translation is very high. The disadvantages of the PROMT system are high requirements for hardware resources and significant translation time when connecting several additional dictionaries.

"Socrates Personal 4.0" is an automatic translation system that helps you quickly and easily get a translation of an unclear phrase or term. Its main purpose is to always be at hand.

Translating a short letter or phrase from a text using Socrates Personal 4.0 is much easier and faster than using the PROMT system. However, to translate a large amount of text, it is advisable to use PROMT Translation Office 2000.

Lingvo 7.0

Lingvo 7.0 is a powerful professional dictionary that is very user-friendly.

Press a hotkey in any Windows application - and the most complete translation of the word from all dictionaries connected to the system will appear on the screen. Grammar comments on any word, pronunciation of the most important words, checking spelling, the ability to create your own dictionaries - all this is offered by ABBYY Lingvo 7.0 (Fig. 9). Lingvo 7.0 contains more than 1.2 million words and phrases in 18 general and specialized dictionaries.

When Lingvo starts, the main window appears on the screen (Fig. 10). The user can type the desired word in the input line. As you type, the program will search for the most suitable word. By pressing the enter key or the “Translate text” icon, the user will see a card window containing the dictionary entry of the selected (found during search) word (Fig. 11).

If you are reading the help section of a program, working with a text editor, browser, or any other Windows application, select a word or several words in the text and press Ctrl+Ins+Ins. Or simply drag-and-drop the word into the input line. This will activate the main Lingvo window and open a card with the translation of the selected word. If there are many such cards, the “Translation” window will appear containing words and phrases from the request.

To insert a translation into the edited text, select the translation in the card and press Ctrl+Ins. Switch to the text editor window and perform the “Paste” operation. You can also drag the translation onto your text editor window.

When translating from Russian into English, highlighting combinations and grammatical structures is not difficult, and if these combinations are not in the dictionary, you can immediately turn to the full-text search function. The search results allow you to evaluate how the expression you are interested in is translated in real examples.

Main features of Lingvo:

  • translation of 1.2 million words and phrases;
  • 18 general and specialized dictionaries (2 medical and 2 legal dictionaries in Lingvo 7.0 - new);
  • modern vocabulary;
  • calling the dictionary from any Windows application;
  • perfect search system;
  • 5 thousand English words were voiced by an announcer from Oxford;
  • the ability to create your own custom dictionaries;
  • 23 free user dictionaries at http://www.lingvo.ru/;
  • detailed interpretations and explanations of the use of words;
  • modern linguistic technologies;
  • new updated versions of general and specialized dictionaries.

System requirements

Minimum computer configuration:

  • IBM PC-compatible computer with a Pentium 133 processor or higher;
  • operating system Windows 95/98/Me, Windows 2000/Windows NT 4.0 (SP3 or higher);
  • 16 MB of RAM for Windows 95/98/Me, 32 MB of RAM for Windows 2000/Windows NT 4.0;
  • from 85 to 265 MB of free hard disk space;
  • 3.5” disk drive and CD-ROM drive, mouse;
  • Microsoft Internet Explorer 5.0 and higher (the ABBYY Lingvo 7.0 distribution includes Microsoft Internet Explorer 5.5 - installing it will require an additional 27 to 80 MB);
  • sound card compatible with the operating system; headphones or speakers (recommended).

Context 4.0

"Context 4.0" is a system of electronic dictionaries that includes a developed software shell and an extensive set of dictionaries - both general vocabulary and specialized ones.

Context dictionaries are two-way. The program translates from one language to another and back without any special settings. The translation search can be carried out both in all dictionaries included in the kit, and in a specific dictionary. At the same time, the set of active (participating in the search) dictionaries, as well as the search order for them, can be easily changed.

You can work with “Context” by typing a word or phrase of interest to the user into a special input field (Fig. 12).

It is convenient to work with “Context” from Windows applications. Translation is carried out using the drag-and-drop method or via the clipboard. In the settings, you can specify a hotkey or enable the option to start translation when placing text on the clipboard.

For users working in the MS Word editor, the ability to call “Context” from the editor itself has been implemented. To do this, click on the “Context” icon located on the MS Word toolbar, and the user does not need to select a word or phrase in the text. “Context” will translate the word the cursor is on and at the same time check several words on the right and left to see if they are part of the phrase.

“Context” is completed with dictionaries upon request of the user. If the user has purchased a shell and some dictionaries, he can purchase any other dictionaries he needs.

The 4th version of Context has a number of interesting features that were not present in previous versions. For example, the dictionary searches partial phrases.

In this case, all phrases whose relevance coefficient in relation to the search string is greater than a specified threshold value are displayed in the translation window (Fig. 13,).

There is a new function of fast dialing (Fast Typing). When entering a word, the user receives hints of similar words from the current dictionary, taking into account the characters already entered (Fig. 15). Then the user can select from the list or continue dialing himself.

The new version has the ability to add and edit dictionary entries, which makes the dictionary system more flexible. In the previous version of Context, the ability to work with the user's dictionary was implemented. The new version of the Context program allows you to create several dictionaries and edit them. User dictionaries, standard dictionaries, and user dictionaries are equal in the Context dictionary system. The format of the user's dictionary entry is close to the format of the standard dictionary, that is, to the usual book format.

The article may include both words and expressions and examples of the use of words as part of set expressions and interpretation (

MultiLex 3.5

"MultiLex 3.5" is an electronic dictionary, which includes electronic versions of well-known printed dictionaries. A variety of English-Russian and Russian-English dictionaries are produced in the “MultiLex 3.5 English” shell (New English-Russian Dictionary by V.K. Muller, English-Russian/Russian-English Dictionary by O.S. Akhmanova, Russian-English Dictionary ed. . A.I. Smirnitsky). It is planned to release technical, medical-biological, economic-legal and other collections.

"MultiLex 3.5 English" allows the user to gradually select for himself the optimal set of dictionaries that will work together.

  • Features of the MultiLex dictionary:
  • convenience and ease of use;
  • voicing a large number of dictionary entries;
  • quick access to important articles: using bookmarks, you can mark dictionary entries that are important to you, and then access them directly;
  • “speed dial” function - when typing a word, a list of similar words appears, from which the user can select a word for translation without typing the whole word;
  • translating a word or phrase and transferring translation results to a Windows application via the clipboard or drag-and-drop;
  • entering notes: when working collaboratively, it is important to maintain uniform terminology.

This is where the notes mechanism comes to the rescue - you can write your own notes for any dictionary entry;

user dictionary.

The left panel contains a list of headings of articles of that dictionary, which is shown in the dictionary panel using an icon in the form of an open book (used to view the headings of dictionary entries). The right panel always shows the dictionary entry corresponding to the title highlighted in the right panel. A dictionary entry begins with a title, followed by its transcription. Next, the part of speech is indicated, possible translations, explanations, and examples are given.

The dictionary panel allows you to select the desired dictionary. Each dictionary has its own icon, which takes three different states: closed book, half-open book, or open book. The shape of the icons shows which dictionary is currently open and in which dictionaries the last search found something.

If the dictionary icon depicts an open book (notepad) - this dictionary is now open, a half-open book (notebook) - this dictionary is not currently open, but it contains information relevant to your request, and if the icon depicts a closed book (notebook) - this dictionary is closed and it doesn't contain the information you need.

In July 2001, a new version of the dictionary “MultiLex 3.5 English Popular” (English-Russian, Russian-English dictionary of general vocabulary edited by O.S. Akhmanova and E.A.M. Wilson) was released. It contains more than 40 thousand dictionary entries.

Version 3.5 has a number of advantages that you will not find in the previous version:

  • possibility of additional installation of dictionaries. By purchasing any English dictionary (version no lower than 3.5), you can easily integrate it into your MultiLex. It is planned to release technical, medical-biological, economic-legal and other collections;
  • pop-up translation. MultiLex 3.5 provides support for translation using hot keys from any application that supports Clipboard. To do this, simply highlight the word, press the corresponding function key (F10 by default) - and a translation window will appear on the screen. The translation in the window is a hyperlink. If you need more complete information on the word you are interested in, click on the left mouse button to call up “MultiLex” with ready-made translation options for the requested word. The pop-up translation window can be installed on top of all windows by selecting the appropriate item in the context menu, which becomes available when you right-click on the “MultiLex” icon (in the lower right corner of the screen). A similar function is performed by the button on the left side of the “pop-up translation” window. Using this button you can “pin” the resulting translation anywhere on your screen;
  • Sound card compatible with the operating system, headphones or speakers (recommended).

Summary

In conclusion, a few words about personal experience in using machine translation systems and dictionaries.

Three years ago, I used a machine translation system to prepare a report for a Western employer. Several people who were involved in offshore programming wrote a program for the navigation receiver. Unfortunately, few of the group spoke enough English to describe the results of their work in the customer’s language.

In this regard, there was a need to translate reports compiled in Russian. It was then that the idea came to me to try out the Stylus machine translation system (the first versions of PROMT systems were called that way). This attempt turned out to be very successful: I translated the 140-page document three times faster than planned. Of course, the translation performed by the program was not perfect. I had to edit it a lot and for a long time. But the gain is obvious.

Since then, when translating texts of more than 10 pages, I always use machine translation systems.

I became familiar with electronic dictionaries even earlier, when I had a need to read foreign books and magazines on technical disciplines with specific vocabulary. Technical electronic dictionaries, dictionaries on telecommunications and computer science allowed me to save a lot of time and effort. Thanks Lingvo!

We hope that my story about new machine translation systems and dictionaries will help you organize your work effectively and ultimately achieve success.

The editors would like to thank for their assistance in preparing the article: Alexandra Andreeva, PROMT company; Andrey Sokolov, Informatics company; Anastasia Savina, ABBYY company;

Konstantin Konin and Natalya Talpa, MediaLingua company; Alexey Bukhanov, Arsenal company.

ComputerPress 9"2001

Machine Translation: A Brief History

Another outstanding mathematician of the 19th century, Charles Babbage, tried to convince the British government of the need to finance his research on the development of a “computing machine.” Among other benefits, he promised that one day this machine would be able to automatically translate spoken language. However, this idea remained unrealized [Shalyapina 1996: 105].

The birth date of machine translation as a research field is usually considered to be March 1947. It was then that cryptography specialist Warren Weaver, in his letter to Norbert Wiener, first posed the problem of machine translation, comparing it with the problem of decryption.

The same Weaver, after a series of discussions, drew up a memorandum in 1949 in which he theoretically substantiated the fundamental possibility of creating machine translation systems. W. Weaver wrote: “I have a text in front of me which is written in Russian but I am going to pretend that it is really written in English and that it has been coded in some strange symbols. All I need to do is strip off the code in order to retrieve the information contained in the text" ("I have a text in front of me that is written in Russian, but I am going to pretend that it is actually written in English and encoded using rather strange characters. All I need is to crack the code to extract the information contained in the text" [Slocum 1989: 56-58]. Weaver's ideas formed the basis of an approach to MP based on the concept interlingua

In those days, the few computers that were available were used mainly for solving military problems, so it is not surprising that in the USA the main attention was paid to Russian-English, and in the USSR - to English-Russian translation. By the early 50s, a number of research groups were struggling with the problem of automatic translation.

In 1952, the first conference on MT was held at the Massachusetts Institute of Technology, and in 1954, the first full-fledged machine translation system was presented - IBM Mark II, developed by IBM together with Georgetown University (this event went down in history as the Georgetown Experiment). The very limited system perfectly translated 49 specially selected sentences from Russian into English using a 250-word dictionary and six grammatical rules.

One of the new developments of the 70-80s was the TM (translation memory) technology, which works on the principle of accumulation: during the translation process, the original segment (sentence) and its translation are saved, resulting in the formation of a linguistic database; If an identical or similar segment to the original is found in the newly translated text, it is displayed along with the translation and an indication of the percentage match. The translator then makes a decision (edit, reject or accept the translation), the result of which is stored by the system.

Since the early 80s, when personal computers confidently and powerfully began to conquer the world, their operating time became cheaper, and they could be accessed at any moment. MP has become economically profitable. In addition, in these and subsequent years, the improvement of programs made it possible to translate many types of texts quite accurately, but some problems of MP remain unresolved to this day.

The 90s can be considered a true era of renaissance in the development of MP, which is associated not only with the high level of capabilities of personal computers, but also with the spread of the Internet, which created a real demand for MP. It has once again become an attractive investment area for both private investors and government agencies.

Since the early 1990s, Russian developers have been entering the PC systems market.

In July 1990, at the PC Forum exhibition in Moscow, Russia's first commercial machine translation system called PROMT (PROgrammer's Machine Translation) was presented. In 1991, PROJECT MT CJSC was created, and already in 1992 the company PROMT won the NASA competition for the supply of MP systems (PROMT was the only non-American company in this competition) [Kulagin 1979: 324].

As for the machine translation systems themselves, it should be noted that they went through three stages of their development:

  • 1. "Electronic translators" of the first generation - direct transfer systems (DTS)- were software and hardware systems and analyzed the text “word by word” (semantic connections and nuances were practically not taken into account). The capabilities of the NGN were determined by the available sizes of dictionaries, which directly depended on the amount of computer memory. The IBM Mark II, which made the Georgetown experiment fundamentally possible, belonged to the SPP category.
  • 2. Over time, SPP was replaced by T-systems(from the English Transfer - “transformation”), in which translation was carried out at the level of syntactic structures (this is how language is taught in high school). They performed a set of operations that made it possible, by analyzing the translated phrase, to determine its syntactic structure according to the rules of the grammar of the input language, and then transform it into the syntactic structure of the output sentence and synthesize a new phrase, substituting the necessary words from the dictionary of the output language. Work in this direction is no longer being carried out: practice has proven that the real correspondence system is more complex and adequate translation requires a fundamentally different algorithm of actions.
  • 3. A little later, the increasingly numerous machine translation systems, depending on the principle of their operation, began to be divided into MT-programs(from Machine Translation - "machine translation") and TM-complexes(from Translation Memory - “translation memory”). As a truly successful example of an MT program, let’s name the famous Canadian METEO system, which translates weather forecasts from French into English and back (it was created almost thirty years ago and is still in use today). The METEO developers bet on the fact that truly automated machine translation is only possible in conditions of an artificially limited (both in vocabulary and grammar) language. And they succeeded. The world's most popular professional TM-tool is the Translation's Workbench package from TRADOS. Such programs are used mainly by professional translators who have realized the benefits of partial automation of their work using a computer when translating repetitive texts that are similar in theme and structure.

The main idea of ​​Translation Memory is not to translate the same text twice. This technology is based on comparing the document that needs to be translated with data stored in a pre-created “input” database. When the system finds a fragment that meets predetermined criteria, its translation is taken from the “output” database. The resulting text is subject to intensive human post-editing [Marchuk 1997: 21-22].

Chapter 1 Conclusions

In Chapter 1 we looked at what translation is. Its types, forms and genres were identified. We also looked at machine translation. Having touched on the topic of machine translation, we looked at its brief history, as well as what place it occupies in the general classification of translation. We found out how the translator program works.

Kontsevoy Daniil Sergeevich,
Private educational institution of higher education "Omsk Law Academy", Omsk

A translator in the field of professional communications is a person who is actively proficient in a foreign language of the professional sphere, who is able to logically correctly, reasonedly and clearly construct foreign language oral and written speech, and most importantly, master the technique of using machine translation systems, because even professionals cannot do without turning to electronic translators.

Machine translate - a process performed on a computer or other electronic device to convert text from one language into equivalent text in another language, as well as the result of such an action. Since there are no fully automated electronic translators capable of accurately and correctly translating a text, a specialist translator must prepare this text, or correct errors and omissions already in the machine-processed text.

There are four forms of organizing interaction between a computer and a person when performing machine translation:

  • pre-editing: a person prepares the text for computer processing (simplifying the meaning of the text, eliminating ambiguous readings, marking up the text), after which machine translation is performed;
  • inter-editing: a person directly intervenes in the operation of the translation system, resolving problematic issues;
  • post-editing: the entire source text is subjected to machine processing, and a person corrects the result by editing the translated text;
  • mixed system.

Modern electronic translators are capable of producing a perceptually adequate translation of individual phrases and sentences; they serve to facilitate the work of a human translator, to relieve him of the routine work of searching for the meanings of certain words and phrases in dictionaries.

To master machine translation systems, it is necessary to at least have a general understanding of electronic translation technologies. There are several of them in machine translation:

1) Direct machine translation

Direct machine translation is the oldest machine translation approach. With this method of translation, the text in the source language is not subject to structural analysis beyond morphology. This translation uses a large number of dictionaries and is word-by-word, except for minor grammatical adjustments, for example regarding word order and morphology. The direct translation system is designed for specific language pairs. The lexicon is a repository of information about the specifics of words. These systems depend on the quality of dictionary preparation, morphological analysis and text processing software. An example of a direct translation system is Systran.

2) Rule-based machine translation uses a large store of linguistic rules and bilingual dictionaries for each language pair. Types of rule-based machine translation include the Interlingua principle and Transfer machine translation.

  • Machine translation Interlingua

In machine translation based on the Interlingua principle, translation is carried out through an intermediate (semantic) model of the source language text. Interlingua is a language-independent model from which translations into any language can be generated. The Interlingua principle allows for the possibility of transforming text in the source language into a model common to several languages.

  • Transfer machine translation is based on the idea of ​​Interlingua using comparative analysis of two languages. The three stages of this process are analysis, transfer and generation. First, the source language text is translated into an abstract or intermediate model of the source language, which is then transformed into a target language model, in order to be finally formed into text in the target language. This principle is simpler than Interlingua, but it is more difficult to avoid ambiguity.

3) Machine translation on text corpora

The corpus approach in machine translation uses a collection (corpus) of parallel bilingual texts. The main advantage of corpus-based machine translation systems is their self-tuning, i.e. they are able to remember the terminology and even the style of phrases from the texts of previous translations. Statistical machine translation and example-based machine translation are variants of the corpus approach.

  • Statistical machine translation

This is a type of machine text translation based on comparison of large volumes of language pairs. This translation approach uses statistical translation models. One of the approaches used is Bayes' theorem. Building statistical translation models is a fairly fast process, but the technology relies heavily on the availability of a multilingual text corpus. A minimum of 2 million words is required for each individual area if we are talking about the language as a whole. Statistical machine translation requires special equipment in order to “average” translation models. An example of statistical machine translation is Google Translate.

  • Machine translation with examples

Example-based machine translation systems are based on the principle of a parallel bilingual corpus of texts, which contains pairs of sentences as examples. Each sentence is duplicated in a different language. Statistical machine translation has a "learning" property. The more texts (examples) you have at your disposal, the better the machine translation result.

Every translator in the field of professional communication will face the problem of choosing the appropriate translation program. Excluding paid services, we consider it necessary to analyze the most well-known systems.

The electronic translator Google Translate, which was developed by Google in the mid-2000s, is very popular. This service is designed for translating texts and translating websites on the fly. The translator uses a self-learning machine translation algorithm based on language analysis of texts.

Unlike most machine translators, which use SYSTRAN technology, Google uses its own software. Google Translate is currently the most popular translator due to its simplicity and versatility (as well as its direct connection to the computer software developer - Microsoft). Thanks to this, this machine translation system is developing very quickly and is optimized to meet the needs of users. Therefore, now the functions of this translator can be observed: translation of the entire web page; simultaneous search for information with translation into another language; translation of text on images; translation of the spoken phrase; handwriting translation; translation of dialogue.

The features of this machine translation system include:

  1. Translation options are controlled by a statistical algorithm.

Users can always offer their own translations of certain words and/or select one of the translation options as the most suitable. The disadvantage of such an algorithm can be deliberately incorrect translation options, including obscene words.

  1. Coverage of world languages.

That is, the program now works with more than a hundred languages, including Swahili, Chinese and Welsh. Thus, Google Translator is able to translate from one supported language to another supported language, but in most cases the translation is performed through English. The disadvantage of this mechanism is obvious - the quality of the translation suffers.

PROMT, developed in 1991, occupies a leading position in the Russian machine translator market.

PROMT, like Google Translate, uses its own software, which was significantly updated in 2010. From now on, PROMT carries out translation based on hybrid technology. Its essence lies in the fact that instead of one translation option, the program produces about a hundred translations of the same sentence, depending on the polysemy of words, constructions and statistical results. The machine then selects the most likely of the proposed translations. Thus, the translator is able to learn quickly, but has the same disadvantages as all translators based on statistical methods of text processing.

The translator's capabilities include: translation of words, phrases and texts, including using hot keys; translation of a selected area of ​​the screen with graphic text; translation of documents of various formats: doc(x), xls(x), ppt(x), rtf, html, xml, txt, ttx, pdf (including scanned ones), jpeg, png, tiff; use, editing and creation of specialized dictionaries and translation profiles; connection of Translation Memory databases and glossaries; integration into office applications, web browsers, corporate portals and websites.

The disadvantages of the translator are: a small number of language pairs with which the program works; complex interface; inaccuracies in the translations of professional vocabulary (which, however, is eliminated by connecting thematic dictionaries).

However, PROMT was recognized as the best English-Russian translator at the annual workshop on statistical machine translation under the auspices of the Association for Computational Linguistics (ACL) in 2013 and 2014.

There are many other machine translation systems, but they, one way or another, copy various features of the domestic PROMT translator or the American Google Translate.

Thus, a translator in the field of professional communication, knowing machine translation technologies and knowing how to choose the right electronic translator for certain purposes, will be savvy to carry out successful professional activities, because at this stage of development of computer technology it is too early to think about fully automatic machine translation. A human translator thinks in images and proceeds from the goal: to convey a specific thought to the listener/reader. It is still difficult to imagine a computer program with such capabilities. Modern machine translators play a supporting role. They are designed to save a person from routine work during the translation process. The age of paper dictionaries is over, and machine translation systems are coming to help professional translators (and not only others).

List of used literature

  1. www.promt.ru
  2. www.translate.google.com
  3. Belonogov G.G. Zelenkov Yu.G. Interactive system for Russian-English and English-Russian machine translation, VINITI, 1993.
  4. Bulletin of Moscow University. Ser.19 Linguistics and intercultural communication. 2004. No. 4, p. 51.

Your rating: Empty

40s: first steps

The history of machine translation as a scientific and applied direction began in the late 40s of the last century (except for the mechanized translation device of P.P. Smirnov-Troyansky, a kind of linguistic adding machine, invented in 1933). In March 1947 Warren Weaver ( Warren Weaver), Director of the Natural Sciences Division of the Rockefeller Foundation ( Rockefeller Foundation), in correspondence with Andrew Booth ( Andrew D. Booth) and Norbert Wiener ( Norbert Wiener) first formulated the concept of machine translation, which he developed somewhat later (in 1949) in his memorandum addressed to the Foundation.

W. Weaver wrote: " I have a text in front of me which is written in Russian but I am going to pretend that it is really written in English and that it has been coded in some strange symbols. All I need to do is strip off the code in order to retrieve the information contained in the text." ("I have a text in front of me written in Russian, but I'm going to pretend that it's actually written in English and encoded using some rather strange characters. All I need is to crack the code to extract the information, enclosed in the text"). The analogy between translation and decryption was natural in the context of the post-war era, given the advances that cryptography made during World War II.

The same Weaver, after a series of discussions, drew up a memorandum in 1949 in which he theoretically substantiated the fundamental possibility of creating machine translation systems. W. Weaver wrote: “I have a text in front of me which is written in Russian but I am going to pretend that it is really written in English and that it has been coded in some strange symbols. All I need to do is strip off the code in order to retrieve the information contained in the text" ("I have a text in front of me that is written in Russian, but I am going to pretend that it is actually written in English and encoded using rather strange characters. All I need is to crack the code to extract the information contained in the text" [Slocum 1989: 56-58]. Weaver's ideas formed the basis of an approach to MP based on the concept: The information transmission stage is divided into two stages; At the first stage, the source sentence is translated into an intermediary language (created on the basis of simplified English), and then the result of this translation is presented in the target language.

Weaver's memorandum aroused great interest in the MP problem. In 1948 A. Booth and Richard Richens ( Richard Richens) carried out some preliminary experiments (for example, Richens developed rules for dividing word forms into stems and endings).

Those years were quite different from today. These were very large and expensive machines that occupied entire rooms and required a large staff of engineers, operators and programmers for their maintenance. These computers were mainly used to carry out mathematical calculations for the needs of military institutions, as well as mathematics and physics departments of universities (the latter were also closely related to the military sphere). Therefore, in the early stages, the development of MP was actively supported by the military, while in the USA the main attention was paid to the Russian-English direction, and in the USSR - to the Anglo-Russian direction.

In 1952, the first conference on MP was held at the Massachusetts Institute of Technology, and in 1954, the first MP system was presented in New York - IBM Mark II, developed by IBM together with Georgetown University (this event went down in history as the Georgetown Experiment). A program was presented that was very limited in its capabilities (it had a dictionary of 250 units and 6 grammatical rules), which translated from Russian into English. It seemed that the creation of high-quality automatic translation systems was quite achievable within a few years (at the same time, the emphasis was on the development of fully automatic systems that provide high-quality translations; human participation at the post-editing stage was regarded as a temporary compromise). Professional translators seriously feared that they would soon be left without work...

50s: first disappointment

By the early 1950s, a number of research groups in the United States and Europe were working in the field of MP. Significant funds were invested in these studies, but the results very soon disappointed investors. One of the main reasons for the low quality of MP in those years was the limited capabilities of the hardware: a small amount of memory with slow access to the information contained in it, and the inability to fully use high-level programming languages. Another reason was the lack of a theoretical basis necessary to solve linguistic problems, as a result of which the first MT systems were reduced to word-by-word ( word-to-word) translation of texts without any syntactic (let alone semantic) integrity.

In 1959, the philosopher J. Bar-Hillel ( Yohoshua Bar-Hillel) made the claim that a high-quality fully automatic MP ( FAHQMT) cannot be achieved in principle. As an example, he cited the problem of finding the correct translation for the word pen in the following context: John was looking for his toy box. Finally he found it. The box was in the pen. John was very happy. (John was looking for his toy box. Finally he found it. The box was in the playpen. John was very happy.) Pen in this case it should be translated not as “pen” (writing instrument), but as “playpen” ( play-pen). The choice of one translation or another in this case and in a number of others is determined by knowledge of extra-linguistic reality, and this knowledge is too extensive and diverse to be entered into a computer. However, Bar-Hillel did not deny the idea of ​​MT as such, considering the development of machine systems oriented towards their use by a human translator (a kind of “human-machine symbiosis”) as a promising direction.

This speech had the most unfavorable impact on the development of small business in the United States. In 1966, the ALPAC commission, specially created by the National Academy of Sciences (Automatic Language Processing Advisory Committee), based, among other things, on the findings of Bar-Hillel, came to the conclusion that machine translation is unprofitable: the ratio of cost and quality of MT was clearly not in favor of the latter, and there were enough human resources for the needs of translating technical and scientific texts. For the report ALPAC followed by a reduction in funding for research in the field of MP from the US government - and this despite the fact that at that time at least three different MP systems were regularly used by a number of military and scientific organizations (including the US Air Force, the US Nuclear Energy Commission, the Euratom Center in Italy).

60s: low start

For the next ten years, the development of MP systems was carried out in the USA by a university Brigham Young University in Provo, Utah (early commercial systems WEIDNER And ALPS) and was funded by the Mormon Church, which was interested in translating the Bible; in Canada by groups of researchers, including TAUM in Montreal with her system METEO; in Europe - in groups GENA(Grenoble) and SUSY(Saarbrücken). The work in this area by domestic linguists such as I.A. Melchuk and Yu.D. Apresyan (Moscow), which resulted in the linguistic processor ETAP, deserves special mention. In 1960, an experimental machine translation laboratory was organized as part of the Research Institute of Mathematics and Mechanics in Leningrad, which was later transformed into the laboratory of mathematical linguistics of Leningrad State University.

70-80s: new impulse

With the development of computer technology in the late 70s (the emergence of microcomputers, the development of networks, an increase in memory resources), machine translation entered the Renaissance era. At the same time, the emphasis shifted somewhat: researchers now aimed at developing “realistic” MT systems that assumed human participation at various stages of the translation process. MT systems are turning from an “enemy” and “competitor” of a professional translator into an indispensable assistant that helps save time and human resources.

About the revival of the MP in the 70-80s. The following facts indicate: the Commission of the European Communities ( CEC) buys the English-French version of Systran, as well as a translation system from Russian into English (the latter developed after the report ALPAC and continued to be used by the US Air Force and NASA); in addition, CEC commissions the development of a French-English and Italian-English version. At that time, thanks to CEC, the foundations of the project were laid EUROTRA, based on the developments of the groups SUSY And GETA. At the same time, there is a rapid expansion of activities to create MP systems in Japan (mainly based on technology interligva, developed by Weaver in the late 40s); in the USA, the Pan American Health Organization (PAHO) orders the development of a Spanish-English track (system SPANAM); The US Air Force is funding the development of the MP system at the Linguistic Research Center at the University of Texas at Austin; group TAUM in Canada achieves noticeable success in developing its system METEO(which was used mainly for translating weather reports). A number of projects started in the 70-80s. subsequently developed into full-fledged commercial systems.

During the period 1978-93, the USA spent 20 million dollars on research in the field of MP, 70 million in Europe, and 200 million in Japan.

One of the new developments was the technology TM (translation memory), which works on the principle of accumulation: during the translation process, the original segment (sentence) and its translation are preserved, resulting in the formation of a linguistic database; If an identical or similar segment to the original is found in the newly translated text, it is displayed along with the translation and an indication of the percentage match. The translator then makes a decision (edit, reject or accept the translation), the result of which is stored by the system. And ultimately, “there is no need to translate the same sentence twice!” Currently the developer of a well-known commercial system based on the technology TM, is the TRADOS system (founded in 1984).

From the 90s to XXI century

The 90s brought with them the rapid development of the PC market (from desktop to pocket-sized) and information technology, and the widespread use of the Internet (which is becoming increasingly international and multilingual). All this made it possible, and most importantly in demand, for the further development of MP systems. New technologies are emerging based on the use of neural networks, the concept of connectionism, and statistical methods.

Currently, several dozen companies are developing commercial MT systems, including: Systran, IBM, L&H (Lernout & Hauspie), Transparent Language, Cross Language, Trident Software, Atril, Trados, Caterpillar Co., LingoWare, Ata Software; Lingvistica b.v. etc. (more details about foreign MP developers and their systems).

It is now possible to use the services of automatic translators directly on the Internet: alphaWorks; PROMT's Online Translator; LogoMedia.net; Yahoo! Babel Fish; InfiniT.com.

Since the early 1990s. Domestic developers are entering the PC systems market.

In July 1990 at the exhibition PC Forum Russia's first commercial machine translation system called PROMT (PROgrammer's Machine Translation). In 1991, CJSC PROJECT MT was created, and already in 1992, the PROMT company won a NASA competition for the supply of MP systems (PROMT was the only non-American company in this competition).

In 1992, PROMT released a whole family of systems under a new name STYLUS for translation from English, German, French, Italian and Spanish into Russian and from Russian into English, and in 1993 on the basis STYLUS the world's first MP system is being created for Windows. A version was released in 1994 STYLUS 2.0 for Windows 3.X/95/NT, and in 1995-1996. introduced the third generation of machine translation systems, fully 32-bit STYLUS 3.0 for Windows 95/NT, at the same time, the development of completely new, first in the world Russian-German and Russian-French MP systems was successfully completed.

In 1997, an agreement was signed with a French company Softissimo on the creation of translation systems from French into German and English and vice versa, and in December of this year the world's first German-French translation system was released. In the same year, the PROMT company released a system implemented using Gigant technology - to support several language directions in one shell, as well as a special translator for working on the Internet WebTranSite.

In 1998, a whole constellation of programs was released under a new name PROMT 98. A year later, the PROMT company released two new products: a unique software package for working on the Internet - PROMT Internet, and translator for corporate email systems - PROMT Mail Translator. Special server solutions have also been developed for corporate clients - corporate translation server PROMT Translation Server (PTS) and Internet solution PROMT Internet Translation Server (PITS). In 2000, PROMT updated its entire line of software products, releasing a new generation of MP systems: PROMT Translation Office 2000, PROMT Internet 2000 And Magic Gooddy 2000.

Online translation with the support of the PROMT system is used on a number of domestic and foreign sites: PROMT's Online Translator, InfiniT.com, etc.

Software products of the PROMT company have been awarded a number of domestic and foreign awards, more details.

The past and future of machine translation. Key dates
First published on Wired

1966 ALPAC publishes a report on machine translation with the conclusion that years of research in this direction have not brought the expected results. This led to the cessation of government funding for small business development programs.

1982 Janet and Jim Baker founded Dragon Systems in Newton, Massachusetts.

1983 An automatic speech processing system is presented ( ALPS) - the first MP program for microcomputers.

1988 Scientists at IBM's Thomas J. Watson Research Center are returning to the development of a statistical method called MP, which compares parallel texts and calculates the probability of matching words.

1990 Dragon Systems releases the program DragonDictate, the first spoken-to-written system capable of recognizing 30,000 words.

DAPRA launches the Spoken Language Systems program ( SLS) with the goal of developing applications that enable voice interaction between man and machine.

1991 The first translator workstation has appeared, combining STAR's Transit programs, IBM's TranslationManager, Canadian Translation Services" PTT And Eurolang's Optimizer.

1992 ART-ITL founded the Consortium for Research in Natural Language Translation ( C-STAR), which organizes the first public demonstration of telephone translation between English, German and Japanese.

1993 In Germany, work is underway on the Verbmobil project. Researchers have focused on portable systems for providing translation of business conversations from English into German and Japanese.

2264 “Man is as stupid as a bag of sawdust,” said Device 296. “Only completely naive scientists could think of developing the technology to understand what these untidy lumps of protoplasm say. The noise they make from the holes in their heads is decidedly much less meaningful than in cosmic radiation."

Compiled by: Christine Demos ( [email protected]) and Mark Fraunfelder ( [email protected]). 1629-2000: K. D.; 2001-2264: M.F.

Andreeva Elena Vladimirovna

website hosting Langust Agency 1999-2019, a link to the site is required

Did you like the article? Share with your friends!