The robot has passed the Turing test. Computer program simulating a psychotherapist

An empirical experiment in which a person communicates with a computer intelligent program that simulates responses like a person.

It is assumed that Turing test passed if a person, when communicating with a machine, believes that he is communicating with a person and not a machine.

British mathematician Alan Turing in 1950 came up with such an experiment by analogy with an imitation game, which assumes that 2 people go into different rooms, and the 3rd person must understand who is where by communicating with them in writing.

Turing proposed playing such a game with a machine, and if the machine could mislead an expert, this would mean that the machine could think. Thus, the classic test follows the following scenario:

A human expert communicates via chat with a chatbot and other people. At the end of the conversation, the expert must understand which of the interlocutors was human and which was a bot.

Nowadays, the Turing test has received many different modifications, let's consider some of them:

Reverse Turing test

The test consists of performing some actions to confirm that you are a person. For example, we may often be faced with the need to enter numbers and letters into a special field from a distorted image with a set of numbers and letters. These actions protect the site from bots. Passing this test would confirm the machine's ability to perceive complex distorted images, but such do not exist yet.

Immortality test

The test consists of repeating a person’s personal characteristics as much as possible. It is believed that if a person’s character is copied as accurately as possible, and it cannot be distinguished from the source, it means that the test of immortality has been passed.

Minimal intelligent Signal test

The test assumes a simplified form of answering questions - only yes and no.

Meta Turing Test

The test assumes that a machine “can think” if it can create something that it itself wants to test for intelligence.

The first passage of the classical Turing test was recorded on June 6, 2014 by the chatbot “Zhenya Gustman”, developed in St. Petersburg. The bot convinced experts that they were communicating with a 13-year-old teenager from Odessa.

In general, the machines are already capable of a lot, now many specialists are working in this direction and more and more interesting variations and passing this test await us.

"Eugene Goostman" managed to pass the Turing test and convince 33% of judges that it was not a machine communicating with them. The program posed as a thirteen-year-old boy named Evgeny Gustman from Odessa and was able to convince the people talking to it that the answers it produced belonged to a person.

The test took place at the Royal Society of London and was organized by the University of Reading, UK. The authors of the program are Russian engineer Vladimir Veselov, who currently lives in the United States, and Ukrainian Evgeniy Demchenko, who now lives in Russia.

How did the program "Evgeniy Gustman" pass the Turing test?

On Saturday June 7, 2014, a supercomputer named Eugene tried to recreate the intelligence of a thirteen-year-old teenager, Evgeny Gustman.

Five supercomputers participated in the testing, organized by the School of Systems Engineering at the University of Reading (UK). The test consisted of a series of five-minute written dialogues.

The program developers managed to prepare the bot for all possible questions and even train it to collect examples of dialogues via Twitter. In addition, the engineers endowed the hero with a bright character. Pretending to be a 13-year-old boy, the virtual “Evgeniy Gustman” did not raise doubts among experts. They believed that the boy might not know the answers to many questions, because the average child’s level of knowledge is significantly lower than that of adults. At the same time, his correct and accurate answers were attributed to unusual erudition and erudition.

The test involved 25 “hidden” people and 5 chatbots. Each of the 30 judges conducted five chat sessions, trying to determine the real nature of the interlocutor. For comparison, in the traditional annual competition for artificial intelligence programs for the Loebner Prize*, only 4 programs and 4 hidden people participate.

The first program with a “young Odessa resident” appeared back in 2001. However, only in 2012 did she show a truly serious result, convincing 29% of the judges.

This fact proves that in the near future, programs will appear that will be able to pass without problems Turing test.

September 15, 2009 at 08:44 pm

Turing test

  • Artificial intelligence

So today we will talk about the most famous test for evaluating a talking bot - the Turing test.

The Turing test is an empirical test, the idea of ​​which was proposed by Alan Turing in the article "Computing Machinery and Intelligence", published in 1950 in the philosophical journal Mind. Turing set out to determine whether a machine could think.
The standard sound of the law: “If a computer can operate in such a way that a person is unable to determine whether he is communicating with another person or with a machine, he is considered to have passed the Turing test.”

Intelligent, human-like machines have been a major theme in science fiction for many decades. Since the birth of modern computing technology, people's minds have been occupied by the question: is it possible to build a machine that could in some way replace a person. An attempt to create solid empirical ground for resolving this issue was the test developed by Alan Turing.
The first version of the test, published in 1950, was somewhat confusing. The modern version of the Turing test is the following task. A group of experts communicates with an unknown creature. They do not see their interlocutor and can communicate with him only through some kind of isolating system - for example, a keyboard. They are allowed to ask their interlocutor any questions and conduct a conversation on any topic. If at the end of the experiment they cannot tell whether they were talking to a person or a machine, and if in fact they were talking to a machine, the machine can be considered to have passed the Turing test.
There are at least three main versions of the Turing test, two of which were proposed in the article "Computing Machines and Intelligence", and the third version, in Saul Traiger's terminology, is the standard interpretation.

While there is some debate as to whether the modern interpretation corresponds to what Turing described or is the result of a misinterpretation of his work, the three versions are not considered equivalent, and their strengths and weaknesses differ.
Imitation game

Turing, as we already know, described a simple party game that involves a minimum of three players. Player A is a man, Player B is a woman and Player C, who plays as the conversation leader, is of any gender. According to the rules of the game, C does not see either A or B and can communicate with them only through written messages. By asking questions to players A and B, C tries to determine which of them is a man and which is a woman. Player A's job is to confuse player C so that he makes the wrong conclusion. At the same time, player B's task is to help player C make the right judgment.

In what S. G. Sterret calls the Original Imitation Game Test, Turing proposes that the role of Player A be played by a computer. Thus, the computer's task is to pretend to be a woman in order to confuse player C. The success of such a task is assessed by comparing the outcomes of the game when player A is a computer and the outcomes when player A is a man. If, in Turing's words, "a conversational player makes the wrong decision as often after a game [with a computer] as after a game between a man and a woman," then the computer can be said to be intelligent.

The second option was proposed by Turing in the same article. As in the Initial Test, the role of Player A is played by the computer. The difference is that the role of Player B can be played by either a man or a woman.

“Let's look at a specific computer. Is it true that by modifying this computer to have sufficient storage space, increasing its speed, and giving it a suitable program, one can design such a computer so that it satisfactorily plays the role of player A in a simulation game, while the role of player B is performed by a man?" - Turing, 1950, p. 442.

In this variation, both players A and B try to persuade the leader to make the wrong decision.

The main idea of ​​this version is that the purpose of the Turing test is not to answer the question of whether a machine can fool a leader, but to answer the question of whether a machine can imitate a person or not. Although there is some debate as to whether Turing intended this option or not, Sterrett believes that Turing intended this option and thus combines the second option with the third. At the same time, a group of opponents, including Treyger, does not think so. But this still led to what might be called the “standard interpretation.” In this version, player A is a computer, player B is a person of any gender. The task of the presenter is now not to determine which of them is a man and a woman, but which of them is a computer and which is a human.

Turing in 2012

A special committee has been created to organize events to celebrate the centenary of Turing's birth in 2012, whose task is to convey Turing's message about the intelligent machine, reflected in Hollywood films such as Blade Runner, to the general public, including children. Members of the committee include: Kevin Warwick, Chair, Huma Sha, Coordinator, Ian Bland, Chris Chapman, Marc Allen, Rory Dunlop, Loebner Robbie Prize winners Garne and Fred Roberts. The committee is supported by Women in Technology and Daden Ltd.

Text
Artyom Luchko

Britain's University of Reading announced with great fanfare that a "major milestone in the history of computing" had been passed and a computer had passed the Turing test correctly for the first time, misleading judges into believing it was communicating with a 13-year-old Ukrainian boy. Look At Me figured out what really lies behind this event.

What was the experiment


The University of Reading, which conducted the first successful Turing test

The chatbot trial was organized by the School of Systems Engineering at the University of Reading to mark the 60th anniversary of Alan Turing's death. The experts communicated simultaneously with a live person and with the program, being in different rooms. At the end of the test, each judge must declare which of his two interlocutors is a person and which is a program. For the purity of the experiment, five computers and 30 judges were used, each of whom conducted a series of 10 written dialogues lasting 5 minutes. Although usually in the annual competition of artificial intelligence programs for the Loebner Prize ( in which programs compete to pass the Turing test for a prize of $2000) Only 4 chatbots and 4 people take part. As a result of the experiment, the Eugene Goostman program managed to convince 33% of the jury of its “humanity,” which happened for the first time in history. Robert Llewellyn, one of the judges, a British actor and technology enthusiast, said:

The Turing test was amazing. There were 10 sessions of 5 minutes each, 2 screens, 1 person and 1 machine. I guessed correctly only 4 times. This robot turned out to be a smart guy...

Chatbot Eugene Goostman was developed by Russian native Vladimir Veselov (he now lives in the USA) and Ukrainian Evgeniy Demchenko, living in Russia. The first version appeared back in 2001. The age of the teenager was not chosen by chance: at the age of 13, a child already knows a lot, but not everything, which complicates the task of the judges. In 2012, the chatbot already came quite close to success: then 29% of the judges believed in the “humanity” of the Ukrainian schoolchild. During the latest improvements, programmers were able to prepare the virtual interlocutor for all possible questions and even teach him to select example answers on Twitter.

What is the Turing Test,
and what are its disadvantages


Alan Turing aged 16

The Turing Test was first proposed by British mathematician Alan Turing in his paper "Computing Technology and Intelligence", published in the journal Mind in 1950. In it, the scientist asked a simple question: “Can a machine think.” In its simplest form, the test is as follows: a person interacts with one computer and one person. Based on the answers to the questions, he must determine who he is talking to: a person or a computer program. The purpose of a computer program is to mislead a person into making the wrong choice. The test involves a five-minute text conversation, during which at least 30% of the judges must believe that they are dealing with a person and not a machine. In this case, of course, all test participants do not see each other.


John Searle, American philosopher

There are many different versions of this test (in some variations the judge knows that one of the interlocutors being tested is a computer, in others he does not know about it), but many scientists and philosophers criticize him to this day. American philosopher John Searle challenged the test with a thought experiment known as the “Chinese Room.” He allowed himself to suggest that the ability of a computer to carry on a conversation and answer questions convincingly is far from the same as having a mind and thinking like a person. “Suppose I were locked in a room and [...] that I did not know a single word of Chinese, either written or spoken,” Searle writes in 1980. He imagined that he was receiving questions written in Chinese through a crack in the wall. He was not able to read these symbols, but had a set of instructions in English that allowed him to respond to "one set of formal symbols with another set of formal symbols." Thus, Searle would theoretically be able to answer questions simply by following the rules of English and choosing the correct Chinese characters. And his interlocutors would be convinced that he could speak Chinese.

Most critics of the Turing test as a way to evaluate artificial intelligence are of a similar opinion. They argue that computers can only use sets of rules and huge databases programmed to answer questions to appear intelligent.

How the program deceived the jury


Reading University Professor Kevin Warwick

Eugene Goostman had two factors that helped him pass the test. Firstly, grammatical and stylistic errors that the machine makes in imitation of a teenager’s writing, and secondly, a lack of knowledge about specific cultural and historical facts, which can also be attributed to the age of the student.

There is no stage in the development of artificial intelligence more iconic or controversial than passing the Turing test.

"The success of the program is likely to raise some concerns about the future of information technology," said University of Reading professor Kevin Warwick. - There is no more iconic or controversial stage in the development of artificial intelligence than passing the Turing test, when a computer convinces enough judges to believe that it is not a machine, but a person, communicating with them. The very existence of a computer that can trick a person into thinking it is a human is a red flag for cybercrime.” The Turing Test is still an important tool in combating this threat. And now experts have to more fully understand how the emergence of such advanced chatbots can affect online communication on the Internet.

Judging by the logs that can be found on the Internet (it’s not yet possible to try the bot on your own; probably, due to the hype, the site couldn’t handle the traffic and “fell”), The chatbot is quite primitive and, as it seems at first glance, is not very different from similar developments that can be found on the Internet. One of the interesting dialogues with “Eugene” was presented by journalist Leonid Bershidsky, who asked him uncomfortable questions about a high-profile event that could not pass by the young Odessa resident.

Even taking into account the well-developed character and biography, mistakes and typos that a real teenager can make, the persuasiveness of the bot is questionable. In fact, it also reacts to keywords, and when it is stumped, it produces pre-prepared and not the most original placeholder answers. If the program had the ability to use search engines to place itself in the context of the current world situation, we could see a much more impressive result. This will probably take time. Previously, the famous futurist Raymond Kurzweil, who holds the position of technical director at Google, stated that computers will be able to easily pass the Turing test by 2029. According to his assumptions, by this time they will be able to master the human language and surpass humans in intelligence.

7 supercomputers that can outsmart humans

ELIZA


There is probably not a person today who has not at least once heard of such a concept as the Alan Turing test. Most people are probably far from understanding what such a testing system is. Therefore, let us dwell on it in a little more detail.

What is the Turing Test: Basic Concept

Back in the late 40s of the last century, many scientific minds were engaged in the problems of the first computer developments. It was then that one of the members of a certain non-governmental group Ratio Club, engaged in research in the field of cybernetics, asked a completely logical question: is it possible to create a machine that would think like a person, or at least imitate his behavior?

Do I need to say who invented the Turing test? Apparently not. The initial basis of the entire concept, which is still relevant today, was the following principle: will a person, after some time of communication with some invisible interlocutor on completely different arbitrary topics, be able to determine who is in front of him - a real person or a machine? In other words, the question is not only whether a machine can imitate the behavior of a real person, but also whether it can think for itself. this issue still remains controversial.

History of creation

In general, if we consider the Turing test as a kind of empirical system for determining the “human” capabilities of a computer, it is worth saying that the indirect basis for its creation were the curious statements of the philosopher Alfred Ayer, which he formulated back in 1936.

Ayer himself compared, so to speak, the life experiences of different people, and on the basis of this expressed the opinion that a soulless machine would not be able to pass any test, since it could not think. At best, this will be pure imitation.

In principle, this is how it is. Imitation alone is not enough to create a thinking machine. Many scientists cite the example of the Wright brothers, who built the first airplane, abandoning the tendency to imitate birds, which, by the way, was characteristic of such a genius as Leonardo da Vinci.

Istria is silent whether he himself (1912-1954) knew about these postulates, however, in 1950 he compiled a whole system of questions that could determine the degree of “humanization” of the machine. And it must be said that this development is still one of the fundamental ones, although only when testing, for example, computer bots, etc. In reality, the principle turned out to be such that only a few programs managed to pass the Turing test. And then, “pass” is said with great stretch, since the test result has never had an indicator of 100 percent, at best - a little more than 50.

At the very beginning of his research, the scientist used his own invention. It was called the Turing test machine. Since all conversations were to be entered exclusively in printed form, the scientist set several basic directives for writing responses, such as moving the printing tape to the left or right, printing a specific character, etc.

Programs ELIZA and PARRY

Over time, the programs became more complex, and two of them, in situations where the Turing test was applied, showed stunning results at that time. These were ELIZA and PARRY.

As for "Eliza", created in 1960: based on the question, the machine had to determine the key word and based on it create a return answer. This is what made it possible to deceive real people. If there was no such word, the machine returned a generalized answer or repeated one of the previous ones. However, the passage of the Eliza test is still in doubt, since the real people who communicated with the program were initially psychologically prepared in such a way that they thought in advance that they were talking to a person and not to a machine.

The PARRY program is somewhat similar to Eliza, but was created to simulate the communication of a paranoid person. What’s most interesting is that real clinic patients were used to test it. After recording the transcripts of the conversations via teletype, they were assessed by professional psychiatrists. Only in 48 percent of cases were they able to correctly assess where the person was and where the machine was.

In addition, almost all programs of that time worked taking into account a certain period of time, since a person in those days thought much faster than a machine. Now it's the other way around.

Supercomputers Deep Blue and Watson

The developments of the IBM corporation looked quite interesting; they not only thought, but had incredible computing power.

Many people probably remember how in 1997 the supercomputer Deep Blue won 6 chess games against the then current world champion Garry Kasparov. Actually, the Turing test is very conditionally applicable to this machine. The thing is that it initially contained many game templates with an incredible amount of interpretation of the development of events. The machine could evaluate about 200 million positions of pieces on the board per second!

The Watson computer, consisting of 360 processors and 90 servers, won the American television game show, outperforming the other two participants in all respects, for which, in fact, it received a $1 million bonus. Again, the question is moot because the machine was loaded with incredible amounts of encyclopedic data, and the machine simply analyzed the question for keywords, synonyms, or general matches, and then gave the correct answer.

Eugene Goostman emulator

One of the most interesting developments in this area was the program of Odessa resident Evgeniy Gustman and Russian engineer Vladimir Veselov, now living in the United States, which imitated the personality of a 13-year-old boy.

On June 7, 2014, the Eugene program demonstrated its full capabilities. Interestingly, 5 bots and 30 real people took part in testing. Only in 33% of cases out of a hundred were the jury able to determine that it was a computer. The point here is that the task was complicated by the fact that a child has lower intelligence than an adult, and less knowledge.

The Turing test questions were the most general, however, for Eugene there were also some specific questions about the events in Odessa that could not go unnoticed by any resident. But the answers still made me think that the jury was a child. For example, the program answered the question about place of residence immediately. When the question was asked whether the interlocutor was in the city on such and such a date, the program stated that it did not want to talk about it. When the interlocutor tried to insist on a conversation in line with what exactly happened that day, Eugene disowned himself by saying, “You yourself should know, why ask him?” In general, the child emulator turned out to be extremely successful.

However, this is still an emulator, not a thinking creature. So the machine uprising will not happen for a very long time.

The other side of the coin

Finally, it remains to add that so far there are no prerequisites for creating thinking machines in the near future. Nevertheless, if earlier recognition issues related specifically to machines, now almost every one of us has to prove that you are not a machine. Just look at entering a captcha on the Internet to gain access to some action. So far, it is believed that not a single electronic device has yet been created that can recognize distorted text or a set of characters, except for a person. But who knows, everything is possible...

Standard interpretation of the Turing test

Turing test- an empirical test, the idea of ​​which was proposed by Alan Turing in the article “Computing Machines and Intelligence” (eng. Computing Machinery and Intelligence ), published in 1950 in a philosophical journal "Mind". Turing set out to determine whether a machine could think.

The standard interpretation of this test is as follows: " A person interacts with one computer and one person. Based on the answers to the questions, he must determine who he is talking to: a person or a computer program. The purpose of a computer program is to mislead a person into making the wrong choice.».

All test participants cannot see each other. If the judge cannot say for sure which of the interlocutors is human, then the machine is considered to have passed the test. To test the intelligence of the machine, and not its ability to recognize spoken language, the conversation is conducted in “text only” mode, for example, using a keyboard and a screen (an intermediary computer). Correspondence should occur at controlled intervals so that the judge cannot draw conclusions based on the speed of responses. In Turing's time, computers were slower than humans. Now this rule is necessary because they react much faster than humans.

Story

Philosophical background

Although research into artificial intelligence began in 1956, its philosophical roots go deep into the past. The question of whether a machine can think or not has a long history. It is closely related to the differences between dualistic and materialistic views. From the point of view of dualism, thought is not material (or at least does not have material properties), and therefore the mind cannot be explained using physical concepts alone. On the other hand, materialism states that minds can be explained physically, thus leaving open the possibility of artificially created minds.

Alan Turing

By 1956, British scientists had been researching “machine intelligence” for 10 years. The question was a common subject of discussion among members of the Ratio Club, an informal group of British cyberneticists and electronics researchers that included Alan Turing, after whom the test was named.

Turing had been particularly concerned with the problem of machine intelligence since at least 1941. One of his earliest references to “computer intelligence” was made in 1947. In his talk “Intelligent Machines,” Turing explored the question of whether a machine could exhibit intelligent behavior, and as part of this study he proposed what could be considered a precursor to his further research: “It is not difficult to design a machine that can play chess well. Now let's take three people - subjects of the experiment. A, B and C. Let A and C play chess poorly, and B be a machine operator. […] Two rooms are used, as well as some mechanism for communicating moves. Participant C plays either with A or with the machine. Participant C may find it difficult to answer who he is playing with.”

Thus, by the time he published his paper “Computing Machinery and Intelligence” in 1950, Turing had already been considering the possibility of artificial intelligence for many years. However, this paper was Turing's first paper to deal exclusively with this concept.

Turing begins his article with the statement: "I propose to consider the question 'Can machines think?'." He emphasizes that the traditional approach to this issue is to first define the concepts of “machine” and “intelligence.” Turing, however, chose a different path; instead, he replaced the original question with another "that is closely related to the original and is stated in relatively unambiguous terms." Essentially, he proposes replacing the question “Do machines think?” question “Can machines do what we (as thinking creatures) can do?” The advantage of the new question, Turing argues, is that it draws “a clear line between human physical and intellectual capabilities.”

To demonstrate this approach, Turing proposes a test inspired by the party game "Imitation game." In this game, a man and a woman are sent to separate rooms, and guests try to tell them apart by asking them a series of written questions and reading typed answers to them. According to the rules of the game, both the man and the woman try to convince the guests that everything is the other way around. Turing proposes to remake the game as follows: "Now let us ask the question, what would happen if in this game the role of A was played by a machine? Would the questioner be mistaken as often as if he had played with a man and a woman? These questions replace the original " Can a machine think?

In the same report, Turing later proposes an "equivalent" alternative formulation involving a judge who converses only with a computer and a human. While none of these formulations exactly correspond to the version of the Turing test that is best known today, in 1952 the scientist proposed a third. In this version of the test, which Turing discussed on BBC Radio, the jury asks questions to a computer, and the computer's role is to make a large proportion of the jury believe that it is in fact human.

Turing's paper addresses 9 proposed issues, which include all of the major objections to artificial intelligence raised since the paper was first published.

Eliza and PARRY

Blay Whitby points to 4 major turning points in the history of the Turing test - the publication of the paper "Computing Machines and Intelligence" in 1950, the announcement of Joseph Weizenbaum's creation of the Eliza program (ELIZA) in 1966, Kenneth Colby's creation of the PARRY program, which was first described in 1972 , and the Turing Colloquium in 1990.

The principle of Eliza's work is to examine user-entered comments for the presence of keywords. If a keyword is found, then a rule is applied according to which the user's comment is converted and the result sentence is returned. If the keyword is not found, Eliza either returns a general response to the user or repeats one of the previous comments. In addition, Weizenbaum programmed Eliza to imitate the behavior of a client-centered therapist. This allows Eliza to "pretend that she knows almost nothing about the real world." By using these methods, Weisenbaum's program was able to mislead some people into thinking they were talking to a real person, and some were "very difficult to convince that Eliza […] was not human." On this basis, some argue that Eliza is one of the programs (perhaps the first) that was able to pass the Turing test. However, this statement is very controversial, since the people "asking the questions" were instructed to think that they would be talking to a real therapist, and were not aware that they could be talking to a computer.

Colloquium on Conversational Systems, 2005

In November 2005, the University of Surrey hosted a one-day meeting of ACE developers, which was attended by the winners of the practical Turing tests as part of the Loebner Prize competition: Robby Garner, Richard Wallace, Rollo Carpenter. Guest speakers included David Hamill, Hugh Loebner and Huma Shah.

AISB Society Symposium on the Turing Test, 2008

In 2008, along with the regular Loebner Prize competition held at the University of Reading, The Society for the Study of Artificial Intelligence and Simulation of Behavior (AISB) held a one-day symposium, where the Turing Test was discussed. The symposium was organized by John Barnden, Mark Bishop, Huma Shah and Kevin Warwick. Speakers included the director of the Royal Institute, Baroness Susan Greenfield, Selmer Bringsjord, Turing biographer Andrew Hodges and scientist Owen Holland. No agreement on a canonical Turing test has emerged, but Bringsord suggested that a larger premium would encourage the Turing test to be passed more quickly.

The year of Alan Turing and Turing-100 in 2012

Alan Turing's birthday will be celebrated in 2012. There will be many great events taking place throughout the year. Many of them will take place in places that were significant in Turing's life: Cambridge, Manchester and Bletchy Park. The Year of Alan Turing is supervised by the organization TCAC (Turing Centenary Advisory Committee), which provides professional and organizational support for events in 2012. Also supporting events are: ACM, ASL, SSAISB, BCS, BCTCS, Bletchy Park, BMC, BLC, CCS, Association CiE, EACSL, EATCS, FoLLI, IACAP, IACR, KGS and LICS.

A special committee has been created to organize events to celebrate the centenary of Turing's birth in June 2012, whose task is to convey Turing's message about the intelligent machine, reflected in such Hollywood films as Blade Runner, to the general public, including children. Members of the committee include: Kevin Warwick, Chair, Huma Sha, Coordinator, Ian Bland, Chris Chapman, Marc Allen, Rory Dunlop, Loebner Robbie Prize winners Garne and Fred Roberts. The committee is supported by Women in Technology and Daden Ltd.

At this competition, Russians, whose names were not disclosed, presented the “Eugene” program. In 150 tests conducted (and in fact, five-minute conversations), five new programs participated, which were “lost” among 25 ordinary people. The program "Eugene", depicting a 13-year-old boy living in Odessa, became the winner, managing to mislead the examiners in 29.2% of its answers. Thus, the program was only 0.8% short of completely passing the test.

Variants of the Turing Test

An imitation game as described by Turing in the article "Computing Machinery and Intelligence". Player C, by asking a series of questions, tries to determine which of the other two players is male and which is female. Player A, a man, is trying to confuse Player C, and Player B is trying to help C.

An initial test based on a simulation game in which the computer plays instead of Player A. The computer should now confuse Player C while Player B continues to try to help the host.

There are at least three main versions of the Turing test, two of which were proposed in the article "Computing Machines and Intelligence", and the third version, in Saul Traiger's terminology, is the standard interpretation.

While there is some debate as to whether the modern interpretation corresponds to what Turing described or is the result of a misinterpretation of his work, the three versions are not considered equivalent, and their strengths and weaknesses differ.

Imitation game

Turing, as we already know, described a simple party game that involves a minimum of three players. Player A is a man, Player B is a woman and Player C, who plays as the conversation leader, is of any gender. According to the rules of the game, C does not see either A or B and can communicate with them only through written messages. By asking questions to players A and B, C tries to determine which of them is a man and which is a woman. Player A's job is to confuse player C so that he makes the wrong conclusion. At the same time, player B's task is to help player C make the right judgment.

In what S. G. Sterret calls the Original Imitation Game Test, Turing proposes that the role of Player A be played by a computer. Thus, the computer's task is to pretend to be a woman in order to confuse player C. The success of such a task is assessed by comparing the outcomes of the game when player A is a computer and the outcomes when player A is a man:

The second option was proposed by Turing in the same article. As in the Initial Test, the role of Player A is played by the computer. The difference is that the role of Player B can be played by either a man or a woman.

“Let's look at a specific computer. Is it true that by modifying this computer to have sufficient storage space, increasing its speed, and giving it a suitable program, one can design such a computer so that it satisfactorily plays the role of player A in a simulation game, while the role of player B is performed by a man?" - Turing, 1950, p. 442.

In this variation, both players A and B try to persuade the leader to make the wrong decision.

Standard interpretation

The main idea of ​​this version is that the purpose of the Turing test is not to answer the question of whether a machine can fool a leader, but to answer the question of whether a machine can imitate a person or not. Although there is some debate as to whether Turing intended this option or not, Sterrett believes that Turing intended this option and thus combines the second option with the third. At the same time, a group of opponents, including Treyger, does not think so. But this still led to what might be called the “standard interpretation.” In this version, player A is a computer, player B is a person of any gender. The task of the presenter is now not to determine which of them is a man and a woman, but which of them is a computer and which is a human.

The Imitation Game Compared to the Standard Turing Test

There is disagreement about which option Turing had in mind. Sterrett insists that Turing's work implies two different versions of the test, which, according to Turing, are not equivalent to each other. A test that uses a party game and compares success rates is called the Original Imitation Game Test, while a test based on a judge talking to a human and a machine is called the Standard Turing Test, noting that Sterrett equates it to the standard interpretation , and not to the second version of the simulation game.

Sterrett agrees that the Standard Turing Test (STT) has the shortcomings that his criticism points out. But he believes that, on the contrary, the Original Imitation Game Test (OIG Test) lacks many of them due to key differences: unlike STT, it does not consider human-like behavior as the main criterion, although takes human behavior into account as a sign of machine intelligence. A person may fail the OIG test, which is why it is believed that this is a merit of an intelligence test. Failure to pass the test indicates a lack of resourcefulness: The OIG test, by definition, considers intelligence to be related to resourcefulness and is not simply "imitation of a person's conversational behavior." In general, the OIG test can even be used in non-verbal versions.

However, other writers have interpreted Turing's words as suggesting that the simulation game itself should be considered a test. Moreover, it is not explained how to connect this position and Turing's words that the test he proposed on the basis of the party game is based on the criterion of the comparative frequency of success in this imitation game, and not on the ability to win a round of the game.

Should the judge know about the computer?

In his works, Turing does not explain whether the judge knows that there will be a computer among the test takers or not. As for OIG, Turing only says that player A should be replaced by a machine, but does not say whether player C knows this or not. When Colby, F.D. Hilf, and A.D. Kramer tested PARRY, they decided that the judges did not need to know that one or more of the interlocutors would be computers. As A. Saygin and other experts note, this leaves a significant imprint on the implementation and test results.

Advantages of the test

Theme width

The strength of the Turing test is that you can talk about anything. Turing wrote that "the method of question and answer seems suitable for discussing almost any field of human interest that we wish to discuss." John Hogeland added that “mere understanding of words is not enough; you also need to understand the topic of conversation.” To pass a well-designed Turing test, a machine must use natural language, reason, have knowledge, and learn. The test can be made more difficult by including video input, or, for example, by equipping a gateway to transfer objects: the machine will have to demonstrate the ability to see and robotics. All these tasks together reflect the main problems facing the theory of artificial intelligence.

Compliance and simplicity

The power and appeal of the Turing test comes from its simplicity. Philosophers of consciousness, psychology in modern neuroscience are not able to give definitions of “intelligence” and “thinking”, as far as they are sufficiently accurate and generally applicable to machines. Without such a definition, the central questions of philosophy about artificial intelligence cannot be answered. The Turing Test, while imperfect, at least ensures that it can actually be measured. As such, it is a pragmatic solution to difficult philosophical questions.

Disadvantages of the test

Despite all its merits and popularity, the test is criticized on several grounds.

The human mind and the mind in general

Human behavior and rational behavior

The orientation of the Turing test is clearly expressed towards humans (anthropomorphism). Only the ability of a machine to resemble a person is tested, and not the intelligence of the machine in general. The test is unable to assess the general intelligence of a machine for two reasons:

  • Sometimes human behavior defies reasonable interpretation. At the same time, the Turing test requires that a machine be able to imitate all types of human behavior, regardless of how intelligent it is. It also tests the ability to imitate behavior that a person would not consider reasonable, such as reacting to insults, the temptation to lie, or simply a large number of typos. If a machine is unable to accurately imitate human behavior, typos and the like, then it fails the test, despite all the intelligence it may have.
  • Some intelligent behavior is not inherent in humans. The Turing Test does not test highly intelligent behavior, such as the ability to solve complex problems or come up with original ideas. Essentially, the test requires the machine to cheat: no matter how smart the machine is, it must pretend not to be very smart in order to pass the test. If a machine is able to quickly solve a certain computational problem that is beyond the capabilities of a human, it will, by definition, fail the test.

Impracticality

Extrapolating from the exponential growth of technology over several decades, futurist Raymond Kurzweil has suggested that machines capable of passing the Turing test will be available roughly around 2020. This echoes Moore's law.

The Long Bet Project includes a $20,000 bet between Mitch Kapor (pessimist) and Raymond Kurzweil (optimist). The meaning of the bet: will a computer pass the Turing test by 2029? Some betting conditions are also defined.

Variations of the Turing Test

Numerous versions of the Turing test, including those described earlier, have been discussed for quite some time.

Reverse Turing test and CAPTCHA

A modification of the Turing test in which the goal or one or more of the machine and human roles are reversed is called a reverse Turing test. An example of this test is given in the work of psychoanalyst Wilfred Bion, who was particularly fascinated by the way mental activity is activated when encountering another mind.

Expanding on this idea, R. D. Hinshelwood described the mind as a “mind recognition apparatus,” noting that this could be considered an “add-on” to the Turing test. Now the computer’s task will be to determine who it was talking to: a person or another computer. This addition to the question is what Turing was trying to answer, but it arguably introduces a fairly high standard for determining whether a machine can “think” in the way we would normally apply to a human being.

CAPTCHA is a type of reverse Turing test. Before allowing some action to be performed on the site, the user is given a distorted image with a set of numbers and letters and is asked to enter this set in a special field. The purpose of this operation is to prevent automated systems from attacking the site. The rationale for such an operation is that Bye There are no programs powerful enough to recognize and accurately reproduce text from a distorted image (or they are inaccessible to ordinary users), so it is believed that a system that could do this can most likely be considered a human. The conclusion will be (although not necessarily) that artificial intelligence has not yet been created.

Turing test with an expert

This variation of the test is described as follows: the answer of the machine should not differ from the answer of an expert - a specialist in a certain field of knowledge. As technology for scanning the human body develops, it will become possible to copy the necessary information from the body and brain into a computer.

Immortality test

The Immortality Test is a variation of the Turing test that determines whether a person's character has been adequately conveyed, namely whether the copied character can be distinguished from the character of the person who served as its source.

Minimum Intelligent Signal Test (MIST)

MIST was proposed by Chris McKinstry. In this variation of the Turing test, only two types of answers are allowed - “yes” and “no”. Typically, MIST is used to collect statistical information that can be used to measure the performance of programs that implement artificial intelligence.

Meta Turing Test

In this variation of the test, a subject (say, a computer) is considered intelligent if it has created something that it itself wants to test for intelligence.

Hutter Prize

The Hutter Prize organizers believe that compressing natural language text is a difficult task for artificial intelligence, equivalent to passing the Turing test.

The information compression test has certain advantages over most of the variants and variations of the Turing test:

  • Its result is a single number by which one can judge which of the two machines is “more intelligent”.
  • The computer is not required to lie to the judge - teaching computers to lie is considered a bad idea.

The main disadvantages of this test are:

  • It is impossible to test a person with its help.
  • It is unknown what result (if any) is equivalent to passing the Turing test (at the human level).

Other intelligence tests

There are many intelligence tests that are used to test people. It is possible that they can be used to test artificial intelligence. Some tests (such as the C test) derived from Kolmogorov complexity are used to test people and computers.



Did you like the article? Share with your friends!