Another look at artificial unintelligence

Paper given by John Higgins at the Man and the Media conference held in London in 1989.

The subtitle of my book Language, Learners and Computers (1988) is "Human Intelligence and Artificial Unintelligence". Some people have got the impression from this phrase that I am hostile to Artificial Intelligence. Far from it. I am fascinated by it. The real point I wanted to make was rather that the Artificial Unintelligence of which the computer is so magnificently capable, is something we can and should exploit, in addition to any intelligence we are striving to engineer. Regard unintelligence as a potential asset, not a liability. After all, it has long been the quality that military leaders have most wanted from their foot soldiers; 'Ours not to reason why, ours but to do and die.' Do you want to be guarded by an intelligent sentry, or by one who woodenly demands that everybody approaching the barracks be challenged and searched, no matter how much they look like the commander-in-chief?

Intelligence tests

To try and illustrate one difference between intelligence and unintelligence, I invite you to consider the following test item:

	qp
AB is to dc as SR is to ...	pq
	ut

The correct answer is pq.

The question actually comes from a work assessment test battery published by NFER-Nelson. I don't quite know what it means if you get this and the other reasoning questions in the battery right, but I do find it interesting to think of the processes we go through in finding this answer or indeed one of the other answers. In the first place we have to look at the two pairs of letters in the lead and decide how they are related. Probably you made some kind of formulation like this: I see a consecutive pair of upper case letters, then their two successors in the alphabet in reverse order and in lower case. Next you have to apply this formula to the prompt. Given SR, what are the successors? Either TU, if we are considering a forward alphabet, or QP if we are looking at a reversed alphabet. Now apply the rest of the formula; if TU then the answer would be ut. But wait a minute. Since the SR pair is actually RS reversed, shouldn't the TU pair be unreversed? OK, let's look at the other possibility. If the successors of SR are QP, then we reverse them and get the lower case pq, and that is one of the answers offered. Gotcha!

Two processes

If you got the answer wrong, I expect that the mistake you made was in the second part of that process, applying the algorithm, rather than in the first part, deducing the algorithm. You might have re-reversed the pq pair or not reversed it in the first place. You might not have noticed that SR are in reversed order, since we are highly accustomed to normal alphabetical order and mentally screen out what we perceive as blips. What is interesting is that if a machine had to solve this particular puzzle, it would have no problem at all with this second stage but would find the first stage extremely difficult. To write a computer program that would solve such problems, you would probably need to devote dozens of pages of code to the process of making the deductions from data, but only a couple of lines to handling the generation and output of the answer. Of course, once you had done it, it would become much more efficient than we are at solving all puzzles of this type. For instance, it would take no longer to find the answer to one like this:

	jm
AD is to jg as SP is to ...	mj
	yv

This is exactly the same problem as the first but dealing with pairs which are three apart in the alphabet rather than adjacent. Your computer program, however, would be limited to considering the problem purely as one of dealing with ordered symbols. It would be incapable of doing what a human might do with the first puzzle, namely looking at it and thinking, 'Well, AB means Able Seaman, dc stands for direct current, SR is a brand of toothpaste, pq suggests 'minding one's ps and qs' so it could have something to do with good manners; can I make any sense out of those meanings and their relationships?' That line of thought would be wasted in this case but might well be the right one in another context, e.g.

	OP
AD is to BC as NT is to ...	OT
	PR

In this case the correct answer is (2) OT, since the relationship AD (Anno Domini) to BC (Before Christ) parallels that of NT (New Testament) to OT (Old Testament).

Language teaching

What, you may be asking, has this got to do with artificial intelligence in language teaching? Perhaps nothing very much, except to remind us that many things that humans do effortlessly are done at best laboriously by machines. The main benefit we derive from doing them with machines is to come to understand better the nature of our own abilities. This is what could be described as the computer-as-mirror metaphor.

Consider just what is involved in computerising acts of teaching. When you observe teaching happening, you see and hear people engaged in activity, most of which is verbal. One of the people is usually bigger than the others, or is in a high place, or is standing while the others sit. The big person gives some instructions, and the little people do something. The big person asks a question; one of the little people answers it; the big person makes some form of acknowledgement such as 'Right' and then asks another question. Or perhaps a little person asks a question, the big person answers it, and the little person may make some acknowledgement such as 'I see'.

The machine's role

Of course this is a caricature, but it corresponds to the parts of the process which people seem to want to computerise. They want the machine to do the big person's job. The machine can give instructions. Easy. It can ask questions and, to a great extent, respond appropriately when it 'hears' the answers. (It doesn't usually hear answers; it feels them.) This quizzing component of teaching is now quite well understood and embodied in programs; analysing and providing feedback on quiz answers, once the bugbear and disgrace of CALL, can now be done relatively well.

Can the computer answer questions? That takes one into very elaborate areas of programming and is where much AI effort is directed, including all the efforts to generate database SQLs, or Structured Query Languages. Obviously it can only be done well if the machine in some sense knows the subject matter (a very difficult notion) and knows enough about the enquirer to decide why the question was being asked and hence what is relevant to the answer. It would be nice, would it not, if we could computerise grammar and usage in such a way that the computer could answer unanticipated questions intelligently. But wait a minute. When did you last hear anyone explaining to a student why you have to say I have been living in London since 1980 rather than I am living in London since 1980? Was it a good explanation? Did the student get a flash of insight from it? Could the machine do better? Could the machine explain why the past tense is used in I wish I had a million pounds? Could it explain why danger is apparently uncountable in 'Danger stalks the streets of Ibiza' and countable in 'A new danger has arisen from the use of unpasteurised milk'? Can it explain why information is uncountable in English but countable in French? Can it explain why we say myself, yourself, but not hisself?

The expert explainer

If you want to create an expert explainer, you have to do exactly what you would do in creating any other kind of expert system, such as one to analyse moon rocks or hurricanes. Go along to the human experts and observe what they do, what information they require first, what inferences they draw, what confidence they place in their inferences. So what we have to do is to study expert explainers. How do we find them? It is straightforward enough to find people who are good at explaining things at large, in books, in public lectures. The criteria for their success is the popularity of their public manifestations. But how do we find people who are expert at disentangling individuals' misunderstandings? Can we hold a competition? If so, what tasks would we set and how would we assess the performances of the competitors?

Insight

The trouble is that nobody really knows, not even the participants, when an act of explanation has been successful. I went through mathematics courses at school which taught me a good deal of trigonometry. I could manipulate sines and cosines and emerge with answers to problems that got big ticks put beside them in my exercise book. I got high scores on maths exams. I must, presumably, have asked for explanations and made the right acknowledgements when I got them. To all outward appearances I understood trigonometry. And yet I don't think I understood it at all until, 25 years later, I read a paragraph in the Sinclair Spectrum manual commenting on the trigonometric functions in BASIC (Vickers, 1983, p. 67-69). 'Imagine the tip of the minute hand of a clock face,' it said. 'It would be useful, as the hand turns, to have a way of measuring how far above the horizontal it is and how far off the vertical.' Sine and cosine! Suddenly these terms began to make sense, to have relevance. Suddenly it was clear why they fluctuated between the values +1 and -1 and why they were out of phase with each other. I was even able to dredge some of my almost forgotten school trigonometry out of my memory and apply it to tasks such as putting wavy eyebrows and moustaches on computer-drawn faces. Can I blame my school for never having explained to me what trigonometry is all about? Or is it just that there can never be a total match between what is taught and what is learned, that all learning is sporadic and serendipitous, the product of good fortune?

Experiment and discovery

These are interesting questions and the computer may well be one means of getting closer to the answers. We discover things about ourselves when we try to replicate our own activities in another medium. But if we meanwhile try to jump straight in and create teaching machines with the intelligence to conduct tutorial dialogue, to mix tasks with explanations, then I doubt whether we yet know enough about the nature of that dialogue to make the thing work. The danger is that we will do what the eighteenth century inventors wanted to do when they tried to fly by strapping feathered wings on to a man's arms. Birds can fly, they reasoned, and birds have feathers; if we apply feathers to ourselves, then we will be able to fly. It sounds rather plausible. Good teachers give clear explanations, and they supply relevant feedback when pupils answer test questions, and pupils learn in the presence of good teachers; if machines give clear explanations and relevant feedback, pupils will learn in the presence of computers. Plausible? Or is it that we are as ignorant of the dynamics of learning as the old inventors were of the aerodynamics of flight? We know it happens, but we don't really know the conditions which make it possible. I am not for one moment suggesting that we stop trying to do it. That would be arrogant on my part, and it would cut us off from insights and discoveries that can come from the effort. Even failure can be instructive. Sooner or later, as we continue to try and teach with machines, we hope to be brought up short by something which fails so abjectly (or perhaps something that succeeds so perfectly) that we are forced to ask why.

Variability

Meanwhile, intelligence is, in a funny way, related to variability. The most completely unintelligent conversation one could have would be with an interlocutor who always responded with exactly the same words. Imagine a silly computer program which simply gets input and responds 'Hello'.

10 INPUT a$
20 PRINT "Hello"
70 GOTO 10

Humorists, of course, can exploit such a conversation, as the Two Ronnies did in their splendid take-off of the Mastermind TV Quiz. This, as far as I can remember, went along these lines:

What do we call a road between two mountains?
Pass.
Correct. What must you show to get a cheap fare on a bus?
Pass.
Correct. Which word means 'succeed in an examination'?
Pass.
Correct. What does one car do to another which travels more slowly?
Pass.

and so on. (For those who have never seen Mastermind, PASS is of course what the competitors say when they don't know the answer.) The conversation would be slightly more interesting, and more apparently intelligent, if it varied according to some input, like the classic beginner's BASIC program:

10 INPUT "Your name, please: ",name$
20 PRINT "Hello ";name$;". ";
60 PRINT
70 GOTO 10

Now the machine keeps printing out 'Hello Mary' and 'Hello Marmaduke' for as long as it receives input. The next stage in adding to variability is to add a condition. Let's add one line to our program as follows:

30 IF LEN(name$) > 5 THEN PRINT "That's a long name. "; ELSE PRINT "That's a short name. ";

Now the computer dialogue will have the form

Your name please:
Marmaduke
Hello Marmaduke. That's a long name.

Your name please:
Mary
Hello Mary. That's a short name.

This is not very interesting yet, but we have come some way. What we can do next is add a random element. Let's put these lines into the program:

40 maybe% = (RND < .5) ' TRUE half the time
50 IF maybe% THEN PRINT "I like it. " ELSE PRINT "I don't like it. "

Now the dialogue consists of our inputs and responses like

Your name, please:
Peter
Hello Peter. That's a short name. I don't like it.

Your name, please:
Anne
Hallo Anne. That's a short name. I like it.

and so on. Still not very intelligent, you may think. However a great deal of human/computer interaction is simply built up from elaborations of these three processes, using input data, conditionality and randomness. Whether that is intelligent or not, what has to be said is that it gives the appearance of intelligence. The ELIZA-type programs use no other mechanisms. Funnily enough, it is the randomness that is the more human-like attribute. No computer actually wants to talk to us; it does not initiate conversation in order to satisfy an internal needs for social contact. The only way a computer program can simulate a desire to start a conversation or change the subject is through randomness, and so randomness supplies a great deal of what between communicating beings would be called sociability. While computers possess randomness, they can to some extent do without intelligence.

Things to do with unintelligence

There is a glorious amount we can do with the limited intelligence at the machine's disposal. We can simulate conversation; ELIZA, SHRDLU (Terry Winograd's program for interacting with a world of blocks) and all their descendants are alive and kicking. We can play very challenging games, games in which learners have to call on a great deal of intelligence and make discoveries. In these games, the machine's greatest asset is often its stupidity, its refusal to bend the rules or to use 'common sense', to decide not to challenge the new arrival because he looks like the commander-in-chief. That stupidity is what forces us to explore, to find ways round when direct access is refused. Even programs which just say Right or Wrong may be of enormous value if Right or Wrong is what the user needs to know. (Consider text rebuilding programs, for instance, in which 'Right - that word is present' and 'Wrong - that word is not present' are the only relevant answers.)

Eleusis

One area which interests me especially is the area of logic puzzles and what one could call the Eleusis family of games, those in which the computer selects some organising principle and challenges the user to deduce it from clues. If you recall what I said about the intelligence tests, you will see just how well this fits the talents of the participants. The computer randomises and then applies algorithms faultlessly. The human studies apparently random data and tries to fit a theory to them.

Eleusis itself is a card game in which one player writes down a rule; the remaining players place cards from their hands on to a central stock, and the rule-writer declares the cards legal or illegal. If legal, they stay where they are, and if illegal they have to be taken back. The object is to get rid of all one's cards. A rule might be, for instance, 'if the previous card was a court card, now play a black card, otherwise a red card'. (The game is fully described in Gardner, 1966.) The principle applies to programs like my PRINTER'S DEVIL and even has relevance to conversation simulators like ELIZA and JOHN AND MARY, where part of the motivation is to find out what the machine will do to your input, to deduce the mechanisms it uses to understand and act on your input.

The banquet

By the standards of the early eighties the software available now is extremely 'clever', both in terms of design, i.e. making sure that the screen is attractive and contains only relevant information, and in terms of ingenuity in embodying what is known about language and languages so that the program never displays language errors. We have not, however, seen any programs which know the mind of the user, and can therefore select just that form of explanation or hint which will lead to a flash of insight, to a dawn of enlightenment. We shall not be seeing them for some time yet, since we do not know ourselves how we generate that insight in others, on the rare occasions we do. Meanwhile we can use computers to be playful, slavish, meticulous and comprehensive, as scribes, concordances or encyclopaedias. Computers in language learning supply us with a banquet, not a medically prescribed diet. It is up to us to eat sensibly from the table piled with tasty dishes.