For several years I have been playing with the idea of using simple computer microworlds as language learning aids (Higgins, 1984, 1985). These consist of screen displays capable of modification, and some means by which the learner can interrogate the machine or command it to re-draw the display. The learner has to explore, by trial and error, what the machine knows both about its world and about English, and I have maintained that the opportunities for such linguistic exploration are a valuable component of learning although largely missing from conventional classroom tuition.
In a recent paper, “Reading, Writing and Pointing” (Higgins, 1986), I described a workshop held in Lancaster in 1984 on ways of communicating with the computer. This covered the limitations of menu selection, either by choosing letters or numbers or by stepping through a list using the arrow keys to highlight the next item and RETURN to select the one you want. One alternative to menu-driven systems is the command-driven system with word matching, which can be found in many educational simulations and, with different effect, in adventure games: GO NORTH, TAKE SWORD, etc. If the commands are elaborate, they can give rise to the frustrations of slow keyboard entry, but the article points out that this is not so much the fault of the keyboard as the more general problem that we talk, think and read far faster than we write, so that our thoughts and ideas always run ahead of our ability to write them down.
A solution has been offered by Robert Ward in the rapid whole-phrase entry method used in his logic problems (which, sadly, do not seem to have been published). The group who attended the workshop was even able to develop a BASIC routine to emulate the Ward method on the spot, the listing of which is given in the article.
Thanks to a grant from the Leverhulme Foundation and the hospitality of Lancaster University, I have been able to return to this area, but I am concentrating on the input of coherent sentences and how the machine is to understand them. Language is, of course, divergent: one cannot predict every possible form that a 'sensible' input can take in the context of a dialogue, let alone a senseless or zany one. To cope with even a part of the range of sensible utterances the machine needs to have a fairly robust parser, not just a matching routine. (The matching routine of ELIZA seems to work after a fashion, but it does not produce proper understanding or let the user have any real control over the course of the dialogue.) My target in my work this year is to construct limited domain parsers as 'front ends' for several of my existing programs, and to see if I can develop this into a kind of parsing utility which can be used to generate a whole series of such front-ends, or can be turned into an experimental learning tool in its own right. Ideally the 'parser-generator' should be straightforward enough to be usable by teachers and learners with no programming knowledge.
This may seem to be a tall order, but there are several factors which encourage me to think it can be done. The first concerns the semantics. Within the context of a simulation, a game, or a practice activity, there are only a very few things the machine can do: perhaps change the value of a variable, print a message, or erase and redraw part of a display. This simplifies the semantics of the transaction. In JOHN AND MARY, for instance, the machine's complete knowledge of the world is stored in three flag variables, and, once the machine has identified some input as a command, all it needs to understand is which of these variables has to be flipped. (Higgins, 1985) In general, there are only three things we ever want computers to do. We want them to store data. We want them to retrieve data. We want them to process data (and perhaps display the results graphically). These happen to correspond neatly to the three sentence archetypes of STATEMENT, QUESTION and COMMAND.
The kind of natural language understanding system I hope to create has, therefore, as its first objective to decide if it has been told a fact, asked a question or given an order. The machine itself needs very little further in the way of pragmatics; it implicitly knows the context of the utterance from the values in its variables (in JOHN AND MARY, for instance, it knows it cannot close the door if the door is already closed) and it does not put itself into an emotional relationship with the user (though the converse may apply). It must, of course, try to co-operate with the user, try to understand as much of the message as possible, even if parts of it are corrupt. It does not need, however, to make wild guesses; if the data is too corrupt (because it is beyond its parsing range rather than necessarily because it is ill-formed) then the machine can legitimately admit its failure with an "I don't understand" message. Co-operation is two way. In the learning situation we can demand that the learner also try to co-operate in making himself or herself understood. This is the second factor which is in my favour: given the situation of the foreign learner, it is quite legitimate for the machine to be set to handle only well-formed and relevant language, for the machine to be a little bit stupid, hide-bound and literal-minded.
Parsing in corpus linguistics, for which Lancaster is famous, deals with pre-existing text, and therefore permits a choice of strategies between top-down and bottom-up, depth-first or breadth-first, left-to-right or right-to-left. I could give myself the same flexibility if I use something like the BASIC input routine, so that the machine is given a complete utterance to work on. (This is how JOHN AND MARY works at the moment.) The alternative is to use immediate letter-by-letter parsing, which cuts short an unparsable input before too much time is wasted. This was the approach used in a British Council's demonstration program called FINDER (where missing suitcases had to be described, and the computer would drawn them and colour them during the description). I am hoping to blend both approaches, and to include Robert Ward's whole-phrase entry method, using immediate input but storing and displaying unparsable elements in some marked form, such as inverse colour, so that the learner has a clue to whether the machine is understanding the input and, if not, where the problems lie. At the end of the message, the machine, if it has failed on the first attempt to parse, can make a second attempt using a different strategy. This will mean using my own pseudo-input routines, but they are not too difficult to write and give me much more control over the display and the timing.
Letter-by-letter parsing can be carried out in a fairly simple brute-force fashion by using a non-recursive transition network which holds the machine's complete recognition vocabulary in a readily searchable fashion. Surprisingly, I have found that the memory costs of this are not high. You need five bytes for each letter in the whole vocabulary plus a small overhead for the array, which means that a 100 word vocabulary (ample for most of the applications I am considering) can be stored in about 3K of memory. I sort the whole vocabulary into alphabetical order, and then create an array which stores the position we have reached in the word, the ASCII code of the current letter, the number of possible successors to that letter, and the position in the array where the first of these successors is to be found. What the program has to do is to accept a letter at the keyboard, search the possible successors to the last letter and see if the new one is legal, and, if so, make the new letter the current letter. If it fails, it inverts the colour of the current word and continues accepting letters until the next space, punctuation or carriage return. It might later be able to say something like "I DON'T KNOW WHAT A WIDNOW IS", or I might give the program a fuzzy matcher to help it deduce WINDOW from WIDNOW. It depends on the nature of the activity.
If the program receives a succession of letters which make up one of its pre-taught words, including a space or punctuation sign or carriage return as an end marker, then it will reach a point in the array where there are no legal successors. This leaves me two spare bytes with which to describe the word. One of these is just an index number, its 'meaning', which the program will be able to interpret as appropriate. The other can be used for a part of speech label, and with a whole byte I can distinguish (if I want to) 255 different parts of speech. I am still playing around with different labelling patterns, looking for the ones which will give me the greatest flexibility later in using comparison tests. I might, for instance, be able to use odd and even above a given range to mark present and past, and thus store all the strong verbs I want to.
However, the end product of this immediate word-level parsing is nothing more than a string of numbers. The machine still has to decide whether the grammar of the string is legal, and then to determine its meaning. The grammar parse can be carried out in exactly the same brute-force fashion as the vocabulary check, but that would be either very limiting or very expensive in memory; in either case it would be highly inelegant. What I have to do at this point is to create recursive transition networks, in other words groups of these arrays and the ability of one array to call another (or even call itself). That is as far as the work has gone.
Which programs will I put this parser into? The first one will be PHOTOFIT, which is a game in which one has to study a face and then give orders to a notional 'police artist' to redraw it for you. In this activity all the inputs can be assumed to have the form of commands. There is no need to cope with questions and statements, which simplifies the grammar considerably; that is why I am beginning with this one. My next project is to put a parser into a new program of mine, tentatively called TIGLET, which is a logic exercise in which you have to offer food to a fussy tiger and try to work out why he accepts some kinds but not others. Here, too, the grammar is simplified since all the user's inputs are offers. With this experience behind me, I hope I will be able to tackle a proper GRAMMARLAND scenario of the kind described in an earlier paper (Higgins, 1985), and the one I have in mind is the corner shop, in which the computer plays shopkeeper to the learner's customer. In this case the learner can ask questions and demand products, perhaps even attempt to bargain.
The question I have not mentioned so far but which is clearly central is: what use can this be to a foreign learner? I always invert this question and put it first into the form, Would I want to use one of these programs if I was learning [Lithuanian]? My honest answer is Yes; I would relish the opportunity of playing around trying to manipulate a microworld and knowing that I would succeed if I could handle the limited, idealised but acceptable forms of the language which the machine had been taught to recognise. Unsimplified and authentic native speakers' language is not what I would want to handle as an elementary learner; the protected environment of the limited domain parser would give me a sense of security.But I am not necessarily representative, and so I have had to consider what kind of evidence would determine whether the programs have any effect on the students' grammatical skills. One could look at three areas: motivation, perceptions of rules and regularities, and reductions in errors in spontaneous speech and writing. As an indirect pointer to the last of these, one might want to add reduction of error in grammar-based tests, which is the easiest kind of evidence to gather but the hardest to explain and justify.
Motivation can be measured in a variety of ways, by questionnaire, by observations of behaviour, or simply by measuring time spent voluntarily on one activity rather than another where there is real choice. I have enough evidence of this kind already to believe that the programs are enjoyable. If I can also show that the activity demanded by the program is linguistically relevant to the elementary learner, then there is a fair presumption that some learning will occur. The affective filter (in Krashen's phrase) will not be up, so experience will be assimilated.
Perceptions of grammatical rules are, on the whole, not used in measuring language ability; we do not often ask learners in examinations to state the rules they think they use when they generate language. We would not, in general, expect them to be able to do it well, and prefer to look at the product rather than the rationalisation which the learner believes to underlie the product. What I believe the GRAMMARLAND approach can do is to confront learners with their misconceptions and force them to reconsider them when they demonstrably fail. The experience can 'de-bug' their own faulty rules. If I am right, I may find observational evidence, but I plan also to look at possible forms of before-and-after questionnaires which may reveal changed formulations.
I will also be looking for evidence that using the programs affects performance directly by reducing errors in speech and spontaneous writing. I very much doubt whether, within the limitations of time and resources available to me, I will be able to produce significant figures, but I should at least be able to see where evidence is likely to be found.
Learning of rule-based productive systems can be seen as either deductive, from rule to examples or RULEG, or else inductive, from examples to rules or EGRUL. Learning can be focussed and conscious, or subliminal and incidental. If we make a grid of these categories, we can imagine four learning paradigms, unaware RULEG, aware RULEG, unaware EGRUL and aware EGRUL. Structural or behaviourist drill-and-practice approaches fit the first paradigm: the learner is deducing or analogising usage from patterns, but is doing it unawares since the first task is to establish the pattern as a habit. Cognitive code learning seems to belong to the second paradigm, while Krashen's 'comprehensible input' approach matches the third. The fourth is the neglected paradigm in language learning, and matches the conjectural or exploratory approach of GRAMMARLAND. Linguistic play and linguistic discovery are hard to build in to ordinary classwork; perhaps the use of microworlds and input parsers can give them a place in foreign language learning.
Higgins, John (1984). "JOHN AND MARY: the pleasures of manipulation." Zielsprache Englisch, Max Hueber Verlag, 1, 1984, p. 1 - 5.
Higgins, John (1985). "GRAMMARLAND: a non-directive use of the computer in language learning." ELT Journal, 39, 3.
Higgins, John (1986). "Reading, writing and pointing: communicating with the computer." In Computers in ELT and Research, ed Leech and Candlin, Longman, p. 46 - 54.