What I tell my students about Noam Chomsky and Seymour Papert

An overview, written in 1981, of different versions of the developing "Cognitive Paradigm", with special reference to issues in developmental psychology, learning theory and including emphasis on the concepts of competence & performance; underdetermination & abduction; learnability & accessibility. (Chomsky, Papert, Piaget, Charles Sanders Peirce)

Althusser, L. (1969) For Marx. London: Allen Lane.

Althusser, L. and Balibar, E. (1970) Reading Capital. London: New Left Books.

Anderson, P. (1981) Arguments Within English Marxism. London: New Left Books.

Bereiter, C. and Engelmann, S. (1966) Teaching Disadvantaged Children in the preschool. New York: Englewood Cliffs: Prentice Hall.

Boden, M. (1977) Piaget. London: Fontana.

Bresnan, J. (1978) A Realistic Transformational Grammar. In M. Halle, J. Bresnan and G.A. Miller, eds., Linguistic theory and Psychological Reality, pp 159. Cambridge, Mass: M.I.T. Press.

Bruner, J. (1957) On going beyond the information given. In H. Gruber et al, eds., Contemporary Approaches to Cognition. Cambridge, Mass: Harvard University Press.

Chomsky, N. (1957) Syntactic Structures. The Hague: Mouton.

Chomsky, N. (1959) Review of B.F. Skinner's Verbal Behavior. Language. v. 35, n.l, pp 26-58.

Chomsky, N. (1968) Language and Mind. New York: Harcourt, Brace and World.

Chomsky, N. (1975) Reflections on Language. London: Fontana.

Chomsky, N. (1980) Rules and Representations. New York: Columbia University Press.

Cooper, D. (1975) Knowledge of Language. London: Prism Press.

Corder, S.P. and Raulet, E. eds., (1977) Actes du 56me Colloque de Linguistique appliquee de Neuchatel. Geneva: Librairie Droz.

Culler, J. (1975) Structuralist Poetics. London: Routledge and Kegan Paul.

Dennett, D. (1979) Brainstorms. Hassocks: Harvester Press.

Dittmar, N. (1976) Sociolinguistics. London: Edward Arnold.

Donaldson, M. (1978) Children's Minds. London: Fontana.

Driver, C. (1971) The Exploding University. London: Hodder and Stoughton.

Edgley, R. (1970) Innate Ideas. In G. Vesey, Ed. Knowledge and Necessity. London: Macmillan.

Fodor, J.A. (1975) The Language of Thought. Hassocks: Harvester Press.

Fodor, J.A. (1981) Representations. Brighton: Harvester Press.

Fodor, J.D., Fodor, J.A. and Garrett, M. (1975) The Psychological Unreality of Semantic Representations. Linguistic Inquiry 6: 515-32.

Frege, G. (1894) Review of E. Husserl, Philosophie_der Arithmetik. Translated in Mind, v LXXXI, n.323, 1972, pp 321-37.

Freire, P. (1972) Pedagogy of the Oppressed. Harmondsworth: Penguin Books.

Gazdar., G. (1981) On Syntactic Categories. In The Psychological Mechanisms. of Language. London: The. Royal Society.

Goodman, N. (1965) Fact, Fiction and Forecast. Second ed. Indianapolis: BobbsMerrill.

Habermas, J. (1970) Toward a Theory of Communicative Competence. In H.P. Dreitzel, Ed., Recent Sociology No.2. London: Collier-Macmillan.

Hamlyn, D. (1978) Experience and the Growth of Understanding. London: Routledge and Kegan Paul.

Hirst, P. (1974) .Knowledge and the Curriculum. London: Routledge and Kegan Paul.

Hymes, D. (1972) On Communicative Competence. In J. Pride and J. Holmes, eds., Sociolinguistics, pp 269-293. Harmondsworth: Penguin Books.

Itkonen, E. (1978) Grammatical Theory and Metascience. Amsterdam: John Benjamins B.V.

JohnsonLaird, P., Legrenzi, P. and Legrenzi, M. (1972) Reasoning and a sense of reality. British Journal of Psychology, v.63, pp 395-400.

Katz, J. (1981) Language and Other Abstract Objects. Oxford: Basil Blackwell

Labov, W. (1977) Language in the Inner City. Oxford: Basil Blackwell.

Leach, E. (1966) The Legitimacy of Solomon. European Journal of Sociology, v.VII, pp xxx-101.

LeviStrauss, P. (1966) La Pensee Sauvage. Paris: Plon.

Linell, P. (1980) On the Similarity Between Skinner and Chomsky. In T. Perry, ed., Evidence and Argumentation in Linguistics, pp 19099. Berlin: W. De Gruyter.

Lyons, J. (1981) Language and Linguistics. Cambridge: Cambridge University Press.

Makins, V. (1981) Turning Turtle. Times Educational Supplement, 10 July, p.27.

Macaulay, T. (1829) Utilitarian Logic and Politics. Edinburgh Review, v.XL,IX, pp 159-89.

Mehler, J. and Bever, T. (1967) Cognitive Capacities of Young Children. Science, v.158, n.3797, pp 141-142.

Newmeyer, F. (1980) Linguistic Theory in America. New York: Academic Press.

O'Shea, T. and Young, R. (1978) A production rule account of errors in children's subtraction. Proceedings of the AISB/GI Conf. on Artificial Intelligence, Hamburg 18-20 July. PP 229-37.

Papert, S. (1980) Mindstorms. Brighton: Harvester Press.

Pateman, T. (1980) Can Schools Educate? Lewes: Jean Stroud: Reprinted in Journal of the Philosophy of Education, v.l4, n.2, 1980, pp 139-48.

Pateman, T. (1981) Communicating with Computer Programs. Lane and Communication, v.l, no.l, pp 3-12.

Pateman, T. (forthcoming) Review of Katz (1981).

Peirce, C. (1940) The Philosophy of Peirce Selected Writings, ed. J. Buchler. London: Routledge and Kegan Paul.

Piattelli-Palmarini, M. (1980) Language and Learning. London: Routledge and Kegan Paul.

Pylyshyn, 7. (1973) The Role of Competence theories in Cognitive Psychology. Journal of Psycholinguistic Research, v.2, n.l, pp 21-50.

Quine, W. (1972) Methodological Reflections on Current Linguistic Theory. In D. Davidson and G. Harman, eds., Semantics of Natural Language, pp 442-54. Dordrecht: Reid.el.

Selinker, L. (1972) Interlanguage. IRAL, v.10, n.3, pp 21931.

Slobin, D. (1977) Language Change in Childhood and History. In J. MacNamara, ed., Language Learning and Thought, pp 185-214. New York: Academic Press.

Sperber, D. (1973) Le Structuralisme en Anthropologie. Paris: Editions du Seuil.

Sperber, D. (1975) Rethinking Symbolism. Cambridge: Cambridge University Press.

Steinberg, D. (1975) Chomsky: From Formalism to Mentalism and Psychological Invalidity. Glossa, v.9, n.2, pp 218-58.

Thompson, E. (1.979) The Poverty of Theory. London: Merlin Press.

Valian, V. (1976) The Relationship Between Competence and Performance. A Theoretical Review. CUNY Forum, 1, pp 64-101.

Wason, P. (1977) The Theory of Formal Operations A Critique. In B. Geber, ed., Piaget and Knowing, pp 119-135. London: Routledge and Kegan Paul.

White, L. (1981) The Responsibility of Graimnatical Theory to Acquisitional Data. In Hornstein, N. and Lightfoot, D. eds., Explanation in Linguistics, pp 241-71. London: Longman.


Originally a lecture given to the Education Area, University of Sussex, Autumn 1981. Published in Papers of the Annual Conference of the Philosophy of Education Society of Great Britain April 1982, pp 19 - 39. Very minor corrections only to this 2003 Website version - it would not make sense to try to update this period piece!

In the past quarter of a century one of the most exciting developments in the human sciences has been the emergence and spectacular growth of a cognitive psychology which sets itself no less a task than understanding how the human mind works. The revolution in linguistics begun by Noam Chomsky's Syntactic Structures (1957) is one parent of contemporary cognitive psychology, much of which is itself an elaboration and development of the fresh understanding of language and mind achieved by Chomsky, and the rest in large measure an extension of his approach to other domains of human cognition. The other parent, and the ancestors of cognitive psychology, are less easily named, but they certainly include the pioneers of cybernetics, information theory and artificial intelligence, along with the Gestalt psychologists.

All this may sound terribly forbidding to you; it does to me. The strategy I shall adopt in order to gain an entry to cognitive psychology and to allow me to present a few of its leading ideas and discuss their educational relevance will be to focus on the work of two men, Noam Chomsky and Seymour Papert, both Professors at the Massachusetts Institute of Technology, the former in Linguistics, the latter in Mathematics and Education. About Chomsky, you will already know something. Of Papert, I will say at this point that he was a student and collaborator of Piaget's, has brought children and computers together in an original way which has nothing in common with computer assisted instruction (CAI), and is the author of a book which every teacher should read Mindstorms (Papert 1980). Now to business.

Central to cognitive psychology is the study of human learning, and a theory of learning for humans will give an account of what can be learnt by humans and of how it can be learnt. What crucially differentiates current approaches to these questions from previous ones is, first, that the conception of what is meant by 'What is learnt' is different - indeed, it is the biggest difference - and, second, it is no longer assumed that how humans learn will be pretty much like the way pigeons or rats learn (which may in turn not be at all like the way experimental psychologists have said pigeons and rats learn. But that's another issue). The important thing is to indicate the nature of the reconceptualisation of the domain of what can by learnt which has occurred, and here Chomsky's work is central.

For Chomsky, there can be no linguistics until a clear distinction is drawn between linguistic performance, people's speech actions, and linguistic competence - the knowledge of a language or, more narrowly, of a grammar, which makes those speech actions possible, or, again more narrowly, is responsible for their grammatical structure. The proper object of linguistics is competence and not performance, and its highest aim the modelling of linguistic competence (Endnote 2). Linguistics is a branch of cognitive psychology, since what it aims to model is something which is mentally represented. Of course, performance can be studied, too, but, first, it's a different kind of study and, second, in performance not just one but numerous competences are always in play, so that the explanation of any given performance (speech action) will make reference to multiple, interacting and conflicting causes, both psychological and social.

Though there is disagreement about the significance of Chomsky's distinction, as well as, of course, about its validity (Note 3) two things of relevance here seem clear. First, that Chomsky intended to counterpose his competence oriented linguistics to a structuralist linguistics which he saw as oriented to the study of a corpus of speech from which linguistics sought to infer, inductively, a grammar which would account for that corpus. In contrast, Chomsky took the view that grammars should not account for attested performance but rather be capable of generating all and only the infinite set of sentences which would be recognised as grammatical by speakers of the relevant languages.(Note 4). Second, that any linguistics purporting to study a mentally represented competence was incompatible with all those tendencies in mainstream psychology which either denied that the mind (and hence anything mentally represented) existed, or else said that it was unnecessary to study the mind or competences because human behaviour (performances) could be exhaustively explained in terms of a direct connexion between environmental stimuli and the organism's behavioural responses to them. It was therefore no diversion from his vocation as a linguist when Chomsky sat down to demolish mainstream American psychology, which he did in his 1959 review of B.F. Skinner's Verbal Behavior - perhaps the most scientifically influential book review this century, and which undeniably shows Skinner's claims to be false or vacuous (Chomsky 1959).

The demolition of Skinnerian behaviourism accomplished, the way is clear not only for the pursuit of a linguistics of competence, but for the introduction of a competence/performance distinction and an emphasis on competence in to the domain of every other human accomplishment, including obviously and relevantly human learning, which Skinnerians had sought to define over classes of stimuli and responses. And this is indeed what has happened, to the great benefit of psychology. As the ex-behaviourist psychologist, G.A. Miller, expressed his conversion in 1962, "I now believe that mind is something more than a four letter Anglo-Saxon word - human minds exist and it is our job as psychologists to study them" (quoted in Newmeyer 1980, p.44), where the study of mind is here synonymous with the study of competences. Outside psychology, the competence/performance distinction has had its impact and concepts of communicative competence (Habermas 1970; Hymes 1972) and even literary competence (Culler 1975) have been fruitfully developed.

Competences of all kinds connect in two directions with performances: in one direction, they are put to use in generating actions, and in this direction lies the domain of performance theories; in the other direction, they connect to the performances from which competences develop, the domain of learning theories, which seek to explain how what is learnt is learnt. The rise of the competence approach has brought into focus and prominence some extremely interesting questions about human learning, occluded by behaviourism, and it is to the most fundamental among those questions that I now turn.

Except in the case of rote learning, any competences we develop necessarily go beyond the performances - the data or input - on which they are based. In the case of language, for example, the child emerges with a grammar (or grammars) with infinite generative power after exposure to a finite and, Chomsky would say, small and often degenerate corpus of speech which is addressed to it or which it overhears. Even more obviously, for example, knowing how to multiply is knowing how to operate on an infinite domain of numbers to produce an infinite set of results, but this knowledge arises from contact with a small number of examples or illustrations of how to do multiplications. To put it in a phrase of Jerome Bruner's, competences necessarily go beyond the information given (Bruner 1957). How is this possible? This is the central question of competencist learning theory, and it derives its peculiar fascination from the awareness which cognitive psychologists have that there are always an infinite number of logically possible was of going beyond the information given which are nonetheless logically compatible with that information. Chomsky has become blase about this awareness, remarking that, 'the general thesis is so obviously true that it is even not worth discussing' (in PiattelliPalmarini 1980, p.261). But the thesis is far from obvious to anyone who does not share Chomsky's intellectual biography, and certainly worth discussing here. I shall introduce the discussion with an example.(Note 7).

Suppose, then, that our input or data or information given, consists of a list of values for two continuous variables x and y, and that we plot these values on a graph (Cartesian coordinate geometry). Suppose this yields us the distribution of points shown in Figure 1.

Figure 1

Now to fit a curve to these points - or in other words, to propose a formula relating x and y - just is to go beyond the information given and generate a hypothesis about the relation of x and y which generates predictions for currently unknown further values of x and y. Suppose we join up the points with a straight line at 45 degrees to the horizontal axis, that is, propose the formula x = y as our hypothesis about the relation between x and y. Perhaps this hypothesis strikes you as the obvious one to propose, perhaps even as necessarily true. Such natural responses are extremely interesting, because there is absolutely nothing in the data presented which selects or determines the formula x = y from among the infinite number of formulae compatible with the data represented by points in Figure 1. That there are an infinite number of compatible formulae can be grasped most easily simply by seeing that, Cartesian space being infinitely divisible, an infinite number of wavy lines could be drawn to pass through the points in Figure 1. Furthermore, if we attempt to confirm or falsify our hypothesis by determining intermediate or higher values for x and y, then though a value for x and y which apparently confirms the x = y hypothesis will simultaneously rule out an infinite number of wavy line hypotheses, it will still leave an infinite number of wavy line hypotheses which are compatible with the data. This results directly from the nature of Cartesian space. Values for x and y which falsify the x = y hypothesis do not select a new hypothesis in its place; and though knowing that x = y is false may be very important, for our practical purposes we need more than a knowledge of false hypotheses; we need to adopt a hypothesis upon which we can act. This is true not least for the child developing a grammar (linguistic competence) on the basis of an input of speech data.

Now the problem which my example illustrates goes under various names in the literature. The philosopher Nelson Goodman, whose student Chomsky was, is responsible for a very important formulation of a version of the problem, which Goodman calls 'the new riddle of induction' (Goodman 1965, ch. III; the old riddle was the problem of justifying our expectation that the future will be like the past)(Note 8) Others refer to it as the problem of the 'underdetermination of theories by data', where 'theories' are in all relevant respects indistinguishable in character from the competences I am discussing. I shall speak simply of 'underdetermination'; and whether in science or in learning, the fact of universal underdetermination means that the human subject does indeed add something in going beyond the information given, that something being the theory, hypothesis, rule, formula, etc. which accounts for the data and allows us to project or generate new data. This addition is, I think, half of what is meant when the 'active' as opposed to 'passive' role of the learner is stressed; the other half being the learner's (or equally the scientist's) role in manipulating and experimenting on the environment to produce new data.(Note 9)

But there is more to this activity than the fact that it adds an underdetermined hypothesis to the data. For the striking thing is that in scientific research, in language learning and mathematical learning - indeed, quite generally - of the infinite number of logically possible and admissible hypotheses, human beings actually consider very few, and often converge independently on those very few or even on one unique hypothesis. In an important sense, it could not be otherwise; life is necessarily too short to permit a search through an infinite domain of hypotheses. C. S. Peirce (another, more distant, influence on Chomsky) put the point this way in reflecting on our scientific achievements: 'Unless man have a natural bent in accordance with nature's, he has no chance of understanding nature at all' (Peirce 1940, p.156).

What is true of the scientist's encounter with nature is equally true of the child's encounter with language; without some 'natural bent' towards it, the child would have no chance of developing a linguistic competence. If, following Peirce and the earlier Chomsky (Chomsky 1968), we call the hypotheses humans put forward to account for data abductions, then fundamental questions of learning theory concern the character of the natural bent which places prior constraints on the class of abductions entertained (and entertainable) by humans, and, specifically, consideration of whether this natural bent operates independently of the subject matter about which abductions are being made, guiding the mind towards the simplest abductions, for instance, or whether its character varies from domain to domain. These are the questions I want now to say something about, illustrating their pedagogic relevance.

Wherever corrigible error occurs, as it does in most learning, we can at least be sure that abduction is not subject to rigid prior constraints dictating the selection of a unique hypothesis to account for data. This is no more than a tautology. However, what the competence-by-abduction approach suggests, which is worth elaborating, is a particular way of approaching children's errors.(Note 10). It suggests that we should carefully distinguish errors of performance, due to such local factors as lapses of attention, memory failure and so on, from abductive errors, where the child is employing the wrong rule or an incomplete set of procedures in a given domain. Developments in Artificial Intelligence make it possible to set out in an intelligible form models of such bugged procedures, and to simulate on the computer the production of errors in domains like arithmetic and second language learning . It seems to me that it would deepen teachers' understandings of children's errors if their training included practice in writing or at least playing around with computer programs which will produce characteristic learners' errors, and where the program is so constructed that the source of the errors in a faulty set of procedures (simulating a defective competence) is easy to see. An example of such a production system approach to children's errors is provided by the work of O'Shea and Young 1978, for example, who set up a production system for modelling children's errors in subtraction when they are using the method of decomposition, claiming that, 'many of the incorrect strategies used by schoolchildren can be modelled as the consequences of simple changes to this production system, such as the omission of individual rules or the addition of rules appropriate for other arithmetical tasks' (O'Shea and Young 1978, p.229). Though it would take too long to summarise their work here, I refer to it as a short and accessible paper which mathematics' students especially would benefit from reading.

Returning to the case of the child's first language, it is clear that children make false and corrigible abductions; at its simplest, they overgeneralize rules to which there are 'exceptions', producing such forms as "sheeps" as the plural of "sheep" and "teached" as the past tense of "teach". (Note 11). Such examples are, however, good enough to knock out at least one version of the view that children learn by imitation or conditioning, but probably no one holds that view now, at least not in such a vulnerable form. More importantly, whilst Chomsky in his earlier work was happy to use the term 'abduction' and to think of the process as involving the formation of corrigible hypotheses (see Chomsky 1968, pp 76-79) in his more recent work he questions this picture of language learning (see Chomsky 1980, esp. Pp 13-40)).What impresses him now is the rigidity of the constraints on our so-called abductions: the fact that as far as some grammtical rules are concerned, we don't ever make any errors in formulating them. Thus, in his debate with Piaget, he says of the grammatical rule he calls the Specified Subject Condition (SSC)(Note 12) 'No one ever makes mistakes to be corrected' (in Piattelli-Palmarini 1980, p.42). And he takes the view that most logically possible alternative grammatical rules to those found in natural languages would be quite unlearnable or very inaccessible to humans operating under the normal conditions of language acquisition. He draws the conclusion that the SSC and similar rules, far from being abductions, are properties of the initial state of the human mind. In short, they are innate.

Chomsky's nativism, his belief in innate ideas or, more accurately, in a universal grammar which places rich prior constraints on the class of learnable grammars is not only a substantive psychological theory; it is his solution, in the domain of language learning, to Goodman's new riddle of induction, to the problem of underdetermination. Out of the infinite number of hypotheses which would account for our linguistic experience, we select the one or few which we do select because it is in our nature to do so; Peirce's 'natural bent' becomes in Chomsky's hands an elaborate natural endowment. And Chomsky's co-worker at M.I.T., Jerry Fodor, has generalised his nativisim to make the claim that all our concepts are innate (Fodor 1975 and 1981, chapter 10; he summarises his views in Piattelli-Palmarini 1980, pp 142-62).

I am not, I am sorry to say, going to discuss whether Chomsky's or Fodor's nativism is true(Note 13). Having tried to show how nativism emerges in the context of a particular way of thinking about human learning, I want now to indicate how Chomsky, in a new and interesting Gestalt switch in his image of the mind, turns nativism back on its context in order to question the very concept of learning itself. The switch comes about like this: the more our so-called abductions are constrained by an innate program, the less it seems accurate to speak of the mind, let alone the human subject, doing anything as active as 'learning'; rather, things are pre-programmed to happen in the mind, under what Chomsky now calls the 'triggering' and 'shaping' impact of environmental stimuli (see Chomsky 1980, especially Chapter 1). When so much is given in advance, it is misleading to speak of 'learning', and a better metaphor is that of 'growth': just as we do not say that an acorn learns to be an oak, but grows into one, in virtue of the fact that the acorn already contains the formal cause of the oak, so perhaps we should say that a grammar grows in the child as one of its mental organs.(Note 14) Now while this view may justifiably appear to be a simple antithesis of behaviourism (but not behaviourism itself: cf. Linell 1980; Lyons 1981, p.248), with nature performing everything behaviourism assigns to nurture, in the context of education it can be seen as a variant of a romanticism associated with the names of Rousseau, Pestalozzi and Froebel, and which guides students of education when they write essays in the horticultural mode about 'the growth of the child'. What Chomsky has done which is different is to extend romanticism from our affective to our cognitive life in a particularly striking fashion. What qualifies the romanticism is the clear demarcation he draws between language and other domains of human cognition; just because language grows in the mind it does not mean that physics or chemistry or history grow in the mind, too. And, indeed, the view that there are rich innate constraints on our abductions is virtually equivalent to saying that they are domain specific. It is this question of domain specificity which I now wish to consider.

Though no one inside cognitive psychology is as bold as P.H. Hirst (see Hirst 1974), in telling us how many domains of knowledge there are, Chomsky insists that grammar is one such domain, which means that when he proposes, for example, the innateness of the Specified Subject Condition (SSC) in order to account for the presence of a corresponding rule of grammar, he does not suppose that the SSC will help us to learn anything other than SSC related rules of grammar. SSC is a special purpose feature of human intelligence. Much of the debate between Chomsky and Piaget and their respective allies recorded in the pages of Language and Learning (Piattelli-Palmarini, 1980) concerns the correctness of regarding human intelligence as essentially special purpose rather than general purpose, with Hilary Putnam expressing the challenge of the friends of general purpose intelligence in the memorable question, 'Why should [God] pack our heads with a billion different 'mental organs' rather than just making us smart?' (in Piattelli-Palmarini 1980, p.298).(Note 15)

To this challenge, there are more and less sophisticated responses. Commonsensically, we are happy to view differential intelligence as domain specific and we don't expect mathematical, musical or chess geniuses to be geniuses in every other domain, though we may believe abilities in maths, music and chess to be interrelated. More importantly, in arguing for the specificity of the intelligence brought to bear in language growth, Chomsky uses a number of arguments, of which two are especially worth singling out. First, that ability to acquire a language is not differentially distributed, unlike other abilities. Unless there is gross defect, everyone gets to have an internal representation of an extremely complicated grammar very early on in life.(Note 16) The same cannot be said for, say, mathematics. Second, grammatical rules turn out not to be the simplest imaginable. There could be perfectly useable languages with much simpler rules than actual languages manifest (for example, rules like those of the artificial languages of logic). Insofar as it seems that a general purpose theory of learning would have to show how knowledge can be built up by modular operations, it would have considerable if not insuperable difficulty in explaining the complexity apparently shown in actual grammars.

Neither of these arguments is decisive. In relation to the first, radicals will say that differential abilities in mathematics are the product of the institutions which transmit maths, not of anything innate; if children were able to learn maths like they learn their native language, differences would tend to disappear. This is the line Seymour Papert takes; and it is perfectly coherent, though I happen to believe it to be false. In relation to the second argument, the complexity Chomsky attributes to the mind's linguistic operations is just a mirror image of the complexity of the grammars he has devised. If those grammars prove to be needlessly complex; if, for example, transformational rules are unnecessary and the sentences of a language can be generated by a context-free phrase structure grammar(Note 17) (a view defended by a growing number of linguists; see, for example, Gazdar 1981), then the case for general purpose learning theory could turn out to be stronger than Chomsky imagines it.


The question of the domain specificity of the mind's endowments is distinct from, but entangled with, the question of the context dependent operation of the mental abilities we develop. Though the study of contextual constraints on performance abilities has been investigated by psychologists whose orientation or background is more often Piagetian than Chomskyan, it is perhaps worth saying a little about this work. There have been numerous experiments which show how variation in the context or setting of classical Piagetian conservation tasks considerably affects children's performance on those tasks (for a summary, see Donaldson 1978). Similar work has shown how adults cannot solve 'logical' puzzles presented abstractly but can solve them when they are given a 'concrete' interpretation. Consider, for example, Figure 2.

Figure 2

Figure 2 shows the faces of four cards on one side of which are letters of the alphabet and on the other numbers. Two cards are shown with the letters face up, and two with the numbers face up. Your task is to determine which cards you would have to turn over in order to determine the truth or falsity of the following statement:

(1) If there is a vowel on one side of a card, then there is an even number on the other side.

The correct answer is given in footnote 18; around 80% of University students give incorrect answers (as reported by the experiment's inventor, Peter Wason; see Wason 1977 as the most relevant exposition).

Now consider Figure 3. This shows the back of two envelopes and

Figure 3

the front of two others; one envelope is visibly sealed, the other visibly open; one envelope visibly bears a 5p stamp, the other visibly bears a 4p stamp. With respect to these envelopes your task is to determine whether the following postal regulation has been infringed:

(2) If an envelope is sealed, then it has a 5p stamp on it.

The correct answer is given in footnote 19; around 80% of University students get the answer right (see Johnson-Laird, Legrenzi and Legrenzi 1972, whose experiment it is). Yet this puzzle has the same logical form as the first one; it differs simply in being presented in a more real world context - only the cost of sending a letter is unreal.

From these results, Wason draws the following anti-Piagetian conclusion, 'Our results suggest that reasoning is radically affected by content in a systematic way, and this is incompatible with the Piagetian view that in formal operational thought the content of a problem has at last been subordinated to the form of relations in it' (Wason 1977, p.132). The general conclusion I want to suggest is closely related. If we imagine cognitive theories classified on the two dimensions domain specific/ nonspecific and content dependent/independent (see Figure 4), and if we think Chomsky's arguments for domain specificity hold up, then the only cognitive theories which are plausible are those which allow for both the domain and content relatedness of our mental operations.(Note 20)

Figure 4


Arguments about general and special purpose intelligence aside, I think the most important point for teachers to notice about recent cognitive psychology is that it very strongly suggests that it is unfruitful to think of pedagogy in terms of notions of objective simplicity or logical progression in the material to be learnt by someone (Note 21) and much more fruitful to ask what is learnable at all, and what is more or less accessible to the learner in terms both of the learner's innate endowment and previous experience. Here there is no disagreement between Chomskyans and Piagetians, and it is to the work of a self-proclaimed Piagetian, Seymour Papert, that I now turn with a view to exploring the idea of the accessibility of knowledge to the child. Though Papert sees himself as in substantial disagreement with theorists like Chomsky and Fodor (see the many exchanges in Piattelli-Palmarini 1980) I suspect he is much closer in outlook to the mainstream of cognitive psychology than many other Piagetians, not least because of the fact that the field he works in is Artificial Intelligence (Note 22).

Papert's main work consists in devising computer-rich environments in which children can learn powerful ideas of physics, mathematics and music in the course of programming the computer to execute their ideas. That he gets children to program computers, rather than uses computers to program children is what differentiates his work from mainstream computer assisted instruction. In his experiments, which are now being repeated in Britain (both in schools and at University level, see Note 23) Papert uses small, mobile computer-controlled robots (called Turtles) and computer-controlled screen displays (Screen Turtles), both controlled by means of programs written in the high level computer language, LOGO. LOGO has the advantage over such computer languages as BASIC that it is much easier to learn enough of it to do interesting things with than is the case with BASIC, to the use of which as a first computer language for children or adults Papert is opposed (see Papert 1980, pp 33-37).

However, it is not only by its form that Papert has made LOGO an accessible computer language; LOGO's interpretations have been developed in the context of a very sophisticated Piagetian concern with the accessibility of ideas. Papert takes seriously the part played by sensorimotor constructions in children's learning - constructions established, according to Piaget, in the first two years of prelinguistic life.(Note 24) What this means is that LOGO is so designed that it permits a sensorimotor 'interpretation'. The best example of this is provided by LOGO geometry as it contrasts with Euclidean and Cartesian coordinate geometry. In normal Euclidean geometry, points have no dimensions, no position and no direction; in Cartesian geometry, points acquire position (coordinates); finally in LOGO (or Turtle) geometry, a state of the Turtle is a point with a position and a heading. For Papert, the importance of this is that it means that states of the Turtle and their movement can be modelled by the human body which necessarily points somewhere whereas such modelling cannot be performed for Euclidean or Cartesian points. In Papert's neo-Freudian terminology, Turtle Geometry is 'ego-syntonic' (ego-compatible); in Piagetian terms, it is assimilable.

Programming the Turtle in LOGO to execute Turtle Trips is to learn Turtle Geometry. But why, it may be asked, should anyone want to learn Turtle Geometry rather than classical Euclidean or Cartesian Geometry? (Note 25) This question raises quite fundamental issues, both in learning theory and educational politics. Papert himself uses two rather different lines of argument in defence of his geometry (and his physics). First, he argues that LOGO geometry and physics are conceptually richer than much of the conventional geometry and physics taught to children of the same age as those programming in LOGO. Thus, Turtle Geometry can introduce pupils to the idea of differentiation long before they could encounter it in conventional maths; and Papert contrasts school physics unfavourably with Turtle physics. The former, he says, is ersatz physics; the latter, 'offers a Piagetian learning path into Newtonian laws of motion' (Papert 1980, p.123). Second, however, LOGO is defended as offering not substitutes for conventional forms of knowledge, but rather transitional routes to them, transitional forms which are accessible to children because they take account of the Piagetian way in which children naturally learn.

To accept this second argument one has to overcome the undoubted prejudice that transitional forms of anything must be imperfect, failed, erroneous or otherwise defective copies of the real forms, whether Hirst's forms of knowledge or Chomsky's steady states of linguistic competence.(Note 26). To accept that there can be transitional forms of knowledge, however, is no more than an acceptance of the view that children genuinely develop, and do not just learn to make fewer mistakes. Such a shift in perspective can have quite major consequences, and has had (to take an example outside Papert's concerns but very relevant to it) in the field of second language teaching and learning, where it has been incrasingly recognized that in learning a second language children (and adults, too, in this case) do not just make mistakes, but systematically construct intermediate models of the 'target' language(Note 27) making use of both the information available to them, their native language, and universal grammar. These interlanguages, as they have been called (the term is due to Selinker 1972), which have proper structures in their own right, just like the child's first grammars and pidgins and creoles (to which interlanguages have been compared: see Corder and Roulet 1977), seem likely to provide a valuable index of human learning strategies, and further evidence for the characteristics of universal grammar.

It is one thing to recognize the existence of transitional forms of knowledge and interlanguages and another to promote them - to design them, in Papert's case. If they are to be designed for use, then it seems to me that one important criterion for their selection must be that the transitional forms must be such that transition does, indeed, occur, and fixation on a transitional form avoided. In second language learning, fixation is discussed under the name of 'fossilization', where the problem is roughly this: since it is possible as an English person to get by, communicatively speaking, with a pidginized version of French (what would truly be Franglais if the word had not been preempted for something else), the learner may have no motive to depidginize his or her French, to gain fluency, for example, in French intonation. Likewise, it may be that fixation on Turtle Geometry is as real a possibility as fixation on Space Invaders.(Note 28)

This problem aside, a further issue - and the last one I shall discuss in relation to Papert - concerns the accessibility to reflexive monitoring and control of the kind of knowledge acquired in Papert's computer-rich environments. This is an issue which is, of course, centrally important for educationists, but often treated in a cavalier fashion by cognitive psychologists (notably Fodor 1975, pp 52-53). In Papert's case, the issue arises partly because of his emphasis on Piagetian learning, what he calls 'Learning without being taught' (Papert 1980, p.7). But he explicitly rejects the criticism that Piagetian learning is less available to reflexive monitoring. First, he argues that learning which is often thought to be nonverbalizable is not in fact so; we just haven't thought hard enough about how to verbalize it a position he elaborates and illustrates by spelling out a verbalized procedure for learning the physical skills of juggling (Papert 1980, pp 105-12). Second, the fact that the LOGO computer programs children write are inevitably error or 'bug' ridden (just like those of adults) means that children have to reflexively monitor their programs in a way which will increase their selfconscious understanding of what they are doing. It is in this connexion that Papert believes the teacher plays his or her key role.

To sum up so far. For all their differences, Chomsky and Papert are both theorists of human nature, but of a human nature which is not one of our handicaps - as it is in classical political theory and in Freud - but an advantage without which we would be both stupid and manipulable: stupid because we would only be able to derive weak or divergent inferences from the information given, manipulable because our thinking would be under stimulus control, as empiricists have always imagined it. Both Chomsky and Papert are also theorists of learnability and accessibility. For Chomsky, our complex languages are in order as they are because they are natural to us; any interference with the way they grow in us would be both unnecessary and misguided (like the speech enrichment programs of Operation Headstart.(Note 29)

For Papert, forms of knowledge can be made more accessible to young humans, and the business of the educationist is, 'to think in a fundamental way about science in relation to the way poeple think and learn it' (Papert 1980, p.188). Though Papert actually thinks of himself as an environmentalist - a nurture rather than a nature man - and is surely more of an environmentalist than Chomsky, nonetheless his ideas of accessibility depend on a rich and complex theory of operations universal to humans, especially sensorimotor constructions. From his standpoint, as much as from Chomsky's, the kind of myopic nature/nurture debate which has nourished itself on Sir Cyril Burt's studies of nonexistent monozygotic twins is sheer evasion of both scientific and pedagogic responsibility to understand the minds of children.


I am conscious that so far I have drawn my illustrations of the concepts I have introduced (competence & performance, under determination & abduction, learnability & accessibility) from a limited range of the subjects taught in school. In the final part , I shall consider in an outline fashion what happens when the attempt is made to use the cognitive psychologist's concepts to think about another school subject, history. In this I shall be following my own nose, and can claim no authority for what I have to say. History happens to have been the subject I taught in my brief stint as a secondary school teacher.

First of all, I take the view that the development of a historical sense ought to be the chief aim of history teaching. And a historical sense seems to be something to which the concept of competence, and the competence/performance distinction, is at least loosely applicable. Of course, it's a difficult and contested matter to say what a historical sense is. I suppose it , to include a sense of one's own position as a historical agent; of the dialectic of will and circumstance, subject and structure, in history (on this, see Thompson 1979 and the critique by Anderson 1981); of the multiple causality (or overdetermination) of historical events (Althusser 1969 and 1970); of the contrast between historical continuity and discontinuity. The list shows the influence on me of Marxist theories of history, but not, I think, in any tendentious way. A historical sense is put to work in concrete acts of historical judgement and understanding: that is the form performance takes

Granted that a historical sense can be meaningfully spoken of in this kind of way, the learning theorist's questions concern how it develops and what sort of relationship exists between it and the facts of history as they are or could be presented. Of one thing I feel sure, and that is that a historical sense can no more be developed in children by straight off telling them in what a historical sense consists than linguistic competence can be developed by teaching grammar. Rather, a historical sense grows in responses, initially unreflected, to the texture of the historical facts and narratives encountered, only gradually being brought within the scope of reflexive illumination.

As for the relation between the sense of history and its basis in fact and narrative, I think that two negative theses can be sustained: that the facts and narratives presented to children underdetermine the development of a specifically historical sense; and that they fail to do so in two characteristic ways. First, it happens that the facts and narratives are merely learnt by heart, as rote: no leap at all is made beyond the information given, and the facts and narratives simply gather dust in the museum of memory. Aware that this has happened to a great deal of what they learnt in school, many wonder what was the point of it all. And wonder they should. Second, however, the historical record becomes the material out of which the child, or the adult still working over that material, fashions a myth, bricoleur fashion (Levi-Strauss 1966 for the concept of bricolage adopted, interestingly enough, by Papert 1989, p. 173 ff) (note 30).The transformation of historical fact into historical myth is nicely illustrated by the anthropologist, Edmund Leach, who also provides his own explanation for what happens. This is what he says:

"For ordinary men, as distinct from professional scholars, the significance of history lies in what is believed to have happened, not in what actually happened. And belief, by a process of selection, can fashion even the most incongruent stories into patterned (and therefore memorable) structures. For a contemporary English schoolboy, the really memorable facts about English sixteenth century history are details such as the following:

a) Henry VIII was a very successful masculine King who married many wives and murdered several of them.

b) Edward VI was a very feeble masculine King who remained a virgin until his death.

c) Mary Queen of Scots was a very unsuccessful female King who married many husbands and murdered several of them.

d) Queen Elizabeth was a very successful female King who remained a virgin until her death.

e) Henry VIII enhanced his prestige by divorcing the King of Spain's daughter on the grounds that she had previously been married to his elder brother who had died a virgin.

f) Queen Elizabeth enhanced her prestige by going to war with Spain having previously declined to marry the King of Spain's son who had previously been married to her elder sister (Queen Mary of England).

It is not only in the pages of the Old. Testament [the subject of Leach's article] that the "facts of history" come to be remembered as systems of patterned contradiction!" (Leach 1966, p.100; Compare John Birtwhistle interviewed in Driver 1972, pp 312-13).

Now I doubt that myth is just the product of a drive to make memory more efficient; though the quotation suggests that, I don't actually thinkLeachi believes it either. Myth also has what might be called its hermeneutic motivation: its own standards of intelligibility and ways of creating meaning. Levi-Strauss is only the most recent in a long line of researchers to tell us that this is rooted in human nature as a structured and structuring mythopoeic power, an analogue of universal grammar.(Note 31) But the power and virtues of myth are not the self-same as those of history, and when the school pupil transforms historical material in the way Leach suggests, the aim of history teaching is frustrated even though, of course, the data licence the mythic abductions made from them.

If these two claims about the fate of history teaching stand up, what kind of explanation should we give them? Why do children either leave historical data unanalysed or analyse it into myth? And how might a historical sense have been made more accessible to such children? Well, explanation of rote learning can surely be sought at fairly obvious cultural levels; for the schools where history is taught are actually dominated by what Freire calls the 'banking approach' to education (Freire 1972), and history examinations are notorious for their demands for Facts, Facts and still more Facts. Children can be forgiven for thinking that History is about historical Facts, for what else are they taught? The aims of education are buried beneath the organization of schooling (Pateman 1980).

The explanation of the mythic transformation of history is surely going to be more complex and obscure. On the side of the organisation of cultural reproduction, the content as well as the context of history teaching is not ideologically innocent; it is heavy with its own myths. Consequently, it is not a radical transformation to rewrite the contrast of 'Bad Queen: Good Queen' as that between 'Promiscuous Queen: Virgin Queen', nor to combine this with the opposite, but ideology-preserving, rewrite for Kings: 'Bad King: Good King' becoming 'Virgin King: Promiscuous King'. But is that all there is to it? My nativist sympathies incline me to the view that in some sense, myth is more accessible to humans than history, and that the development of a historical sense may even be something which goes against the grain of our minds. That can't be ruled out. Chomsky himself argues forcefully that humans may well be ill-adapted to certain kinds of knowing (see Chomsky 1975, especially Chapter 1 and 1980, Chapter 6 and passim). But why might that be so? My suggestion and it is no more than that is this: there is no intelligible order in the whole canvas of human history, and there is nothing in the historical sense, as I depicted it, which requires or anticipates such order. There is no Christian or Marxist revelation. Rather, as I depicted it, the historical sense is alert both to the openness of history - to the fact that it could always have been different -and to its closure - history cannot be unmade and remade. None of this is congenial to the order humans long for, especially the order of justice. But there is no justice in history; to borrow Hegel's expression, History is merely a slaughterbench. In. contrast, myths display both order and an abundance of justice. Hence their attraction.

How then sustain the possibility of a sense of history, against the grain? How make this uncongenial knowledge more accessible? If this is more than a matter of changing the priorities of schools and the ideologies of this society, and is also a question of pedagogy, is there anything in the very different domain of history to be learnt from a Chomsky or a Papert? Or do we here re-encounter human nature in its negative form, as a limit on, rather than a condition of our enquiries? Perhaps we do. For consider two pedagogic alternatives to Kings and Queens history. First, suppose we introduce children to history through local or family history. Though we surely create the conditions for a triggering of some concept of historical agency, I doubt that we free ourselves from the mythic embrace. Local and family history is just as much in need of myth as national history; and family myths are likely to have an emotional importance much greater than national ones. So no easy answer to all the dilemmas here. Second, perhaps children could invent history and write invented histories, and get their historical sense that way: a sort of learning by pretending to do. But what would these histories be like? They would possess all the order of Utopias, and none of the chaos of History.

If I worked my imagination harder, I could perhaps come up with something which looked right to make a historical sense more accessible to children than I now believe it is.(Note 32). But I prefer to end with the suggestion that there are no obviously more accessible approaches to history; no transitional forms to real historical understanding; no interhistories. For in my general enthusiasm for cognitive psychology I don't want to suggest that it is a panacea. What I would like to see happen is for people outside the hard areas like maths and physics to play around with some of the cognitivists' ideas, just to see if our understanding of how to teach and learn painting, singing, considerateness for others, human biology, and so on and so forth, could benefit at all from contact with the ideas of competence and performance; underdetermination and abduction; learnability and accessibility.


1 I am grateful to Maggie Boden, Gerald Gazdar and Aaron Sloman for comments which prompted most of the revisions made, and more generally to the Cognitive Studies group at Sussex University from whose activites I have, I hope, learnt a great deal.

2 This conception of linguistics is, of course, disputed. See, for example, Cooper 1975, Itkonen 1978 and Katz 1981; and for a history and critique of Chomsky's positions, Steinberg 1975.

3 For a good discussion of the competence/performance distinction, see Pylyshyn 1973. See also Valian 1976. For some linguists and psychologists, Chomsky's competencism is not psychological enough and this has motivated approaches generically known as 'performancism'. See Bresnan 1978 and Fodor, Fodor and Garrett 1975.

4 The first major theoretical difference which Chomsky's approach makes is to introduce recursive rules as necessary components of any grammar of a natural language. See Chomsky 1957.

5 In my own mind, I have to go back to Macaulay's critique of James Mill's Government (Macaulay 1829) and to Frege's review of Husserl's Philosophy of Arithmetic(Frege 1894) to think of anything comparable - except that Frege's review changed Husserl's mind!

6 Aaron Sloman observes of this formulation that the infinity of possible projections is not important; all that matters is that the number of possible projections should be very large. He further observes that what is logically compatible may not be compatible with general constraints on intelligent systems: the need for speed, for economising on memory space, for coping with noisy and incomplete data, etc.

7 The example is due to Carl Hempel, and is used by J.A. Fodor in Piattelli-Palmarini 1980, pp 259-60.

8 The new and the old riddles are, of course, related, and Goodman's riddle belongs in a framework developed by Hume, Kant and C.S. Peirce.

9 Papert stresses this second half against what he sees as a residual passive behaviourism in Chomsky's account of language development (see Papert's remarks in Piattelli-Palmarini 1980, pp 9697. See also Linell 1980).

10 What follows is indebted to a lecture by Margaret Boden on 'Piaget and Children's Errors', University of Sussex Education Area, Spring Term 1981.

11 Here I use 'abduction' for the genus of hypotheses, of which inductions are a species.

12 The specified subject condition (SSC),'concerns rules that relate x and y in a structure such as (8), where the bracketed embedded structure is a sentence or a noun phrase: (8) ... x ... [ ... y...]... SSC asserts roughly that no rule [of grammar - TP] can apply to x and y if the embedded phrase contains a subject distinct from y' (Chomsky in Piattelli-Palmarini 1980, p.41).

13 There are critical discussions of Chomsky's nativism in Edgley 1970 (Chomsky's response in Chomsky 1980); Quine 1972; Cooper 1975; Piattelli-Palmarini 1980; Katz .1981. For Fodor's nativism, see Dennett 1979, Chapter 6. See also, in general, Hamlyn 1978.

14 Cf. Roy Edgley, 'in one particular area empiricism is conspicuously parsimonious and imposes severe constraints on the allocation of innate structure as an explanatory factor: namely, in the explanation of learning or acquisition of knowledge. Why is this? .... The only plausible answer from the empiricist point of view seems to be not that empiricists have a distaste for innate structure as such, but that in their view any richer innate component would be incompatible with characterizing the process as one of learning'. (Unpublished letter to Noam Chomsky, 13 October 1969, pp 34. Quoted by permission.)

15 A theologizing echo of Galileo's 'Nature does not act by means of many things when it can do so by means of a few' (Dialogue Concerning the Two Chief World Systems).

16 This is not inconsistent with the very interesting fact that the structures of different languages pose children varying degrees of difficulty, with structurally different features of one language being learnt at an earlier or later age than their functional equivalents in others. See Slobin 1977 for some examples.

17 The difference between transformational and phrase structure grammars is described in every introductory linguistics textbook, which present phrase structure grammars as inadequate. The new wave in syntactic theory simply argues that they can be made entirely adequate. Some of the post-transformational grammarians share with Chomsky the view that linguistics is a branch of psychology and that grammars must be evaluated for 'psychological reality', but this has led some of them in the direction of 'performancist' grammars (e.g. Bresnan 1978 ). Other linguists tend to see linguistics as a branch of Mathematics, a view defended philosophically in Katz 1981 (for a critique, see Pateman (forthcoming)).

18 Turn over A and 7.

19 Turn over the sealed envelope and the envelope bearing the 4p stamp.

20 An alternative conclusion is suggested by Aaron Sloman, 'mental processes,. like history, may be overdetermined for example, most of our sensory information is very redundant, and we can cope with different kinds of degradation. Similarly our reasoning processes have access to general principles and domain specific information. Often the latter yields faster, more reliable solutions to problems. We fall back on general methods when we have to. Often they involve a bigger memory load, a more complex search space, longer sequences of inference steps. They are best avoided where possible - like trying to do arithmetic from Peano's axioms instead of memorised tables'.

21 Aaron Sloman puts it like this, 'If mature competence is made of many mutually recursive competences, then learning can't be a 'logical progression'. Various incomplete or incorrect structures need to be built temporarily, then modified through links to others, possibly constructed later.'

22 For example, Papert's account of how conservation might be achieved (in Papert 1980, p.169f) strikes me as close to the approach of Mehler and Bever 1967 which is cheerfully recognised by Chomsky as compatible with his own views in Chomsky 1968 p.80.

23 See Makins 1981. For a theory of human-program interaction based on my own experience of following a Papert-inspired programing course in the Cognitive Studies program at the University of Sussex, taught by the late Max Clowes, see Pateman 1981.

24 For an introduction to Piaget from a cognitivist standpoint, see Boden 1977.

25 Formally, Turtle geometry, like Cartesian geometry, is a representation of a subset of Euclidean geometry.

26 For Chomsky, language growth rapidly progresses to a steady state which, apart from vocabulary change, remains fairly constant through an individual's life. See, for example, Chomsky 1975.

27 Aaron Sloman observes that this is a necessary consequence of the need to learn systems of mutually dependent (mutually recursive), concepts, skills, 'axioms', etc. Cf. note 21 above.

28 This raises the interesting question of why fossilization doesn't occur in first language acquisition: children's grammars go on changing long after they can 'get by, communicatively speaking'. One possible answer is that fossilization doesn't occur, because children's grammars develop by a process quite different from that of restructuring (transformation of one system into another): for this idea, see White 1981.

29 See, for example, Bereiter and Engelmann 1966 and, for criticism, Labov 197 7 , especially Chapter 5 , and Dittmar 1976.

30 I leave aside that the material may well have been a myth already; that does not affect my argument, which concerns the 'other face' of the theory of ideology: what subjects do with ideas, rather than what ideas do to them.

31 However, see Sperber 1973 and 1975 for relevant criticism.

32 Aaron Sloman comes up with the possibility of simulating historical processes on a computer, and observes that 'A Computational Modelling approach could help to bring out the stupidity of questions like, "what were the causes of X?" in a complex. system with multiple feedback loops. The overdetermination of events might actually be understood'. Some work on these lines, designed for schoolchildren, is being conducted by Richard Ennals at Imperial College, London.