[to be published in Theory and Practice in Functional-Cognitive Space, edited by María de los Ángeles Gómez González, Francisco José Ruiz de Mendoza Ibáñez and Francisco Gonzálvez-García (John Benjamins, 2014+)]

Cognitive functionalism in language education

Richard Hudson

UniversityCollegeLondon, United Kingdom

Abstract

Functional pressures on language are always cognitive, and cognitive pressures are always functional, so cognitivism and functionalism combine to explain the structure of lexicogrammar - the continuum of lexicon and grammar - and also the statistics of language usage. As an example, the paper shows how Word Grammar explains the difficulty of centre-embedding in terms of dependency syntax combined with a general cognitive principle of binding, and also the benefits of non-canonical word orders (such as extraposition) in the lexicogrammar. These reordering options are part of the formal academic language that children learn through education, and education should be guided by linguistic research. This is a research area that calls for far more effort and collaboration with other disciplines.

Keywords

Word Grammar, word order, education, syntax, children

1. Cognitive functionalism

The terms cognitive and functional are often combined, as in ‘functional-cognitive space’ (Gonzálvez-García and Butler 2006), ‘usage-based functionalist-cognitive models’ (Butler 2006) or ‘cognitive-functional linguistics’ (espoused by a number of university departments). This is a healthy development, but it is important to remember that each term names a distinct set of assumptions. In linguistics, cognitivism applies the insights of cognitive science, including cognitive psychology, to the study of language, on the assumption that language is subject to the same constraints and principles as other areas of cognition. Functionalism, on the other hand, seeks functional explanations for language in terms of general assumptions such as the principle of contrast (minimize ambiguity). Cognitivism need not seek functional explanations, and functionalism need not seek cognitive underpinnings. Nevertheless, it makes perfect sense to combine them because (as I shall argue below) functional pressures on language are always cognitive pressures, and the effects of cognition on language are always functional. This dual perspective is one of the attractions for me of Chris Butler’s work, along with his unflagging determination to listen, learn and understand his colleagues.

Functional pressures must always be cognitive for three reasons: it is only through cognition that they apply to language, it is only because language is an example of cognition that they apply at all, and they cover the full range of cognitive processes as applied to language. To show the significance of these three claims, imagine a functional analysis which is completely divorced from cognition, such as a branch of the mathematical theory of communication. This would analyse the elements of any communication, such as a message, a medium, a sender, a receiver and a code, and the properties that any code would have to have in order to allow efficient communication. There would be nothing in the analysis about the code’s users, its history or its social significance. The only questions would involve efficient communication: how to measure it, and how to design a code so as to maximize it.

In contrast, as soon as we bring cognition into the discussion the questions multiply. How easy is the code to learn? How does it change diachronically? What is its social significance as an important badge of group membership? How does it balance the needs of the speaker (e.g. for brevity) against those of the hearer (e.g. for explicitness)? Butler puts the complexities well in the following passage (Butler 2006:1):

“If we are to study language as communication, then we will need to take into account the properties both of human communicators and of the situations in which linguistic communication occurs. Indeed, a further important claim of functionalism is that language systems are not self-contained with respect to such factors, and therefore autonomous from them, but rather are shaped by them and so cannot be properly explained except by reference to them. Linguists who make this claim ... undoubtedly form the largest and most influential group of functional theorists. The main language-external motivating factors are of two kinds: the biological endowment of human beings, including cognition and the functioning of language processing mechanisms, and the sociocultural contexts in which communication is deeply embedded. We might also expect that a functionalist approach would pay serious attention to the interaction between these factors and the ways in which languages change over time, although in practice this varies considerably from one model to another.

The question of motivation for linguistic systems is, of course, not a simple one. Much of the formalist criticism of functionalist positions has assumed a rather naïve view of functional motivation, in which some linguistic phenomenon is explicable in terms of a single factor. Functionalists, however, have never seen things this way, but rather accept that there may be competing motivations, pulling in different directions and often leading to compromise solutions.”

This complex and sophisticated view of the pressures that shape languages has been expressed recently as ‘stable engineering solutions satisfying multiple design constraints, reflecting both cultural-historical factors and the constraints of human cognition.’ (Evans and Levinson 2009:1). For Levinson and Evans, the most significant property of language is the enormous diversity, which they hope to explain in relation to the multiple (and competing) design constraints. My only disagreement – a minor quibble about terminology - concerns their contrast between ‘cultural-historical’ and ‘the constraints of human cognition’: cultural-historical facts are themselves ultimately facts about human cognition. If the English word for ‘cat’ is CAT, this is only true because English speakers know it, act upon it and transmit it to the next generation. This is a very different kind of cognitive fact from the fact that working memory is limited, but cognitive it is nevertheless. I should therefore like to reword the quotation: ‘stable engineering solutions satisfying multiple cognitive design constraints, reflecting both variable cultural-historical knowledge and the permanent and universal constraints of human cognition.’ Similarly, Butler’s ‘sociocultural contexts’ are only relevant to the extent that they are part of speakers’ cognition.

If it is true that functional pressures are always cognitive, it is equally true that cognitive pressures are always functional, in the sense that they push language towards a better solution for one of the many competing design constraints. This claim is hard to test in the absence of a closed list of design constraints, so we might treat it as a premise to guide us in the search for design constraints: whenever we find a fact about language which seems to relate to cognition, we must find a design constraint to mediate between language and cognition. To take an elementary example, why does English rank the speaker above the addressee in the pronoun system, so that the presence of the speaker in a group forces the choice of we regardless of who else is in it? Even more interestingly, why do so many other languages do the same? True, some languages distinguish inclusive and exclusive pronouns for ‘we’, but (so far as I know) no language has a word for ‘you’ which may or may not include the speaker. Presumably the explanation lies in cognition, but it must include a design constraint such as the paramount importance of talking about oneself – a sad comment on human nature, perhaps, but apparently true.

If language is subject to functional pressures, what effects do these pressures have? If their effects are always cognitive, as I am suggesting, they must affect our minds first and foremost, and it is only via our minds that they affect our behaviour; so if I choose the word we rather than you to refer to a group including my addressee as well as myself, this is because my mind contains a ‘lexicogrammar’ which assigns each of these words a meaning which dictates this choice. (The term lexicogrammar is a very useful term from Systemic Functional Grammar for the continuum of lexicon and grammar which has more recently been rediscovered by cognitive linguists – Butler and Taverniers 2008). The pressure shapes the lexicogrammar, which in turn affects our behaviour. But is it only via the lexicogrammar that functional pressures can affect our behaviour? The answer depends on how we define ‘lexicogrammar’, but there are some functional pressures whose effects clearly fall outside any familiar definition.

For example, if you and I are talking, we are more likely to understand each other if only one of us is talking at a time, for the simple reason that listening and talking compete for the same mental resources of attention. As with any pressure, this comes with a cost – a competing pressure that has to be balanced against it. If you are talking, and I have something to say, not only do I have to wait, but I also may have to take my place in a queue along with others who also have something to say. Consequently different communities develop different behavioural norms, ranging from complete anarchy to the rigid rules of committee meetings; and these norms affect our speaking behaviour in a striking way (Hudson 1996:133). But they cannot be part of the language system if this simply controls the ways in which words are combined, pronounced and interpreted. On the other hand, the rules for speaking or staying silent are equally clearly related to the language system, because they govern its use – when to use language and when not.

Some functional pressures clearly do affect the content of the language system, and others clearly don’t. But in between these two extremes, we find ‘weak’ pressures, where some kind of language behaviour is not actually dictated by the system, but is nevertheless typical throughout the community. An example that comes to mind is the use of directional expressions in English. If my wife is downstairs and asks me to join her, I believe I would say I’ll come down in a minute rather than simply I’ll come in a minute, even though the down is completely optional, and, in the situation concerned, completely uninformative. And I believe the same is true of any English speaker describing almost any movement or position which could be related to the deictic ‘here’. So in all the following examples, the bracketed expression is grammatically optional and situationally predictable, but nevertheless expected:

(1)I went (over) to Ben’s place the other day.

(2)It’s (up) in the spare bedroom.

(3)I’m driving (down) to Cardiff tomorrow.

I have no research evidence to support this claim, but my hunch is that the bracketed words are much more likely to be uttered than omitted. What is supported by research is the idea that our learning of language is ‘usage-based’ (Barlow and Kemmer 2000, Bybee 2010, Hudson 2007b, Tomasello 2003), which means that we maintain a mental record of the statistical patterns in other people’s behaviour; so a statistical tendency in other people’s behaviour may become part of my own behaviour (with the obvious feed-back effects on other speakers).

But why should English speakers show this particular pattern? It might be just an arbitrary pattern which we reinforce in each other, like the pronunciation patterns which are so well documented in quantitative sociolinguistics(Hudson 1996: chapter 5). But much more likely is that we have created our own local ‘functional pressure’ to specify deictic locations and directions, regardless of the hearer’s needs. If so, this would be an example of a functional pressure being created by collective linguistic behaviour, and then being learned and applied by every novice speaker. It would be reflected in the lexicogrammar by the particles which are tailor-made for this precise purpose, but their use is not governed by categorial rules. How, then, do we decide whether or not to use them?

This question is very similar to the one that arises in quantitative dialectology. For example, given that we all have a choice between a velar and an alveolar nasal in the suffix ing (as in walking or walkin’), how do we choose between them? Labov and his colleagues and followers have shown very clearly that each speaker’s choices reflect rather precisely the choice-patterns of the speakers who have served as their models, but there is no agreed cognitive model for the mechanism of choosing. What I have suggested elsewhere is that a model should take the form of a cognitive network with dynamic activation levels which trigger choices (Hudson 2007a). Once a model is in place, it could be extended to non-categorial functional pressures such as the one discussed above. This is a major research challenge because it isn’t at all obvious how to build the network needed, but the project would certainly reveal a lot about the cognitive architecture behind human language.

The general challenge that linguistic theory faces is to relate functions to structures: how to build a model of language structure which takes account of functional pressures. The current proliferation of theories, including theories whose names contain the word functional, testifies to the difficulty of this project. One basic question is whether the functions might be so closely integrated into the system that they become part of it. Some theories do merge functions and structures in this way, but in my opinion it is a mistake; I shall consider two very different theories: Optimality Theory and Systemic Functional Grammar.

Optimality Theory is the extreme case because each functional pressure is represented directly as either a faithfulness constraint or a markedness constraint within the system (Newmeyer 2010); for instance, the process that inserts an epenthetic vowel in horses is triggered by the difficulty of pronouncing two adjacent sibillants. The trouble with building pressures into the system in this way is that it turns the pressures into concepts, so they only apply to the extent that speakers have the relevant concepts; but the fact is that adjacent sibillants (for instance) are hard to pronounce whether or not we ‘know’ this conceptually.

Systemic Functional Grammar keeps the functional pressures outside the system, but analyses the structure so that it reflects the functions closely. Both the paradigmatic system-networks and the syntagmatic structures of syntax are organised into a small number of ‘metafunctions’ – ideational, interpersonal and textual – each of which is responsible for a different set of functional pressures. This means that a clause has three different syntactic structures: an ideational structure for the basic referential meaning, an interpersonal structure showing how the speaker and addressee relate to this meaning, and a textual structure showing how it relates to what has been said already (Butler 1985, Halliday 1994). My objection in this case is that the analysis misrepresents the relation between functions and structures by concealing the tensions and conflicts. In my opinion, it would be much nearer to the truth to say that we try to use a single structure to perform a number of very different jobs at the same time, so there is no sense in which a single clause can dedicate one entire structure to each job. For example, the clause Does she love me? uses she love me to describe a situation (ideational), uses me and does she to relate it to the speaker and the hearer (interpersonal), and she to relate it to the previous discourse; but these words are all closely integrated in a single structure, where the redundant does is the price we pay for this particular ‘engineering solution’ to the problem of satisfying these conflicting pressures.

But even if some attempts to relate structures to functions have been unsuccessful, we can all celebrate the twentieth century’s strong movement towards functionalism. Whatever we may think of specific theories, they are all trying to go beyond the mere analysis and description of language structures by looking for explanations. More recently, we have a separate movement towards cognitive analyses of language structures which explain how these structures relate to the rest of cognition. If we can marry the two strands, functional and conceptual, into a single cognitive-functional linguistics, then we have some hope of really understanding how language works.

2. Syntactic structure: Word order and dependency geometry

One area of language structure which has generated some particularly promising functional explanations is word order. Why are some basic orders so much more common than others? And why do languages provide so many alternative orders? Cognitive explanations have always been prominent in the sense that terms such as ‘given’ and ‘new’ have been used to capture some kind of mental reality, but it is only recently that these analyses have been able to build on work in cognitive science. One especially promising link relates word order to limitations on working memory; perhaps the best know exponent of this link is Hawkins, who argues that basic word orders evolve so as to minimize demands on working memory (Hawkins 1994, Hawkins 1999, Hawkins 2001). I find his evidence and arguments compelling, and agree with his general conclusions.

However, any discussion of the effects of functional pressures on syntactic structure presupposes some general theory of syntactic structure, and I believe Hawkins’s case would be even stronger under a different set of assumptions. For him, syntactic structure is phrase structure, so words are related to each other only via shared ‘mother’ nodes; so even if two words are adjacent, there is no direct syntactic relation between them. This analysis is not a helpful basis for explaining why syntax favours adjacency; nor is it promising as a basis for a cognitive theory of syntax because it raises the obvious question: why can’t we link words directly to one another, using the same mental apparatus that we use in relating events or objects in other areas of life? For example, if we can represent the members of our family as individuals with direct relations to other individuals, why can’t we do the same with the words in a sentence?