Neurocognitive structure in the interplay of language and thought

To appear in:
Explorations in Linguistic Relativity
M. Puetz and M. Verspoor, eds.

Neuro-Cognitive Structure in the
Interplay of Language and Thought

Sydney M. Lamb
Rice University, Houston, Texas, U.S.A.

We see ... that language ... consists of a peculiar symbolic relation -- physiologically an arbitrary one -- between all possible elements of consciousness on the one hand and certain selected elements localized in the auditory, motor, and other cerebral and nervous tracts on the other .... Hence, we have no recourse but to accept language as a fully formed functional system within man's psychic or "spiritual" constitution.

--Edward Sapir (1921)

The call for papers for this symposium invited participants to focus on new research findings "that throw a special light on the links between language, culture and thought" and mentioned some interdisciplinary fields from which applicable new findings might be emerging. It is possible that new findings and contributions from external disciplines may not only shed light on questions raised in the past but also suggest reformulations of those questions. New knowledge, along with what we have known all along but have been failing to use imaginatively, may permit us to raise different questions about Whorf's assertions than those which have so often defined the issues in the past.

Among the several interrelated theories and hypotheses that Whorf proposed, the one that especially appeals to me and invites further inquiry is summed up in this oft-quoted statement:

We dissect nature along lines laid down by our native languages. The categories and types that we isolate from the world of phenomena we do not find there because they stare every observer in the face; on the contrary, the world is presented in a kaleidoscopic flux of impressions which has to be organized by our minds -- and this means largely by the linguistic systems in our minds. We cut nature up, organize it into concepts, and ascribe significances as we do, largely because we are parties to an agreement to organize it in this way -- an agreement that holds throughout our speech community and is codified by the patterns of our language. The agreement is, of course, an implicit and unstated one, but its terms are absolutely obligatory; we cannot talk at all except by subscribing to the organization and classification of data which the agreement decrees.

I must confess that to me this statement is so self-evidently true — in all but one respect — that I find it hard to understand how anyone could disagree. Yet disagree they do, some people. For example, Steven Pinker finds it almost outrageously mistaken and even calls it "this radical position" (1994).

Now, I just said "in all but one respect". What is that one respect? It is Whorf's provision that the "agreement ... holds throughout our speech community". But I think we can show that people have different thought systems even within the same language-culture system. This position has been convincingly demonstrated, for example, by Deborah Tannen in her You Just Don't Understand (1990): Why do husbands and wives so often fail to understand each other? Because they are operating with different systems of concepts and with different interconnections of concepts with lexemes. They are thus able to use the same expressions, in the same language, to arrive at quite different thoughts.

Even more: We can find divergence of conceptualization within a single person’s cognitive system — a conventional metaphor for a concept which conflicts, for example, with a visual or motor image connected to the same concept; or even different and conflicting metaphors attached to the same concept.

This paper aims to explicate this statement of Whorf, to show not only that it is valid but why it has to be valid, since what Whorf proposes in this passage can be shown to follow as an inevitable consequence of the structure and operation of the human neurocognitive system. Before introducing the neurocognitive perspective, I shall attempt an informal sketch of some properties of our mental systems and then consider just what question(s) we are asking in connection with the interrelationships of language and thought.

Five Basic Properties of Mental Models

We can start with the observation that our thinking works along with our senses and the reports that we get through our linguistic systems to provide us with ‘pictures’ of the world. We humans are all model builders, building models of the world and of ourselves within that world. This modeling process, largely unconscious, begins in infancy, perhaps even before birth, and continues into adulthood, to some extent even to old age, subject to the limitations of senility. We think we have abundant knowledge of the world, and perhaps we do, but to the extent that it is not accurate what we have is illusions rather than knowledge.

It seems that the mental system automatically engages in certain basic strategies that are indispensable to its operation yet which necessarily involve simplification, hence imperfect representation. These basic modeling strategies result in the (usually unconscious) formation of four kinds of assumptions about the world. That is, the mental system, by its nature, assumes

(1) the existence of boundaries,

(2) the existence of enduring objects,

(3) a basic difference between objects and processes, and

(4) the existence of categories of objects and of processes and relationships.

Without such assumptions it can't operate at all. They are consequences of built-in properties of our perceptual and conceptual systems. They are thus involved in all our efforts to understand anything.

To these we can add a fifth property, the tendency to build semantic mirages. This tendency makes use of the twin processes of reification and what might be called the one-lexeme-one-thing fallacy, a process of conflating different concepts that are connected to the same lexeme, simply because of that shared lexical connection. This is the source of some of the intra-person variation in thought pattern mentioned above. Some pertinent semantic mirages of English are ‘thought’, ‘language’, and ‘consciousness’. For example, by reification of the term ‘language’ we are led to believe that there is such a thing as language, and by the one-lexeme-one-thing fallacy we are led to suppose that this term stands for just one thing, even though when we look closely we can see that it is used for a number of quite distinct collections of phenomena selected from the kaleidoscopic flux, including especially these three: (1) language as a set of sentences (e.g. Chomsky) or utterances (Bloomfield); (2) language as the system that lies behind such productions; (3) language as linguistic processes, as in the title of Winograd's book Language as a Cognitive Process (1980). Let us call these language₁, language₂, and language₃. Our cognitive systems are evidently tempted to conflate these three since the same term is being used interchangeably for all. Still other phenomena are labeled by this lexeme from time to time, providing the opportunity for further conflation. For example, we find Steven Pinker using the term for certain cognitive phenomena associated with language₂, namely the propensity and ability of children to learn languages (1994). Why this set of properties should be called ‘language’ is something you would have to ask Pinker about; perhaps he believes that this propensity and ability is explained as the operation of an innate cognitive foundation on which language₂ can be built, and since there is no readily available term for this notion he adapts the term ‘language’ by a kind of metonymy based on language₂. Whatever the explanation, having indulged in this semantic exercise he goes on to conflate this new sense of the term -- we can call it 'language₄' -- with language₂. By thus stretching the term ‘language’ along with the term ‘instinct’, which he uses to draw attention to the fact that language₄ is evidently innate, he gets the title of his book (The Language Instinct) and finds himself justified in making, and evidently believing, such statements as,

...some cognitive scientists have described language as a psychological faculty, a mental organ, a neural system, and a computational module. But I prefer the admittedly quaint term "instinct." It conveys the idea that people know how to talk in more or less the sense that spiders know how to spin webs. (p. 18)

Notice that the first part of this quotation and the passage "people know how to talk" work for language₂, but the term 'instinct' can only be justified, and that by a stretch, for language₄. It can't apply at all to language₂, since a French child raised in a Mandarin speaking environment will speak Mandarin but not French. I don't think any realistic appraisal of the phenomena can find any reason for considering language₂ and language₄ to be one and the same. This is not even to mention that a spider raised in isolation will nevertheless spin webs. It is not just by coincidence that this is the same Steven Pinker who, a little later in the same book, strenuously objects to Whorf's idea that language can influence thought. Those who doubt that language can influence thinking are unlikely to be vigilant for the effects of language on their own thinking.

Another semantic mirage related to the one-lexeme-one-thing fallacy is the unity fallacy, the illusion that a concept represents a unified object, which must be either present or absent as a whole in a given situation, rather than a (sometimes haphazard) collection of phenomena of the kaleidoscopic flux. In the case of language, we see this fallacy in questions like, "How many languages do you speak?". It leaves no provision for the case in which a person knows a little bit (say a few dozen lexemes) of, let us say, Swedish. The same fallacy leads to questions about the evolution of language. If earlier people either had language as a whole unit or didn't, there are serious problems in understanding the evolution of language. How do you get from no language to language fully formed in one generation? What did the first language mutant talk about, given that nobody else could understand him? These are questions that can arise only from a semantic mirage.

In general, whether it is within space or time or in more abstract conceptual dimensions, our mental systems impose boundaries on a world which does not itself have boundaries. Why? If they did not do so, it would not be possible to talk to one another about the world or to think about the things of the world. Although everything is connected in various ways to other things, hence ultimately to everything else, we can't talk or think about the whole world at once. Thus we have to cut up the kaleidoscopic flux, to segment it by imposing boundaries; and since those boundaries are imposed by our minds and are not really there, they can be regarded as illusory.

A pertinent example is words themselves. In ordinary speech they do not occur in isolation; rather, we get phonological phrases with no gaps corresponding to word boundaries. Yet our perceptual systems, seemingly without effort, extract words as units -- if we are hearing speech in a language we know, but not otherwise -- and treat them as separate units in the process of comprehension, just as if there were boundaries there. The boundaries are supplied by our mental systems.

Categorization goes hand-in-hand with segmentation. The world, infinitely complex with no natural boundaries and no two things completely alike, is modeled by our minds by means of these two tools: (1) segmentation, achieved by mentally imposing boundaries; and (2) the classification of the segments into categories on the basis of shared properties. But those shared properties do not include all the properties of the items categorized, only some of them. It would be impossible to use all of them since everything in the world is indefinitely complex, and so recognizing all or even too many of them would render categorization impossible.

It follows that all imposition of structure in our mental models is made at the cost of ignoring some properties of the phenomena modeled.

Of course, these two fundamental sources of (often useful) illusion do not just operate in that order, first segmentation, then categorization. For the segmentation is done partly on the basis of properties of the segments which result; it is thus influenced by considerations of categorization. We do not ask which comes first; it is like the chicken and the egg. Similar considerations apply to the case of Pinker’s conflation of language₂ and language₄: of the two cognitive operations we can distinguish — the metonymic creation of a new sense for the term language and the application of the one-lexeme-one-thing fallacy — we don't want to suppose that they took place separately and in sequence; things like this seem to happen all at once.

Now, is there any reason to expect that all people, regardless of their different cultures and languages, share the same system of illusions? Would that not be a preposterous supposition? If we reject that unlikely possibility, we are accepting the proposal of Benjamin Lee Whorf. To understand how it is that, as Whorf pointed out, different peoples of differing linguistic and cultural backgrounds have different mental models of the world, we have only to appreciate the fact that any mental model is necessarily a simplified model — hence a distorted model — of what it is attempting to represent, and the rest follows. It is then easy to appreciate that the systems of different cultures are different, simply because they are imperfect in different ways. To verify this conclusion, we can find abundant evidence, and of course many examples were provided by Whorf and many more by others, including Chafe (this volume) and other participants in this symposium.

It is not just that our minds are mistaken about the world when they impose these structures, for they couldn't operate at all without doing so. They enhance our ability to cope with the world by building on our experience, including the indirect experience provided by linguistic inputs — that is, by hearsay. But the only way they can do so is to simplify — and that means to oversimplify — since without segmentation and categorization — processes of oversimplification — they couldn't organize our worlds at all. Thus it is inevitable that our representations of reality are necessarily filled with illusion. Although we can get convincing evidence of this fact by observing how different cultures, and even different people in our own culture, structure their projected worlds, we don't have to depend just on such evidence to reach the conclusion that our projected worlds are full of illusions, since we can deduce that fact just from consideration of the structural properties of the system we use for our knowing.

What Are We Asking?

In this exploration I am placing more emphasis on categories expressed by lexemes like nouns and verbs than upon grammatical categories, even though most of the literature on the ‘Whorf hypothesis’ dwells on grammatical categories, as if he wrote about only those. But some of his best examples, such as his well-known passage about how to express the notion of cleaning a gun-barrel in English and Shawnee (Figure 1), concern lexical more than grammatical phenomena. And I think that the passage quoted above about the kaleidoscopic flux makes more sense and is more powerful if interpreted in the context of concepts associated with nominal and verbal lexemes than grammatical categories.

FIGURE 1 ABOUT HERE

The distinction between lexical and grammatical is one of several dimensions of contrast to be found among the various proposals in the "complex of interweaving theoretical strands" of what Penny Lee calls the Whorf Theory Complex (1996:xiv). Besides this contrast, we have several other choices available when considering what questions to investigate. It might be a good idea to be clear about just what question or questions we would like to ask. Are we asking about

Language influencing thought?
Language influencing conceptualization?
Language influencing world view?
Language influencing perception?
Language influencing behavior?
All of the above?

Or should we be asking, along with John Lucy (1997:291), in terms of a two-step process: "how languages interpret experiences and how those interpretations influence thought"?

Another approach would have it that none of these formulations is quite right. It is easy to think in some such terms — one or more of these possibilities — taking these concepts, like language, thought, perception, behavior, as actual objects or entities of some kind, as if they had existence apart from human beings; to be more exact, as if they had some life of their own, apart from the human mind. But I'd like to suggest that thinking in such terms is in itself an example of just the kind of phenomenon Whorf was talking about, an example of language influencing thought -- in this case, through the process of reification, in which we are reifying ‘language’, ‘thought’, and so forth, and treating them as independent objects.

Penny Lee proposes one way of getting beyond this mode of thinking (1996: xiv):

In the realm of linguistic thinking there is little point in arguing about whether language influences thought or thought influences language for the two are functionally entwined to such a degree in the course of individual development that they form a highly complex, but nevertheless systematically coherent, mode of cognitive activity which is not usefully described in conventionally dichotomizing terms as either ‘thought’ or ‘language’.

This way of looking at the relationship seems fair enough as far as it goes; yet it isn't quite robust enough to satisfy some people, and I find myself among them. For we do seem to find in Whorf’s assertions a suggestion that in some way our linguistic systems are playing some kind of causal role. I am not ready to give up on this intriguing possibility.

I would like to propose an alternative way of looking at the situation. Instead of starting with elusive disembodied abstractions like ‘thought’ and ‘language’, we could start by talking about something relatively real, the human br ain, and about language in relation to the brain. I will try to show how that perspective might reframe the questions for us.

The Cortical Information System

Each of us has an information system which we use to interact with the world, our personal information system. That world is of course not just external to the body, since it also includes information about the body itself: feelings of hunger and other sorts of feelings, knowing where our hands and feet are and what condition they are in, and so on. The system also includes information about the past, both external and internal events, both experienced and reported happening s, both true memories and false memories, both physical and mental events. To a limited extent for most people, the system also includes some information about itself.

This information system, implemented mainly in the cerebral cortex and associated white matter, which provides cortico-cortical connections, includes the linguistic system together with conceptual, perceptual, and other systems. Bec ause of its extensive interconnections with these various other systems, the linguistic system enables us humans to report and think about experiences and imaginings of many different kinds, represented by activations in different modalities all over our brains. Figure 2, while it is highly simplified in relation to the actual information system of the human brain, provides some idea of the kind of structure involved.

FIGURE 2 ABOUT HERE

It appears from numerous theoretical and empirical studies (cf. Makkai and Lockwood 1973, Lamb 1999c) that most or all of these mental modalities are organized in the form of networks with multiple layers of structure, and this hypo thesis is supported by neuroanatomical evidence (Kandel et al. 1991, Lamb 1999c:307-369). Of course, as they are all interconnected, these several systems are all portions of one large network.

Figure 2 summarizes in highly oversimplified form numerous hypotheses concerning our neurocognitive systems, some of which are easily taken for granted but none of which should properly be accepted without evidence. It identifies ce rtain specific functional subsystems, and (without making specific locational proposals) suggests that each of them might occupy a relatively contiguous area. It also identifies connections between subsystems and shows many of them as bidirectional, by me ans of lines with arrowheads at both ends (cf. Lamb 1999a). It also includes some hypotheses relative to the relative locations of the different subsystems -- for example, the position of Phonological Recognition close to Auditory Perception . Whether these are actual properties of our cortical information systems or just matters of diagramming convenience is a question we will consider briefly. These and other questions relating to these hypotheses and the evidence supporting them are treate d more extensively elsewhere (Lamb 1999c). For purposes of the present exploration it will be pertinent to consider the locations of just a few of the subsystems relative to one another along with a few of the cases of bidirectional connectivity. We shall consider the questions of localization after looking at the learning process.

The hypothesis of bidirectional connectivity in our systems is perhaps most readily supported by experience with our own perceptual systems. Most people can visualize objects and scenes — cats, dogs, waterfalls, our bedrooms. And mo st of us can ‘hear’ the voices of friends or relatives speaking, or we can listen mentally to the opening lines of Beethoven's 5th symphony; and most of us engage in inner speech, during which we hear our own voices — to be sure, not with the clarity that is present when actual sounds are being received through the ears. (It is reported that a significant percentage of people have little or no visualizing ability; they find statements like that just given about visualizing hard to believe.) Now, what is g oing on here? To me, the most likely explanation (in fact, the only likely one) for such ‘inner seeing’ or ‘inner hearing’ is that we are activating some of those same connections in our perceptual systems that get activated when we are getting actual sen sory input. If I ask you to visualize a cat and you do so, you are activating those connections in your visual system as a consequence of linguistic rather than sensory input. If my suggestion to do so had been spoken rather than written, then, in terms of Figure 2, the pathway of activation would go from Auditory Perception to Phonological Recognition to Lexis (the grammatical recognition and production subsystems have been omitted from Figure 2 just to keep it from being too cluttered) to a location in Object Categories to Hi-Level Vision to Mid-Level Vision to Lo-Level Vision. Yes, all the way to low-level vision, for it is here that you have the actual visual features (which you can conjure up to the extent you care to work at it, unless you are one of those who lack the necessary connections) which are needed to make up the pointy ears, the whiskers, the yellow eyes, etc. (The diagram arbitrarily distinguishes just three layers for visual percep tion and in doing so presents a highly oversimplified picture; actually there are many more layers than three.)

And so what we seem to have are perceptual pathways going in the reverse direction from that of ordinary perception. The kind of network structure needed to support this ability has to consist of both feed-forward and feed-ba ckward connections — from a given layer of structure to both upper and lower layers, and both to and from other subsystems. That is, these feed-forward and feed-backward connections can exist not only between immediately neighboring layers of the same sub system but also between different subsystems, for example between the systems for Vision and Object Categories. This subject has been treated in greater detail elsewhere (Lamb 1997, 1999a, 1999c:132-136; cf. Damasio 1989 a,b,c, Kosslyn 1983, Kosslyn and K oenig 1995).

Concepts are centrally important to this inquiry. A node for a conceptual category seems to have connections to/from a large number of nodes representing its properties, both to/from other conceptual nodes and to/from other subsyste ms. For example, concepts for categories of visible objects need connections to nodes in the visual area, those for audible objects to/from auditory nodes, and so forth. Taking the concept ^Ccat, for example, we have visual connections comprisin g what a cat looks like, auditory connections for the 'meow' and other sounds made by a cat, tactile connections for what a cat feels like to the touch; as well as connections to other concepts representing information about cats in the information system of the person in whose system these connections have been formed (Figure 3). And so a person's knowledge of cats is represented in the information system by a little network, actually comprising hundreds or thousands of nodes, including a visual subnetwo rk for the visual features, an auditory network for the 'meow', and so forth, all ‘held together’ by a central coordinating node, to which we can give the label ‘^Ccat’.

FIGURE 3 ABOUT HERE

The current impression that we have in our conscious awareness of a scene or a situation or a person results from a widely distributed representation of many nodes, usually of multiple subsystems; and it is the lower-level nodes who se activation gives us our conscious experience, while the function of higher-level ones is to provide coordination of those lower-level nodes, so that they are kept active in concert. This is important evidence of the need for distributed representations to be supported by higher-level local representations: It is those higher-level local nodes that provide, by means of their feed-backward connections, the coordinated activation of the nodes comprising the low-level distributed representations. They also make possible the coordinated spread of activation from one subsystem to another. The function of this central coordinating node, and the need to posit its presence in the system, are addressed in detail elsewhere (Lamb 1999b, 1999c:329-343, 366-369; cf. Damasio 1989 a,b,c).

To get a handle on the question of the integrity and relative locations of the various neurocognitive subsystems, it is necessary to consider learning. This we shall do next.

To summarize the argument so far, the first point is: Let's be more realistic about concepts like thought and language and stop treating them as independent disembodied entities with lives of their own. The second point is: Consider the brain. Next, we consider the third point: learning. If the cortical information system is a network, its information is in the connectivity of the system rather than in the form of symbols or any such objects that would have to be stored somewhere. T herefore, learning has to consist of building connections.

Learning Looms Large

Relational networks as portrayed in most of the literature (e.g. Copeland and Davis 1980; Lamb 1966, 1970, 1984, 1994; Lockwood 1972; Makkai and Lockwood 1973; Schreyer 1976) describe, however imperfectly, parts of a typical cognitive system as it might exist at the end of a long series of learning steps. It is natural to ask how that network structure gets formed. How does the system get those seemingly ‘hard-wired’ connections that are seen in linguistic network diagrams? The preliminary answer, considered in more recent literature (Lamb 1997, 1999a,c) comes in two parts: first, there must be some genetically built-in structure that provides the potential for all of the connections that will eventually get formed; second, there must be many steps of building and adjusting connections to get from that initial state to the functioning state that represents an adult's capabilities. The abundant connections of that initial state need to be both local and long-distance: local f or building connections within a subsystem, like higher-level phonological nodes for integrating lower-level phonological elements, and long-distance to allow for connections between different subsystems, such as between lexical and conceptual, between co nceptual and visual, etc.

We need not suppose that all of the connections of a system actually get built as part of the learning process. And in fact such a supposition would create needless problems for the learning theory, for in that case the hypothesized learning mechanism would have to be endowed with some way of ‘knowing’ where to build the new connections needed for each particular aspect of a skill, and a means of ‘knowing’ would demand far more complexity than we actually need. There is a simpler al ternative: to suppose that the genetically provided state of the network includes abundant connections proliferated by a built-in program, most of which connections will never become operative — just as hundreds of eggs are laid by a turtle or insect, onl y a few of which will produce surviving organisms. We can suppose that those abundant latent connections, from each node to many nodes of other levels, start out very weak, in effect with near-zero strength. We can hypothesize, in harmony with Hebb (1949), that the fundamental learning process might consist of strengthening a connection when it is active while the node to which it is connected has its threshold satisfied by virtue of also receiving activation from other connections.

This simple learning hypothesis eliminates the need for the system to ‘know’ how to build the precise connections that it must build for linguistic performance. It doesn't need to know at all; it just proliferates possibilities in a dvance and the learning process is one of selection. This is a Darwinian process like that which leads to the origin of species and to complex biological structures like eyes and brains and the elephant’s trunk (compare Edelman 1987). Nature didn’t have to know in advance how to construct an eye or a brain. At each of many steps in the process it proliferated possibilities, and those which succeeded passed their genetic material to the next generation.

The Darwinian features of this learning mechanism are in harmony with a bottom-up direction of learning — in perception, for example, from the level of sensory input to successively higher levels of integration, leading up to concep tual structures. For language, bottom-up learning implies that a child learns to speak in single words before producing multi-word utterances, etc. This bottom-up hypothesis is supported by neurological evidence in that the progress of myelination of cort ical nerve fibers begins with the primary cortical levels and moves successively higher. The development of species is also bottom-up, as is the development of complex biological systems like eyes and the mammalian brain. In the process of network structu re building, latent connections get selected for specific functions first at lower levels, and it is only after nodes of a lower-level have been recruited for specific functions that they can serve as ‘parent’ nodes for the next generation of nodes which will build upon them. That is, higher-level nodes cannot get recruited until a few of their incoming connections are able to be activated; and they cannot become consistently activated until the nodes from which these connections are coming have been recr uited.

And so it makes sense to call the process Darwinian in that learning is not so much a building process as a process of selection. At every stage of learning we make selections from the abundant potential that has been provided in th e form of latent connections. These abundant latent connections, proliferated and thus available throughout the system, also provide the enormous flexibility which our mental systems enjoy, their ability to learn about new things later in life which could never have been foreseen during childhood, their adaptability to novel conditions, their ability in many cases to compensate for damage to brain tissues, etc.

Conceptual nodes occupy upper levels of the cognitive system. The process of learning a concept is a matter of recruiting a node which can integrate information from perceptual as well as other conceptual locations. In the initial s tages of learning a concept there may be only a few such connections, representing the properties present in awareness at the time of first learning. The activation of the properties that become connected to the concept node, either initially or later on (see below), can come either from direct experience, ie. via the sense organs and perceptual cortices, or, very commonly, as a result of linguistic activation. In the latter case we are talking about activation of conceptual properties coming from phonological representations via lexical nodes.

The same process of strengthening connections applies both to the initial recruitment of a node and to its later refinement to adjust to new information coming in after the initial recruitment. Such fine-tuning operations are of two kinds: (1) adding 'new' connections, for properties of new exemplars that were not present at the time of initial learning of the concept; and (2) strengthening already established connections, for properties repeatedly associated with the concept. In ke eping with the Darwinian features of the process as described, the adding of 'new' connections is not literally adding connections but of strengthening latent ones, just as in the initial recruitment process. The second of these two processes is one of ad ding additional strength. We have to recognize that connections can vary in strength not just between the two values of latent and established but along a continuous scale from very weak to very strong. After a sufficient amount of experience (direct and through hearsay), those properties that are most frequently associated with a concept will have acquired great strength, while those only occasionally present will have acquired relatively weak strength.

For all this to work we must also hypothesize that each such node has a threshold function such that a greater amount of incoming activation leads to a greater amount of threshold satisfaction, causing the node to send varying degre es of activation out to other nodes: strong activation if the threshold is strongly satisfied, weak if only slightly satisfied, none if the incoming activation doesn't reach the threshold at all. It follows that a part of the learning process has to consi st of adjustments in the threshold so that the node will be neither too easily satisfied nor too stringent in its demands.

Although the first step of learning a concept may result from a single exemplar, so that the node for that moment responds to a single object, the strengthened connections, representing the perceived properties of that object, would rarely be specific to that one exemplar, and so would immediately allow for recognition of multiple similar objects comprising, with that initial exemplar, a category rather than just that one object. And as the process of fine tuning progresses, as a re sult of further experience, the node and its connections will progressively refine, in effect, the definition of the membership of the category based on properties experienced as associated with it, giving greater weight to those experienced as more impor tant. The node's threshold will then be satisfied by any member of the category defined by its connections. It will have learned to be satisfied by a sufficient amount of activation from among all of the nodes representing its properties, an d it will automatically exhibit prototype effects, since it will respond more strongly to prototypical exemplars than to peripheral ones. Why? Because the prototypical ones are those with the strongest and the most connections from the properties associat ed with the category.

Another consequence of the learning process according to this hypothesis is that each concept ends up as highly selective in relation to the potential range that was available to it before learning occurred. We can see this selectiv ity and the range of the potentials in two ways. First, the possibilities which the world presents are indefinitely varied -- it is, after all, a kaleidoscopic flux. The system of categories that a person ends up with is the result of many individual proc esses of selection of certain features of that kaleidoscopic flux for representation in the system among the indefinitely many other possibilities which remain more or less ignored. Second, the means by which all this is accomplished is also a matter of s election: it is the selection of certain connections for strengthening while others remain latent, and of the further strengthening of selected connections among those strengthened earlier.

Moreover, this highly selective structuring imposed upon the kaleidoscopic flux is not a consequence of limitations in our sense organs, in our ability to receive inputs from the world — even though such limitations do of course exi st. The child who is building an information system has no problem with being able to discriminate or to learn to discriminate myriad visual and other perceptual properties. The possibilities available for the child's sensory appreciation are abundant bey ond measure. But the process of constructing the information system is compelled by inner necessity to be selective. And what guides the selection? It is other members of the community in which the child is growing up. The child learns to associate certain selected perceptual properties with every concept being learned (except for the abstract concepts, which are even more heavily dependent on language), and ends up with a system of conceptual categories very much like that of the rest of the commu nity. And how does the child learn which perceptual properties to emphasize and which ones to ignore? Through language. Not because someone instructs the child by saying that property p is important for concept C, but just by naming e xemplars of categories, either directly or indirectly. If an older sibling says, "here, doggie!" to a newly encountered creature, that is enough information to allow the younger one to reinforce the connections from the perceptual features of this creatur e to the node for the developing conceptual category for ^Cdog, activated from the linguistic system. The system continues its fine-tuning operations in order to become like those of others in the community, in order to be able to communicate wi th them: "We cannot talk at all except by subscribing to the organization and classification of data which the agreement decrees".

To sum up, what the child does is to learn, by means of language, to make the distinctions that others have been making.

The Proximity Principle

Although the learning hypothesis assumes the availability of abundant latent connections, it seems altogether unlikely that the cortex has connections from every location to every other one, and in fact that possibility real ly has to be ruled out, even locally. This statement is supported by clear neuroanatomical evidence (e.g. Abeles 1991). But it is perhaps reasonable to assume that the latent connections are abundantly proliferated locally and that, as a result of a long process of 'evolutionary learning', sufficient long-distance connections are available to non-local areas. But for the latter, the long-distance connections, it is reasonable to suppose that they are relatively limited in comparison to the local on es. They could be of two kinds: to relatively nearby areas and to distant areas. The latter would be provided for if the brain’s genetic endowment includes long-distance ‘cables’ from certain areas to certain other areas. And we know from neuroanatomy tha t such cables do exist, the most important for language being the arcuate fasciculus, which connects the Phonological Recognition area to that for Phonological Production.

In any case, it would be a reasonable prediction from this learning hypothesis that if the system needs to connect nodes of two subsystems which are distant from each other, the most likely location for a node that would have latent connections available from both would be in an area intermediate between them. Why? Because a system with this property makes fewer demands on the amount of latent connections that need to be provided by the genetic endowment. This is the general situati on for learning of the type which integrates information from more than one subsystem. This situation includes the learning of concepts, which must integrate perceptual information from more than one perceptual modality along with lexical information, and it includes also the nodes for lexemes, each of which has to provide a bridge from a phonological location to a conceptual or other location. The other situation is that in which a node being recruited for a new function is only integrating features from one subsystem, as when a complex phonological expression is learned as a composite of two simpler ones. In this situation it is perhaps even more reasonable to hypothesize that the newly recruited node is likely to be close to the nodes for the propertie s being integrated, and for the same general reason: such a scenario requires far less extensive latent connections in the system than one which would allow such an integrating node to be farther away.

As a result, it will generally turn out that, other things being equal, integrating nodes will tend to be maximally close to the nodes for the features which they integrate. This consequence may be called the proximity hypothesis . This hypothesis relates function to location. It comes in two varieties:

(1) A node being recruited to integrate a combination of properties whose nodes are close to each other will tend to be maximally close to the nodes for those properties.

(2) A node being recruited to integrate a combination of properties whose nodes are not close to each other will tend to be in an intermediate location between the nodes for those properties.

An incidental consequence of this hypothesis is that close competitors — that is, nodes for similar functions — will tend to be physically close to one another. It follows that nodes which are physically close to one another will tend t o have similar functions.

The Language Cortex

Based on the proximity hypothesis we can now interpret Figure 2 (above) as not only functionally descriptive with respect to the various subsystems and their interconnections identified, we can also support two principles su ggested by the figure that up to now may have seemed intuitively acceptable but for which we really had no supporting argument: (1) to a large extent each of the subsystems may be subserved by a geographically coherent area of the cortex; (2) areas which are connected to two or more other areas should, other things being equal, be roughly intermediate in location between the areas they are connected to. So, for example, the hypothesis predicts that lexical nodes ought to be in intermediate locations betwe en conceptual nodes and phonological nodes; that conceptual nodes for objects which are both visible and audible should be in an area intermediate between the visual and auditory areas of the cortex. The figure was drawn following these two principles in the first place because to do otherwise would have resulted in a far more complex diagram. But now we have a theory to justify the policy followed and to support an interpretation of the figure that is more than just an abstract functional one.

The proximity hypothesis also permits us to formulate hypotheses of likely locations in the cortex of the different neurocognitive subsystems, starting from the primary areas, whose locations have been well-known for decades. It all ows us to predict that the Phonological Recognition area ought to be relatively close to the primary auditory area, and intermediate between that area and the lexical area, and so forth. And since conceptual nodes for objects which are both visible and au dible should be in an area intermediate between the visual and auditory areas of the cortex, we can propose that they are likely to be in the posterior temporal lobe. In short, the proximity hypothesis and its corollaries allow us to make various predicti ons about likely locations of subsystems in the cortex, including nodes like those of Figure 3. We can test and refine such predictions against what is known about localizations in the cerebral cortex using results from aphasiology (cf. Goodglass 1993, Be nson and Ardila 1996) and other areas of neuroscience, including brain imaging (cf. H. Damasio 1991). Such checking provides encouraging confirmation as well as adjustments to preliminary guesses (Lamb 1999c:349-365). In fact we are able with some degree of assurance to propose hypothetical localizations like those shown in Figures 4 and 5.

FIGURE 4 ABOUT HERE

FIGURE 5 ABOUT HERE

Top-Down Effects in Perception

To sum up what we have so far, our information about a concept is widely distributed, and the distributed representation is held together by localized integrative or ‘convergence’ nodes at higher-levels, which provide potent ially multiregional retroactivation of lower-level nodes by virtue of bidirectional connections. Feed-backward activation from a category node to the nodes for its relevant properties provides heightened activation to that subset of nodes currently receiv ing activation from the senses, resulting in increased attention to the properties relevant to that category; and it also triggers inferences, as activation of properties normally associated with the category but not currently receiving sensory input — fo r example, a portion of a cat's body which is obscured from sight by an intervening object. When we see a cat's head emerging from behind a sofa, we don't say, "Oh, look, there's a cat’s head!" No, we assume that a whole cat is there as our perception sys tem fills in predicted features of the rest of the body by means of top-down activation. Some such inferences may be unwarranted in the particular instance; this is the source of errors in thinking associated with ‘thinking in categories’.

Together, these properties provide top-down effects in perception: a heavy influence of the system, representing information and beliefs already present in it as a result of accumulated previous experience, upon the interpret ation of new sensory input. The model would thus appear to account for how it is that, to a large extent, we see what we are looking for and what we expect to find, as much as or even more than what is actually there. Moreover, that previous experi ence which has built our cognitive systems includes not only the results of our direct experience (as mediated by earlier stages of our perceptual-conceptual systems) but also the results of information received from others via the linguistic system, which has influenced the construction of our conceptual systems.

The Basic Puzzle and a Solution

We are now ready to return to the questions raised at the outset of this paper. First, we have the question of just what we are asking, and then we have the problem of coming up with an answer to the question(s) we choose to ask. Our basic questions we can now consider in the context of the structure of the neurocognitive system, and I would like to propose that there are two of them: (1) How can language influence thought? (2) How can language influence perception?

First, the influence of language on thought. Here we need to distinguish two subtypes. First, the cases involving semantic mirage. These are the ones which rely on reification and the one-lexeme-one-thing fallacy. For these it is qu ite easy to see an influence of language on thinking (as in the example given above), and we don't need to be detained further by them. Second, we have the type of thinking which is driven primarily by the concepts involved rather than by their lexical co nnections. This being the case it is not so obvious how language could be influencing the thinking. As this question is similar to but less complex and less intriguing than that of how language could influence perception, let us turn to the latter. The an swer to it will apply also here.

Second, then, is the question of how language can influence perception. This is the one I find the most interesting. And such influence I take to be implied by Whorf's statement: "The categories and types that we isolate from the wo rld of phenomena we do not find there because they stare every observer in the face; on the contrary, the world is presented in a kaleidoscopic flux of impressions which has to be organized by our minds — and this means largely by the linguistic systems i n our minds."

If we look at perception in connection with Figures 2, 4, and 5, it is not at all apparent how language could influence perception. A perceptual process, say seeing, starts from the eyes, goes through the several layers of visual st ructure and from there to conceptual structure and only from there to the linguistic subsystems if the subject is motivated to engage in linguistic activity as a result of what has been perceived -- perhaps to say "Henry, do you know that your cat is claw ing your oriental rug?". The activation of linguistic subsystems would appear to come only after that of the perceptual areas. So how can language influence perception? It may seem that some mysterious — even mystical — process is involved, or mayb e just an imaginary process.

In thinking about the possibility that language may influence perception or thought, it is easy to suppose, if we are letting our thinking about this question be influenced by the words we are using, not only that these abstract obj ects -- language and thought and perception — have a life of their own, apart from the minds of human beings, but that any such operation of language upon thinking or perception must be taking place at the same time as the thinking (or perception) being a ffected. But it needn't be so. And in fact the only way to take the mystery out of the process — to solve the puzzle — is to recognize that it isn’t so.

We need to recognize two different time periods. In the later one, at the actual time of the thinking and perceiving we are interested in, two important factors are operating:

(1) the mutual activation of the conceptual categories and perceptual distinctions which are present in the system at this time of operation;

(2) top-down effects in perception, from conceptual structure to high-level perceptual layers and from higher-level to lower-level perceptual layers.

The other time period is an earlier one, actually several earlier periods in the usual case, often going back to the childhood of the individual involved. At these earlier periods, the conceptual and perceptual structures are being built and fine-tuned, largely through the operation of linguistic inputs to the system. Here is where the most important role of language comes in, during the construction and refinement of the conceptual and perceptual systems — during the learnin g processes.

Thus there is a long time delay between the time of linguistic influence and the time of the thinking and perception being influenced.

And so I’d like to propose that the process works roughly as follows: Our thinking is largely the operation of our conceptual systems, and therefore it depends upon the structure of those systems. Also, our perception is depende nt upon the structure of the perceptual networks and is affected by our conceptual system through the operation of top-down effects in perception. And, our conceptual systems were built and our perceptual systems shaped, mostly in childhood, under the hea vy influence of language. Therefore, it is not the case that, in some mysterious way, language is influencing thought and perception at the time the thinking and perceiving are occurring; rather it is the influence of languaging during childhood that i s affecting thinking and perceiving throughout later life.

When we were children we accepted the illusions of our parents and older siblings and friends and teachers, knowing no better than to trust them. And by what means did we do this? Of course, it was largely through language. They tol d us, in effect, what to believe about the world. Here, then, we have a clear causal relationship: It is largely through language that each generation learns the system of boundaries and categories and semantic mirages projected onto the world by its cult ure.

References

Abeles, M. 1991. Corticonics: Neural Circuits of the Cerebral Cortex. Cambridge University Press.

Benson, D. Frank, & Alfredo Ardila. Aphasia: A Clinical Perspective. New York: Oxford University Press.

Copeland, James, & Philip Davis (eds). 1980. Papers in Cognitive- Stratificational Linguistics. (Rice University Studies 66:2).

Damasio, Antonio. 1989a. Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition 33.25-62.

----------. 1989b. The brain binds entities and events by multiregional activation from convergence zones. Neural Computation 1.123-132.

----------. 1989c. Concepts in the brain. Mind and Language 4.24-28.

Damasio, Hanna. 1991. Neuroanatomical correlates of the aphasias. Acquired Aphasia, ed. by Martha Taylor Sarno, 2nd edition, 45-71. San Diego: Academic Press.

Edelman, Gerald M. 1987. Neural Darwinism: The Theory of Neuronal Group Selection. New York: Basic Books.

Goodglass, Harold. 1993. Understanding Aphasia. San Diego: Academic Press.

Hebb, D. O. 1949. The Organization of Behavior. New York: Wiley.

Kandel, Eric. R., James H. Schwartz & Thomas M. Jessell. 1991. Principles of Neural Science (3rd ed.). New York: Elsevier.

Kosslyn, Stephen M. 1983. Ghosts in the Mind's Machine. New York: Norton.

---------- & Oliver Koenig. 1995. Wet Mind. New York: The Free Press.

Lamb, Sydney M. 1966. Outline of Stratificational Grammar. Washington: Georgetown University Press.

------. 1970. Linguistic and cognitive networks. Cognition: A Multiple View, ed. by Paul Garvin, 195-222. New York: Spartan Books. Reprinted in Makkai & Lockwood 1973.

----------. 1984. Semiotics of Language and Culture. Semiotics of Culture and Language, ed. by Fawcett, Halliday, Lamb, & Makkai, 71-100. London: Frances Pinter.

----------. 1994. Relational network linguistics meets neural science. LACUS Forum XX.151-178.

----------. 1997. Bidirectional processing and expanded relational network notation. LACUS Forum XXIII.109-124.

----------. 1999a. Bi-directional processing in language and related cognitive systems. Usage-Based Models of Language, ed. by Michael Barlow & Suzanne Kemmer (eds),. Stanford: CSLI Publications.

------------. 1999b. Local and Distributed Representation. LACUS Forum XXV.

----------. 1999c. Pathways of the Brain: The Neurocognitive Basis of Language. Amsterdam: Benjamins.

Lee, Penny. 1996. The Whorf Theory Complex. Amsterdam: Benjamins.

Lockwood, David. G. 1972. Introduction to Stratificational Linguistics. New York: Harcourt Brace Jovanovitch.

Makkai, Adam & David Lockwood (eds.). 1973. Readings in Stratificational Linguistics. University of Alabama Press.

Pinker, Steven. 1994. The Language Instinct. New York: Morrow.

Sapir, Edward. 1921. Language. New York: Harcourt Brace.

Schreyer, Rüdiger. 1977. Stratifikationsgrammatik, Eine Einführung. Niemayer.

Tannen, Deborah. 1990. You Just Don't Understand: Women and Men in Conversation. New York: Morrow.

Winograd, Terry. 1980. Language as a Cognitive Process. New York: Wiley.

Terms for the Glossary

The Language Cortex. Those portions of the cerebral cortex which are devoted largely to processing linguistic information.

Latent Connections. Very weak network connections which are abundantly available throughout the cortical information system and which can become strengthened as part of the learning process.

The One-Lexeme-One-Thing Fallacy. The assumption that a lexeme stands for just one thing, ruling out the possibility that it might have different senses in different contexts.

The Proximity Hypothesis. The hypothesis, derived from considerations of the learning process, that network structures with similar functions will tend to be in physical proximity to one another in the cortex, and that those which have connections to relatively distant cortical areas will tend to be intermediate between the areas connected, thus as close as possible to them.

Reification. The assumption that a nominal lexeme must represent a thing, leading to the unconscious ascription of substantial reality to abstractions.

Semantic Mirage. Any semantic relationship between lexical and conceptual units in a cognitive system which leads to assumptions about or projections onto the world of properties which are not actually there. Subtypes include the one-le xeme-one-thing fallacy, reification, and the unity fallacy.

The Unity Fallacy. The assumption that a concept represents an object that is an integral whole, even if closer examination would show it to be a relatively haphazard collection of diverse phenomena.