How language and cognition interact in thinking? Is language just used for communication of completed thoughts, or is it fundamental for thinking? Existing approaches have not led to a computational theory. We develop a hypothesis that language and cognition are two separate but closely interacting mechanisms. Language accumulates cultural wisdom; cognition develops mental representations modeling surrounding world and adapts cultural knowledge to concrete circumstances of life. Language is acquired from surrounding language “ready-made” and therefore can be acquired early in life. This early acquisition of language in childhood encompasses the entire hierarchy from sounds to words, to phrases, and to highest concepts existing in culture. Cognition is developed from experience. Yet cognition cannot be acquired from experience alone; language is a necessary intermediary, a “teacher.” A mathematical model is developed; it overcomes previous difficulties and leads to a computational theory. This model is consistent with Arbib's “language prewired brain” built on top of mirror neuron system. It models recent neuroimaging data about cognition, remaining unnoticed by other theories. A number of properties of language and cognition are explained, which previously seemed mysterious, including influence of language grammar on cultural evolution, which may explain specifics of English and Arabic cultures.
How do language interacts with cognition is unknown. How they function in thinking? Is language just a communication device, or is it fundamental in developing thoughts? Do we think with words and phrases, or do we speak without thinking? If both abilities are important, how do we learn? Which words go with which thoughts? To use just 1000 words for 1000 objects, every kid has to learn correct combinations among 10001000 possible combinations, often without explicit teaching as has been the case for most kids around the world for millennia. Learning abstract ideas is even more difficult. Words and sentences are not used in small sets combined with objects and events exactly fitting the intended meanings. Most objects present in every situation are irrelevant for this situation (say a pattern on the floor is irrelevant for understanding that this room is a lecture hall or a dining room). How do we learn to ignore the irrelevant majority of objects and events and to account for the relevant context? Which neural mechanisms of the brain enable learning language and cognition? After a brief review of existing theories and past difficulties, fundamental mechanisms of cognition are described with their mathematical models that enable to overcome past difficulties.
For long time, logic dominated thinking of mathematicians and the intuitions of psychologists and linguists. Logical mechanisms are similar for language or cognition; both are based on logical statements and rules. Deficiencies of logic established by the fundamental Gödelian results [
Contemporary linguistic interests in the mind mechanisms of language were initiated by Chomsky [
Initially the available mathematics of logical rules, similar to rule systems of artificial intelligence, was used by Chomsky’s followers. Eventually a new mathematical paradigm in Chomsky’s linguistics was proposed in [
Many psychological linguists, however, disagreed with the separation of language and cognition. In the 1970s, cognitive linguistics emerged to unify language and cognition and to explain the creation of meanings. Chomsky’s idea about a special module in the mind devoted to language was rejected. Language and cognition use similar mechanisms. It is embodied and situated in the environment. Related research on construction grammar argues that language is not compositional, and not all phrases are constructed from words using the same syntax rules and maintaining the same meanings; metaphors are good examples [
Evolutionary linguistics emphasizes an importance of evolving language and meanings. Language mechanisms are shaped by transferring from generation to generation [
Many aspects of interacting language and cognition cannot be modeled by existing mathematical techniques. Existing theories of language and cognition do not explain many salient aspects of the unknown human neural mechanisms, remaining mysterious. These mechanisms are addressed here. The proposed model resolves some long-standing language-cognition issues. How the mind learns correct associations between words and objects among an astronomical number of possible associations; why kids can talk about almost everything but cannot act like adults; what exactly are the brain-mind differences? Why animals do not talk and think like people? How language and cognition participate in thinking? Recent brain imaging experiments indicate support for the proposed model.
Important properties of perception and cognition are revealed by a simple experiment, properties ignored by most theories [
Explaining this experiment requires understanding mechanisms of instincts, emotions, and mental representations. Perception and understanding of the world is due to mechanism of mental representations or concepts. Concept representations are like mental models of objects and situations; this analogy is quite literal, for example, during visual perception, a mental model of the object stored in memory projects an image (top-down signals) onto the visual cortex, which is matched there to an image projected from retina (bottom-up signal; for more details see [
Mental representations are an evolutionary recent mechanism. It evolved for satisfaction of more ancient mechanisms of instincts. Here, “instinct” is a simple inborn, nonadaptive mechanism described in [
Instinctual-emotional theory of Grossberg-Levine [
Top-down neural signals projected from a mental model to the visual cortex make visual neurons to be more receptive to matching bottom-up signals, or “primes” neurons. This projection produces the imagination that we perceive with closed eyes, as in the close-open eye experiment. Conscious perception occurs, as mentioned, after top-down and bottom-up signals match. For a while, the process of matching presented difficulties to mathematical modeling, as discussed below.
Computer intelligence cannot compete with animals [
CC difficulties have been related to Gödelian limitations of logic; they are manifestations of logic inconsistency in finite systems [
Dynamic logic (DL) was proposed to overcome limitations of logic [
DL models the open-close eye experiment: Initial states of the models are vague. Recent brain imaging experiments measured many details of this process. Bar et al. [
The mind has an approximately hierarchical structure from sensory signals at the bottom to representations of the highest concepts at top [
The knowledge instinct (KI) is similar to other instincts in that the mind has a sensor-like mechanism, which measures a similarity between top-down and bottom-up signals, between concept-models and sensory signals, and maximizes this similarity. Brain areas participating in KI were discussed in [
In a single layer of the mental hierarchy, neurons are enumerated by index
A mathematical model of the knowledge instinct is maximization of a similarity between top-down and bottom-up signals,
Here,
KI maximizes similarity
DL determining the Neural Modeling Fields (NMF) dynamics is given by
When solving this equation iteratively,
The process of DL always converges [
Below in Figure
Dynamic logic operation example, finding cognitively related events in noise, in EEG signals. The searched processes are shown in Figure
Exact pattern shapes are not known and depend on unknown parameters these; parameters should be found by fitting the pattern model to the data. At the same time, it is not clear which subset of the data points should be selected for fitting. A previous state-of-the-art algorithm for this type of problems, multiple hypotheses testing, tries various subsets [
The models and conditional similarities for this case are described in details in [
Here, we consider a next higher level in the hierarchy of cognition. At each level of the hierarchy, bottom-up signals interact with top-down signals. For concreteness, we consider learning situations composed of objects. In real brain-mind, learning and recognition of situations proceed in parallel with perception of objects. For simplifying presentation, we consider objects being already recognized. Situations are collections of objects. The fundamental difficulty of learning and recognizing situations is that, when looking in any direction, a large number of objects are perceived. Some combinations of objects form “situations” important for learning and recognition, but most combinations of objects are just random sets, which human mind learns to ignore. The total number of combinations exceeds by far the number of objects in the Universe. This is the reason for this problem having not being solved over the decades [
This example is considered in details in [
Learning situations; white dots show present objects, and black dots correspond to absent objects. Vertical axes show 1000 objects, and horizontal axes show 10 situations each containing 10 relevant objects and 40 ransom one; in addition, there are 5000 “clutter” situations containing only random objects; (a) shows situations sorted along horizontal axis; hence, there are horizontal lines corresponding to relevant objects (right half contains only random noise); (b) shows the same situations in random order, which looks like random noise.
To solve this problem using a standard algorithm, one can try to sort horizontal axis until white lines appear, similar to Figure
(a) shows DL initiation (random) and the first three iterations; the vertical axis shows objects, and the horizontal axis shows models (from 1 to 20). The problem is approximately solved by the third iteration. This is illustrated in (b), where the error is shown on the vertical error. The correct situations are chosen by minimizing the error. The error does not go to 0 for numerical reasons as discussed in [
Figure
The procedure outlined in this section is general in that it is applicable to all higher layers in the mind hierarchy and to cognitive as well as language models. For example, at higher layers, abstract concepts are subsets of lower level ones. The mathematical procedure outlined above is applicable without change.
The procedure outlined in the previous section is applicable to learning language in the entire hierarchy from words up. Phrases are composed of words, and larger chunks of text from smaller chunks of texts can be learned similarly to learning above situations models composed of objects. Grammar rules, syntax, and morphology are learned using markers as discussed above. Lower layer models may require continuous parametric models, like laryngeal models of phonemes [
Do we use phrases to label situations that we already have understood or the other way around, and do we just talk with words without understanding any cognitive meanings? It is obvious that different people have different cognitive and linguistic abilities and may tend to different poles in the cognitive-language continuum, while most people are somewhere in the middle in using cognition to help with language, and vice versa. What are the neural mechanisms that enable this flexibility? How do we learn which words and objects come together? If there is no specific language module, as assumed by cognitive linguists, why do kids learn a language by 5 or 7 but do not think like adults? And why there is no animals thinking like humans but without human language?
Little is known about neural mechanisms for integrating language and cognition. Here, we propose a computational model that potentially can answer the above questions, and that is computationally tractable, it does not lead to combinatorial complexity. Also it implies relatively simple neural mechanisms, and explains why human language and human cognition are inextricably linked. It suggests that human language and cognition have evolved jointly.
Whereas Chomskyan linguists could not explain how language and cognition interact, cognitive linguists could not explain why kids learn language by 5 but cannot think like adults; neither theory can overcome combinatorial complexity.
Consider first how is it possible to learn which words correspond to which objects? Contemporary psycholinguists follow the ancient Locke idea, “associationism”: associations between words and object are just remembered. But this is mathematically impossible. The number of combinations among 100 words and 100 objects is larger than all elementary particle interactions in the Universe. Combinations of 30,000 words and objects are practically infinite. No experience would be sufficient to learn associations. No mathematical theory of language offers any solution. NMF-DL solves this problem using the Dual model [
This dual-model equation suggests that the connection between language and cognitive models is inborn. In a newborn mind, both types of models are vague placeholders for future cognitive and language contents. An image, say of a chair, and the sound “chair” do not exist in a newborn mind. But the neural connections between the two types of models are inborn; therefore, the brain does not have to learn associations between words and objects; which concrete word goes with which concrete object. Models acquire specific contents in the process of growing up and learning, and linguistic and cognitive contents are always staying properly connected. Zillions of combinations need not be considered. Initial implementations of these ideas lead to encouraging results [
Consider language hierarchy higher up from words, Figure
Parallel hierarchies of language and cognition consist of lower-level concepts (like situations consist of objects). A set of objects (or lower-level concepts) relevant to a situation (or higher-level concept) should be learned among practically infinite number of possible random subsets (as discussed, larger than the Universe). No amount of experience would be sufficient for learning useful subsets from random ones. The previous section overcame combinatorial complexity of
Now, that the fundamental problem is solved, learning language will be solved in due course. Practically, significant effort will be required to build machines learning language. However, the principal difficulty has been solved in the previous section. Mathematical model of learning situations, considered in the previous section, is similar to learning how phrases are composed from words. Syntax can be learned similar to relations between objects [
The next step beyond current mathematical linguistics is modeling interaction between language and cognition. It is fundamental because cognition cannot be learned without language. Consider a widely held belief that cognition
NMF-DL with Dual model and dual hierarchy suggests that information is coming from language. This is the reason why no animal without human-type language can achieve human-level cognition. This is the reason why humans learn language early in life, but learning cognition (making cognitive representations models as crisp and conscious as language ones) takes a lifetime. Information for learning language is coming from the surrounding language at all levels of the hierarchy. Language model representations exist in the surrounding language “ready-made.” Learning language is thus grounded in the surrounding language.
For this reason, language models become less vague and more specific by 5 years of age, much faster than the corresponding cognitive models for the reason that they are acquired ready-made from the surrounding language. This is especially true about the contents of abstract models, which cannot be directly perceived by the senses, such as “law,” “abstractness,” and “rationality,”. While language models are acquired ready-made from the surrounding language, cognitive models remain vague and gradually acquire more concrete contents throughout life guided by experience and language. According to the Dual model, this is an important aspect of the mechanism of what is colloquially called “acquiring experience.”
Human learning of cognitive models continues through the lifetime and is guided by language models. If we imagine a familiar object with closed eyes, this imagination is not as clear and conscious as perception with opened eyes. With opened eyes, it is virtually impossible to remember imaginations. Language plays a role of eyes for abstract thoughts. On one hand, abstract thoughts are only possible due to language, on the other, language “blinds” our mind to vagueness of abstract thoughts. When talking about an abstract topic, one might think that the thought is clear and conscious in the mind. But the above discussion suggests that we are conscious about the
Animal vocalizations are inseparable from instinctual needs and emotional functioning. The Dual model has enabled separation of semantic and emotional contents, which made possible deliberate thinking. Yet operations of the Dual model, connecting sounds and meanings, require motivation. Motivation in language is carried by sounds [
Evolution of the language ability required rewiring of human brain. Animal brains cannot develop ability for deliberate discussions because conceptual representations, emotional evaluations, and behavior including vocalization are unified, undifferentiated states of the mind. Language required freeing vocalization from emotions, at least partially [
Another mystery of human cognition, which is not addressed by current mathematical linguistics, is basic human irrationality. This has been widely discussed and experimentally demonstrated following discoveries of Tversky and Kahneman [
The Dual model also suggests that the inborn neural connection between cognitive brain modules and language brain modules is sufficient to set humans on an evolutionary path separating us from the animal kingdom. Neural connections between these parts of cortex existed millions of years ago due to mirror neuron system, what Arbib called “language prewired brain” [
The combination of NMF-DL and the dual hierarchy introduces new mechanisms of language and its interaction with cognition. These mechanisms suggest solutions to a number of psycholinguistic mysteries, which have not been addressed by existing theories. These include fundamental cognitive interaction between cognition and language; similarities and differences between these two mechanisms; word-object associations; why children learn language early in life, but cognition is acquired much later; why animals without human language cannot think like humans. These mechanisms also connected language cognition dichotomy to “irrationality” of the mind discovered by Tversky-Kahneman and to the story of the Fall and Original sin.
The mathematical mechanisms of NMF-DL-Dual model are relatively simple ((
An experimental indication in support of the Dual model has appeared in [
This provides evidence for neural connections between perception and language, a foundation of the Dual model. It supports another aspect of the Dual model: The crisp and conscious language part of the model hides from our consciousness, the vaguer cognitive part of the model. This is similar to what we observed in the close-open eye experiment: With opened eyes, we are not conscious about vague imaginations.
Another experimental evidence for the Dual model is Mirror Neuron System (MNS) [
Every complex functioning neural mechanism requires motivation, correspondingly, functioning of the Dual model, and requires motivations or emotions, connecting language and cognitive sides of the Dual model, as illustrated in Figure
Developing meanings by connecting language and cognition requires motivation, in other words, emotions. If language emotionality is too weak, language is disconnected from the world, meanings are lost, and cultures disintegrate. If language emotionality is too strong, connections could not evolve and cultures stagnate. Is it possible to keep the balance?
Emotionality of languages resides in their sounds, like the sound of music moves us emotionally. Animal voicing is fused with emotions; animals lack volunteer control over voice muscles and therefore cannot develop language. Evolution of language required rewiring the brain, so that automatic connection of voice and emotions severed. Language and voice started separating from ancient emotional centers possibly millions of years ago. Nevertheless, emotions are present in language. Most of these emotions originate in cortex and are controllable aesthetic emotions. Emotional centers in cortex are neurally connected to old emotional limbic centers, so both influences, new and old, are present. Emotionality of languages is carried in language sounds, what linguists call prosody or melody of speech. This ability of human voice to affect us emotionally is most pronounced in songs [
Emotionality of everyday speech is low, unless affectivity is specifically intended. We may not notice emotionality of everyday “nonaffective” speech. Nevertheless, “the right level” of emotionality is crucial for developing cognitive parts of models. If language parts of models were highly emotional, any discourse would immediately resort to fights and there would be no room for language development (as among primates). If language parts of models were nonemotional at all, there would be no motivational force to engage into conversations, to develop the Dual model. Dual model is fundamental for developing representations of situations and higher cognition [
Primordial fused language-cognition-emotional models, as discussed, have been differentiated long ago. The involuntary connections between voice-emotion-cognition have dissolved with emergence of language. They have been replaced with habitual connections. Sounds of all languages have changed in history, and sound-emotion-meaning connections in languages could have severed. However, if the sounds of a language change slowly, the connections between sounds and meanings persist and consequently the emotion-meaning connections persist. This persistence is a foundation of meanings because meanings imply motivations. If the sounds of a language change too fast, the cognitive models are severed from motivations, and meanings disappear. If the sounds change too slowly the meanings are nailed emotionally to the old ways, and culture stagnates.
These arguments suggest that an important step toward understanding cultural evolution is to identify mechanisms determining changes of the language sounds. These changes are controlled by grammar. In inflectional languages, affixes, endings, fusion, and other inflectional devices are fused with sounds of word roots. Pronunciation sounds of affixes and other inflections are controlled by few rules, which persist over thousands of words. These few rules are manifest in every phrase. Therefore, every child learns to pronounce them correctly. Positions of vocal tract and mouth muscles for pronunciation of inflections are fixed throughout population and are conserved throughout generations. Correspondingly, pronunciation of whole words cannot vary too much, and language sound changes slowly. Inflections, therefore, play a role of “tail that wags the dog” as they anchor language sounds and preserve meanings. This, I think is what Humboldt [
This has happened with English language after transition from Middle English to Modern English [
Semitic languages and in particular Arabic language are highly inflected. Inflection mechanism called fusion affects the entire word sounds, and the meaning of the word changes with changing sounds; also suffixes control verbs and moods. Therefore, sounds are closely fused with meanings. This strong connection between sounds and meanings contributes to beauty and affectivity of Classical Arabic texts including Quran. On the other hand, creation of new meanings in Classical Arabic is difficult because of this strong connections, remaining unchanged for centuries, and also because of religious restrictions. Arabic language leads to a culture, where meanings and values are strong, but conceptual culture development is slow. There are significant differences between Classical Arabic and street Arabic languages; however, this topic requires separate study.
Neural mechanisms of grammar, language sound, related emotions-motivations, and meanings hold a key to connecting neural mechanisms in the individual brains to evolution of cultures. Studying them experimentally is a challenge for future research. It is not even so much a challenge, because experimental methodologies are at hand; they just should be applied to these issues. The following sections develop mathematical models based on existing evidence that can guide this future research.
The Dual model implies a relatively minimal neural change from the animal to the human mind. It could emerge through combined cultural and genetic evolution, and this cultural evolution might continue today. DL resolves a long-standing mystery of how human language, thinking, and culture could have evolved in a seemingly single big step, too large for an evolutionary mutation, too fast, and involving too many advances in language, thinking, and culture, happening almost momentarily around 50,000 years ago [
Mathematical models of some of the mechanisms of evolving languages and cultures have been discussed in [
The author is thankful to M. Alexander, M. Bar, R. Brockett, M. Cabanac, R. Deming, F. Lin, J. Gleason, R. Kozma, D. Levine, A. Ovsich, and B. Weijers, to AFOSR PMs Drs. J. Sjogren and D. Cochran for supporting part of this research, and to the paper reviewers for valuable suggestions.