The widespread use of ChatGPT has normalized the dialogue Turing test. To meet this challenge, China's major national development strategy suggests that for a new generation of artificial intelligence, it is first necessary to answer the big questions raised by Turing in 1950 from the perspective of cognitive physics: Can machines think? How do machines think? How do machines cognize? Whether it is carbon-based human cognition or silicon-based machine cognition, it is an interaction between complex constructs composed of the four most basic elements: matter, energy, structure, and time. Both humans and machines depend on negative entropy for living, and time is the cornerstone of cognition. Structure and time are parasitic on matter and energy in physical space, forming hard-structured ware. The soft-structured ware in cognitive space is mind, which is parasitic on the hard-structured ware or other existing soft-structured ware, and constitutes a rich hierarchy of multi-scale feelings, concepts, information, and knowledge. Extending “abstraction” from the symbolic school of artificial intelligence, “association” from the connectionist school, and “interaction” from the behaviorist school, the core of cognition is established on the shoulders of such scientific giants such as Schrödinger, Turing and Wiener. Soft and hard-structured ware interact. Cognitive machine can comprise heterogeneous hard-structured ware, such as field programmable gate arrays (FPGAs), data processing units (DPUs), central processing units (CPUs), graphics processing units (GPUs), tensor processing units (TPUs), and memory. It can also be implanted with the “Baby Cognitive Nucleus” which is hard-structured ware genetically inherited and naturally evolved to form the embodied machine. Then the hard-structured ware is parasitized by rich, multi-scale soft-structured ware. By regulating matter and energy through soft-structured ware, machines produce orderly events, form coordinated and orderly thinking activities. The heterogeneous sensors configured by the machine and the speed of thinking will no longer be trapped by the extreme values of biochemical parameters of carbon-based organisms but will be able to perceive through multi-channel cross-modal means, carry out intense thinking, and maintain cognitive continuity with memory. To generate computational and memory intelligence in cognitive space which can bootstrap, self-reuse and self-replicate, imagination and creativity are improved through memory-constrained computing. The new generation of artificial intelligence will leap beyond mechanized mathematics to automation in thinking and self-driven growth of cognition, and the thinking in the cognitive space and the behavior in the physical space verify each other, from the dialogue Turing test to the embodied Turing test. Humans have entered the intelligent era of human-machine co-creation with cognitive machines iteratively inventing, discovering, and creating alongside scientists, engineers, and skilled craftsmen, each wise in its way, improving thinking ability, and amplifying human energy.

Humans have two spaces. One is the objective external physical space or world. In this space, human beings perceive the universe and recognize celestial bodies, such as the sun, stars, moon, and earth; natural substances such as oceans, rivers, lakes, and mountains, as well as microorganisms, plants, and animals—that is all life outside of “me”; we recognize tools, tables and chairs, electric lights, cars, buildings, books, maps, schools, sculptures, machines, artificial satellites and other artifacts. All of these entities are real and physical. There is also a subjective internal cognitive or spirit space associated with consciousness, desire, emotion, belief, intelligence, feelings, experiences, and perceptions [1]. Different combinations of consciousness, desire, emotion, belief, and intelligence of the inner world evolve into different value systems.

In the value system of consciousness, desire, emotion and belief are considered the main values, and love, equality, justice, human rights, respect, growth, and creation become our value standards [1]. There is a tendency to materialize or reify human cognitive processes. Words, tools, art, machines, and even intelligent machines are used to realize thoughts, thereby stripping intelligence from life and extending it outside the body without the entanglement of consciousness and emotions. The artificial intelligence existing in the physical world, should become part of the ecology of human civilization and boost the development of human intelligence.

A carbon-based life cognition ends, meaning that spirit turns into nothing, and organic matter turns into inorganic substance. However, the universe is still vast and stars still move. The universe is approximately 14 billion years old; the Earth is approximately 4.5 billion years old, and the evolution that has led to human cognition is at best 5 million years old. If shorten the age of the Earth to one year, the human would only appear in the last half hour [2]. Descartes’ statement [3], “I think, therefore I am,” refers to sprit in cognitive space. However, regardless of thoughts, the Earth exists. Compared to the objective physical world, the human subjective spiritual world cannot be overstated. The forces of the universe are inherently separate from individual desires and happiness, and it is impossible to know whether the physical world is what it is perceived to be. This is the basic context for understanding the human invention of machine cognition.

At present, ChatGPT [4] is widely used and accepted in all walks of life around the world and is revitalizing the dialogue Turing test. ChatGPT poses a challenge to China's significant national development strategy, the new generation of artificial intelligence, which was first proposed by China globally. The profound understanding of the basis, connotation, extension, technical characteristics, and development of the new generation of artificial intelligence provide important assurances for the realization of this national strategy. Whether it is human cognition or machine cognition, whether it is the frontier of global artificial intelligence or artificial intelligence with Chinese characteristics, it is necessary to use the method of cognitive physics [5] to address the source—the formalization of cognition.

2.1 Definition of Cognition

Human cognition helps explain and solve the realistic problems encountered in the process of human survival and reproduction. Each cognitive activity can be divided into a cycle of perception, cognition, and action, which is fed back into perception. Perception is the source of cognition; thinking is the activity that takes place in cognitive space. It is self-contained and full of imagination, and at times can be more profound than the physical world. Action is the externalized expression and purpose of cognition. Both perception and action take place in physical space and form embodied intelligence through interaction. Cognition keeps spiraling back and forth between objective physical and subjective cognitive spaces to answer questions such as “where,” “what,” “why,” and “how”. The great physicist Albert Einstein said, “The most incomprehensible thing about the universe is that it is understandable. [6]” However, the individual's position in the universe is too insignificant, a single life span of less than 100 years, making the comprehension of infinity very difficult or even impossible. Human beings have attempted to explain such a vast universe with scientific theories and technological inventions through generations of human culture and civilization, never stopping to explore, and increasingly transforming the unknown into the known and interpretable, to a certain degree. This bounded rationality is in fact a group consensus in the process of human cognition. What was incomprehensible a thousand years ago may now be understandable to people, although some of it may still be incomprehensible due to knowledge gaps. Thus there is no end to the spiral of human cognition. The larger the radius of cognition, the larger the interface between human cognition and the unknown. Although human cognition has twists and turns, it is gradually approaching the truth, step by step.

Cognitive abilities comprise the ability to learn (i.e., the ability to explain and solve to preset problems) and the ability to explain and solve to real-world problems [7]. Postulated problems, which are often adapted from real-world problems, have been formalized and proven to have solutions, like those found in school textbooks. Learning helped by outside to change the unknown before into the knowable is the basis of explaining and solving new problems. Explaining problems makes them easier to solve, and solving problems makes them easier to explain. The embodied intelligence of physical space includes perceptual and behavioral intelligence. Perceptual intelligence can be further divided into spatiotemporal recognition intelligence (i.e., the ability to recognize location, direction, and time) and pattern recognition intelligence. Because of the needs of survival and reproduction, perceptual intelligence can even develop into perceptual intuition, such as face recognition and speech recognition. Computational intelligence and memory intelligence coexist in cognitive space. However, memory is before computation and restricts computation. It responds the applicable scope of the border of computation. Whether it is learning, explaining realistic problems, or solving them, it requires thinking activities in cognitive space and also requires the repeated interaction of the physical and cognitive spaces to verify. This requirement is reflected externally through embodied behavior including language and accumulation of memory in realizing the self-driven growth of cognition. The result of learning is the modification and remodeling of memory, as well as the storage, regulation, and extraction of memory. The purpose of learning is to explain and solve newly encountered real problems. As the saying goes, “Learning without thinking is labor lost, thinking without learning is dangerous.” In China, “supervised learning” is translated as “监督学习” and “unsupervised learning” is translated as “无监督学习”. but these terms are not very accurate because “guided learning” includes direction, explanation, and error correction, in addition to supervision, and has other rich connotations. Nature has endowed human beings with more desire than wisdom. “Seeking truth,” “seeking knowledge,” and “seeking beauty” are the natural desires of human beings, and these desires are developed through natural evolution according to the principle of the survival of the fittest and the reproduction of human beings. Cognitive space is not only the repository of human memory knowledge, but also the sky of imagination. Matter is difficult to constrain the scope of imagination, human can imagine what does not exist.

2.2 Four Basic Cognitive Modes: Deduction, Induction, Creation and Discovery

To discuss the formalization of cognition, it is necessary to analyze the openness of cognition, especially its interactivity. It is important to analyze the uncertainty of cognition, particularly the fundamental determinism within uncertainty and the hierarchical nature of cognition, particularly its recursiveness. The cognitive initiative, especially the attention mechanism, should be examined. The complexity of cognition, particularly emergence, should be analyzed. The holistic nature of cognition, especially the synergy between perception, thinking, and behavior, should be explored. The way individuals or machines gradually approximate reality through the iterative formation of cognition in objective, real, external physical space and in subjective, abstract, internal cognitive space, should be analyzed. The analysis of human cognition, which is formed through the iterative interaction of collectively inherited and individual intelligence across generations, involves four fundamental modes: deduction, induction, creativity, and discovery, characterized by knowledge-driven patterns, such as mathematical theorem proving; experience-driven patterns, such as deep learning; association-driven creative patterns, such as celestial navigation, satellite positioning, and the Starlink project; and finally, hypothesis-driven discovery patterns, such as Mendeleev's prediction of new chemical elements. The intelligence formed by multiple cognitive modes covers visual thinking, logical thinking, and epiphany. Abstraction, association, and interaction support and drive each other, lead each other, complete each other and advance in a spiral. As far as the cognitive development of a single person is concerned, even for the same question, there are differences in imagination and creativity, whereby different cognitive models may be adopted depending on the period and situation, and there is uncertainty. This paper discusses the formalization of cognition and also how the artificial intelligence ecology formed by the extension and accumulation of human intelligence in silico can enhance human cognition. The multiple types of cognition that exist in the process of thinking, learning, and human or machine growth coexist, contribute to the intelligence of each other, and are inclusive; the transformation between cognitive modes sets off a cycle of endless cognition which tends to be unified.

2.3 Recursion and Iteration are Used to Overcome the Entropy Increase and Maintain the Order of the Machine

Both human and machine cognition follow the laws and principles of physics, the most fundamental of which is the principle of entropy increase [8]. Referring to the degree of total disorder of an isolated system, “entropy” always increases in nature. When entropy reaches its maximum value, the system will suffer from severe chaos and come to an end. The universe, life, and machines are all good at looping through simple repetitive basic operations to maintain consistency and order.

The survival and reproduction of human beings is carried out iteratively from generation to generation and is expressed as the inheritance of genes. There are many recursive and fractal phenomena in the organization of an individual life. Human evolution, including human intelligence, can be said to follow a circular process. Regardless of the agent being humans or machines, cyclical phenomena are prevalent in cognitive activities.

In the discussion of formalization of cognition, we noticed that an important cyclic activity in life and cognition is iteration, which uses the result of one cycle as the initial value of the next iteration, continuously updating and accumulating development. Another important form of cyclic activity is recursion; however, recursion is not the same as iteration. Iteration is moving forward, metaphysically. For example, the knowledge in the human brain, from elementary school to university to adulthood, is reused and grows iteratively. Science and technology in human society, especially the mass production of intelligent machines from generation to generation, is also iterative development. Recursion looks backwards, physically. For example, the embodied behavioral intelligence in cognitive machines is ultimately executed recursively by machine instructions in hard-structured ware [9]. ChatGPT's autoregressive generation system takes advantage of both recursion and iteration [32]. Recursion and iteration are especially important for self-guidance and self-driven growth in life and cognition.

Carbon-based life consists of cells. In “What is Life?” [10], Nobel Prize winner Erwin Schrodinger wrote that life is a code book that can determine the complete pattern of the future development of an individual. To live is to fight against the law of entropy. Humans and all living things are subjected to the most basic laws of physics; we all age; and we all live on negentropy. Charles Darwin's theory of evolution [11], especially the diversity of species, can be construed in terms of intelligent machines. Similarly, to understand Francis Crick's genetics [12], especially genetic engineering, and to understand Eric Kandel's cytology [13], especially cognitive neurobiology, it is necessary to understand how machines form order by relying on energy, how they generate negentropy through interaction with the outside world, and how they think.

The universe is vast and the stars move. The universe is made up of matter. Matter and energy are interchangeable. Some people say that the Big Bang was a new emergence, while others argue that the Big Bang gave birth to the Earth [14]. Appear early in life on the earth before there exist in the form of a variety of natural substances. Humans have been on earth for millions of years. Sun rises and sun sets. Winter goes and spring comes. The evolution of human cognition created more and more people creation, finally from the universe of material, gradually developed into today's “the four most basic elements in machine cognition”.

3.1 Two Elements Theory of Tools

First, we present the “two elements of tools theory” which has existed since the birth of agriculture during the Stone Age and Agrarian Age that is, the theory of matter and structure. The material of the tool, invented by human is matter, and various structures are directly parasitic on the matter, and the relationship between the parts constitutes the tool. The structure determines the function and forms the hard-structured ware (Figure 1). It took hundreds of thousands of years to “make the first stone into a knife [15]”. The structure and matter of physical space are inseparable. In 3200 BC, the Sumerians invented the early wheel [16]. If a natural tree trunk is cut into two parallel “planar structures,” and shaped into a “round structure,” it becomes a “wheel.” If without this seemingly simple invention and other structural inventions that “roll things around their axes” constitute hard-structured ware, it is hard to imagine that any mechanized tool would be working today. This knowledge accumulation covers the entire spectrum from gears to bicycles, cars, jet engines, precision instruments. The invention of the wheel has a history of 5,500 years, and the important role of the wheel in human history is often compared with the invention of fire. The first known gear calculation tool invented by humans was the ancient Greek Antikythera machine, which was invented more than 2,000 years ago. If we trace back the mechanization of Chinese mathematics, the earliest abacus with structure parasitic on matter was invented by Yue Xu, a mathematician in the Eastern Han Dynasty of China. He wrote “Number.” It is clear that hard-structured ware is by no means simply equivalent to matter. A variety of complex structures parasitic on matter is the embodiment of the unique imagination of human beings. The scale of the structure can be smaller or larger than that of the human body, with 18 different levels. Tools invented during the Stone Age and Agrarian Age, such as swords, spears, and wheels, were designed without considering energy. Those tools were not machines, and were not life; however, they were able to greatly expand the physical strength and behavior of people.

Figure 1.

Later versions of tools originally invented in the Stone Age: Examples of hard-structured ware whose structure is parasitic on matter.

Figure 1.

Later versions of tools originally invented in the Stone Age: Examples of hard-structured ware whose structure is parasitic on matter.

Close modal

3.2 Three Elements Theory of Machine

In the Industrial Age, an important element was added to the machine—”energy” such that matter, energy, and structure constituted the “three elements of the machine” to replace and extend human physical power with machines. The structure is directly parasitic on matter, energy, and the interrelations of the parts that make up the machine (Figure 2), forming hard-structured ware to replace body that can operate but cannot think. For example, the pendulum, steam engines, and electric cars expand human physical strength and behavioral ability. Human walking speed and muscle strength have not changed much for thousands of years, yet the invention of powered machines has greatly extended and expanded human physical strength and behavioral ability. The vehicles, ships, planes, and rockets invented by men have expanded range of human activity to land, sea, sky, and space. A rocket is for or five orders of magnitude faster than the speed of human walking. Nowadays, regardless of the mechanical, radiation, thermal, chemical, electrical, or nuclear nature of energy, all kinds of powered machines are widely used and ubiquitous. Energy has even become a symbol of whether a country is developed or not. In the physical space, through a variety of complex hard-structured ware formed by matter, structure, and energy, whether it's a tool or a machine, fundamental changes have been introduced into the forms of production, organizational structure, flourishing economy and lifestyle of human society.

Figure 2.

Industrial Age machines - examples of hard-structured ware.

Figure 2.

Industrial Age machines - examples of hard-structured ware.

Close modal

Some say that the information revolution is the fourth industrial revolution; however, this is not exactly true. In human history, there have been and will continue to be agricultural, industrial, and cognitive revolutions, all developing in parallel as well as one after another. In the Stone Age, people invented all kinds of tools to extend their physical strength. During the Industrial Revolution, humans invented all kinds of machinery, including powered machines, thereby extending and expanding human physical strength and behavior. However, powered machines cannot think. The clock, for example, is a machine that depends on matter, energy, and structure to run, but time does not contribute to the operation of the clock. Other examples of Industrial Age hard-structured ware are the steam locomotive and the combustion engine.

3.3 An Analysis of Basic Elements in Machine Cognition

Cognitive machines in the Intelligent Age are different from the powered machines of the Industrial Age. In this paper, we propose the four-elements theory of machine cognition, which adds a new element, time, to matter, energy, and structure.

In physics, especially astrophysics, space and time are often the two major dimensions. The structure of any object in physics must be parasitic on matter and energy, and the structure cannot be an isolated existence in physical world. The movement and change of matter and energy in space are described by time, and their structure changes with time. Matter and energy are exchangeable. Matter and energy exist in movement and change, and life exists in growth or aging. We believe the time was created as a subjective concept by humans. In the universe there is no absolute time or it does not objectively exist. Time is not any entity. It should not be spatialized and materialized. Man invented the concept of time. Time is divided into moments and time intervals, to describe the movement and change of matter and energy in the universe. People once gave the highest philosophical status to time [17], but only to the movement and change of matter and energy in cognitive space. Structure is independent of time and does not appear too precise. Structure and time are the cornerstones of human cognition and machine cognition. Intelligence originates from the human brain, especially from the complexity caused by the interaction of a large number of various nerve cells. The neocortex of the brain is the thinking organ that forms the structure and time. Without memory, we would always live in the present, and there would be no concept of structure and time. It is memory that allows us to provide continuity for past and present cognition. Memory is the biological basis for human invention of structure and time. Water, air and nutrition are necessary to maintain life. Memory is human cognition about structure and time. It is necessary to maintain life thinking and the soul of life, throughout life. Only by maintaining normal memory can human beings have intelligence, accumulate intelligence, and realize value. Therefore, the life has a sense of history, and the cognition has a sense of growth. Structure and time permeate the whole cognitive space of human beings. Mathematics is the most abstract soft structure of human beings and the most abstract professional language used on the basis of natural language. According to Einstein's definition, time is just a reading on the surface of a clock. If pendulum clocks produced in the same batch are placed at different locations in the universe, different times will be displayed, which proves that time cannot be completely separated from or be independent of space; time is only used to express a property of space. With the concept of time, matter and energy can be represented as topological structures and relationship that behave at different time (as if time is frozen) in cognitive space.

Matter and energy are real entities that exist at the physical level, whereas structure and time are abstract thoughts that exist at the cognitive level, which are the state parameters of human cognizing the existence and change of material and energy. In the spiritual world of man, structure is used to express the topology and deformation of matter in space, and time is used to express the movement and change of matter to reflect the transfer and conversion of energy. Structure and time are examples of numerous kinds of hard-structured ware parasitic on matter and energy, constituting the embodiment of the machine. The numbers, symbols, and information in the machine's thinking represent the soft-structured ware, just like the thoughts expressed in the human cognitive space, which depend on hard-structured ware or other existing soft-structured ware. These are bootstrapping, reusable, recursively used (iterative), and self-replicating or self-change to imagine. There is a “next” time period for the machine to re-”think.” Time introduces order into the activities of the machine; thinking can race. The soft-structured ware in human being's cognitive space is the element of thinking, and supports visual, logical, and intuitive thinking. It reflects people's rich imagination and creativity, their spiritual world, providing a sense of, scale, time, and hierarchy. To name the underlying soft-structured ware, symbols, letters, strokes, and numbers are used to indicate before and after, left and right, up and down, order, fast and slow, etc. Soft-structured ware, also known as the heart language, can be collectively referred to as the association state between symbols and symbols. Regarding feeling, the upper soft-structured ware consists of concepts, information, and knowledge is merging and grouping, reflecting the different scales, mirror image, and superstructure of the physical world in cognitive space, as well as the imagined reality. At present, the historically significant great success of deep learning which include ChatGPT is training machines with a vast amount of hard-structured ware from the physical world, and replacing memory with annotations, using an amount large enough and small enough number to approximate infinity, to form soft-structured ware at various scales. Deep learning belongs to the memory-driven empirical cognitive mode at the macro level. It can recognize the hard-structured ware “A” written by different people on paper. This common structure is merged, grouped and abstracted into the letter “A” of the soft-structured ware and forms memories. Deep learning can judge and recognize the existence of thousands of distinct physical chairs (hard-structured ware) in the physical world and form the abstract concept of soft-structured ware in cognitive space, such as “mountain”, “water”, “tree”, “grass”, “chair”, “house”, “person”, “pet” and so on… At the cognitive level, all human thinking activities are abstract. soft-structured ware is the result of abstraction, “virtual unit” of abstract thinking and mirror image of hard-structured ware and. If DNA is a chemical substance with a genetic code in the cells of animals and plants, which is matter with a parasitic structure, and is hard-structured ware, whereas the gene code is a soft-structured ware. Considering the unmanned driving car as an example, the hard-structured ware of the physical space includes the compartment, chassis, motor, sensor, and chip, whereas the soft-structured ware of cognitive space includes the operating system, driving program [18], driving map, and traffic rules. The soft -structured ware and hard-structured ware interact and complement each other to form embodied intelligence. There is no difference between the embodied behavior of an unmanned vehicle and that of a human driver.

Here to note that the elements of cognitive not constitute elements of the universe. The paper is from the universe composed of the single most basic element (matter), to the tool consisting of the two most basic elements (matter, structure), and to power machine composed of the three most basic elements (matter, structure, energy), then to cognitive composed of the four most basic elements of machine (matter, structure, energy and time). Although matter and energy can be interchanged in physics, in cognitive processes if there is only matter and no energy, thought activity can't be carried out, and perception and behavior are impossible. If the power supply of the machine stops, such as power failure, the machine will shut down; And then if the power is resupplied, the machine can bootstrap, activate the operating system, to enter the working state of the cognitive; But hard-structured ware of cognitive machine is impossible to self-grow, repair and replicate, which is very different with soft-structured ware; If hard-structured ware is aging, failing, it can restart after be repaired, if there are new hard-structured ware, soft-structured ware added, the ability of cognition can be improved after the upgrade. The cognitive machine is not a living entity composed of cells, does not share the biological basis of cell fission and growth, cannot reproduce, cannot self-replicate, and cannot self-start, but it can self-replicate soft-structured ware supported by the four elements; it can replicate and extend thought, and realize the self-driven growth of cognition. The machine exhibits embodied as well as general intelligence and can enter into a self-induced state of sleep and wait to be awakened.

Four million years of human evolution led to the formation of genetic advantages. Structure is directly parasitized onto matter, transitioning from barbarism to the invention of tools. Three million years of human evolution provided the language advantage. Six thousand years ago, mankind invented writing and education, establishing the cultural civilization which led to the first cognitive revolution. The past 500 years have been characterized by the use of matter, structure and energy, the invention of machines providing scientific and technological advantages, liberation of human physical strength, and greatly expanded physical space of human activities, all of which led to the second cognitive revolution. In the last 100 years, more sensors and intelligent machines have been invented to liberate human intelligence and provide intelligent advantages whereby human beings entered the third cognitive revolution. Matter, energy, structure, and time are the core elements of human cognition. They are also the core elements of machine cognition, although more soft-structured ware is added to the cognitive machine.

4.1 Einstein's Mass-Energy Equation

The mass-energy equation E = mc2 proposed by Einstein in 1905 is the order of human cognition which is between that of matter and energy in the universe and structure and time in cognitive space. It is supported by the soft-structured ware concepts of “displacement,” “meter,” “kilogram,” “second,” “Joule,” and “velocity”, which explain the relationship between matter and energy in the universe [19]. The speed of light is C = λΜ.fΜ. where λΜ, fM correspond wavelength and frequency, respectively, reflecting the unique physical properties of matter in wavelength and space and time frequency characteristics, The equation expresses that energy and mass are interchangeable and that each kilogram of mass can be converted into 9×1016 joules of energy. Matter in the universe was formed from energy during the Big Bang. When an object emits energy in the form of radiation, its mass decreases, reflecting the overall invariant of the universe and the unification of mass and energy. The relationship between matter and energy cannot be explained without the soft-structured ware concepts of “displacement,” “meter,” “kilogram,” “second,” “Joule,” and “velocity.” The mass-energy equation expresses the four elements of matter, energy, structure and time and the transformation rules in a formula.

4.2 Back to Simon's “Physical Symbol System Hypothesis”

In 1976, Herbert Simon and Alan Newell, pioneers of symbolic artificial intelligence, proposed the “physical symbolic system” hypothesis [20], which expresses the abstract ability in thinking activities and maintains that abstraction is a necessary and sufficient condition for general intelligent behavior. Mathematics is one method of cultivating abstract thinking. Abstraction comes from imitation in early humans, which is the creative activity of thinking; it is the symbol of the generality and universality of the physical entity after it has been “purified from the course to the refined, from the false to the true” by human cognitive abstraction. however, it does not really exist. A symbolic system consists of a set of abstract “symbols” that represent entities and can be regarded as another kind of recomposition for “expressions” (or symbolic structures). Simon further proposed the cognitive system model and the theory of chunking [21], which unifies scattered components into meaningful information units. These very limited symbols, expressions, their sets and operations can actually be called soft-structured ware; however, there are multiple levels of abstraction, and the abstract scales of soft-structured ware are different. The high-level abstract soft-structured ware can be supported by the lower-level soft-structured ware, and the low-level abstractions can be realized by the lower-level abstractions. However, Simon's physical symbol system hypothesis underestimates by far the richness of human abstract ability at the core and is too simple to be a sufficient condition for general intelligence. There is no need for strict logical relations in soft-structured ware. This is evidenced by the success of some of today's large language models, such as ChatGPT, which have up to hundreds of billion connection parameters. Association and interaction are also indispensable, apart from abstraction. This is not noted in the assumption of physical symbolic systems. Creativity is the result of imagination, imagination is the result of bold abstract and simple lenovo, then which is deepened by the analogy calmly, verified in practice. There are many kinds of topological connections in soft-structured ware. Associations, analogies [31], and inferences connect one example to another and to form general knowledge and intelligence. Interaction also ensures that abstraction and association cannot be separated from the physical world. However, Simon used a new symbolic structure equivalent to high-level soft-structured ware to reflect the iterative development of thinking activities, letting the machine think completely through recursion, which was a great contribution. Simon won the Turing Award for his pioneering work in artificial intelligence, cognitive psychology, and list processing in programming. He later won the Nobel Prize in Economics and the Lifetime Achievement Award of the American Psychological Association. In 1994, he was also elected to the Chinese Academy of Sciences as one of the first foreign academicians.

4.3 Silicon-Based Machines Become Accelerators of Human Thinking

In 1936, Turing published “Computable Numbers and Their Application to Decision Problems” [22], which provides a strict mathematical definition of the essence of computability and describes the model that was later named the Turing machine. The proposed model was a very simple but powerful computing device that laid the theoretical foundation of the idea that “computation is intelligence.” Every conceivable computable number can be computed with a Turing machine. Later, the famous “Church-Turing thesis” [23] demonstrated the “equivalence of λ-calculus, recursive functions, and the Turing computability problem,” that is, the idea that all functions that can be computed by mechanical programs can approximate infinity using general recursive functions. “Turing computability” can be regarded as a process in which soft-structured ware relies on reuse to approximate the infinite and is the first reference to machine brute force computing.

Let us consider human calculation of π as an example. In 1900 BC, an ancient Babylonian stone plaque recorded that π was equal to about 25/8 (3.125). In 200 BC, Archimedes imagined two polygons of 96 sides, one inside and one outside a circle, and from their perimeters calculated that the value of π should be between 3.140845 and 3.1428571. In 500 A.D, to calculate π, Chongzhi Zu used an algorithm invented by Liu Hui for calculating pi to any desired accuracy and applied the algorithm to an inner polygon of 12,288 sides, estimating that the value of π should be between 3.1415926 and 3.1415927. Over time, humans have developed better and better methods for calculating the value of π with the help of simple tools. It took 1,700 years to improve the accuracy of the decimal point by one digit, and it took 800 years to improve the accuracy by four digits. However, 2,037 decimal places of π were calculated on the ENIAC computer in 1950. In 1954, the NORC computer calculated 3,089 decimal places in 13 minutes. In 1989, the IBM-VF supercomputer was used to calculate π to 1.01 billion decimal places. In 2010, a Japanese team extended the calculation to 5 billion decimal places using custom-built computers. In 2011 the same team extended the calculation to trillion decimal places, which would require 1 billion sheets of paper to write, assuming that a sheet of A4 paper can hold 60 lines with 17 digits per line. Stacked together, the pile would be 100,000 meters high! With the help of computers, a Turing machine can be designed; the soft-structured ware can be reused. The accuracy of π was improved to 1012 decimal places in only 70 years. It is apparent that silicon-based machines are super powerful accelerators of human thinking and super powerful amplifiers of intelligent behavior, and the exponential growth rate of brute force computation is beyond the reach of carbon-based life intelligence (Figure 3). Humans should fully enjoy the dividends of machine violence thinking, and let artificial intelligence serve the association-driven creation mode of engineers and the hypothesis-driven discovery mode of scientists.

Figure 3.

The precision of π is used to illustrate the computing power of silicon-based machines (The right image is from [17]).

Figure 3.

The precision of π is used to illustrate the computing power of silicon-based machines (The right image is from [17]).

Close modal

The brute force computing power of silicon-based machines has pioneered new directions in machine animation and virtual reality. Through brute force computing, abstraction and association in the cognitive machine can generate vivid virtual realities that can fool the human eye, such as a virtual tsunami. Of course, just as excessive imagination can lead to hallucinations and delusions in mental patients, silicon-based machines can fall into an endless cycle of soft structures that renders them seemingly ineffective even though they are still consuming energy.

When discussing evolutionary change, the commonly used time scale is “ten thousand years.” When discussing the ecological phenomena of human civilization, the commonly used time scale is “a thousand years.” The time scale commonly used to discuss the progress of human thinking and cognition, especially the development of science and technology, is “100 years” or even “10 years.” Recently, intelligent machines reached the nanosecond (10-9) level in thinking speed and are approaching the picosecond (10-12) and even femtosecond (10-15) levels, in infinitesimal steps. The speed of human thinking has not changed much for thousands of years, and the reaction speed of naturally evolved carbon-based life organisms is still stuck at the millisecond (10-3) level, or perhaps lower. The speed of human thinking is seven or eight orders of magnitude behind the speed of machine thinking. The working frequency of the CPU is increased synchronously by means of improvements in the accuracy of the computer clock that are equivalent to the shortening of the execution cycle of language instructions (whether complex instruction set or reduced instruction set) in the baby cognitive nucleus [24]. With the invention of quantum computers, computing power will skyrocket. It is not surprising that the AlphaGo program [25] and AlphaFold program [26] surpass the human brain's ability to perform the same tasks. More importantly, intense machine thinking can in turn promote the imagination of the human brain. In “Computational Machinery and Intelligence” [27], Turing said: “I do not disparage a machine that cannot perform well in a beauty contest, nor a man who fails to race an airplane,” which can be interpreted today as follows: we do not devalue thinking machines that lack consciousness and emotion, nor do we devalue biological humans whose thinking speed is far inferior to silicon-based machines. It is entirely possible for cognitive machines to surpass ordinary urban white-collar workers doing secretarial work; however, it is still difficult to replace jobs with high emotional intelligence and direct service to people. Given time, there will be improvements.

5.1 The Future of Autonomous Driving: The Normalization of the Testing of the Embodied Behavior of Vehicles

Autonomous driving is often mistaken for an automatic control problem, perhaps because it is overly influenced by the L0 to L5 grading scale defined in the J3016 standard published by the Society of Automotive Engineers. Autonomous driving is the problem of pre-training plus fine-tuning in deep learning and is overly influenced by end-to-end deep learning like the work being done by Nvidia. On the road, some people emphasize intelligent networking, relying on high-precision positioning of Beidou/GPS, the guidance of roadside facilities such as RSU/OBU, 5G/6G communication networks, or high-precision navigation maps [28] to provide driving cognition. People are expanding the number of sensors in the vehicle, increasing the number of cameras to dozens, the number of laser radars to seven or eight, and the cache from 64 lines to 128 lines or even more, even adding millimeter wave radar, infrared radar, etc. Some people let the smart car drive millions of kilometers on actual roads to accumulate data on a variety of situations, including accidents. Traffic authorities are establishing a variety of test evaluation criteria to cover as many driving situations as possible, such as changing lanes to overtake, unprotected left turns, merging in and out of traffic, turning at intersections, side parking, tailgating, driving in snow, rolling, flat tires, and accident prevention, for issuing licenses to smart cars. More than ten years ago, our team proposed the driving brain and took the lead by successfully completing an unmanned trip on the actual road from Beijing to Tianjin and from Zhengzhou to Kaifeng. The chassis of the car was made by a well-known bus manufacturer, which recently decided to stop working on autonomous driving and outsourced the business. Observing little progress in the industrialization of automatic driving, people like to joke: “We can hear footsteps on the stairs, but no one comes down.”

The automobile has a glorious history of nearly 200 years of manufacturing development. It is not only a typical product of the industrial revolution, but also a typical product of intelligent manufacturing and an achievement in human mobility. In particular, automotive ergonomics, practiced through the steering wheel, accelerator, and brake and necessitating the movement of human limbs, requires physical strength, and the car becomes an extension of the body. Although the research on vehicle dynamics is becoming more and more mature, and the automation of cars has been achieved to the extreme, an autonomous car without a driver cannot learn like a human, does not give way to pedestrians, is not decisive enough to switch roads, does not try and test, does not respond to the needs of surrounding vehicles, and cannot be flagged down by a pedestrian like a taxi, and cannot deal with various edge conditions, even if it has traveled millions of kilometers on the highway. Autonomous cars have had difficulty in getting recognition in human society. The core of intelligent driving is the formalization of driving cognition, the development and mass production of a machine “driving brain”, and the problem of how to ensure that machine driving is safer, more energy-efficient, and more comfortable than human driving. It is immensely difficult for car manufacturers to develop a driving brain. Autonomous driving will have succeeded when it has passed the embodied Turing test—that is, when people cannot distinguish between the benchmark driver and a machine driving itself. The “Embodied Turing test” refers to a test where a third party cannot distinguish between the behavior of a machine controlled by a human and that of a powerful machine. Driving accidents are ubiquitous, and the embodied interactive intelligence of vehicles is the starting point and end goal of unmanned driving.

A prominent advantage of the machine driven brain is its unwavering attention. Unlike human drivers who may become tired or emotional, the autonomous vehicle of the future can always correctly analyze a situation and take the best action. There is a three-stage process in teaching a machine to drive and training a machine's driving brain to do the job of a benchmark driver, as shown in Figure 4. In the first stage, supervised learning, the benchmark driver operates the vehicle, and the machine brain learns. In the next stage, semi-supervised learning, the machine brain operates the vehicle, and the benchmark driver intervenes if necessary. In the last stage, autonomous learning, the machine driving brain learns and operates the vehicle independently. The process is iterative—that is, it may be necessary to return to the first or second stage of learning, depending on the machine brain's performance. Supervised learning includes planning, task assignment, guidance, clarification, and interaction. Autonomous learning is an important link that transfers the results of supervised learning into long-term memory. The driving brain improves by means of self-directed learning growth through continuous iteration, in order to realize the self-growth of cognition. In addition, it can be improved through the use of various memory sticks containing reference libraries on situation response, accident prevention, and parking, for example. This information helps the driving brain gradually transition from operating with difficulty in unfamiliar situations to operating while remaining appropriately focused on extracting the relevant information from its environment. The norm alization of the embodied Turing test in realizing the self-driven growth cognition holds the key to the future of unmanned driving.

Figure 4.

Driving brain learning process.

Figure 4.

Driving brain learning process.

Close modal

5.2 A New Milestone: The Embodied Turing Test Succeeds the Dialogue Turing Test

Language intelligence is the most basic embodiment of human intelligence. Presently, ChatGPT is being subjected to daily, real-time Turing tests [29] across the globe, with remarkable performance. Dialogue, whether spoken, written, or typed, is an embodied act that consumes energy. However, the GhatGPT cannot replace the multiple intelligences of a person, because it has never acquired any behavioral experience and experience in the physical world outside of text. The long evolutionary process of human beings includes, in addition to visual and auditory interaction, physical interaction with the external environment with limbs and body [30], often referred to as “labor.” All kinds of powered machines such as tractors, harvesters, excavators, cranes, tunnel-boring machines, cars, planes, transport ships, engines, generators, machine tools, production lines in factories, spacecraft, and so on, are widely used. The machines keep running, thereby creating a batch of new jobs, especially skilled jobs, and supporting a large number of excellent operators, skilled craftsmen, and great craftsmen, whose operation skills often amaze people. People revolve around machines, and machines revolve around people, with the cycle repeated generation after generation, as part of normal life. Engels even exclaimed that “labor created mankind.” The design, implementation, and evaluation of powered machines are related to the control of human beings, the interaction between humans and machine, and the behavioral interaction between the human body and machine. Currently, machine development is following a more perfect, natural, and convenient direction with human beings at the control center of production activities.

If the kinds of machines invented and used by human beings since the industrial revolution could be Invisibly imbued with artificial intelligence to realize continuous self-control, the embodied behavior of machines controlled by skilled craftsmen would no longer be distinguishable from that of humans.

People will be freed from the day-to-day, hundreds of millions of all kinds of machine labor (especially hard work), and no longer be enslaved and bound by all kinds of machines for a long time, and which would maintain the increase in production of all kinds of industrial and agricultural products and daily necessities, allowing humans to enjoy economic and social prosperity, as well as engaged in more creative free labor. How much human society will be changed. Therefore, the normalization of the embodied Turing test for unmanned machines will be another important milestone after the normalization of the dialogue Turing test. Perhaps this process will not take more than 100 years, and the autonomous control of vehicles may be the first step.

For any machine, we expect versatility but not omnipotence. We hope that intelligent machines can operate autonomously and replace all types of labor in human society, especially jobs performed under hard conditions and harsh environments. The world perceived by the machine is determined by the heterogeneous sensors with which it is equipped. The type and accuracy of the sensors determine the quality of the machine's perception, which can limit the physical world observed by the machine and affect the cognition and intelligence of the machine. For example, let it or wear a microscope, telescope, or see polarized light, electromagnetic field, or hear ultrasonic, secondary waves, Beidou positioning receiver configuration, and even a specific form of language, such as programming language, art language, chemical language, material formula language and so on to different individual machines. Let it use professional terms and human experts interact. The behavior of the machine is embodied intelligence, which is closely related to the properties of the embodied dynamics of the machine. For example, the behavior of the self-driving car is related to the vehicle dynamics, that of the self-navigating surface ship to the Marine dynamics of the ship, that of the self-driving airplane to the aerodynamics of the fuselage, and that of the self-steering shield machine to the behavior of its servo system. Self-operating surgical robots depend on the dexterity of the scalpel, and humans can configure machines using a variety of powerful or elaborate kinetic energy behavior devices. Because machines are no longer limited by human perception organs, behavioral capabilities, and living bodies, the self-growth of machine cognition will lead to machines surpassing humans in performance.

Self-controlled cognitive machines have powerful power systems and complex servo systems, and more importantly, they can be composed of heterogeneous silicon-based hardware such as field programmable gate arrays (FPGAs), data processing units (DPUs), computing processing units (CPUs), graphics processing units (GPUs), tensor processing units (TPUs), and memory. It can also be embedded with hard-structured ware reflecting the genetic inheritance of “baby cognitive nucleus” to form an embodied machine, having parasitize rich relationships with multi-scale soft-structured ware, which can be bootstrapped and reused. The interaction between intelligent machine and human is realized through the outer loop of behavior through cross-modal perception. The brain of a machine contains heterogeneous, parallel, immediate, working, and long-term memories that cooperate amongst themselves, thereby creating memory intelligence. In the current computer brain, immediate memory and working memory can be realized with DPUs, GPUs, TPUs, FPGAs, and other parallel processors and circuits according to system requirements, whereas calculations can be realized by CPUs, GPUs, and other processors. The future computer brain may be implemented based on new architectures such as 3D memory and integrated computing with higher processing efficiency. In summary, the new generation of intelligent machines will be heterogeneous, even hyper-heterogeneous, learning and performing value alignment in interactions with people who teach them (Figure 5). Soft-structured ware receive feedback through hard constructs in the physical world, makes full use of prediction and control, and forms a loop of perception, thinking and behavior, which have been confirmed to form more and more correct cognition.

Figure 5.

Interaction and synergy in machine learning and machine work.

Figure 5.

Interaction and synergy in machine learning and machine work.

Close modal

The flowchart of a self-operating machine with interactive, learning ability, self-growing ability is shown in Figure 6. In the figure, we can see the multi-level nested execution of a control system, with three types of feedback at different levels: embodied behavioral feedback, choice of attention, and situation awareness feedback. There is conversion between long-term memory, immediate memory, and working memory. There are engines for searching for relevant facts and knowledge. There are behavioral decisions, as well as modifications to memory and rapid retrieval from memory. Align mission goals by correcting erroneous negative feedback.

Figure 6.

Flowchart of machine self-operation.

Figure 6.

Flowchart of machine self-operation.

Close modal

Machine intelligence in age of intelligence leaps from the mechanization of mathematics to the automation of thinking, and then to the self-driven growth of cognition. From the normalization of dialogue Turing tests to the normalization of embodied Turing tests, cognitive machines will invent, discover, and create together with scientists, engineers, and craftsmen. Human beings will enter the intelligent era of human-computer co-creation and iterative development, laying the foundation for the architecture of a new generation of artificial intelligence. Finally, it should be added that the simulation of human embodied behavior and the embodied Turing test are not the focus of artificial intelligence, because intelligent science and technology do not generate artificial life or pursue the bionic engineering of a humanoid appearance, and there is no sign of “the world of robots united.”

The fundamental purpose of humans inventing cognitive machines is to take the physical world directly as a cognitive object to explain and solve realistic problems encountered in the process of human survival. Let us embrace the new era of artificial intelligence and welcome cognitive machines that undertake embodied labor so that people can undertake more creative work. Human beings will surely live more intelligently, with more dignity and grace!

We would like to express our sincere thanks to Yike Guo, Nick, Guanrong Chen, Yu Wei, Jie Chen, Qionghai Dai, Yun Xie, Bing Li, Liwei Huang, Chenglin Liu, Jingnan Liu, Pengju Ren, Yan Peng, Sheng Jiang, Yuchao Liu, Chengqing Zong, Yu Hu Sheng Jiang and many other scholars and graduate students for their help in writing this manuscript. The Chinese version of this paper was published in China Basic Science in 2023.

[1]
Sun
,
R.
:
The self-creation of children's life with complete growth
, pp.
25
41
.
China Women Publishing House
,
Beijing
(
2014
).
[2]
Wu
,
J.
:
Light of civilization
.
Posts & Telecom Press
,
Beijing
(
2014
).
[3]
Descartes
,
R.
:
The philosophical writings of Descartes
, pp.
65
67
.
Cambridge University Press
,
Cambridge
(
1985
).
[4]
Sang
,
J.
,
Jian
,
Y.
:
ChatGPT: a glimpse into AI's future
.
Journal of Computer Research and Development
60
(
06
),
1191
1201
(
2023
).
[5]
Li
,
D.
:
Cognitive physics—The enlightenment by Schrödinger, Turing, and Wiener and beyond
.
Intelligent Computing
9
(
2
), (
2023
).
[6]
Einstein
,
A.
:
Physics and reality
.
Journal of the Franklin Institute
221
(
3
),
349
382
(
1936
).
[7]
Li
,
D.
:
Philosophy of artificial intelligence
.
Science and Society
13
(
02
),
123
135
(
2023
).
[8]
Fain
,
VM.
:
Principle of entropy increase and quantum theory of relaxation
.
Soviet Physics Uspekhi
6
(
2
), (
1963
).
[9]
Li
,
D.
,
Gao
,
H.
:
A hardware platform framework for an intelligent vehicle based on a driving brain
.
Engineering
4
(
4
),
464
470
(
2018
).
[10]
Schrödinger
,
E.
:
What is life?
The physical aspect of the living cell
, pp.
84
89
.
Cambridge University Press
,
Cambridge
(
1943
).
[11]
Oldroyd
,
D.R.
:
Charles Darwin's theory of evolution: A review of our present understanding
.
Biology and Philosophy
1
,
133
168
(
1986
).
[12]
Crick
,
F.H.C.
:
The genetic code—yesterday, today, and tomorrow
. In:
Cold Spring Harbor Symposia on Quantitative Biology
vol.
31
, pp.
3
9
(
1966
).
[13]
Kandel
,
E.R.
:
Nerve cells and behavior
.
Scientific American
223
(
1
),
57
71
(
1970
).
[14]
Hawking
,
S.
,
Mlodinow
,
L.
:
A briefer history of time
.
Bantam Books
,
New York
(
2008
).
[15]
Engels
,
F.
:
The part played by labor in the transition from ape to man
.
Monthly Review
47
(
6
),
1
15
(
1995
).
[16]
Childe
,
V.G.
The bronze age. Past & Present
(
12
),
2
15
(
1957
).
[17]
Heidegger
,
M.
:
Sein und zeit
.
Akademie Verlag GmbH
,
Berlin
(
2007
).
[18]
Li
,
D.
:
Formalization of brain cognition-starting from the development of machine driving brain
.
Science and Technology Review
33
(
24
), (
2015
).
[19]
Hecht
,
E.
:
Einstein on mass and energy
.
American Journal of Physics
77
(
9
),
799
806
(
2009
).
[20]
Newell
,
A.
:
Physical symbol systems
.
Cognitive Science
4
(
2
),
135
183
(
1980
).
[21]
Mandelbrot
,
B.
,
Simon
,
H.
:
A note on a class of skew distribution functions, Analysis and Critique
.
Information and Control
2
(
1
),
90
99
(
1959
).
[22]
Turing
,
A.M.
:
On computable numbers, with an application to the Entscheidungs problem
.
Journal of Mathematics
58
(
5
),
345
363
(
1936
).
[23]
Copeland
,
J.
:
The church-turing thesis
.
AlanTuring.net
(
1997
). Available at: https://www.alanturing.net/turing_archive/pages/reference%20articles/The%20Turing-Church%20Thesis.html. Accessed 2 January 2023.
[24]
Li
,
D.
:
Basic problem of artificial intelligence: can machines think?
.
Caai Transactions on Intelligent Systems
17
(
04
),
856
858
(
2022
).
[25]
Alzubi
,
J.
,
Nayyar
,
A.
,
Kumar
,
A.
:
Machine learning from theory to algorithms: An overview
. In:
Journal of Physics: Conference series
. vol.
1142
, (
2018
).
[26]
Jumper
,
J.
,
Evans
,
R.
,
Pritzel
,
A.
,
Green
,
T.
(eds.):
Highly accurate protein structure prediction with AlphaFold
.
Nature
596
,
583
589
(
2021
).
[27]
Turing
,
A.M.
:
The essential Turing: the ideas that gave birth to the computer age
.
Computing Machinery and Intelligence
(
2012
),
433
464
(
1950
).
[28]
Liu
,
J.
: “
Progress and consideration of high precision road navigation map
”,
Engineering Science
20
(
2
),
99
105
(
2018
).
[29]
Guo
,
B.
,
Zhang
,
X.
,
Wang
,
Z.
:
How close is ChatGPT to human experts?
Comparison corpus, evaluation, and detection
. arXiv preprint arXiv:2301.07597. (
2023
).
[30]
Cannon
,
W.B.
:
The wisdom of the body
.
Nature
(
133
), (
1934
).
[31]
Hofstadter
,
D.R.
,
Sander
,
E.
:
Surfaces and Essences: Analogy as the Fuel and Fire of Thinking
.
Basic Books
,
New York
(
2013
).
[32]
Assran
,
M.
,
Duval
,
Q.
,
Misra
,
I.
,
Bojanowski
,
P.
,
Vincent
,
P.
,
Rabbat
,
M.
,
LeCun
,
Y.
,
Ballas
,
N.
:
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
. arXiv preprint arXiv:2301.08243 (
2023
).
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.