This article proposes a method for an artificial agent to behave in a social manner. Although defining proper social behavior is difficult because it differs from situation to situation, the agent following the proposed method adaptively behaves appropriately in each situation by empathizing with the surrounding others. The proposed method is achieved by incorporating empathy into active inference. We evaluated the proposed method regarding control of autonomous mobile robots in diverse situations. From the evaluation results, an agent controlled by the proposed method could behave more adaptively socially than an agent controlled by the standard active inference in the diverse situations. In the case of two agents, the agent controlled with the proposed method behaved in a social way that reduced the other agent’s travel distance by 13.7% and increased the margin between the agents by 25.8%, even though it increased the agent’s travel distance by 8.2%. Also, the agent controlled with the proposed method behaved more socially when it was surrounded by altruistic others but less socially when it was surrounded by selfish others.

We humans are social animals, and we are required to behave socially to live in a society. Therefore artificial agents operating in our daily spaces with us are also required to behave socially. Social behaviors are behaviors among multiple agents in general, and they are “appropriate” behaviors for these multiple agents in particular. The challenge in realizing socially behaving artificial agents is that the appropriateness of social behaviors varies from situation to situation (Henrich, 2015). Although we humans define some of appropriate social behaviors as explicit rules, such as laws or standards, we are also required to follow implicit socially appropriate behaviors that are not defined as explicit rules. For example, when using an escalator, in one country, it is considered appropriate to ride on the right side, whereas in another country, it is considered appropriate to ride on the left side. Appropriate social behavior on escalators varies from country to country and region to region. These social behaviors, which are implicitly required in every situation, can be thought of as manners. Artificial agents that are capable of social behavior must adopt not only explicitly rule-able behaviors but also these implicit social behaviors. However, such implicit socially appropriate behavior is difficult to implement as a predefined program in an artificial agent because it is not explicitly defined as a rule and because it varies from situation to situation.

On the basis of this research motivation, the goal of this article is to develop an artificial agent that is not a predefined program but is capable of learning and adopting appropriate implicit social behavior in diverse situations. We proposed the method to realize artificial agents capable of adopting such implicitly desired social behaviors (Matsumura et al., 2022). This article describes the details of the proposed method and reports evaluation results in more diverse situations. The core idea of the proposed method is that others surrounding us are not only objects of recognition but also sources of social behaviors, and we humans can make appropriate cognition and take proper action due to the help of the presence of others in diverse situations.

The proposed method is implemented based on active inference (Adams et al., 2013; Friston et al., 2011, 2016, 2017). Active inference was proposed under the context of the free energy principle (FEP), which is a hypothesis for understanding the mechanism of a biological agent’s cognitive activities (Friston, 2010; Friston et al., 2006). In FEP, the brain is viewed as a device performing variational Bayes inference. The human brain is explained as always predicting the future and works to decrease the uncertainty of predictions. Similar ideas were widely studied within certain contexts, such as the Bayesian brain hypothesis (Knill & Pouget, 2004), predictive coding (Rao & Ballard, 1999), and the Helmholtz machine (Dayan et al., 1995). The unique point of active inference is that it explains actions as well as perception with only one principle: minimization of variational free energy. It is assumed that there is an internal model for predicting external environments in the human brain. The process of perceptions is explained as the process of minimizing free energy of the internal model by updating the state of that model. The process of taking action is also explained as the process of minimizing expected free energy for the future under a certain action. Although active inference is discussed mainly for single-agent cases, we extend the active inference to multiagent cases. In multiagent cases, we can consider two types of uncertainty, that is, (a) the agent’s uncertainty of others and (b) others’ uncertainty of the agent. The proposed method determines the agent’s action based on these two types of uncertainty. By incorporating the second type of uncertainty, an agent controlled with the proposed method (hereinafter an empathic agent) attempts to act based on the other’s expectations, which makes it behave in an adaptively socially desirable manner. This is achieved by estimating others’ predictions about the agent by simulating others’ situations. The empathic agent estimates the predictions of others using the same model that is used to predict the future of its external environment. When the model is used to predict the predictions of others, the input data of the model are changed from the agent’s observation to the others’ observations, which are also estimated by the agent.

We simulated the walking behavior of the proposed empathic agents in the presence of others to evaluate and discuss the social behavior of the proposed method. When we humans walk, we do not walk selfishly toward a destination; rather, we walk with consideration for others around us. Therefore walking is a good situation for evaluating social behavior, and models like social long short-term memory (LSTM) (Alahi et al., 2016; Mohamed et al., 2020; Vemula et al., 2018) have been proposed to predict walking that takes sociality into account. In the simulation of a walking scene, the trajectory and travel time can be obtained. We quantitatively discuss the social behavior of the proposed method by comparing the trajectory and travel time of the agent when it behaves asocially, that is, when it acts selfishly, without considering others around it, and when it is based on the proposed method.

We describe the details of active inference and the proposed method in section 2. In section 3, we discuss the social behavior of agents based on the proposed method by comparing it with the behavior of asocial agents. Then, in section 4, we summarize and discuss future work, and in section 5, we discuss related research.

2.1 Active Inference

This section describes the active inference on which the proposed method is based. As mentioned in section 1, active inference is a hypothesis for understanding the mechanisms of action of biological agents (Adams et al., 2013; Friston et al., 2011, 2016, 2017) and was proposed under the context of the FEP (Friston, 2010; Friston et al., 2006). The FEP is widely studied and is applied to explain many cognitive abilities or phenomena, such as behavior (Friston et al., 2010), planning (Kaplan & Friston, 2018), autism (Quattrocki & Friston, 2014), and attention (Feldman & Friston, 2010). The most basic cognitive mechanisms, perceptions, and actions are also explained as the process for minimizing variational free energy. It seems natural to explain perception as inference minimizing the uncertainty of the internal model for the inference. More interestingly, an action is also explained with the same principle. In the FEP, an action is explained as the process of inference in which biological agents actively act to decrease uncertainty, and the best action is that which is expected to decrease uncertainty the most. This process for performing actions is called active inference. Active inference is studied in a variety of environments (Friston et al., 2018; Pio-Lopez et al., 2016; Parr & Friston, 2017). It has been combined with deep learning for applying it to more complicated environments, such as robot control (Çatal et al., 2020, 2021; Fountas et al., 2020; Millidge, 2020; Tschantz et al., 2020; Ueltzhöffer, 2018). Although active inference is basically defined for single-agent cases, some works have recently extended it to multiagent cases (Albarracin et al., 2022; Friedman et al., 2021; Kaufmann et al., 2021; Wirkuttis & Tani, 2021). We now mathematically describe the FEP and active inference.

As illustrated in Figure 1, there is an agent that receives observations (ot) from an environment at time t. There are hidden states (st) behind the process of generating the observation. The agent takes an action (at) each time, then receives the next observation (ot +1) from the environment. The agent always infers the hidden state and action at the current state from the current observation as the following posterior:
Pst,atot
(1)
A variational density, Q(st, at), is assumed to infer the posterior by variation methods. Under this context, variational free energy (F) is expressed as
F=logP(ot)+KLQ(st,at)P(st,atot)
(2)
where KL is a Kullback–Leibler divergence. Variational free energy is the same as negative evidence lower bound in machine learning (Blei et al., 2017). In the FEP, it is an important idea that the action (at) is also inferred like the hidden state (st). On the basis of this idea, minimizing free energy can be achieved with two types of processes. The first type of minimization process is related to the inference for a hidden state. This is related to perception; that is, it can decrease by refining the internal model for inferring the probability of the hidden state and its transition. More interestingly, an agent can decrease the free energy by refining the internal model for inferring the probability of the action, and this is the second type of process of minimizing free energy. From this viewpoint, it can be said that our action-making process is also the process of inference, as with perception. Agents based on the FEP perceive and act by minimizing variational free energy with these two types of minimization processes.
Figure 1. 

Overview of the free energy principle. An agent receives observations from an environment, and the agent always infers the hidden state and action with minimizing free energy.

Figure 1. 

Overview of the free energy principle. An agent receives observations from an environment, and the agent always infers the hidden state and action with minimizing free energy.

Close modal
In active inference, the desired distribution of actions at a given hidden state, Patst, is expected to minimize free energy for a future state when an action is taken from the distribution. Patst was defined by
Patst=σγG(st,at)
(3)
where σ is a softmax function, γ is a precision weight, and G(st, at) is the expected free energy. Expected free energy is the estimated free energy for future t. Equation 3 means that the agent estimates free energy for a certain action in the future; that is, the agent makes a plan. If the agent is assumed to plan several steps ahead, the expected free energy is estimated for the sequence of actions, π = {at, at +1, at +T}. In active inference literature, π is called a policy. From Equation 2, expected free energy for a single time step is expressed by
G(st,at)=logP(ot)+KLQ(st)P(stot)
(4)
A neural network model is used to estimate expected free energy by Millidge (2020), and a Monte Carlo tree search (MCTS) is also used by Fountas et al. (2020). In active inference literature, the first term, logP(ot), is treated as a preference of the agent to the observation, and the intention of the agent is encoded into this term as a reward signal, r:
G(st,at)=r(ot)+KLQ(st)Q(stot)
(5)
where P(st|ot) is approximated by Q(st|ot). The second term in Equation 5 is called the intrinsic value and expresses curiosity to explore the environment. From Equation 5, the agent acting on the basis of active inference takes into account both the intentional behavior expressed by the reward term and curiosity to explore the environment.

2.2 Active Inference With Empathy Mechanism

We extended active inference to generate social behavior. For this purpose, we embed the human capacity of empathy into active inference. There are mainly two types of human activity regarding empathy: (a) cognitive empathy and (b) emotional (or affective) empathy (Davis, 1983). Cognitive empathy is the ability of inferring another’s mental state. Emotional empathy is the ability to feel what others are feeling as if it were your own feelings. In any type of empathy, understanding or sharing the experiences or feelings of others is thought to be related to our sociality (Eisenberg & Miller, 1987). Theory of mind and simulation theory are theoretical frameworks for understanding the mechanisms of these emotions. According to simulation theory, we can infer another’s mental state by simulating what we would infer or feel if we were in the same situation as another. Our proposed method enables an agent to empathize with others and is inspired by the simulation theory to virtually experience the experiences of others using its internal models as if simulating others’ mental states with its own body. Figure 2 shows the core idea of the proposed method, named empathic active inference.

Figure 2. 

Overview of the core idea of the proposed empathic active inference. An agent always infers the future external environment and attempts to decrease the uncertainty of the inference. Others surrounding the agent also always infer the future external environment and attempt to decrease the uncertainty of inference. The empathic agent infers the other’s inference and acts to decrease the uncertainty of the other’s inference.

Figure 2. 

Overview of the core idea of the proposed empathic active inference. An agent always infers the future external environment and attempts to decrease the uncertainty of the inference. Others surrounding the agent also always infer the future external environment and attempt to decrease the uncertainty of inference. The empathic agent infers the other’s inference and acts to decrease the uncertainty of the other’s inference.

Close modal

Figure 2 shows two agents, I and other, both acting with active inference. The agent I always infers the future external environment and attempts to decrease the uncertainty of the inference. It is important that the others surrounding I also always infer the future external environment and attempt to decrease the uncertainty of inference. I is an uncertain factor for the others. Given these situations, there is a way to decrease free energy in addition to the ways described in the previous section about active inference: act to decrease others’ uncertainty. Although it will not decrease I′s free energy, it will decrease the total amount of free energy in the group of agents. Similarly, the free energy of I can decrease by the actions of the others if the others act to decrease the free energy of I. We can manipulate free energy not only through our actions but also through the actions of others by thinking of free energy collectively rather than individually. Actions based on this can be said to be for others, that is, social action. In this article, we will refer to the agent that acts in this way as an empathic agent.

An empathic agent first needs to infer another’s inference to it. For this purpose, the empathic agent uses an idea inspired by the simulation theory. The core idea for inferring another’s inference is that the other’s inference is inferred using the same internal model used for inferring the empathic agent’s external environment. Inference of another’s inference consists of two processes, as illustrated in Figure 3.

Figure 3. 

Processes of an inference of the other’s inference. Inference of an other’s inference consists of two processes. The first process is to estimate the observation of others from the observation of the empathic agent. The second process generates a prediction of others by giving the estimated observation of others to the internal model that the empathic agent uses to predict its external environment.

Figure 3. 

Processes of an inference of the other’s inference. Inference of an other’s inference consists of two processes. The first process is to estimate the observation of others from the observation of the empathic agent. The second process generates a prediction of others by giving the estimated observation of others to the internal model that the empathic agent uses to predict its external environment.

Close modal
The first process is to estimate the observation of others (ototh) from the observation of the empathic agent (otmy). If the observation is given by an image of vision, then the agent infers the observed image of the others. In this case, a method for novel view synthesis, such as a generative query network (Eslami et al., 2018), can be used. If the observation is the set of positions of the others extracted from object detectors, such as infrared sensors, the observation of others can be estimated simply by coordinate transformation. However, even in such a simple case, because of partial observability, it is not always possible to completely estimate the observation of others. For example, if there is an object that is not visible to the empathic agent because of occlusions or other factors, information about that object will be missing, even if the others observe it. The second process generates prediction for the external environment of others (ot+1oth) by giving the estimated observation of others (ototh) to the internal model that the empathic agent uses to predict its external environment. This method, inspired by simulation theory, infers the inferences of others by simulating the situation of “if I were in the other’s situation.” Here the action of the other (atoth) is also required to infer the inference of the other; however, inferring the action of the other is difficult in general. In this article, we assume a situation in which the others give priority to the artificial agent to act first and assume no-operation (NOP) action as the action of others. As explained in the previous section, an agent acting in accordance with active inference will take the action that will most decrease the expected free energy. Intentional behavior is achieved by encoding the reward information corresponding to the behavioral intention as a preferable observation. We similarly extend Equation 5 to a form that encodes an intentional response to others’ expectations to the agent as a reward:
G(st,at)=r(ot)+KL[Q(st)Q(stot)=rmy(ot)+iroth(ot,eti)+KLQ(st)Q(stot)
(6)

The first term is the reward for the agent’s goal (rmy), which is determined by the observed information and is the same as the reward in Equation 5. In Equation 6, a reward term for the expectations of others (roth) is added, determined by the observation (ot) and expectation of others of the agent (eti), which are estimated by the agent. The expectation of others is derived from the inference of others (ototh). For example, if the positions of the others surrounding the others are estimated as the inference of others, the positions of the empathic agent inferred by the others are the expectations from the others. Similar to the flexibility in encoding rewards in active inference (Millidge, 2020), expectations from others can be encoded flexibly, such as by using probability distributions. For multiple others, the reward for each of the others (i) is summed. The proposed agent takes the action that will most decrease the expected free energy described in Equation 6. It is important that an inference of others to the empathic agent be interpreted as an expectation to the empathic agent from others. An artificial agent controlled by the proposed method does not act only according to its own objectives; it also takes into account the expectations of others around it. In other words, empathic agents are capable of taking social action, that is, action that takes others into account, and in particular, they realize the appropriate social action required in each situation as the action that others expect them to take.

2.3 Ideal Behaviors in Simple Game

In this section, we illustrate the ideally expected behavior of an agent based on the proposed method in a simple game, a public goods game widely used in game theory. As mentioned in section 1, the definition of socially appropriate behavior depends on the situation; however, in the public goods game, the social appropriateness of behavior can be discussed quantitatively by the sum of public goods. In the following explanation, the agent that acts according to the proposed method is referred to as the empathic agent. The public goods game is a virtual game for examining the sociality of agents and is illustrated in Figure 4.

Figure 4. 

Overview of the public goods game. There are multiple players. Each player is required to decide how much of his or her goods to offer to the public. Goods provided by players are multiplied by k and returned (r) equally among all players. The multiplier k is greater than 1 and less than the number of participants, N.

Figure 4. 

Overview of the public goods game. There are multiple players. Each player is required to decide how much of his or her goods to offer to the public. Goods provided by players are multiplied by k and returned (r) equally among all players. The multiplier k is greater than 1 and less than the number of participants, N.

Close modal

There are multiple players; the case of N = 4 players is illustrated in Figure 4. Each player is required to decide how much of his or her goods (property) to offer to the public. Goods provided by players are multiplied by k and returned (r) equally among all players. The multiplier k is greater than 1 and less than the number of participants, N. These are the game settings for the public goods game.

When all players provide all goods to the public, the total goods of all players are maximized. Therefore, in terms of the total quantity of goods, the most socially desirable behavior is for all players to provide all goods to the public. However, each player’s personal goods are maximized when each player offers nothing, because a player who offers nothing to the public can benefit only from others without reducing his or her own goods at all. A player who does not provide goods at all is called a free-rider. To maximize the benefits to society, it is important to reduce the existence of free-riders; stated differently, in the public goods game, the appropriateness of social behavior can be discussed in terms of how altruistic it is. Experimental results using subjects have shown that free-riders are rarely observed in practice and that humans provide a small number of goods (Fischbacher et al., 2001). In other words, humans do not act only to maximize their own goods but also act in altruistic ways that benefit others. Many discussions and experiments explain and help one to understand this behavior, for example, punishment of free-riders encourages social behaviors (Fehr & Gächter, 2000, 2002).

If an empathic agent can ideally predict the expectations of others, the empathic agent’s behavior can also explain human social behavior in public goods games. In Equation 6, the reward for each player is the amount of goods. Assuming that in an ideal situation, the empathic agent can perfectly predict the behavior of others, the KL term in Equation 6 is ignored. We examine the behavior of the empathic agent in the case in which the characteristics of the three players other than the empathic agent are completely selfish and completely altruistic, respectively. When a player is completely selfish, the player offers nothing to the public; that is, the player behaves as a free-rider. Conversely, if a player is completely altruistic, the player offers all of his or her goods to the public.

First, we examine the behavior of the empathic agent for the case in which the other players are completely selfish. In this case, if the empathic agent can predict that others will behave perfectly selfishly, it predicts that other players will also predict that it will behave perfectly selfishly. This is because the empathic agent uses the same internal model that predicts the behavior of others to predict its own behavior predictions from others. In this case, the behavior that best meets the expectations of others is the case of a completely selfish behavior of not offering anything. Also, the behavior that maximizes one’s own reward is the behavior that offers nothing to the public. Therefore the reward-maximizing behavior for both self and others is to offer nothing to the public, and in this case, the empathic agent behaves as a free-rider, like the others surrounding it.

Second, we examine the behavior of the empathic agent for the case in which the other players are completely altruistic. In this case, as in the previous discussion, the behavior that maximizes the empathic player’s own reward is the behavior of offering nothing to the public. Conversely, the empathic agent predicts that all others will behave perfectly altruistically in this case, and the empathic agent predicts that all others expect the empathic agent to behave perfectly altruistically, namely, the others expect the empathic agent to offer all of its goods to the public. Thus, as the average of the offering that maximizes its own reward (zero) and the offering that maximizes the expectations of N −1 others (all), the empathic agent takes the behavior of offering N −1/N of the total amount to the public.

From the same discussion, as shown in Figure 5, the amount offered to the public by the empathic agent increases with an increase in the ratio of completely altruistic players in the surroundings.

Figure 5. 

Ideal behavior of empathic agents in public goods games. The amount offered to the public by the empathic agent increases with an increase in the ratio of completely social players in the surroundings.

Figure 5. 

Ideal behavior of empathic agents in public goods games. The amount offered to the public by the empathic agent increases with an increase in the ratio of completely social players in the surroundings.

Close modal

The same discussion holds true not only for an increase in the ratio of completely altruistic players but also for an increase in the amount of each player’s offerings to the public. As the foregoing discussions show, the proposed empathic agent consequently behaves similarly to others around it. Thus, if we consider the amount offered to the public as a sociality index, we can say that the sociality of the empathic agent changes adaptively according to the sociality of the surrounding others. Therefore, in the proposed method, people behave altruistically in public goods games because others around them in the environment in which they were raised were altruistic. This explained behavior realizes the idea of implicitly required social behaviors considered in this article, namely, the adoption of appropriate implicit social behaviors in given situations.

3.1 Simulation Setup

To evaluate and discuss the social behaviors of the proposed empathic agent, we ran multiagent simulations in which multiple agents walked from their initial positions to their destinations. The reason for simulating walking scenes is that a human’s walking is a social behavior (Alahi et al., 2016; Mohamed et al., 2020; Vemula et al., 2018), and it is possible to discuss behavioral comparisons quantitatively through travel times and trajectories. Social behaviors are behaviors among others. Conversely, asocial behaviors are behaviors in which others are not considered at all. Therefore the social behavior of the proposed empathic agents can be discussed by comparing it to the behavior of the case in which surrounding others are not considered at all. For this comparison, we also simulated the behavior of an agent controlled by standard active inference in addition to the proposed empathic agent.

In addition, we simulated several cases in which the altruism of others surrounding the empathic agent varied or the number of walkers varied. This is to confirm that the social behavior considered in this article is not simply altruistic behavior, as discussed in section 2.3, but altruistic behavior that is adaptively adjusted according to the altruism of others around the agent. Two types of conditions were changed, depending on the scenario. The first condition was the situation of the scenario, namely, the settings for the number of walkers and their starting and ending points. Three types of ideally symmetrical situations were assumed for the simulation, as illustrated in Figure 6.

Figure 6. 

Types of simulation situations. A player and some others move from their initial positions to each player’s destination, avoiding collisions with each other. The initial positions and destinations are symmetrically placed. The number of others in each situation varies from one to three.

Figure 6. 

Types of simulation situations. A player and some others move from their initial positions to each player’s destination, avoiding collisions with each other. The initial positions and destinations are symmetrically placed. The number of others in each situation varies from one to three.

Close modal

Moreover, asymmetric situations were also simulated to evaluate for more realistic situations, as discussed in section 3.4. In this section, the standard/empathic active inference agent is called the “player,” and the other agents are simply called the “others” or “other.” The player is the red dot in Figure 6. Situation (a) is the simplest: Two agents, the player and one other, walk from their initial points to their destinations. Because their initial points and destinations are opposite each other, the player and the other must take the nonshortest path (a nonstraight line between the initial point and the destination) to avoid colliding. Situation (b) is denser than situation (a): Two others plus the player are walking and will cross each other at the center of the field. Situation (c) is the densest: Three others plus the player walk in a crossroad and will pass each other at the center of the field. No obstacles (i.e., walls) are present in any of the situations, and agents can walk in any area of the field.

The second condition is the type (characteristic) of the other, that is, selfish or altruistic. The type of other is controlled by tuning the parameters of the model that controls the walking of the others. The others were controlled by the social force model (SFM) (Helbing & Molnar, 1995), which is a model for controlling pedestrians in a social space. Although there are a variety of extensions, the SFM basically models the motion of agents by the combination of driving and repulsive forces. The driving force describes the motivation of agents to move toward the given goal at a certain desired velocity. The repulsive force represents the motivation of agents to avoid colliding with others or with obstacles, such as walls. When others are selfish, the weight of the driving force toward the destination is set higher than that of the repulsive force with others. When others are altruistic, the weight of the repulsive force is set higher than that of the driving force toward the destination. Example trajectories of each type of other (green) are illustrated in Figure 7.

Figure 7. 

Two types of other. The difference between the types of other appears in the difference of the margin to avoid collision. (right) When the other is altruistic, it walks with a larger margin with the player than (left) when it is selfish.

Figure 7. 

Two types of other. The difference between the types of other appears in the difference of the margin to avoid collision. (right) When the other is altruistic, it walks with a larger margin with the player than (left) when it is selfish.

Close modal

In this example, the player (red) moves in a straight line toward the destination, and the other takes the nonshortest path to avoid colliding with the player in both cases. The difference between the types of other appears in the difference of the margin to avoid collision. When the other is altruistic, it walks with a larger margin with the player than when it is selfish. In the evaluation of the symmetric cases, all of the others are assumed to have the same type (characteristic). For example, in the situation in Figure 6(c) with three others, assuming that the others are selfish means assuming that all three others are selfish. Alternatively, in the evaluation of asymmetric cases discussed in section 3.4, we also show the case in which the three others have different types. In summary, we evaluate and discuss the social behavior of the proposed empathic agent from two viewpoints: (a) comparing the behavior of the empathic agent with that of an asocial agent (standard active inference) that does not consider others at all and (b) how behavior changes with differences in the surrounding situations.

3.2 Setup for Empathic Agent Model

The observation of the player is the position of agents relative to the current position of the player. The observation is constructed for the two simulation steps, that is, [0t−1, 0t]. As an ideal case of observation assumed in this simulation, there is no lack of data caused by occlusions, and there is no noise in the observation. The action space is defined as a discrete space. The player moves a constant distance in five directions (−60°, −30°, 0°, 30°, 60°) toward the current direction. The constant distance of the player’s movement is almost the same as that of others. In addition to these actions, the player can select NOP action, that is, to stop at the current position. A total of six actions comprises the action space for the player. The action is encoded into one-hot vectors.

Two densities were assumed in the evaluation, and they were modeled using simple neural networks. The models are illustrated in Figure 8.

Figure 8. 

Models for empathic active inference. The observation model represents Pθ(st|ot) and Pθ(ot|st). The transition model represents Pϕ[δ(s)t +1|st, at] and Pϕ[δ(o)t +1st +1]. Both the observation and transition models use a reparameterization trick to express probabilistic distribution with neural networks, like a variational autoencoder.

Figure 8. 

Models for empathic active inference. The observation model represents Pθ(st|ot) and Pθ(ot|st). The transition model represents Pϕ[δ(s)t +1|st, at] and Pϕ[δ(o)t +1st +1]. Both the observation and transition models use a reparameterization trick to express probabilistic distribution with neural networks, like a variational autoencoder.

Close modal
The first model is the observation model for Pθ(st|ot) and Pθ(ot|st), parameterized by θ. The second model is the transition model for Pϕδ(s)t+1st,at and Pϕδ(o)t+1δst+1, parameterized by ϕ. As described in section 3.1, the movements of others were simulated by SFM. SFM determines movements of agents based on the current position and velocity of the agents (Helbing & Molnar, 1995), so that the agents of others can be predicted by inputting observation information for at least two time positions. In addition, there is no need to maintain history of movements. Therefore complicated models, such as recurrent neural networks or LSTM, that can maintain history are not required, and both of the models consist simply of two fully connected layers (linear block; see Figure 8). The observation and transition models use a reparameterization trick, a technique to express probabilistic models with neural networks (Kingma & Welling, 2013). Because probabilistic models require a sampling process that disconnects the data flow of neural networks, backpropagation, which is an efficient learning method for neural networks, cannot be applied. Neural networks with the reparameterization trick compute mean (μ) and variance (σ) deterministically and generate samples (z) as z = μ + σ + ε by using samples (ε) from a normal distribution N(0, I). This allows the neural network to sample and learn by backpropagation because the data flow is not disconnected. The latent vectors in both the observation and transition models are assumed to be distributed normally. The distributions of the latent vectors of the observation model (Pθ) and transition model (Pϕ) are learned to get closer to each other. The transition model is constructed to model the difference in the observations (i.e., difference in observed positions) between the present and future, not the absolute position in the future. This is because there is continuity in this environment, namely, the positions of agents continuously change, and agents do not jump to faraway places in a time. This feature can help in learning the transition model. For this modification, each of the distributions of the latent vectors of the observation model (Pθ) and the transition model (Pϕ) are learned to become closer to the standard normal distribution, N(0, I), respectively, and simultaneously, the distribution of the observation model (Pθ) for st +1 is learned to become closer to the sum of the normal distribution of the observation model (Qθ) for st and the transition model (Pϕ) for st +1. Namely, the KL divergence described in Equation 7 also decreases in the learning process:
KLN(st+1μ,st+1σ)N(stμ+δst+1μ,stσ+δst+1σ)
(7)

In this evaluation, we do not model the density of P(at|st) with neural networks, although it is modeled as a policy model in previous works (Fountas et al., 2020; Millidge, 2020) because the action of the player is determined by the action probability P(at|st) described in Equation 3 with Equation 6. The neural networks were trained in an offline manner because the aim was to evaluate the sociality of the empathic agent’s behavior, not to validate the feasibility of online training. Therefore the training data for each scenario, that is, three situations and two others’ characteristics, are generated in advance with a randomly acting player. During training-data generation, the player and one of the others are interchanged at random for the purpose of giving the player the experience of others’ viewpoints, and it is necessary because the player must infer the others’ inferences with the others’ viewpoints. When the player has no experience of the others’ viewpoints, it cannot infer the inferences of others. One million samples were generated for each scenario. Adam was used for optimizing the neural network’s parameters (Kingma & Ba, 2014), and its learning rate was constantly set to 1e-4. Training proceeded for 10,000 epochs. KL vanishment is a known difficulty in learning the VAE model, and KL annealing was proposed to tackle this problem (Bowman et al., 2016). In the learning process, the weight of KL loss is doubled from 1e-4 at every 1,000 epochs as KL annealing.

In the simulations, the player estimated the expected free energy described in Equation 6 by MCTS with the learned models (Coulom, 2006; Fountas et al., 2020). The learned models were used for predicting the future state and value of each action (i.e., expected free energy) in MCTS. The maximum depth of the tree was set to 3 (i.e., three time steps are maximally estimated), and the search was run for 3,000 iterations for each time step. The action of the player is determined by Equation 3 with the estimated expected free energy for each action. The precision weight (γ) in Equation 3 is set to 1. The behavior of an agent controlled with standard active inference (hereafter, standard agent) was also evaluated for comparison. Although the standard agent is also based on the prediction about a future environment using internal models, it does not take into account expectations from others. The reward (rmy) in Equation 6 is defined by how close the agent is to the destination. If the agent moves closer to the goal, then a positive reward is returned and the expected free energy will be smaller than if it moves away from the destination. Moreover, the empathic player (i.e., empathic agent) estimates the reward of its future state for others’ expectations, roth, in Equation 6. This reward is defined by the distance between the position of the player and others’ expected positions.

3.3 Results for Ideal Symmetrical Cases

We discuss the social behavior of the proposed empathic agent. As described in previous sections, the social behavior is behavior among others, and we discuss it by comparing behavior of the empathic player with that of the standard player controlled by standard active inference. Although it is difficult to define the appropriateness of social behavior, the minimum distance between agents can be considered as an indicator from the point of view of safety to avoid collisions in the behavior of walking. The larger the minimum distance, the more socially desirable walking is. Moreover, from the point of view of equality, the average difference in distance traveled by each walker can be considered as an indicator. The smaller the average difference in distance traveled by walkers is, the more socially desirable walking is. Therefore we evaluated the minimum distance and the average difference (inequality) among the walkers. We also discuss the altruism of empathic players based on differences in travel distances between the empathic players and the standard players, as well as differences in travel distances based on differences of surrounding others’ characteristics (selfish or altruistic). The behaviors of the agents are shown in Figure 9 and Table 1.

Figure 9. 

Trajectories of agents for each scenario. Two types of others (selfish and altruistic) are assumed for each situation. Results for both standard and empathic players are shown. Trajectories represent the travel history of each agent.

Figure 9. 

Trajectories of agents for each scenario. Two types of others (selfish and altruistic) are assumed for each situation. Results for both standard and empathic players are shown. Trajectories represent the travel history of each agent.

Close modal
Table 1. 

Quantitative evaluation of behaviors.

Travel distanceTravel distanceInequality
Method(player)(average of others)Minimum distance(average difference)
Situation (a) 
Selfish     
Standard 4.68 5.04 0.51 0.358 
Empathic 4.77 4.80 0.81 0.027 
Altruistic     
Standard 4.65 5.93 1.20 1.279 
Empathic 5.03 5.12 1.51 0.100 
Situation (b) 
Selfish     
Standard 4.77 5.09 0.70 0.244 
Empathic 4.95 4.97 0.57 0.128 
Altruistic     
Standard 4.58 6.05 0.75 1.405 
Empathic 5.40 5.35 1.40 0.371 
Situation (c) 
Selfish     
Standard – – – – 
Empathic 4.85 5.12 0.61 0.338 
Altruistic     
Standard 4.85 6.45 0.87 1.341 
Empathic 5.40 5.74 1.39 0.285 
Note. Travel distances of the player and others, minimum distances between agents, and inequality of travel distances are shown. Inequality among agents is calculated as the average of the differences in the distance traveled by the agents. Data for situation (c) of selfish others are not listed because agents collided with each other before reaching each destination. 
Travel distanceTravel distanceInequality
Method(player)(average of others)Minimum distance(average difference)
Situation (a) 
Selfish     
Standard 4.68 5.04 0.51 0.358 
Empathic 4.77 4.80 0.81 0.027 
Altruistic     
Standard 4.65 5.93 1.20 1.279 
Empathic 5.03 5.12 1.51 0.100 
Situation (b) 
Selfish     
Standard 4.77 5.09 0.70 0.244 
Empathic 4.95 4.97 0.57 0.128 
Altruistic     
Standard 4.58 6.05 0.75 1.405 
Empathic 5.40 5.35 1.40 0.371 
Situation (c) 
Selfish     
Standard – – – – 
Empathic 4.85 5.12 0.61 0.338 
Altruistic     
Standard 4.85 6.45 0.87 1.341 
Empathic 5.40 5.74 1.39 0.285 
Note. Travel distances of the player and others, minimum distances between agents, and inequality of travel distances are shown. Inequality among agents is calculated as the average of the differences in the distance traveled by the agents. Data for situation (c) of selfish others are not listed because agents collided with each other before reaching each destination. 

From the result for situation (a) in Figure 6, the standard player almost walked straight to the destination. This is because it predicted that it would obtain the highest reward (i.e., lowest free energy) if it were to walk straight to the destination and simultaneously predicted that another would pass without colliding with it even if it were to go straight. On the other hand, the empathic player moved in a more circuitous trajectory compared with the standard player. From Table 1, the total travel distance of the empathic player is larger than that of the standard player in situation (a), regardless of the type of other (selfish or altruistic). On the other hand, the total travel distance of the others is smaller when the player is the empathic agent than when the player is the standard agent. Moreover, comparing the empathic agent and the standard agent when the other is altruistic for situation (a), the travel distance of the empathic agent increased by 8.2% (from 4.65 to 5.03), whereas the travel distance of the other decreased by 13.7% (from 5.93 to 5.12). From these results, the empathic player behaved more altruistically than the standard player. In other words, the empathic player behaved in a way that benefited the others, even if it was to its detriment. The minimum distance between the player and the other increased by 25.8% (from 1.20 to 1.51) and inequality among agents decreased by 92.2% (from 1.279 to 0.100) in the case of the altruistic other. Therefore the empathic player took a socially desirable action in terms of both safety (minimum distance) and equality. Because the difference between the empathic and standard players was only the term of reward for the others (roth), the difference in behavior stems from this term. Motivation to respond to others’ expectations changes the behavior of the player; in other words, others surrounding the player leads to the player being altruistic. The evaluation also showed that the behavior of the player changes in accordance with the others around it. The total travel distance of the player when others were altruistic was larger by 5.5% (increasing from 4.77 to 5.03) than it was when others were selfish for situation (a). In other words, as explained in section 2.3, differences in the others around the empathic player were reflected in differences in the players’ sociality. Similar quantitative results are observed for situations (b) and (c), as shown in Table 1.

The trajectories of the agents’ movements in situation (b) when the player is the empathic agent resulted in agents avoiding each other in a circle at the center, as in situation (a). Alternatively, the behaviors were more complex when the others were selfish in situation (c). In situation (c), when the player was controlled by the standard active inference and the others were selfish, the player and the others collided. Meanwhile, the player and the others could reach each destination without collision when the others were selfish and the player was an empathic agent. Unlike in the other cases, in this case, the trajectory of the empathic player was almost straight, as shown in Figure 9. The social behavior of the empathic player in this case is not expressed in the final trajectory alone but in the details of the transition in movement. Figure 10 shows the transitions in movement when the others were selfish and the player was controlled by the empathic active inference in situation (c).

Figure 10. 

Transition of movements when others were selfish in situation (c) of Figure 6. Empathic players wait for others to pass. Sociality was found in time of travel.

Figure 10. 

Transition of movements when others were selfish in situation (c) of Figure 6. Empathic players wait for others to pass. Sociality was found in time of travel.

Close modal

In this situation, the green other and the orange other first passed each other, while the blue other and the player waited (t1). The blue other then passed the center point, while the player kept waiting (t2, t3). Finally, the player passed the center point (t4). Namely, the empathic player socially behaved like the blue agent, “waiting” for others to pass by. Thus, in this case, sociality was found in the time of travel, not in the trajectory of travel.

From these results, the behavior of the empathic agent adaptively changed according to the surrounding others. In particular, the behavior of the empathic player was similar to that of the surrounding others. This is because the empathic agent responds to the expectations of others, and the expectations of others are predicted as “how would I predict the observed other if I were in his or her situation?” by using the internal model learned by the others’ behavior. Therefore the empathic agent behaves like others surrounding it. The empathic agent can change its sociality for a given situation without manually changing the reward for the situation. Sociality is automatically adjusted to every scene by the behavior of others.

3.4 Results for Asymmetrical Cases

In the previous section, we showed the behaviors of the agents in ideal situations in which the initial positions and destinations of the agents are perfectly symmetrical. In addition, we assumed that all the others have a single type (characteristic), selfish or altruistic. In this section, we show the behavior of the agent in more realistic situations: the case of asymmetric positions and others with several different characteristics. The settings of the player are the same as in the previous section. As shown in section 3.3, the trajectory of the standard player is almost a straight line, because it does not care about the others at all. Therefore we discuss only the case of the empathic player.

First, we evaluated the behaviors of the agents in the situation shown in Figure 11, in which the agent’s initial positions and destinations are asymmetric. Figure 12 shows the transitions in movement when all of the others were selfish.

Figure 11. 

Asymmetrical initial position and destination case.

Figure 11. 

Asymmetrical initial position and destination case.

Close modal
Figure 12. 

Transition of movements for the geometrically asymmetric situation with selfish others. Agents pass each other with a small margin.

Figure 12. 

Transition of movements for the geometrically asymmetric situation with selfish others. Agents pass each other with a small margin.

Close modal

In Figure 12, the empathic player (red) and blue other started to turn to avoid colliding with the green other at first (t1). The player and blue other continued to turn (t2) and avoided each other by a small margin and passed each other (t3). Finally, the green other passed the goal (t4). In this case, the minimum distance between agents was 0.42, and the movement of the empathic player was a more forceful trajectory than it was in the ideally symmetric case in Figure 6(b). Meanwhile, Figure 13 shows the different transitions in movement when others were altruistic in Figure 11.

Figure 13. 

Transition of movements for the geometrically asymmetric situation with altruistic others. Minimum distances among agents were larger than they were in the case of selfish others.

Figure 13. 

Transition of movements for the geometrically asymmetric situation with altruistic others. Minimum distances among agents were larger than they were in the case of selfish others.

Close modal

At first, the empathic player (red) and blue other started to turn to avoid colliding, similarly to the case when others were selfish (t1). The empathic player turned to the left to avoid the blue other, unlike when the others are selfish, and the green other also started to turn (t2). Then all agents moved in one circle, all together (t3). Finally, all agents reached their goals. These results show that the empathic agent can properly move to avoid others around it even in the asymmetrical position cases shown in Figure 11. Moreover, avoidance depends on the surrounding others, and it can be seen that when the surrounding others are altruistic, the agent acts more altruistically to avoid others.

Next, we evaluate the empathic agent’s behavior when the characteristics of surrounding others are asymmetric, as shown in Figure 14.

Figure 14. 

Asymmetrical characteristic cases. Others have different characteristics. In patterns 1 and 2, there were two selfish others and one altruistic other. Placements of others are different in patterns 1 and 2. Similarly, patterns 3 and 4 show the case of one selfish other and two altruistic others.

Figure 14. 

Asymmetrical characteristic cases. Others have different characteristics. In patterns 1 and 2, there were two selfish others and one altruistic other. Placements of others are different in patterns 1 and 2. Similarly, patterns 3 and 4 show the case of one selfish other and two altruistic others.

Close modal

Figure 14 shows three agents besides the player, and their characteristics are not homogeneous. We consider two cases with respect to the ratio of selfish to altruistic agents, namely, cases with (a) two selfish and one altruistic other and (b) one selfish and two altruistic others. Moreover, we consider two different arrangements of selfish and altruistic agents for each case.

Figure 15 shows the results for each case. In all cases, agents did not collide, and the players and all others moved appropriately to their destinations, avoiding each other. The resulting movement trajectories of the player do not show significant differences from case to case. However, there were differences in the detailed movement histories of the player, depending on the different proportions of altruistic others. For cases (a) and (b) in Figure 14, in which the number of selfish others is greater than that of altruistic others, the empathic player reached the destination at a time closer to the selfish others, with the altruistic agent reaching the destination last. For example, in case (a), the player reached the destination at step 91, at which time the other selfish agents (green and orange) also reached the destination, while the altruistic agent (blue) had not yet reached the destination. Similarly, in case (b), the player reached the destination at step 74, at which time the other selfish agents (blue and orange) had also reached the destination, while the altruistic agent (green) had not yet reached the destination. On the other hand, for cases (c) and (d) in Figure 16, in which the number of altruistic others is greater than the number of selfish others, the player waited for others to pass near the center to avoid collisions with others (t2 or t3 in Figures 15(c) and 15(d)) and, as a result, reached the destination last. For example, in cases (c) and (d), the player reached the destination at step 133 and 124, respectively. These results show that although the movement trajectory of the player itself does not change significantly depending on the characteristics of the surrounding others, differences can be seen in the time of travel, as shown in Figure 10. In particular, the more altruistic the behavior of the others around the player is, the more altruistic the player will act in time. In case (a), the behavior of the player stopping for a while near the goal was observed. Figure 16 shows this behavior.

Figure 15. 

Transition of movements for asymmetrical characteristic cases. Although the trajectory of the player itself does not change significantly depending on the characteristics of the surrounding others, differences can be seen in the time of travel. The more altruistic is the behavior of the others around the player, the more altruistic the player will act in time.

Figure 15. 

Transition of movements for asymmetrical characteristic cases. Although the trajectory of the player itself does not change significantly depending on the characteristics of the surrounding others, differences can be seen in the time of travel. The more altruistic is the behavior of the others around the player, the more altruistic the player will act in time.

Close modal
Figure 16. 

Detailed transition of movements for asymmetrical characteristic cases (Figure 15(a)). The empathic player stops for a while, even though no others are in front of the player.

Figure 16. 

Detailed transition of movements for asymmetrical characteristic cases (Figure 15(a)). The empathic player stops for a while, even though no others are in front of the player.

Close modal

The player has moved to a position beyond the center at step 47 and has already crossed paths with the green and orange agents. There are no others in front of the empathic player. However, the player continued to stop until step 67. This is because the player inferred that the others expected the empathic player to continue stopping at the position. However, this inference may be inappropriate because of the positional relationship between the player and the others. Because the proposed method determines actions based on the prediction of others’ expectations, if the prediction is inappropriate, the selected action may also be inappropriate, as in this case. In practical use, it will be important to address the risk of such predictions being wrong.

This article proposed a model of social behavior of artificial agents to realize artificial agents co-living with people in our human society. Social behavior is an action taken in consideration of surrounding others in a situation, and the challenge is to behave adaptively according to the situation. To solve this problem, the proposed method incorporates the mechanism of empathy into the active inference model, which is a behavioral model in cognitive science, to realize adaptive behavior in any situation through empathy with others around the agent.

We evaluated and discussed the social behavior of the proposed empathic agents using simulation results of walking scenes, a typical social behavior scene. Comparison with an agent controlled by standard active inference, which does not consider others around it at all, showed that the behavior of the empathic agent is appropriately social in terms of altruism, safety (minimum distance between agents), and inequality (average difference of travel distances). We also showed that the social behavior of empathic agents changes depending on the given situation by simulating their behavior in different situations, such as when the surrounding others have different characteristics (selfish or altruistic) and in different situations (number of walkers, layout of walkers). These results show that the proposed model can adaptively change its behavior by responding to others around it. When it learned and operated in an environment in which others were selfish, it tended to behave selfishly. Conversely, when it learned in an environment in which others were altruistic, it tended to behave altruistically.

An important point for the future is the control of empathy. In this article, we assume uniform empathy for others around us. However, the generation of more appropriate adaptive behavior requires a mechanism that can appropriately recognize others to empathize with and others for whom to avoid empathy and that can adjust empathy appropriately for each situation and each other. Future work is needed to model such a system based on findings on empathy control in psychology and neuroscience.

Many studies on behavioral models of artificial intelligence (AI), robots, and other agents using active inference have been proposed. For example, active inference has been evaluated for simple control systems (Friston et al., 2018; Parr & Friston, 2017; Pio-Lopez et al., 2016) and for more complex control systems by leveraging advances in deep learning (Çatal et al., 2020; Çatal et al., 2021; Fountas et al., 2020; Millidge, 2020; Tschantz et al., 2020; Ueltzhöffer, 2018). These studies mainly assumed a situation in which there was a single control target and in which no other agents were present. Therefore empathy for others around agents and social behavior as presented in this article is not discussed. In Friston and Frith (2015), active inference is discussed for the case of two agents. In addition, active inference has recently been applied to multiagent cases (Friedman et al., 2021; Kaufmann et al., 2021). The motivations and discussions of these two-agent and multiagent cases are similar to those of this article, and they comprise interesting phenomena, such as the synchronization of agents and the emergence of collective intelligence from the autonomous behavior of individuals by assuming multiple learnable agents, that are not discussed herein. However, the differences in behavior depending on the surrounding others in different situations are not evaluated. Because agents’ social behavior changes in response to others around them, we believe that the simulation results reported in this article for a variety of surrounding situations are useful for the research on social behavior within the literature on active inference.

Although many studies of empathy models and inference models of others’ internal states are not based on active inference, we discuss here those studies that have been applied to robot and AI behavior generation, as in this article. One of the articles that seems most relevant is “Game Theory of Mind” (Yoshida et al., 2008). Although the study is not based on active inference, the inference of others’ mental states and the sociality of agents in a simple, discrete game are discussed therein. In Yoshida et al., bounded rationality, which is not discussed in this article, is incorporated into the model. Incorporating bounded rationality is important because real-time performance is required in real applications, such as walking the robot in a real world. Many other models have been proposed, as shown in a survey paper by Paiva et al. (2017); one reason why so many models have been proposed is that empathy is multidimensional and has many aspects. Therefore, when constructing empathy models, it is also important to consider the multidimensional aspect of empathy, rather than limiting it to one aspect (estimating others’ states by simulating others), as in this article. A model that assumes a dialogue agent has been proposed as considering such a multidimensional approach to empathy (Yalcin & DiPaola, 2018). Yalcin and DiPaola summarize empathy into three categories—communicative ability, emotion regulation, and emotional situation evaluation—and a hierarchical model with the categories is proposed. Although this article implemented one aspect of empathy for active inference, this research should be extended in the future by incorporating different aspects of empathy and related cognitive functions, such as bounded rationality.

In the field of multiagent systems, there have been many studies on the emergence of cooperative behavior (e.g., Hernandez-Leal et al., 2019; Winfield, 2018). In these multiagent systems, it is mainly assumed that other agents are also machines and that agents can explicitly communicate observations, model parameters, and predictions between each other (Foerster et al., 2016; Gupta et al., 2017; Tampuu et al., 2017). In this study, we assumed that the other agents are human, and there is no explicit communication between an artificial agent and human agents. Similar to the situation discussed in this article, social robotics assumes that a robot functioning around humans has sociality and needs to infer the mental states of humans from their behavior. For example, ProxEmo generates a trajectory for movement based on estimates of others’ emotions from their gait behavior (Narayanan et al., 2020). Most of these studies modeled others as agents different from the agent itself, whereas the proposed method models others as the same as the agent.

Adams
,
R. A.
,
Shipp
,
S.
, &
Friston
,
K. J.
(
2013
).
Predictions not commands: Active inference in the motor system
.
Brain Structure and Function
,
218
(
3
),
611
643
. ,
[PubMed]
Alahi
,
A.
,
Goel
,
K.
,
Ramanathan
,
V.
,
Robicquet
,
A.
,
Fei-Fei
,
L.
, &
Savarese
,
S.
(
2016
).
Social LSTM: Human trajectory prediction in crowded spaces
. In
Proceedings of the IEEE conference on Computer Vision and Pattern Recognition
(pp. 
961
971
).
IEEE
.
Albarracin
,
M.
,
Demekas
,
D.
,
Ramstead
,
M. J. D.
, &
Heins
,
C.
(
2022
).
Epistemic communities under active inference
.
Entropy
,
24
(
4
),
476
. ,
[PubMed]
Blei
,
D. M.
,
Kucukelbir
,
A.
, &
McAuliffe
,
J. D.
(
2017
).
Variational inference: A review for statisticians
.
Journal of the American Statistical Association
,
112
(
518
),
859
877
.
Bowman
,
S. R.
,
Vilnis
,
L.
,
Vinyals
,
O.
,
Dai
,
A. M.
,
Jozefowicz
,
R.
, &
Bengio
,
S.
(
2016
).
Generating sentences from a continuous space
. In
International Conference on Computational Natural Language Learning
(pp. 
10
21
).
Association for Computational Linguistics
.
Çatal
,
O.
,
Verbelen
,
T.
,
Nauta
,
J.
,
De Boom
,
C.
, &
Dhoedt
,
B.
(
2020
).
Learning perception and planning with deep active inference
. In
2020 IEEE International Conference on Acoustics, Speech and Signal Processing
(pp. 
3952
3956
).
IEEE
.
Çatal
,
O.
,
Verbelen
,
T.
,
Van de Maele
,
T.
,
Dhoedt
,
B.
, &
Safron
,
A.
Van
(
2021
).
Robot navigation as hierarchical active inference
.
Neural Networks
,
142
,
192
204
. ,
[PubMed]
Coulom
,
R.
(
2006
).
Efficient selectivity and backup operators in Monte-Carlo tree search
. In
International Conference on Computers and Games
(pp. 
72
83
).
Springer
.
Davis
,
M. H.
(
1983
).
Measuring individual differences in empathy: Evidence for a multidimensional approach
.
Journal of Personality and Social Psychology
,
44
(
1
),
113
126
.
Dayan
,
P.
,
Hinton
,
G. E.
,
Neal
,
R. M.
, &
Zemel
,
R. S.
(
1995
).
The Helmholtz machine
.
Neural Computation
,
7
(
5
),
889
904
. ,
[PubMed]
Eisenberg
,
N.
, &
Miller
,
P. A.
(
1987
).
The relation of empathy to prosocial and related behaviors
.
Psychological Bulletin
,
101
(
1
),
91
119
. ,
[PubMed]
Eslami
,
S. A.
,
Jimenez Rezende
,
D.
,
Besse
,
F.
,
Viola
,
F.
,
Morcos
,
A. S.
,
Garnelo
,
M.
,
Ruderman
,
A.
,
Rusu
,
A. A.
,
Davihelka
,
I.
,
Gregor
,
K.
,
Reichert
,
D. P.
,
Buesing
,
L.
,
Weber
,
T.
,
Vinyals
,
O.
,
Rosenbaum
,
D.
,
Rabinowitz
,
N.
,
King
,
H.
,
Hillier
,
C.
, …
Hassabis
,
D.
(
2018
).
Neural scene representation and rendering
.
Science
,
360
(
6394
),
1204
1210
. ,
[PubMed]
Fehr
,
E.
, &
Gächter
,
S.
(
2000
).
Cooperation and punishment in public goods experiments
.
American Economic Review
,
90
(
4
),
980
994
.
Fehr
,
E.
, &
Gächter
,
S.
(
2002
).
Altruistic punishment in humans
.
Nature
,
415
(
6868
),
137
140
. ,
[PubMed]
Feldman
,
H.
, &
Friston
,
K.
(
2010
).
Attention, uncertainty, and free-energy
.
Frontiers in Human Neuroscience
,
4
,
215
. ,
[PubMed]
Fischbacher
,
U.
,
Gächter
,
S.
, &
Fehr
,
E.
(
2001
).
Are people conditionally cooperative? Evidence from a public goods experiment
.
Economics Letters
,
71
(
3
),
397
404
.
Foerster
,
J.
,
Assael
,
I. A.
,
De Freitas
,
N.
, &
Whiteson
,
S.
(
2016
).
Learning to communicate with deep multi-agent reinforcement learning
. In
D. D.
Lee
,
U.
von Luxburg
,
R.
Garnett
,
M.
Sugiyama
, &
I.
Guyon
, (Eds.)
NIPS’16: Proceedings of the 30th International Conference on Neural Information Processing Systems
(pp. 
2145
2153
).
Curran Associates
.
Fountas
,
Z.
,
Sajid
,
N.
,
Mediano
,
P.
, &
Friston
,
K.
(
2020
).
Deep active inference agents using Monte-Carlo methods
. In
H.
Larochelle
,
M.
Ranzato
,
R.
Hadsell
,
M. F.
Balcan
, &
H.
Lin
, (Eds.)
NIPS’20: Proceedings of the 34th International Conference on Neural Information Processing Systems
(pp. 
11662
11675
).
Curran Associates
.
Friedman
,
D. A.
,
Tschantz
,
A. D. D.
,
Ramstead
,
M. J. D.
,
Friston
,
K.
, &
Constant
,
A.
(
2021
).
Active inferants: An active inference framework for ant colony behavior
.
Frontiers in Behavioral Neuroscience
,
15
,
647732
. ,
[PubMed]
Friston
,
K.
(
2010
).
The free-energy principle: A unified brain theory?
Nature Reviews Neuroscience
,
11
(
2
),
127
138
. ,
[PubMed]
Friston
,
K. J.
,
Daunizeau
,
J.
,
Kilner
,
J.
, &
Kiebel
,
S. J.
(
2010
).
Action and behavior: A free-energy formulation
.
Biological Cybernetics
,
102
(
3
),
227
260
. ,
[PubMed]
Friston
,
K.
,
FitzGerald
,
T.
,
Rigoli
,
F.
,
Schwartenbeck
,
P.
, &
Pezzulo
,
G.
(
2016
).
Active inference and learning
.
Neuroscience and Biobehavioral Reviews
,
68
,
862
879
. ,
[PubMed]
Friston
,
K.
,
FitzGerald
,
T.
,
Rigoli
,
F.
,
Schwartenbeck
,
P.
, &
Pezzulo
,
G.
(
2017
).
Active inference: A process theory
.
Neural Computation
,
29
(
1
),
1
49
.
Friston
,
K.
, &
Frith
,
C.
(
2015
).
A duet for one
.
Consciousness and Cognition
,
36
,
390
405
. ,
[PubMed]
Friston
,
K.
,
Kilner
,
J.
, &
Harrison
,
L.
(
2006
).
A free energy principle for the brain
.
Journal of Physiology
,
100
(
1–3
),
70
87
. ,
[PubMed]
Friston
,
K.
,
Mattout
,
J.
, &
Kilner
,
J.
(
2011
).
Action understanding and active inference
.
Biological Cybernetics
,
104
(
1
),
137
160
. ,
[PubMed]
Friston
,
K. J.
,
Rosch
,
R.
,
Parr
,
T.
,
Price
,
C.
, &
Bowman
,
H.
(
2018
).
Deep temporal models and active inference
.
Neuroscience and Biobehavioral Reviews
,
77
,
388
402
. ,
[PubMed]
Gupta
,
J. K.
,
Egorov
,
M.
, &
Kochenderfer
,
M.
(
2017
).
Cooperative multi-agent control using deep reinforcement learning
. In
International Conference on Autonomous Agents and Multiagent Systems
(pp. 
66
83
).
Springer
.
Helbing
,
D.
, &
Molnar
,
P.
(
1995
).
Social force model for pedestrian dynamics
.
Physical Review E
,
51
(
5
),
4282
. ,
[PubMed]
Henrich
,
J.
(
2015
).
Culture and social behavior
.
Current Opinion in Behavioral Sciences
,
3
,
84
89
.
Hernandez-Leal
,
P.
,
Kartal
,
B.
, &
Taylor
,
M. E.
(
2019
).
A survey and critique of multiagent deep reinforcement learning
.
Autonomous Agents and Multi-Agent Systems
,
33
(
6
),
750
797
.
Kaplan
,
R.
, &
Friston
,
K. J.
(
2018
).
Planning and navigation as active inference
.
Biological Cybernetics
,
112
(
4
),
323
343
. ,
[PubMed]
Kaufmann
,
R.
,
Gupta
,
P.
, &
Taylor
,
J.
(
2021
).
An active inference model of collective intelligence
.
Entropy
,
23
(
7
),
830
. ,
[PubMed]
Kingma
,
D. P.
, &
Ba
,
J.
(
2014
).
Adam: A method for stochastic optimization
.
ArXiv
.
Kingma
,
D. P.
, &
Welling
,
M.
(
2013
).
Auto-encoding variational bayes
.
ArXiv
.
Knill
,
D. C.
, &
Pouget
,
A.
(
2004
).
The Bayesian brain: The role of uncertainty in neural coding and computation
.
TRENDS in Neurosciences
,
27
(
12
),
712
719
. ,
[PubMed]
Matsumura
,
T.
,
Esaki
,
K.
, &
Mizuno
,
H.
(
2022
).
Empathic active inference: Active inference with empathy mechanism for socially behaved artificial agent
. In
Artificial Life Conference Proceedings
(Vol.
34
, p. 
18
).
MIT Press
.
Millidge
,
B.
(
2020
).
Deep active inference as variational policy gradients
.
Journal of Mathematical Psychology
,
96
,
102348
.
Mohamed
,
A.
,
Qian
,
K.
,
Elhoseiny
,
M.
, &
Claudel
,
C.
(
2020
).
Social-STGCNN: A social spatio-temporal graph convolutional neural network for human trajectory prediction
. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
(pp. 
14424
14432
).
IEEE
.
Narayanan
,
V.
,
Manoghar
,
B. M.
,
Dorbala
,
V. S.
,
Manocha
,
D.
, &
Bera
,
A.
(
2020
).
ProxEmo: Gait-based emotion learning and multi-view proxemic fusion for socially-aware robot navigation
. In
2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
(pp. 
8200
8207
).
IEEE
.
Paiva
,
A.
,
Leite
,
I.
,
Boukricha
,
H.
, &
Wachsmuth
,
I.
(
2017
).
Empathy in virtual agents and robots: A survey
.
ACM Transactions on Interactive Intelligent Systems
,
7
(
3
),
1
40
.
Parr
,
T.
, &
Friston
,
K. J.
(
2017
).
Uncertainty, epistemics and active inference
.
Journal of the Royal Society Interface
,
14
(
136
),
20170376
. ,
[PubMed]
Pio-Lopez
,
L.
,
Nizard
,
A.
,
Friston
,
K.
, &
Pezzulo
,
G.
(
2016
).
Active inference and robot control: A case study
.
Journal of the Royal Society Interface
,
13
(
122
),
20160616
. ,
[PubMed]
Quattrocki
,
E.
, &
Friston
,
K.
(
2014
).
Autism, oxytocin and interoception
.
Neuroscience and Biobehavioral Reviews
,
47
,
410
430
. ,
[PubMed]
Rao
,
R. P.
, &
Ballard
,
D. H.
(
1999
).
Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects
.
Nature Neuroscience
,
2
(
1
),
79
87
. ,
[PubMed]
Tampuu
,
A.
,
Matiisen
,
T.
,
Kodelja
,
D.
,
Kuzovkin
,
I.
,
Korjus
,
K.
,
Aru
,
J.
,
Aru
,
J.
, &
Vicente
,
R.
(
2017
).
Multiagent cooperation and competition with deep reinforcement learning
.
PLoS ONE
,
12
(
4
),
172395
. ,
[PubMed]
Tschantz
,
A.
,
Baltieri
,
M.
,
Seth
,
A. K.
, &
Buckley
,
C. L.
(
2020
).
Scaling active inference
. In
2020 International Joint Conference on Neural Networks
(pp. 
1
8
).
IEEE
.
Ueltzhöffer
,
K.
(
2018
).
Deep active inference
.
Biological Cybernetics
,
112
(
6
),
547
573
. ,
[PubMed]
Vemula
,
A.
,
Muelling
,
K.
, &
Oh
,
J.
(
2018
).
Social attention: Modeling attention in human crowds
. In
2018 IEEE International Conference on Robotics and Automation (ICRA)
(pp. 
4601
4607
).
IEEE
.
Winfield
,
A. F.
(
2018
).
Experiments in artificial theory of mind: From safety to story-telling
.
Frontiers in Robotics and AI
,
5
,
75
. ,
[PubMed]
Wirkuttis
,
N.
, &
Tani
,
J.
(
2021
).
Leading or following? Dyadic robot imitative interaction using the active inference framework
.
IEEE Robotics and Automation Letters
,
6
(
3
),
6024
6031
.
Yalcin
,
Ö. N.
, &
DiPaola
,
S.
(
2018
).
A computational model of empathy for interactive agents
.
Biologically Inspired Cognitive Architectures
,
26
,
20
25
.
Yoshida
,
W.
,
Dolan
,
R. J.
, &
Friston
,
K. J.
(
2008
).
Game theory of mind
.
PLoS Computational Biology
,
4
(
12
),
e1000254
. ,
[PubMed]
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.