Abstract
This article proposes a method for an artificial agent to behave in a social manner. Although defining proper social behavior is difficult because it differs from situation to situation, the agent following the proposed method adaptively behaves appropriately in each situation by empathizing with the surrounding others. The proposed method is achieved by incorporating empathy into active inference. We evaluated the proposed method regarding control of autonomous mobile robots in diverse situations. From the evaluation results, an agent controlled by the proposed method could behave more adaptively socially than an agent controlled by the standard active inference in the diverse situations. In the case of two agents, the agent controlled with the proposed method behaved in a social way that reduced the other agent’s travel distance by 13.7% and increased the margin between the agents by 25.8%, even though it increased the agent’s travel distance by 8.2%. Also, the agent controlled with the proposed method behaved more socially when it was surrounded by altruistic others but less socially when it was surrounded by selfish others.
1 Introduction
We humans are social animals, and we are required to behave socially to live in a society. Therefore artificial agents operating in our daily spaces with us are also required to behave socially. Social behaviors are behaviors among multiple agents in general, and they are “appropriate” behaviors for these multiple agents in particular. The challenge in realizing socially behaving artificial agents is that the appropriateness of social behaviors varies from situation to situation (Henrich, 2015). Although we humans define some of appropriate social behaviors as explicit rules, such as laws or standards, we are also required to follow implicit socially appropriate behaviors that are not defined as explicit rules. For example, when using an escalator, in one country, it is considered appropriate to ride on the right side, whereas in another country, it is considered appropriate to ride on the left side. Appropriate social behavior on escalators varies from country to country and region to region. These social behaviors, which are implicitly required in every situation, can be thought of as manners. Artificial agents that are capable of social behavior must adopt not only explicitly rule-able behaviors but also these implicit social behaviors. However, such implicit socially appropriate behavior is difficult to implement as a predefined program in an artificial agent because it is not explicitly defined as a rule and because it varies from situation to situation.
On the basis of this research motivation, the goal of this article is to develop an artificial agent that is not a predefined program but is capable of learning and adopting appropriate implicit social behavior in diverse situations. We proposed the method to realize artificial agents capable of adopting such implicitly desired social behaviors (Matsumura et al., 2022). This article describes the details of the proposed method and reports evaluation results in more diverse situations. The core idea of the proposed method is that others surrounding us are not only objects of recognition but also sources of social behaviors, and we humans can make appropriate cognition and take proper action due to the help of the presence of others in diverse situations.
The proposed method is implemented based on active inference (Adams et al., 2013; Friston et al., 2011, 2016, 2017). Active inference was proposed under the context of the free energy principle (FEP), which is a hypothesis for understanding the mechanism of a biological agent’s cognitive activities (Friston, 2010; Friston et al., 2006). In FEP, the brain is viewed as a device performing variational Bayes inference. The human brain is explained as always predicting the future and works to decrease the uncertainty of predictions. Similar ideas were widely studied within certain contexts, such as the Bayesian brain hypothesis (Knill & Pouget, 2004), predictive coding (Rao & Ballard, 1999), and the Helmholtz machine (Dayan et al., 1995). The unique point of active inference is that it explains actions as well as perception with only one principle: minimization of variational free energy. It is assumed that there is an internal model for predicting external environments in the human brain. The process of perceptions is explained as the process of minimizing free energy of the internal model by updating the state of that model. The process of taking action is also explained as the process of minimizing expected free energy for the future under a certain action. Although active inference is discussed mainly for single-agent cases, we extend the active inference to multiagent cases. In multiagent cases, we can consider two types of uncertainty, that is, (a) the agent’s uncertainty of others and (b) others’ uncertainty of the agent. The proposed method determines the agent’s action based on these two types of uncertainty. By incorporating the second type of uncertainty, an agent controlled with the proposed method (hereinafter an empathic agent) attempts to act based on the other’s expectations, which makes it behave in an adaptively socially desirable manner. This is achieved by estimating others’ predictions about the agent by simulating others’ situations. The empathic agent estimates the predictions of others using the same model that is used to predict the future of its external environment. When the model is used to predict the predictions of others, the input data of the model are changed from the agent’s observation to the others’ observations, which are also estimated by the agent.
We simulated the walking behavior of the proposed empathic agents in the presence of others to evaluate and discuss the social behavior of the proposed method. When we humans walk, we do not walk selfishly toward a destination; rather, we walk with consideration for others around us. Therefore walking is a good situation for evaluating social behavior, and models like social long short-term memory (LSTM) (Alahi et al., 2016; Mohamed et al., 2020; Vemula et al., 2018) have been proposed to predict walking that takes sociality into account. In the simulation of a walking scene, the trajectory and travel time can be obtained. We quantitatively discuss the social behavior of the proposed method by comparing the trajectory and travel time of the agent when it behaves asocially, that is, when it acts selfishly, without considering others around it, and when it is based on the proposed method.
We describe the details of active inference and the proposed method in section 2. In section 3, we discuss the social behavior of agents based on the proposed method by comparing it with the behavior of asocial agents. Then, in section 4, we summarize and discuss future work, and in section 5, we discuss related research.
2 Empathic Active Inference
2.1 Active Inference
This section describes the active inference on which the proposed method is based. As mentioned in section 1, active inference is a hypothesis for understanding the mechanisms of action of biological agents (Adams et al., 2013; Friston et al., 2011, 2016, 2017) and was proposed under the context of the FEP (Friston, 2010; Friston et al., 2006). The FEP is widely studied and is applied to explain many cognitive abilities or phenomena, such as behavior (Friston et al., 2010), planning (Kaplan & Friston, 2018), autism (Quattrocki & Friston, 2014), and attention (Feldman & Friston, 2010). The most basic cognitive mechanisms, perceptions, and actions are also explained as the process for minimizing variational free energy. It seems natural to explain perception as inference minimizing the uncertainty of the internal model for the inference. More interestingly, an action is also explained with the same principle. In the FEP, an action is explained as the process of inference in which biological agents actively act to decrease uncertainty, and the best action is that which is expected to decrease uncertainty the most. This process for performing actions is called active inference. Active inference is studied in a variety of environments (Friston et al., 2018; Pio-Lopez et al., 2016; Parr & Friston, 2017). It has been combined with deep learning for applying it to more complicated environments, such as robot control (Çatal et al., 2020, 2021; Fountas et al., 2020; Millidge, 2020; Tschantz et al., 2020; Ueltzhöffer, 2018). Although active inference is basically defined for single-agent cases, some works have recently extended it to multiagent cases (Albarracin et al., 2022; Friedman et al., 2021; Kaufmann et al., 2021; Wirkuttis & Tani, 2021). We now mathematically describe the FEP and active inference.
2.2 Active Inference With Empathy Mechanism
We extended active inference to generate social behavior. For this purpose, we embed the human capacity of empathy into active inference. There are mainly two types of human activity regarding empathy: (a) cognitive empathy and (b) emotional (or affective) empathy (Davis, 1983). Cognitive empathy is the ability of inferring another’s mental state. Emotional empathy is the ability to feel what others are feeling as if it were your own feelings. In any type of empathy, understanding or sharing the experiences or feelings of others is thought to be related to our sociality (Eisenberg & Miller, 1987). Theory of mind and simulation theory are theoretical frameworks for understanding the mechanisms of these emotions. According to simulation theory, we can infer another’s mental state by simulating what we would infer or feel if we were in the same situation as another. Our proposed method enables an agent to empathize with others and is inspired by the simulation theory to virtually experience the experiences of others using its internal models as if simulating others’ mental states with its own body. Figure 2 shows the core idea of the proposed method, named empathic active inference.
Figure 2 shows two agents, I and other, both acting with active inference. The agent I always infers the future external environment and attempts to decrease the uncertainty of the inference. It is important that the others surrounding I also always infer the future external environment and attempt to decrease the uncertainty of inference. I is an uncertain factor for the others. Given these situations, there is a way to decrease free energy in addition to the ways described in the previous section about active inference: act to decrease others’ uncertainty. Although it will not decrease I′s free energy, it will decrease the total amount of free energy in the group of agents. Similarly, the free energy of I can decrease by the actions of the others if the others act to decrease the free energy of I. We can manipulate free energy not only through our actions but also through the actions of others by thinking of free energy collectively rather than individually. Actions based on this can be said to be for others, that is, social action. In this article, we will refer to the agent that acts in this way as an empathic agent.
An empathic agent first needs to infer another’s inference to it. For this purpose, the empathic agent uses an idea inspired by the simulation theory. The core idea for inferring another’s inference is that the other’s inference is inferred using the same internal model used for inferring the empathic agent’s external environment. Inference of another’s inference consists of two processes, as illustrated in Figure 3.
The first term is the reward for the agent’s goal (rmy), which is determined by the observed information and is the same as the reward in Equation 5. In Equation 6, a reward term for the expectations of others (roth) is added, determined by the observation (ot) and expectation of others of the agent (), which are estimated by the agent. The expectation of others is derived from the inference of others (). For example, if the positions of the others surrounding the others are estimated as the inference of others, the positions of the empathic agent inferred by the others are the expectations from the others. Similar to the flexibility in encoding rewards in active inference (Millidge, 2020), expectations from others can be encoded flexibly, such as by using probability distributions. For multiple others, the reward for each of the others (i) is summed. The proposed agent takes the action that will most decrease the expected free energy described in Equation 6. It is important that an inference of others to the empathic agent be interpreted as an expectation to the empathic agent from others. An artificial agent controlled by the proposed method does not act only according to its own objectives; it also takes into account the expectations of others around it. In other words, empathic agents are capable of taking social action, that is, action that takes others into account, and in particular, they realize the appropriate social action required in each situation as the action that others expect them to take.
2.3 Ideal Behaviors in Simple Game
In this section, we illustrate the ideally expected behavior of an agent based on the proposed method in a simple game, a public goods game widely used in game theory. As mentioned in section 1, the definition of socially appropriate behavior depends on the situation; however, in the public goods game, the social appropriateness of behavior can be discussed quantitatively by the sum of public goods. In the following explanation, the agent that acts according to the proposed method is referred to as the empathic agent. The public goods game is a virtual game for examining the sociality of agents and is illustrated in Figure 4.
There are multiple players; the case of N = 4 players is illustrated in Figure 4. Each player is required to decide how much of his or her goods (property) to offer to the public. Goods provided by players are multiplied by k and returned (r) equally among all players. The multiplier k is greater than 1 and less than the number of participants, N. These are the game settings for the public goods game.
When all players provide all goods to the public, the total goods of all players are maximized. Therefore, in terms of the total quantity of goods, the most socially desirable behavior is for all players to provide all goods to the public. However, each player’s personal goods are maximized when each player offers nothing, because a player who offers nothing to the public can benefit only from others without reducing his or her own goods at all. A player who does not provide goods at all is called a free-rider. To maximize the benefits to society, it is important to reduce the existence of free-riders; stated differently, in the public goods game, the appropriateness of social behavior can be discussed in terms of how altruistic it is. Experimental results using subjects have shown that free-riders are rarely observed in practice and that humans provide a small number of goods (Fischbacher et al., 2001). In other words, humans do not act only to maximize their own goods but also act in altruistic ways that benefit others. Many discussions and experiments explain and help one to understand this behavior, for example, punishment of free-riders encourages social behaviors (Fehr & Gächter, 2000, 2002).
If an empathic agent can ideally predict the expectations of others, the empathic agent’s behavior can also explain human social behavior in public goods games. In Equation 6, the reward for each player is the amount of goods. Assuming that in an ideal situation, the empathic agent can perfectly predict the behavior of others, the KL term in Equation 6 is ignored. We examine the behavior of the empathic agent in the case in which the characteristics of the three players other than the empathic agent are completely selfish and completely altruistic, respectively. When a player is completely selfish, the player offers nothing to the public; that is, the player behaves as a free-rider. Conversely, if a player is completely altruistic, the player offers all of his or her goods to the public.
First, we examine the behavior of the empathic agent for the case in which the other players are completely selfish. In this case, if the empathic agent can predict that others will behave perfectly selfishly, it predicts that other players will also predict that it will behave perfectly selfishly. This is because the empathic agent uses the same internal model that predicts the behavior of others to predict its own behavior predictions from others. In this case, the behavior that best meets the expectations of others is the case of a completely selfish behavior of not offering anything. Also, the behavior that maximizes one’s own reward is the behavior that offers nothing to the public. Therefore the reward-maximizing behavior for both self and others is to offer nothing to the public, and in this case, the empathic agent behaves as a free-rider, like the others surrounding it.
Second, we examine the behavior of the empathic agent for the case in which the other players are completely altruistic. In this case, as in the previous discussion, the behavior that maximizes the empathic player’s own reward is the behavior of offering nothing to the public. Conversely, the empathic agent predicts that all others will behave perfectly altruistically in this case, and the empathic agent predicts that all others expect the empathic agent to behave perfectly altruistically, namely, the others expect the empathic agent to offer all of its goods to the public. Thus, as the average of the offering that maximizes its own reward (zero) and the offering that maximizes the expectations of N −1 others (all), the empathic agent takes the behavior of offering N −1/N of the total amount to the public.
From the same discussion, as shown in Figure 5, the amount offered to the public by the empathic agent increases with an increase in the ratio of completely altruistic players in the surroundings.
The same discussion holds true not only for an increase in the ratio of completely altruistic players but also for an increase in the amount of each player’s offerings to the public. As the foregoing discussions show, the proposed empathic agent consequently behaves similarly to others around it. Thus, if we consider the amount offered to the public as a sociality index, we can say that the sociality of the empathic agent changes adaptively according to the sociality of the surrounding others. Therefore, in the proposed method, people behave altruistically in public goods games because others around them in the environment in which they were raised were altruistic. This explained behavior realizes the idea of implicitly required social behaviors considered in this article, namely, the adoption of appropriate implicit social behaviors in given situations.
3 Evaluation
3.1 Simulation Setup
To evaluate and discuss the social behaviors of the proposed empathic agent, we ran multiagent simulations in which multiple agents walked from their initial positions to their destinations. The reason for simulating walking scenes is that a human’s walking is a social behavior (Alahi et al., 2016; Mohamed et al., 2020; Vemula et al., 2018), and it is possible to discuss behavioral comparisons quantitatively through travel times and trajectories. Social behaviors are behaviors among others. Conversely, asocial behaviors are behaviors in which others are not considered at all. Therefore the social behavior of the proposed empathic agents can be discussed by comparing it to the behavior of the case in which surrounding others are not considered at all. For this comparison, we also simulated the behavior of an agent controlled by standard active inference in addition to the proposed empathic agent.
In addition, we simulated several cases in which the altruism of others surrounding the empathic agent varied or the number of walkers varied. This is to confirm that the social behavior considered in this article is not simply altruistic behavior, as discussed in section 2.3, but altruistic behavior that is adaptively adjusted according to the altruism of others around the agent. Two types of conditions were changed, depending on the scenario. The first condition was the situation of the scenario, namely, the settings for the number of walkers and their starting and ending points. Three types of ideally symmetrical situations were assumed for the simulation, as illustrated in Figure 6.
Moreover, asymmetric situations were also simulated to evaluate for more realistic situations, as discussed in section 3.4. In this section, the standard/empathic active inference agent is called the “player,” and the other agents are simply called the “others” or “other.” The player is the red dot in Figure 6. Situation (a) is the simplest: Two agents, the player and one other, walk from their initial points to their destinations. Because their initial points and destinations are opposite each other, the player and the other must take the nonshortest path (a nonstraight line between the initial point and the destination) to avoid colliding. Situation (b) is denser than situation (a): Two others plus the player are walking and will cross each other at the center of the field. Situation (c) is the densest: Three others plus the player walk in a crossroad and will pass each other at the center of the field. No obstacles (i.e., walls) are present in any of the situations, and agents can walk in any area of the field.
The second condition is the type (characteristic) of the other, that is, selfish or altruistic. The type of other is controlled by tuning the parameters of the model that controls the walking of the others. The others were controlled by the social force model (SFM) (Helbing & Molnar, 1995), which is a model for controlling pedestrians in a social space. Although there are a variety of extensions, the SFM basically models the motion of agents by the combination of driving and repulsive forces. The driving force describes the motivation of agents to move toward the given goal at a certain desired velocity. The repulsive force represents the motivation of agents to avoid colliding with others or with obstacles, such as walls. When others are selfish, the weight of the driving force toward the destination is set higher than that of the repulsive force with others. When others are altruistic, the weight of the repulsive force is set higher than that of the driving force toward the destination. Example trajectories of each type of other (green) are illustrated in Figure 7.
In this example, the player (red) moves in a straight line toward the destination, and the other takes the nonshortest path to avoid colliding with the player in both cases. The difference between the types of other appears in the difference of the margin to avoid collision. When the other is altruistic, it walks with a larger margin with the player than when it is selfish. In the evaluation of the symmetric cases, all of the others are assumed to have the same type (characteristic). For example, in the situation in Figure 6(c) with three others, assuming that the others are selfish means assuming that all three others are selfish. Alternatively, in the evaluation of asymmetric cases discussed in section 3.4, we also show the case in which the three others have different types. In summary, we evaluate and discuss the social behavior of the proposed empathic agent from two viewpoints: (a) comparing the behavior of the empathic agent with that of an asocial agent (standard active inference) that does not consider others at all and (b) how behavior changes with differences in the surrounding situations.
3.2 Setup for Empathic Agent Model
The observation of the player is the position of agents relative to the current position of the player. The observation is constructed for the two simulation steps, that is, [0t−1, 0t]. As an ideal case of observation assumed in this simulation, there is no lack of data caused by occlusions, and there is no noise in the observation. The action space is defined as a discrete space. The player moves a constant distance in five directions (−60°, −30°, 0°, 30°, 60°) toward the current direction. The constant distance of the player’s movement is almost the same as that of others. In addition to these actions, the player can select NOP action, that is, to stop at the current position. A total of six actions comprises the action space for the player. The action is encoded into one-hot vectors.
Two densities were assumed in the evaluation, and they were modeled using simple neural networks. The models are illustrated in Figure 8.
In this evaluation, we do not model the density of P(at|st) with neural networks, although it is modeled as a policy model in previous works (Fountas et al., 2020; Millidge, 2020) because the action of the player is determined by the action probability P(at|st) described in Equation 3 with Equation 6. The neural networks were trained in an offline manner because the aim was to evaluate the sociality of the empathic agent’s behavior, not to validate the feasibility of online training. Therefore the training data for each scenario, that is, three situations and two others’ characteristics, are generated in advance with a randomly acting player. During training-data generation, the player and one of the others are interchanged at random for the purpose of giving the player the experience of others’ viewpoints, and it is necessary because the player must infer the others’ inferences with the others’ viewpoints. When the player has no experience of the others’ viewpoints, it cannot infer the inferences of others. One million samples were generated for each scenario. Adam was used for optimizing the neural network’s parameters (Kingma & Ba, 2014), and its learning rate was constantly set to 1e-4. Training proceeded for 10,000 epochs. KL vanishment is a known difficulty in learning the VAE model, and KL annealing was proposed to tackle this problem (Bowman et al., 2016). In the learning process, the weight of KL loss is doubled from 1e-4 at every 1,000 epochs as KL annealing.
In the simulations, the player estimated the expected free energy described in Equation 6 by MCTS with the learned models (Coulom, 2006; Fountas et al., 2020). The learned models were used for predicting the future state and value of each action (i.e., expected free energy) in MCTS. The maximum depth of the tree was set to 3 (i.e., three time steps are maximally estimated), and the search was run for 3,000 iterations for each time step. The action of the player is determined by Equation 3 with the estimated expected free energy for each action. The precision weight (γ) in Equation 3 is set to 1. The behavior of an agent controlled with standard active inference (hereafter, standard agent) was also evaluated for comparison. Although the standard agent is also based on the prediction about a future environment using internal models, it does not take into account expectations from others. The reward (rmy) in Equation 6 is defined by how close the agent is to the destination. If the agent moves closer to the goal, then a positive reward is returned and the expected free energy will be smaller than if it moves away from the destination. Moreover, the empathic player (i.e., empathic agent) estimates the reward of its future state for others’ expectations, roth, in Equation 6. This reward is defined by the distance between the position of the player and others’ expected positions.
3.3 Results for Ideal Symmetrical Cases
We discuss the social behavior of the proposed empathic agent. As described in previous sections, the social behavior is behavior among others, and we discuss it by comparing behavior of the empathic player with that of the standard player controlled by standard active inference. Although it is difficult to define the appropriateness of social behavior, the minimum distance between agents can be considered as an indicator from the point of view of safety to avoid collisions in the behavior of walking. The larger the minimum distance, the more socially desirable walking is. Moreover, from the point of view of equality, the average difference in distance traveled by each walker can be considered as an indicator. The smaller the average difference in distance traveled by walkers is, the more socially desirable walking is. Therefore we evaluated the minimum distance and the average difference (inequality) among the walkers. We also discuss the altruism of empathic players based on differences in travel distances between the empathic players and the standard players, as well as differences in travel distances based on differences of surrounding others’ characteristics (selfish or altruistic). The behaviors of the agents are shown in Figure 9 and Table 1.
. | Travel distance . | Travel distance . | . | Inequality . |
---|---|---|---|---|
Method . | (player) . | (average of others) . | Minimum distance . | (average difference) . |
Situation (a) | ||||
Selfish | ||||
Standard | 4.68 | 5.04 | 0.51 | 0.358 |
Empathic | 4.77 | 4.80 | 0.81 | 0.027 |
Altruistic | ||||
Standard | 4.65 | 5.93 | 1.20 | 1.279 |
Empathic | 5.03 | 5.12 | 1.51 | 0.100 |
Situation (b) | ||||
Selfish | ||||
Standard | 4.77 | 5.09 | 0.70 | 0.244 |
Empathic | 4.95 | 4.97 | 0.57 | 0.128 |
Altruistic | ||||
Standard | 4.58 | 6.05 | 0.75 | 1.405 |
Empathic | 5.40 | 5.35 | 1.40 | 0.371 |
Situation (c) | ||||
Selfish | ||||
Standard | – | – | – | – |
Empathic | 4.85 | 5.12 | 0.61 | 0.338 |
Altruistic | ||||
Standard | 4.85 | 6.45 | 0.87 | 1.341 |
Empathic | 5.40 | 5.74 | 1.39 | 0.285 |
Note. Travel distances of the player and others, minimum distances between agents, and inequality of travel distances are shown. Inequality among agents is calculated as the average of the differences in the distance traveled by the agents. Data for situation (c) of selfish others are not listed because agents collided with each other before reaching each destination. |
. | Travel distance . | Travel distance . | . | Inequality . |
---|---|---|---|---|
Method . | (player) . | (average of others) . | Minimum distance . | (average difference) . |
Situation (a) | ||||
Selfish | ||||
Standard | 4.68 | 5.04 | 0.51 | 0.358 |
Empathic | 4.77 | 4.80 | 0.81 | 0.027 |
Altruistic | ||||
Standard | 4.65 | 5.93 | 1.20 | 1.279 |
Empathic | 5.03 | 5.12 | 1.51 | 0.100 |
Situation (b) | ||||
Selfish | ||||
Standard | 4.77 | 5.09 | 0.70 | 0.244 |
Empathic | 4.95 | 4.97 | 0.57 | 0.128 |
Altruistic | ||||
Standard | 4.58 | 6.05 | 0.75 | 1.405 |
Empathic | 5.40 | 5.35 | 1.40 | 0.371 |
Situation (c) | ||||
Selfish | ||||
Standard | – | – | – | – |
Empathic | 4.85 | 5.12 | 0.61 | 0.338 |
Altruistic | ||||
Standard | 4.85 | 6.45 | 0.87 | 1.341 |
Empathic | 5.40 | 5.74 | 1.39 | 0.285 |
Note. Travel distances of the player and others, minimum distances between agents, and inequality of travel distances are shown. Inequality among agents is calculated as the average of the differences in the distance traveled by the agents. Data for situation (c) of selfish others are not listed because agents collided with each other before reaching each destination. |
From the result for situation (a) in Figure 6, the standard player almost walked straight to the destination. This is because it predicted that it would obtain the highest reward (i.e., lowest free energy) if it were to walk straight to the destination and simultaneously predicted that another would pass without colliding with it even if it were to go straight. On the other hand, the empathic player moved in a more circuitous trajectory compared with the standard player. From Table 1, the total travel distance of the empathic player is larger than that of the standard player in situation (a), regardless of the type of other (selfish or altruistic). On the other hand, the total travel distance of the others is smaller when the player is the empathic agent than when the player is the standard agent. Moreover, comparing the empathic agent and the standard agent when the other is altruistic for situation (a), the travel distance of the empathic agent increased by 8.2% (from 4.65 to 5.03), whereas the travel distance of the other decreased by 13.7% (from 5.93 to 5.12). From these results, the empathic player behaved more altruistically than the standard player. In other words, the empathic player behaved in a way that benefited the others, even if it was to its detriment. The minimum distance between the player and the other increased by 25.8% (from 1.20 to 1.51) and inequality among agents decreased by 92.2% (from 1.279 to 0.100) in the case of the altruistic other. Therefore the empathic player took a socially desirable action in terms of both safety (minimum distance) and equality. Because the difference between the empathic and standard players was only the term of reward for the others (roth), the difference in behavior stems from this term. Motivation to respond to others’ expectations changes the behavior of the player; in other words, others surrounding the player leads to the player being altruistic. The evaluation also showed that the behavior of the player changes in accordance with the others around it. The total travel distance of the player when others were altruistic was larger by 5.5% (increasing from 4.77 to 5.03) than it was when others were selfish for situation (a). In other words, as explained in section 2.3, differences in the others around the empathic player were reflected in differences in the players’ sociality. Similar quantitative results are observed for situations (b) and (c), as shown in Table 1.
The trajectories of the agents’ movements in situation (b) when the player is the empathic agent resulted in agents avoiding each other in a circle at the center, as in situation (a). Alternatively, the behaviors were more complex when the others were selfish in situation (c). In situation (c), when the player was controlled by the standard active inference and the others were selfish, the player and the others collided. Meanwhile, the player and the others could reach each destination without collision when the others were selfish and the player was an empathic agent. Unlike in the other cases, in this case, the trajectory of the empathic player was almost straight, as shown in Figure 9. The social behavior of the empathic player in this case is not expressed in the final trajectory alone but in the details of the transition in movement. Figure 10 shows the transitions in movement when the others were selfish and the player was controlled by the empathic active inference in situation (c).
In this situation, the green other and the orange other first passed each other, while the blue other and the player waited (t1). The blue other then passed the center point, while the player kept waiting (t2, t3). Finally, the player passed the center point (t4). Namely, the empathic player socially behaved like the blue agent, “waiting” for others to pass by. Thus, in this case, sociality was found in the time of travel, not in the trajectory of travel.
From these results, the behavior of the empathic agent adaptively changed according to the surrounding others. In particular, the behavior of the empathic player was similar to that of the surrounding others. This is because the empathic agent responds to the expectations of others, and the expectations of others are predicted as “how would I predict the observed other if I were in his or her situation?” by using the internal model learned by the others’ behavior. Therefore the empathic agent behaves like others surrounding it. The empathic agent can change its sociality for a given situation without manually changing the reward for the situation. Sociality is automatically adjusted to every scene by the behavior of others.
3.4 Results for Asymmetrical Cases
In the previous section, we showed the behaviors of the agents in ideal situations in which the initial positions and destinations of the agents are perfectly symmetrical. In addition, we assumed that all the others have a single type (characteristic), selfish or altruistic. In this section, we show the behavior of the agent in more realistic situations: the case of asymmetric positions and others with several different characteristics. The settings of the player are the same as in the previous section. As shown in section 3.3, the trajectory of the standard player is almost a straight line, because it does not care about the others at all. Therefore we discuss only the case of the empathic player.
First, we evaluated the behaviors of the agents in the situation shown in Figure 11, in which the agent’s initial positions and destinations are asymmetric. Figure 12 shows the transitions in movement when all of the others were selfish.
In Figure 12, the empathic player (red) and blue other started to turn to avoid colliding with the green other at first (t1). The player and blue other continued to turn (t2) and avoided each other by a small margin and passed each other (t3). Finally, the green other passed the goal (t4). In this case, the minimum distance between agents was 0.42, and the movement of the empathic player was a more forceful trajectory than it was in the ideally symmetric case in Figure 6(b). Meanwhile, Figure 13 shows the different transitions in movement when others were altruistic in Figure 11.
At first, the empathic player (red) and blue other started to turn to avoid colliding, similarly to the case when others were selfish (t1). The empathic player turned to the left to avoid the blue other, unlike when the others are selfish, and the green other also started to turn (t2). Then all agents moved in one circle, all together (t3). Finally, all agents reached their goals. These results show that the empathic agent can properly move to avoid others around it even in the asymmetrical position cases shown in Figure 11. Moreover, avoidance depends on the surrounding others, and it can be seen that when the surrounding others are altruistic, the agent acts more altruistically to avoid others.
Next, we evaluate the empathic agent’s behavior when the characteristics of surrounding others are asymmetric, as shown in Figure 14.
Figure 14 shows three agents besides the player, and their characteristics are not homogeneous. We consider two cases with respect to the ratio of selfish to altruistic agents, namely, cases with (a) two selfish and one altruistic other and (b) one selfish and two altruistic others. Moreover, we consider two different arrangements of selfish and altruistic agents for each case.
Figure 15 shows the results for each case. In all cases, agents did not collide, and the players and all others moved appropriately to their destinations, avoiding each other. The resulting movement trajectories of the player do not show significant differences from case to case. However, there were differences in the detailed movement histories of the player, depending on the different proportions of altruistic others. For cases (a) and (b) in Figure 14, in which the number of selfish others is greater than that of altruistic others, the empathic player reached the destination at a time closer to the selfish others, with the altruistic agent reaching the destination last. For example, in case (a), the player reached the destination at step 91, at which time the other selfish agents (green and orange) also reached the destination, while the altruistic agent (blue) had not yet reached the destination. Similarly, in case (b), the player reached the destination at step 74, at which time the other selfish agents (blue and orange) had also reached the destination, while the altruistic agent (green) had not yet reached the destination. On the other hand, for cases (c) and (d) in Figure 16, in which the number of altruistic others is greater than the number of selfish others, the player waited for others to pass near the center to avoid collisions with others (t2 or t3 in Figures 15(c) and 15(d)) and, as a result, reached the destination last. For example, in cases (c) and (d), the player reached the destination at step 133 and 124, respectively. These results show that although the movement trajectory of the player itself does not change significantly depending on the characteristics of the surrounding others, differences can be seen in the time of travel, as shown in Figure 10. In particular, the more altruistic the behavior of the others around the player is, the more altruistic the player will act in time. In case (a), the behavior of the player stopping for a while near the goal was observed. Figure 16 shows this behavior.
The player has moved to a position beyond the center at step 47 and has already crossed paths with the green and orange agents. There are no others in front of the empathic player. However, the player continued to stop until step 67. This is because the player inferred that the others expected the empathic player to continue stopping at the position. However, this inference may be inappropriate because of the positional relationship between the player and the others. Because the proposed method determines actions based on the prediction of others’ expectations, if the prediction is inappropriate, the selected action may also be inappropriate, as in this case. In practical use, it will be important to address the risk of such predictions being wrong.
4 Conclusion
This article proposed a model of social behavior of artificial agents to realize artificial agents co-living with people in our human society. Social behavior is an action taken in consideration of surrounding others in a situation, and the challenge is to behave adaptively according to the situation. To solve this problem, the proposed method incorporates the mechanism of empathy into the active inference model, which is a behavioral model in cognitive science, to realize adaptive behavior in any situation through empathy with others around the agent.
We evaluated and discussed the social behavior of the proposed empathic agents using simulation results of walking scenes, a typical social behavior scene. Comparison with an agent controlled by standard active inference, which does not consider others around it at all, showed that the behavior of the empathic agent is appropriately social in terms of altruism, safety (minimum distance between agents), and inequality (average difference of travel distances). We also showed that the social behavior of empathic agents changes depending on the given situation by simulating their behavior in different situations, such as when the surrounding others have different characteristics (selfish or altruistic) and in different situations (number of walkers, layout of walkers). These results show that the proposed model can adaptively change its behavior by responding to others around it. When it learned and operated in an environment in which others were selfish, it tended to behave selfishly. Conversely, when it learned in an environment in which others were altruistic, it tended to behave altruistically.
An important point for the future is the control of empathy. In this article, we assume uniform empathy for others around us. However, the generation of more appropriate adaptive behavior requires a mechanism that can appropriately recognize others to empathize with and others for whom to avoid empathy and that can adjust empathy appropriately for each situation and each other. Future work is needed to model such a system based on findings on empathy control in psychology and neuroscience.
5 Related Work
Many studies on behavioral models of artificial intelligence (AI), robots, and other agents using active inference have been proposed. For example, active inference has been evaluated for simple control systems (Friston et al., 2018; Parr & Friston, 2017; Pio-Lopez et al., 2016) and for more complex control systems by leveraging advances in deep learning (Çatal et al., 2020; Çatal et al., 2021; Fountas et al., 2020; Millidge, 2020; Tschantz et al., 2020; Ueltzhöffer, 2018). These studies mainly assumed a situation in which there was a single control target and in which no other agents were present. Therefore empathy for others around agents and social behavior as presented in this article is not discussed. In Friston and Frith (2015), active inference is discussed for the case of two agents. In addition, active inference has recently been applied to multiagent cases (Friedman et al., 2021; Kaufmann et al., 2021). The motivations and discussions of these two-agent and multiagent cases are similar to those of this article, and they comprise interesting phenomena, such as the synchronization of agents and the emergence of collective intelligence from the autonomous behavior of individuals by assuming multiple learnable agents, that are not discussed herein. However, the differences in behavior depending on the surrounding others in different situations are not evaluated. Because agents’ social behavior changes in response to others around them, we believe that the simulation results reported in this article for a variety of surrounding situations are useful for the research on social behavior within the literature on active inference.
Although many studies of empathy models and inference models of others’ internal states are not based on active inference, we discuss here those studies that have been applied to robot and AI behavior generation, as in this article. One of the articles that seems most relevant is “Game Theory of Mind” (Yoshida et al., 2008). Although the study is not based on active inference, the inference of others’ mental states and the sociality of agents in a simple, discrete game are discussed therein. In Yoshida et al., bounded rationality, which is not discussed in this article, is incorporated into the model. Incorporating bounded rationality is important because real-time performance is required in real applications, such as walking the robot in a real world. Many other models have been proposed, as shown in a survey paper by Paiva et al. (2017); one reason why so many models have been proposed is that empathy is multidimensional and has many aspects. Therefore, when constructing empathy models, it is also important to consider the multidimensional aspect of empathy, rather than limiting it to one aspect (estimating others’ states by simulating others), as in this article. A model that assumes a dialogue agent has been proposed as considering such a multidimensional approach to empathy (Yalcin & DiPaola, 2018). Yalcin and DiPaola summarize empathy into three categories—communicative ability, emotion regulation, and emotional situation evaluation—and a hierarchical model with the categories is proposed. Although this article implemented one aspect of empathy for active inference, this research should be extended in the future by incorporating different aspects of empathy and related cognitive functions, such as bounded rationality.
In the field of multiagent systems, there have been many studies on the emergence of cooperative behavior (e.g., Hernandez-Leal et al., 2019; Winfield, 2018). In these multiagent systems, it is mainly assumed that other agents are also machines and that agents can explicitly communicate observations, model parameters, and predictions between each other (Foerster et al., 2016; Gupta et al., 2017; Tampuu et al., 2017). In this study, we assumed that the other agents are human, and there is no explicit communication between an artificial agent and human agents. Similar to the situation discussed in this article, social robotics assumes that a robot functioning around humans has sociality and needs to infer the mental states of humans from their behavior. For example, ProxEmo generates a trajectory for movement based on estimates of others’ emotions from their gait behavior (Narayanan et al., 2020). Most of these studies modeled others as agents different from the agent itself, whereas the proposed method models others as the same as the agent.