Introduction

In the context of future smart mobility, there is an intensifying demand for naturalistic scene generation for automated vehicle simulation, intelligence testing and algorithm validation\cite{Feng2021}. The mixture of human-driven vehicles, pedestrians, and other intelligent autonomous agents will be on the roads, sharing the right of ways and interacting with one another, in the foreseeable future. In order to generate high-fidelity scenes for representing the new transportation modality, the interactive behaviors among heterogeneous traffic participants should be carefully considered. The conventional human-driven traffic participants, including pedestrians, cyclists, and human-driven cars, usually do not follow pre-defined trajectories or patterns, and their behaviors are difficult to predict in real world. But their decisions and actions are correlated, i.e. one's decision is made based on the constraints imposed by surrounding ones, and its behavior will also affect others in surround\cite{Huang_2021,Yu_2018,Hang_2020} . Besides, as a human has limited perception capability, the information one can obtain from the surrounding environment is limited \cite{Dingus_2016,Li_2016,Kuo_2019,Hu_2021}. Further, their individual behaviors are usually highly personalized, as different road users have diverse travel demands, preferences and habits\cite{Fridman_2019,Marina_Martinez_2018,Sama_2020,Xing_2020} . Thus, for the scene generation for autonomous driving, it is worthwhile exploring intelligent methods which can realize naturalistic and human-like interactive behaviors between intelligent agents. Instead of establishing comprehensive and large-scale various scenes, we focus on the intelligent representation of interacting moments. During interactions, the specific decision or intent of a road user is generally not available to the surrounding ones. However, through driving performance observation or driving style recognition, it is possible to infer their intents or possible actions using the trajectory prediction or aggressiveness estimation\cite{Huang_2021a}, which is crucial in competing for the right of way. Moreover, one's aggressiveness or pattern my not always remain unchanged, as the situation and demand are varying, which makes the interactions  game-like\cite{Hang_2020,Hang_2021,Liniger_2020} with incomplete information.
The understanding and modeling of interaction modalities among various road users, including cars, pedestrians, and cyclists, is critical, because information exchanges, time-varying reactions, and mutual influences would exert great impacts on the results of scene generation. Considering the above facts, in the context of human-like interactive scene generation for autonomous driving, challenges remain opening: What is the best strategy to win the right of road during interactions? And what is human's winning mechanism during interactions? To deal with the above problems, the decision logic behind the interaction with consideration of the aggressiveness should be explored first. Beyond this, the representation of human-likeness and its quantification method of human-likeness should be investigated as well.
To be more specific, we list some representative interactions and possible conflicts in Figure \ref{243685} . The first situation is a vehicle-vehicle interaction, occurring during lane-change and merging. Besides, the vehicle-pedestrian interaction is also presented, and it is very important especially  in unstructured or unsignalized areas. The third modality, i.e. the pedestrian-pedestrian interaction, which imposes more uncertainties to the autonomous driving scenarios, is included in the proposed paradigm as well. The most challenging situation is when road users conflict in their expected trajectories due to their non-cooperative behaviors. For instance, in the vehicle-pedestrian interaction case, the optimal solution for each of their trajectories (the yellow and blue lines, respectively, shown in Figure \ref{243685} is to not decelerate. However, if both of them maintain their current speeds, a collision will be inevitable.