Discussion

As can be seen in Figure \ref{113875}, all three cases start with a confusing situation because of the incomplete information assumption where V2 has no prior on V1. When \(\alpha_{svth}=0.5\), V2 believes that it is less aggressive than V1 as can be seen in Figure \ref{113875} (a) but it still accelerates a little bit to ensure this belief. After a short period of acceleration, it finds out that it is actually less aggressive, thus it decelerates to give the right of way as can be seen in Figure \ref{113875} (d) . On the other hand, when \(\alpha_{svth}=0.7\), V2 believes it is more aggressive than V1, as can be found in Figure \ref{113875} (b) , where the green region is the largest at the beginning. It accelerates longer to demonstrate its will to fight. After a turning point at around 3s, the subject vehicle doubts itself whether it is really more aggressive than the surrounding one. Because the obstacle vehicle follows pre-defined way-points no matter what happens, which can be seen as extremely aggressive. V2 yields eventually till there is no driving conflicts. However, when \(\alpha_{svth}=0.9\), V2 thinks it is more aggressive than the surrounding one. Thus it accelerates intensively, along with its belief as the green block in Figure \ref{113875} (c). After a short period of probing, it chooses to fight. Though, for the same reason, the obstacle vehicle can be seen as extremely aggressive. V2 doubts itself even it is driving parallel with the obstacle one. After a while, it knows that it is less aggressive, thus accelerates to get out of the situation. In this case, V2 is not more aggressive than the obstacle, but when the subject vehicle tries to probe, it accelerates more and finally blocked the surrounding road user.
As is shown in Figure \ref{896130}, when \(\kappa = 1\), the driver does not make a rush decision as compared to \(\kappa = 2\): the relative aggressiveness varies less intensively. By tuning those above white-box parameters, various complex behaviors can be generated.
As in Figure \ref{907317}, when the ego vehicle is less polite or more aggressive as defined in this paper, the driver intends to start a lane change earlier. However, our method are not exactly monotonous. This is because when the vehicle is probing, there are chances that the subject misjudged the obstacle vehicle's intention and also, the subject vehicle can block the obstacle vehicle, thus the obstacle vehicle has to yield. This means the proposed concurs with MOBIL's method, which is proved effective in modeling large traffic flow, while our method can generate micro and more complicated human behavior. Also, the comparison with Xuemin's\cite{Hu_2018} method in Figure \ref{665197} indicates that our method could be more aggressive. As can be seen in the early phase of lane change, the compared method is more conservative because, when there is a conflict of interest, the subject vehicle will choose a less costly candidate trajectory without conflict. However, the proposed method is adversarial because the algorithm assumes that the obstacle driver will yield eventually.
When we set the collision weight to infinite, the trajectories selected by the algorithm are the same as the compared method as is validated in Figure \ref{693354}. It does not matter what \(P\) is because, when the collision cost is too high, our method will always choose a conservative alternative, which aligns with the literature.
Also, the comparison with Hang et.al's method . Figure \ref{539751}b, vehicle 2 is assumed to be an aggressive driver. In our case, \(\alpha_{th,v1} = 0.1\)\(\alpha_{th,v2} = 1\) both methods indicate that the subject vehicle maintains a relatively low velocity. But there are two major differences. In Table Figure \ref{539751}d, the subject vehicle of Hang et.al's method accelerates and maintains its speed at around 21m/s. However, to enlarge the space for lane-change, our vehicle 1 decelerates and then starts to accelerate at 5s to restart a lane-change. As can be seen, our method outperforms in lane changing time: proactive decelerating increases grid distance for a faster lane change. As for driver 2's behavior, though Hang's method accelerates intensively and maintains at around 22 m/s, our vehicle 2 does not accelerate much because vehicle 1 is faster, as can be seen as more aggressive, from 0s to 1s. Though the aggressiveness threshold of our vehicle 2 is 1, it accelerates conservatively to ensure driver 1's aggressiveness. Though from 4s to 8s, our method accelerates to a relatively high velocity to get rid of the lane-changing vehicle. The other situation is the opposite of case one. As can be found in Table Figure \ref{539751}c and Figure \ref{539751}e, vehicle 1 is quite sure that it is more aggressive, thus it starts a lane change at the early phase and accelerates to 24m/s at 3s. However, vehicle 2 does not quit fighting from 0s to 1s. The above phenomenon aligns with Hang's method, though from 3s to 8s, the two methods are different because, in the scene generation context, after the interaction, the vehicles are enforced to bounce back to the initial velocity, through which we can eliminate the chance of collisions with irrelevant vehicles, which can be tuned to Hang's method easily. Additionally, although the trends in the two methods are identical, the velocity or trajectories are not exactly the same. This is partially because Hang et.al use MPC for the control of vehicle which is a strong assumption because human drivers can control the vehicle perfectly.
According to the above comparisons, we can see, from the decision level, the proposed method concurs with the state of art literature and is more flexible and intelligent with respect to scene generation. Though, we compare the proposed method with some methodologies that are proven to be human-like, to further evaluate the human-likeness of this method, a variation of the Turing test is conducted.
In the Turing test, most people are not able to distinguish whether it is human or algorithm as 50% of them have approximately 50% accuracy. Also, the other participant's scores are closed to 50% as well. This indicates our method can confuse the participants so that they can not distinguish whether it is algorithm-generated or controlled by a real human driver. Thus the proposed is human-like and effective for scene generation for driving intelligence tests. Meanwhile, the accident rate is 14% (most of them are caused by participant 1, rear-end collision), which is higher than usual\cite{Feng2021}, indicating that the participants are actively testing the subject vehicle. This is higher than normal driving but aligns with what we told the drivers before the test: we want them driving normally as a primary task but we also need them to test the subject, which makes our results more reliable. However, there are still some drawbacks in the proposed work as well. One major drawback as reported by the participants is that there are case when driver A acts so indecisive. This is because when the random driving style is too small, the algorithm behaves so cautiously, which does not happen in reality. Also, there is also a overshoot problem for both participants and algorithm. Algorithm overshoot usually happens in the early stage of lane-changing because, as is the same, since the control part is not constrained, when \(\alpha_{th}\) is too small, the driver will tries to get away from the obstacle driver. As for real human driver, the overshoot usually happens in the late stage of lane changing because when they accelerate too heavily, they can not control the vehicle properly. On the other hand, these cases are rare. Hence, we think the results is generally valid. In future work, we may take the control level into consideration to generate more human-like behavior. Also, we will increase the number of participants and set a base-line for the test as well.

Conclusion

This paper presents a human-like decision-making algorithm for driving intelligence tests. The interaction model of road users is firstly established using the Bayesian game theory. Besides an extreme conservative choice or an extreme aggressive choice, a probing behavior can be generated using the proposed method based on the cost and relative aggressiveness probability. To evaluate the aggressiveness of the opponent, an observation model is established and the way to customize it is given by an experiment. Additionally, the driver's probing strategy generation method is developed to test the real aggressiveness of the background vehicle. The strategy is reflected on the vehicle's behavior through a proposed Markov method. Next, the proposed methodology is compared with commonly used approaches and state of art literature. The comparison indicates that our method concurs with previous researches while is capable of generating more complex and human-like behavior. Finally, the human-likeness of our algorithm is evaluated using the Turing test. The test results indicate that the participants cannot distinguish human behavior from the behavior generated by our algorithm.
Although the proposed method is designed for scene generation, it may shed some light on the autonomous driving algorithm as well. One of the major challenges in autonomous driving is the uncertainty of traffic. Instead of passively accepting the probability, we may actively make some small steps to reduce the entropy without compromising safety as is given in this paper. Current researches focus on prediction accuracy and learning convergence, which is supposed to be a trade-off between perception/computation burden, and accuracy; the more data available, the more powerful the computer is, the better the decision can be. In this way, we may eventually be able to predict the future, thus obtain a best decision. But, this demand is endless. The decision algorithm, as well as prediction and aggressiveness estimation methodology in this paper, are simple and direct, thus computationally efficient because we do not insist on the global best decision, which is the same for normal human drivers; when human drivers are confused, they just try with small steps, which are simple but powerful.
Additionally, the Turing test framework given in this paper might be applied to autonomous driving algorithm evaluation. As so many researchers and manufacturers are developing human-like self-driving algorithms, this unified and objective method can be used for the assessment of human-likeness.
Our future work will focus on the human control level. Other human behaviors, such as human distraction, control latency will be considered to generate more human-like behavior for autonomous tests. Also, the Markov method will be replaced with a better approximator that can be even more tightly connected to the strategy. Moreover, a more general Turing test procedure with more participants might be our focus as well.