Socially Aware Motion Planning with Deep Reinforcement Learning >>
A. 0 >> Abstract
1. important things >>
a) model subtle human behaviors >>
b) navigation rules >>
2. Traditional stu_method >> using feature-matching techniques to describe and imitate human paths
3. This paper >> not to do
B. 1 >> INTRODUCTION
1. challenging >>
a) subtle social norms that are difficult to quantify >>
b) not known >>
pedestrians’ intents
2. traditional ways >> avoiding collision
a) disadvantage >> generate unsafe/unnatural movements
3. Other ways >> predicted paths
pedestrians’ hidden intents
a) collision-free path >>
(1) freezing robot problem, >>
4. 解决办法 >> account for cooperation
a) model/anticipate the impact of the robot’s motion on the nearby pedestrians. 建立/预测及机器人运动对附近行人的影响 >>
5. cooperative, socially compliant navigation >>
a) Type 1 >> model-based
(1) extensions of multiagent collision avoidance >>
i) additional parameters introduced to account for social interactions >>
(2) 缺陷 1 >> unclear
precise geometric rules
i) 缺陷 2 >> oscillatory paths
b) Type 2 >> learning-based
(1) a policy that emulates human behaviors by matching feature statistics 通过匹配特征统计信息来制定一种模拟人类行为的策略 >>
i) 例子 >> Inverse Reinforcement Learning 逆强化学习
(2) 优 >> more closely resemble human behaviors 更接近人类行为的路径
(3) 缺 >> higher computational cost 更高的计算成本
i) different environments 不同环境的适用性存疑 >>
6. 总的来说 >> human-like navigation
a) solving a cooperative collision avoidance problem. 解决类人导航从 解决协同避碰问题 >>
7. 本文主要贡献 >> main contributions
a) socially aware collision avoidance 社会意识避免碰撞 >>
b) a symmetrical neural network 发展对称神经网络结构 —> 多智能体场景 >>
(1) multiagent (n > 2) scenarios >>
c) demonstrating 演示 >>
C. 2 >> II. BACKGROUND
1. 碰撞避免 (with DRL) >> A. Collision Avoidance with Deep Reinforcement Learning
a) a sequential decision making problem >>
b) 挑战 >> A major challenge in finding the optimal value function
(1) joint state sjnis a continuous, high-dimensional vector, 联合状态是一个连续高维向量 >>
i) t impractical to discretize and enumerate the state space 离散化 枚举空间 不切实际 >>
c) 解决方法 >> deep neural networks
d) this work extends the collision avoidance with deep reinforcement learning framework (CADRL) [14] to characterize and induce socially aware behaviors in multiagent systems. 本研究扩展了深度强化学习框架的碰撞避免机制,以刻画和诱导多智能体系统中的社会感知行为 >>
e) recent works >>
(1) in unknown static environments >>
(2) computing control inputs directly from raw sensor data >>
2. 社会规范特征 >> B. Characterization of Social Norms
a) human navigation >>
(1) cooperative >>
(2) time- efficient >>
(3) two properties >>
i) min-time reward function >>
ii) reciprocity assumption 互易性假设 >>
D. 3 >> III. APPROACH
1. 双 -> 多 >> two-agent
multiagent
2. A. Inducing Social Norms >>
a) To induce a particular norm, a small bias can be introduced in the RL training process in favor of one set of behaviors over others. >>
b) defining the penalty set Snorm affect the rate of convergence. >>
3. B. Training a Multiagent Value Network >>
a) two important modifications >>
(1) 1 >> two experience sets, E, Eb, are used to distinguish between trajectories
that reached the goals and those that ended in a collision 使用两个经验集来区分达到目标的轨迹和以碰撞结束的轨迹