Human-level performance in 3D multiplayer games with population-based reinforcement learning

doi:10.1126/science.aau6249

GSTDTAP > 地球科学

DOI	10.1126/science.aau6249
	Human-level performance in 3D multiplayer games with population-based reinforcement learning
	Jaderberg, Max; Czarnecki, Wojciech M.; Dunning, Iain; Marris, Luke; Lever, Guy; Castaneda, Antonio Garcia; Beattie, Charles; Rabinowitz, Neil C.; Morcos, Ari S.; Ruderman, Avraham; Sonnerat, Nicolas; Green, Tim; Deason, Louise; Leibo, Joel Z.; Silver, David; Hassabis, Demis; Kavukcuoglu, Koray; Graepel, Thore
	2019-05-31
发表期刊	SCIENCE
ISSN	0036-8075
EISSN	1095-9203
出版年	2019
卷号	364 期号:6443 页码:859-+
文章类型	Article
语种	英语
国家	England
英文摘要	Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input. We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.
领域	地球科学 ; 气候变化 ; 资源环境
收录类别	SCI-E ; SSCI
WOS记录号	WOS:000469887900056
WOS关键词	TIME ; GO
WOS类目	Multidisciplinary Sciences
WOS研究方向	Science & Technology - Other Topics
引用统计
文献类型	期刊论文
条目标识符	http://119.78.100.173/C666/handle/2XK7JSWQ/201527
专题	地球科学资源环境科学气候变化
作者单位	DeepMind, London, England
推荐引用方式 GB/T 7714	Jaderberg, Max,Czarnecki, Wojciech M.,Dunning, Iain,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning[J]. SCIENCE,2019,364(6443):859-+.
APA	Jaderberg, Max.,Czarnecki, Wojciech M..,Dunning, Iain.,Marris, Luke.,Lever, Guy.,...&Graepel, Thore.(2019).Human-level performance in 3D multiplayer games with population-based reinforcement learning.SCIENCE,364(6443),859-+.
MLA	Jaderberg, Max,et al."Human-level performance in 3D multiplayer games with population-based reinforcement learning".SCIENCE 364.6443(2019):859-+.