GSTDTAP  > 气候变化
DOI10.1126/science.abb0184
Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex
Philippe Domenech; Sylvain Rheims; Etienne Koechlin
2020-08-28
发表期刊Science
出版年2020
英文摘要Successful behavior in an uncertain, changing, and open-ended environment critically relies on the ability to decide between continuing with the ongoing strategy or exploring new options. Neuroimaging studies have shown that the human medial prefrontal cortex (mPFC) is the part of the brain that primarily deals with this dilemma. However, the contribution of the different mPFC regions remains largely unknown. Domenech et al. recorded neuronal activity in six epileptic patients with depth electrodes in this brain area (see the Perspective by Steixner-Kumar and Gläscher). The ventral mPFC inferred the reliability of the ongoing action plan according to action outcomes. It proactively flagged outcomes either as learning signals to better exploit this plan or as potential triggers to explore new ones. The dorsal mPFC then evaluated action outcomes and generated an adaptive behavioral strategy. Science , this issue p. [eabb0184][1]; see also p. [1056][2] ### INTRODUCTION Everyday life often requires arbitrating between pursuing an ongoing action plan by possibly adjusting it versus exploring new action plans instead. Resolving this so-called exploitation-exploration dilemma is critical to gradually build a repertoire of action plans for efficient adaptive behavior in uncertain, changing, and open-ended everyday environments. Previous studies have shown that its resolution primarily involves the medial prefrontal cortex (mPFC). Human functional magnetic resonance imaging shows that activations in the ventromedial PFC (vmPFC) reflect the subjective value of the ongoing plan according to action outcomes, whereas the dorsomedial PFC (dmPFC) exhibits activations when this value drops and the plan is abandoned for exploring new ones. However, the neural mechanisms that resolve the dilemma and make the decision to exploit versus explore remain largely unknown. ### RATIONALE We addressed this issue by recording neuronal activity in participants using intracranial electroencephalography while they were performing a task that induced systematic exploitation-exploration dilemmas in an uncertain, changing, and open-ended environment. Participants were six epileptic patients with electrodes implanted in the vmPFC and dmPFC (see the figure), who were eventually diagnosed with temporal or parietal lobe epilepsy with no impacts in the PFC. Using computational modeling, we identified from participants’ behavior the so-called stay trials, when participants adjusted and exploited their ongoing action plan through reinforcement learning, and the switch trials, when action outcomes instead led participants to covertly switch away from this plan and explore new ones in the following trials. We then analyzed vmPFC and dmPFC neural activity in both stay and switch trials. ### RESULTS vmPFC neural activity in the high-gamma frequency band (>50 Hz) that reflects local processing was found to encode outcome expectations after action selection. This vmPFC high-gamma activity further encoded the prior and posterior reliability of the ongoing action plan relative to action outcomes, which, according to the computational model, subserved the arbitration between exploiting and exploring. Notably, this reliability encoding yielded vmPFC activity to proactively flag forthcoming action outcomes as potential triggers to explore rather than as learning signals to exploit. Preceding the occurrence of action outcomes, switch trials—unlike stay trials—witnessed an increased neural activity in the beta frequency band (13 to 30 Hz) that reflects top-down neural processing (see the figure). Following action outcomes in switch compared with stay trials, dmPFC neural activity then decreased in the theta frequency band (4 to 8 Hz), which indicates that the dmPFC was then configured to respond to action outcomes according to this vmPFC proactive construct. In stay trials, outcome expectations encoded in the vmPFC were transmitted to the dmPFC, so that from 300 ms after action outcomes, dmPFC neural activity in the high-gamma frequency band encoded the reward prediction error (i.e., the discrepancy between expected and actual outcomes that scales reinforcement learning). In switch trials, by contrast, this encoding was disrupted through reconfiguring dmPFC activity in the alpha frequency band (8 to 12 Hz) to release the inhibition bearing upon alternative action plans from 250 ms after action outcomes. ### CONCLUSION The medial PFC resolves exploitation-exploration dilemmas through a predictive coding mechanism that was originally proposed for perception. The vmPFC monitors the reliability of the ongoing action plan to proactively set the functional signification of forthcoming action outcomes as either learning signals to exploit or potential triggers to explore. The dmPFC responds to action outcomes according to this functional construct, yielding to either stay and adjust the ongoing plan through reinforcement learning or switch away from this plan to explore new ones. This predictive coding mechanism has the advantage of speeding up the abandonment of ongoing action plans and preventing action outcomes that trigger exploration from inappropriately acting as learning signals. These findings support the idea that predictive coding also operates within the prefrontal executive system and constitutes a general mechanism that underlies information processing across the cerebral cortex. In perceptual neural systems, predictive coding operates so that observers’ prior beliefs about a scene alter how they perceive the scene. Our findings suggest that within the prefrontal executive system, predictive coding operates by proactively altering the functional signification of behavioral events according to the agents’ beliefs about their own behavior. ![Figure][3] Action outcomes triggering exploration. Neural activity around outcome onsets in switch compared with stay trials recorded in ventromedial (orange, vmPFC) and dorsomedial (blue, dmPFC) prefrontal electrodes implanted in the six patients. Electrode localizations are shown on a canonical sagittal brain slice [Montreal Neurological Institute (MNI) coordinate: x = −10], and neural activity is shown against time according to its spectral decomposition. vmPFC activity reflecting top-down neural processing increased and proactively flagged action outcomes as potential triggers to explore rather than as learning signals to exploit. dmPFC activity followed action outcomes triggering exploration through reconfiguring neural processing. Stim, stimulus. Everyday life often requires arbitrating between pursuing an ongoing action plan by possibly adjusting it versus exploring a new action plan instead. Resolving this so-called exploitation-exploration dilemma involves the medial prefrontal cortex (mPFC). Using human intracranial electrophysiological recordings, we discovered that neural activity in the ventral mPFC infers and tracks the reliability of the ongoing plan to proactively encode upcoming action outcomes as either learning signals or potential triggers to explore new plans. By contrast, the dorsal mPFC exhibits neural responses to action outcomes, which results in either improving or abandoning the ongoing plan. Thus, the mPFC resolves the exploitation-exploration dilemma through a two-stage, predictive coding process: a proactive ventromedial stage that constructs the functional signification of upcoming action outcomes and a reactive dorsomedial stage that guides behavior in response to action outcomes. [1]: /lookup/doi/10.1126/science.abb0184 [2]: /lookup/doi/10.1126/science.abd7258 [3]: pending:yes
领域气候变化 ; 资源环境
URL查看原文
引用统计
文献类型期刊论文
条目标识符http://119.78.100.173/C666/handle/2XK7JSWQ/293218
专题气候变化
资源环境科学
推荐引用方式
GB/T 7714
Philippe Domenech,Sylvain Rheims,Etienne Koechlin. Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex[J]. Science,2020.
APA Philippe Domenech,Sylvain Rheims,&Etienne Koechlin.(2020).Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex.Science.
MLA Philippe Domenech,et al."Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex".Science (2020).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Philippe Domenech]的文章
[Sylvain Rheims]的文章
[Etienne Koechlin]的文章
百度学术
百度学术中相似的文章
[Philippe Domenech]的文章
[Sylvain Rheims]的文章
[Etienne Koechlin]的文章
必应学术
必应学术中相似的文章
[Philippe Domenech]的文章
[Sylvain Rheims]的文章
[Etienne Koechlin]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。