GSTDTAP  > 气候变化
DOI10.1126/science.abd7258
Strategies for navigating a dynamic world
Saurabh Steixner-Kumar; Jan Gläscher
2020-08-28
发表期刊Science
出版年2020
英文摘要One of the most difficult problems for an adaptable agent is gauging how to behave in a nonstationary environment. When conditions are stable, an organism generally pursues a strategy known to provide the best outcome. However, when environmental conditions change, an organism abandons the current action plan and searches for a new best option. The most challenging aspect of this search—calculating the exact time point at which to change strategies—requires the brain to integrate past and present observations and evaluate whether they remain consistent with current environmental conditions. On page 1076 of this issue, Domenech et al. ([ 1 ][1]) report on the modeling of rare direct electrical recordings from the prefrontal cortices (PFCs) of a small group of human epilepsy patients as they flexibly negotiated a nonstationary environment. To understand the brain's mode of navigation, consider for example a sailor at sea (see the figure). The winds and the currents determine the waves that drive the sailor to continuously adjust the rudder so as to stay on course. By observing the wave patterns, he can anticipate the navigational effects of his actions and adapt accordingly. But when the currents or the weather changes, the sailor must adapt his course to reach the next port of call. At that time, the sailor observes essentially the same stimulus (the waves) but has to remap his action plan (rudder adjustments) to the new wind conditions and currents. This difficult-decision problem—how to detect and then adapt to a nonstationary environment—is captured perfectly in the exploration-exploitation dilemma: When should I stop exploiting my current action plan and start exploring different ways to reach my goals? An optimal solution tracks the discounted sum of normalized future rewards. However, this approach applies strictly to stationary environments and thus does not capture the dynamic changes that organisms encounter in their daily lives ([ 2 ][2]). Yet the human brain and those of other species seem to smoothly solve the exploration-exploitation dilemma in nonstationary environments. Decision neuroscience has investigated the flexible adaptation to changing environmental contingencies with diverse experimental paradigms and assorted computational models. The simplest paradigm is probabilistic reversal learning, in which the agent has to search for reward among two options with complementary reward probabilities. This adaptation problem can be solved by hidden Markov models ([ 3 ][3]), which are well-approximated by reinforcement learning (RL) models that also update nonchosen actions ([ 4 ][4]). Extension of this paradigm to include independently changing reward probabilities reveals two distinct neural responses: Expected-value signals, which reflect “exploitative” choices, spur activation of the ventromedial prefrontal cortex (vmPFC); and “explorative” choices (that is, the choosing of a currently lesser valued option) activate the frontopolar cortex ([ 5 ][5]). ![Figure][6] A sailor solves a dilemma at sea As the ship nears bad weather, the sailor's ventromedial prefrontal cortex (vmPFC) evaluates the ongoing (orange) action plan (exploitation) and the prospective (brown, red) plans (exploration). Once the red (calm waters) plan is exploited, the sailor's dorsomedial PFC (dmPFC) uses trial-and-error learning to map the proper rudder adjustments. GRAPHIC: A. KITTERMAN/ SCIENCE Another task with both rapid and slow changes in the reward probabilities of various options was used to develop a hierarchical Bayesian model that estimates the volatility of the environment and adjusts the learning rate accordingly ([ 6 ][7]). This model has found its generalization in the hierarchical Gaussian filter (HGF) framework ([ 7 ][8]), which is widely used in modeling social and nonsocial human decision-making in nonstationary environments. Although these computational modeling frameworks differ, all are trying to solve similar problems: How to infer the latent structure of the world from discrete observations and how to detect transitions between different states of the world. Domenech et al. address the same problems with yet another experimental paradigm, this one carried out with a small group of human epilepsy patients. Electrodes deeply implanted in the patients' PFCs delivered direct electrical recordings from the vmPFC and dorsomedial PFC (dmPFC) while the patients performed a multioption decision task. The participants had to associate three different stimuli with three distinct actions, thus constituting an action plan. The mapping changed every 33 to 57 trials, and participants had to relearn the association of the same stimuli with a different combination of actions, much like our sailor at sea who faces changes in weather and currents that alter wave patterns. The computational model ([ 8 ][9]) generates a reliability value for the ongoing action plan and other concurrently monitored plans. When the ongoing action plan is deemed reliable, the model is in “exploitation” mode and learns the stimulus-action mapping through RL mechanisms. When the ongoing action plan is deemed unreliable, the model switches to “exploration” mode. New provisional action plans are created and evaluated, until one emerges as a reliable predictor for successful stimulus-action mapping (see the figure). Using a state-of-the-art model-based analysis that associates the model-derived variables with the brain activity in various frequency bands of the neural recordings, the authors found a delicate interplay between the vmPFC and dmPFC that supports a predictive coding interpretation for resolution of the exploration-exploitation dilemma. vmPFC monitors and represents the reliability of the ongoing action plan. vmPFC relays the ongoing action plan to the dmPFC as either a “stay” or “switch” trial. A stay trial triggers additional learning through RL mechanisms in the dmPFC. In contrast, the dmPFC responds to a switch trial by suppressing activity related to maintaining the ongoing action plan. These findings resonate with and extend earlier results obtained with functional neuroimaging ([ 5 ][5], [ 9 ][10]). These computational approaches to the problem of behavioral flexibility in a nonstationary environment share one commonality: They are all building a model of the environment and the transition therein, either explicitly (as in the HGF framework) or implicitly (by evaluating the ongoing action plan, as in the Domenech et al. study). Although all of these models strive for generality, each was developed for a specific experimental context. It remains to be seen which of these provides the best account of flexible decision-making in humans and other species, preferably using a unified experimental paradigm. A model-free RL account ([ 10 ][11]) likely will not suffice, as several studies have demonstrated the superiority of more-complex models over this “vanilla” RL model. Rather, an agent requires a rich representation of the environment and its dynamic transitions (often referred to as model-based learning) ([ 10 ][11]) to solve the exploration-exploitation dilemma and flexibly respond to a changing world. 1. [↵][12]1. P. Domenech, 2. S. Rheims, 3. E. Koechlin , Science 369, eabb0184 (2020). [OpenUrl][13][Abstract/FREE Full Text][14] 2. [↵][15]1. J. D. Cohen, 2. S. M. McClure, 3. A. J. Yu , Philos. Trans. R. Soc. London Ser. B 362, 933 (2007). [OpenUrl][16][CrossRef][17][PubMed][18] 3. [↵][19]1. A. N. Hampton, 2. P. Bossaerts, 3. J. P. O'Doherty , J. Neurosci. 26, 8360 (2006). [OpenUrl][20][Abstract/FREE Full Text][21] 4. [↵][22]1. J. Gläscher, 2. A. N. Hampton, 3. J. P. O'Doherty , Cereb. Cortex 19, 483 (2009). [OpenUrl][23][CrossRef][24][PubMed][25][Web of Science][26] 5. [↵][27]1. N. D. Daw, 2. J. P. O'Doherty, 3. P. Dayan, 4. B. Seymour, 5. R. J. Dolan , Nature 441, 876 (2006). [OpenUrl][28][CrossRef][29][PubMed][30][Web of Science][31] 6. [↵][32]1. T. E. J. Behrens, 2. M. W. Woolrich, 3. M. E. Walton, 4. M. F. S. Rushworth , Nat. Neurosci. 10, 1214 (2007). [OpenUrl][33][CrossRef][34][PubMed][35][Web of Science][36] 7. [↵][37]1. C. Mathys, 2. J. Daunizeau, 3. K. J. Friston, 4. K. E. Stephan , Front. Hum. Neurosci. 5, 39 (2011). [OpenUrl][38][CrossRef][39][PubMed][40] 8. [↵][41]1. A. Collins, 2. E. Koechlin , PLOS Biol. 10, e1001293 (2012). [OpenUrl][42][CrossRef][43][PubMed][44] 9. [↵][45]1. M. Donoso, 2. A. G. E. Collins, 3. E. Koechlin , Science 344, 1481 (2014). [OpenUrl][46][Abstract/FREE Full Text][47] 10. [↵][48]1. N. D. Daw, 2. P. Dayan , Philos. Trans. R. Soc. London Ser. B 369, 20130478 (2014). [OpenUrl][49][CrossRef][50][PubMed][51] [1]: #ref-1 [2]: #ref-2 [3]: #ref-3 [4]: #ref-4 [5]: #ref-5 [6]: pending:yes [7]: #ref-6 [8]: #ref-7 [9]: #ref-8 [10]: #ref-9 [11]: #ref-10 [12]: #xref-ref-1-1 "View reference 1 in text" [13]: {openurl}?query=rft.jtitle%253DScience%26rft.stitle%253DScience%26rft.aulast%253DDomenech%26rft.auinit1%253DP.%26rft.volume%253D369%26rft.issue%253D6507%26rft.spage%253Deabb0184%26rft.epage%253Deabb0184%26rft.atitle%253DNeural%2Bmechanisms%2Bresolving%2Bexploitation-exploration%2Bdilemmas%2Bin%2Bthe%2Bmedial%2Bprefrontal%2Bcortex%26rft_id%253Dinfo%253Adoi%252F10.1126%252Fscience.abb0184%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [14]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE3OiIzNjkvNjUwNy9lYWJiMDE4NCI7czo0OiJhdG9tIjtzOjIzOiIvc2NpLzM2OS82NTA3LzEwNTYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9 [15]: #xref-ref-2-1 "View reference 2 in text" [16]: {openurl}?query=rft.jtitle%253DPhilosophical%2BTransactions%2Bof%2Bthe%2BRoyal%2BSociety%2BB%253A%2BBiological%2BSciences%26rft.stitle%253DPhil%2BTrans%2BR%2BSoc%2BB%26rft.aulast%253DCohen%26rft.auinit1%253DJ.%2BD%26rft.volume%253D362%26rft.issue%253D1481%26rft.spage%253D933%26rft.epage%253D942%26rft.atitle%253DShould%2BI%2Bstay%2Bor%2Bshould%2BI%2Bgo%253F%2BHow%2Bthe%2Bhuman%2Bbrain%2Bmanages%2Bthe%2Btrade-off%2Bbetween%2Bexploitation%2Band%2Bexploration%26rft_id%253Dinfo%253Adoi%252F10.1098%252Frstb.2007.2098%26rft_id%253Dinfo%253Apmid%252F17395573%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [17]: /lookup/external-ref?access_num=10.1098/rstb.2007.2098&link_type=DOI [18]: /lookup/external-ref?access_num=17395573&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom [19]: #xref-ref-3-1 "View reference 3 in text" [20]: {openurl}?query=rft.jtitle%253DJournal%2Bof%2BNeuroscience%26rft.stitle%253DJ.%2BNeurosci.%26rft.aulast%253DHampton%26rft.auinit1%253DA.%2BN.%26rft.volume%253D26%26rft.issue%253D32%26rft.spage%253D8360%26rft.epage%253D8367%26rft.atitle%253DThe%2BRole%2Bof%2Bthe%2BVentromedial%2BPrefrontal%2BCortex%2Bin%2BAbstract%2BState-Based%2BInference%2Bduring%2BDecision%2BMaking%2Bin%2BHumans%26rft_id%253Dinfo%253Adoi%252F10.1523%252FJNEUROSCI.1010-06.2006%26rft_id%253Dinfo%253Apmid%252F16899731%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [21]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam5ldXJvIjtzOjU6InJlc2lkIjtzOjEwOiIyNi8zMi84MzYwIjtzOjQ6ImF0b20iO3M6MjM6Ii9zY2kvMzY5LzY1MDcvMTA1Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30= [22]: #xref-ref-4-1 "View reference 4 in text" [23]: {openurl}?query=rft.jtitle%253DCereb.%2BCortex%26rft_id%253Dinfo%253Adoi%252F10.1093%252Fcercor%252Fbhn098%26rft_id%253Dinfo%253Apmid%252F18550593%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [24]: /lookup/external-ref?access_num=10.1093/cercor/bhn098&link_type=DOI [25]: /lookup/external-ref?access_num=18550593&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom [26]: /lookup/external-ref?access_num=000262518800023&link_type=ISI [27]: #xref-ref-5-1 "View reference 5 in text" [28]: {openurl}?query=rft.jtitle%253DNature%26rft.stitle%253DNature%26rft.aulast%253DDaw%26rft.auinit1%253DN.%2BD.%26rft.volume%253D441%26rft.issue%253D7095%26rft.spage%253D876%26rft.epage%253D879%26rft.atitle%253DCortical%2Bsubstrates%2Bfor%2Bexploratory%2Bdecisions%2Bin%2Bhumans.%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnature04766%26rft_id%253Dinfo%253Apmid%252F16778890%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [29]: /lookup/external-ref?access_num=10.1038/nature04766&link_type=DOI [30]: /lookup/external-ref?access_num=16778890&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom [31]: /lookup/external-ref?access_num=000238254100043&link_type=ISI [32]: #xref-ref-6-1 "View reference 6 in text" [33]: {openurl}?query=rft.jtitle%253DNature%2Bneuroscience%26rft.stitle%253DNat%2BNeurosci%26rft.aulast%253DBehrens%26rft.auinit1%253DT.%2BE.%26rft.volume%253D10%26rft.issue%253D9%26rft.spage%253D1214%26rft.epage%253D1221%26rft.atitle%253DLearning%2Bthe%2Bvalue%2Bof%2Binformation%2Bin%2Ban%2Buncertain%2Bworld.%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnn1954%26rft_id%253Dinfo%253Apmid%252F17676057%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [34]: /lookup/external-ref?access_num=10.1038/nn1954&link_type=DOI [35]: /lookup/external-ref?access_num=17676057&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom [36]: /lookup/external-ref?access_num=000249144000025&link_type=ISI [37]: #xref-ref-7-1 "View reference 7 in text" [38]: {openurl}?query=rft.stitle%253DFront%2BHum%2BNeurosci%26rft.aulast%253DMathys%26rft.auinit1%253DC.%26rft.volume%253D5%26rft.spage%253D39%26rft.epage%253D39%26rft.atitle%253DA%2Bbayesian%2Bfoundation%2Bfor%2Bindividual%2Blearning%2Bunder%2Buncertainty.%26rft_id%253Dinfo%253Adoi%252F10.3389%252Ffnhum.2011.00039%26rft_id%253Dinfo%253Apmid%252F21629826%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [39]: /lookup/external-ref?access_num=10.3389/fnhum.2011.00039&link_type=DOI [40]: /lookup/external-ref?access_num=21629826&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom [41]: #xref-ref-8-1 "View reference 8 in text" [42]: {openurl}?query=rft.jtitle%253DPLoS%2Bbiology%26rft.stitle%253DPLoS%2BBiol%26rft.aulast%253DCollins%26rft.auinit1%253DA.%26rft.volume%253D10%26rft.issue%253D3%26rft.spage%253De1001293%26rft.epage%253De1001293%26rft.atitle%253DReasoning%252C%2Blearning%252C%2Band%2Bcreativity%253A%2Bfrontal%2Blobe%2Bfunction%2Band%2Bhuman%2Bdecision-making.%26rft_id%253Dinfo%253Adoi%252F10.1371%252Fjournal.pbio.1001293%26rft_id%253Dinfo%253Apmid%252F22479152%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [43]: /lookup/external-ref?access_num=10.1371/journal.pbio.1001293&link_type=DOI [44]: /lookup/external-ref?access_num=22479152&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom [45]: #xref-ref-9-1 "View reference 9 in text" [46]: {openurl}?query=rft.jtitle%253DScience%26rft_id%253Dinfo%253Adoi%252F10.1126%252Fscience.1252254%26rft_id%253Dinfo%253Apmid%252F24876345%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [47]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEzOiIzNDQvNjE5MS8xNDgxIjtzOjQ6ImF0b20iO3M6MjM6Ii9zY2kvMzY5LzY1MDcvMTA1Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30= [48]: #xref-ref-10-1 "View reference 10 in text" [49]: {openurl}?query=rft.jtitle%253DPhilos.%2BTrans.%2BR.%2BSoc.%2BLondon%2BSer.%2BB%26rft_id%253Dinfo%253Adoi%252F10.1098%252Frstb.2013.0478%26rft_id%253Dinfo%253Apmid%252F25267820%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [50]: /lookup/external-ref?access_num=10.1098/rstb.2013.0478&link_type=DOI [51]: /lookup/external-ref?access_num=25267820&link_type=MED&atom=%2Fsci%2F369%2F6507%2F1056.atom
领域气候变化 ; 资源环境
URL查看原文
引用统计
文献类型期刊论文
条目标识符http://119.78.100.173/C666/handle/2XK7JSWQ/293214
专题气候变化
资源环境科学
推荐引用方式
GB/T 7714
Saurabh Steixner-Kumar,Jan Gläscher. Strategies for navigating a dynamic world[J]. Science,2020.
APA Saurabh Steixner-Kumar,&Jan Gläscher.(2020).Strategies for navigating a dynamic world.Science.
MLA Saurabh Steixner-Kumar,et al."Strategies for navigating a dynamic world".Science (2020).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Saurabh Steixner-Kumar]的文章
[Jan Gläscher]的文章
百度学术
百度学术中相似的文章
[Saurabh Steixner-Kumar]的文章
[Jan Gläscher]的文章
必应学术
必应学术中相似的文章
[Saurabh Steixner-Kumar]的文章
[Jan Gläscher]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。