Synergistic reinforcement learning by cooperation of the cerebellum and basal ganglia
biorxiv(2024)
摘要
The cerebral cortex, cerebellum, and basal ganglia play a central role in flexible learning in mammals. However, how these three structures work together is not fully understood. Recently, it has been suggested that reinforcement learning may be implemented not only in the basal ganglia but also in the cerebellum, as the activity of cerebellar climbing fibers represents reward prediction error. If the same learning mechanism via reward prediction error occurs simultaneously in the basal ganglia and cerebellum, it remains unclear how these two regions co-function. Here, we recorded neuronal activity in the output of cerebellum and basal ganglia, the cerebellar nuclei and substantia nigra pars reticulata, respectively, from ChR2 transgenic rats with high-density Neuropixels probes while optogenetically stimulating the cerebral cortex point-by-point. The temporal response patterns could be categorized into two classes in both cerebellar nuclei and substantia nigra pars reticulata. Among them, the fast excitatory response of the cerebellar nuclei due to the input of mossy fibers and the inhibitory response of the substantia nigra pars reticulata via the direct pathway were synchronized. This coincidence, reproduced in a spiking network simulation based on connectome data, was expected to synchronously activate the cerebral cortex via the thalamus. To further investigate the significance of this synchronous positive feedback, we constructed a reservoir model that mimics the time course of the activity dynamics of cerebral cortex and temporal responses of cerebellar nuclei and substantia nigra pars reticulata. Plasticity of both parallel fiber inputs to Purkinje cell and corticostriatal synapses onto the striatal neurons of the direct pathway was essential for successful learning of a reinforcement learning task. Notably, learning was inhibited when the timing of the cerebellar or basal ganglia output was delayed from the real data by 10 ms; the larger this delay, the slower the learning rate. This necessary temporal precision was observed only when the cerebral cortex operated in the β-to-γ frequency range. These results indicate that coordinated output of the cerebellum and basal ganglia, with input from the cerebral cortex in a narrow frequency band, facilitates brain-wide synergistic reinforcement learning. Thus, our findings contribute to a holistic understanding of the interactions among the cerebellum, basal ganglia, and cerebral cortex.
### Competing Interest Statement
The authors have declared no competing interest.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要