Build Complementary Models on Human Feedback for Simulation to the Real World
Knowledge-Based Systems(2021)
Key words
Safe reinforcement learning,Human-in-the-loop reinforcement learning,Markov decision processes,Supervised learning
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined