By using positive reinforcement (强化), an approach familiar to anyone who’s used treats to change a dog’s behavior, a team dramatically improved a robot’s skills and did it quickly enough to make training robots for real-world work a more workable enterprise. The findings are newly published in a paper called, “Good Robot!”
Unlike humans and animals that are born with highly intuitive (有直觉力的) brains, robots are blank slates (写字板) and must learn everything from scratch. But true learning is often accomplished by trial and error, and roboticists are still figuring out how robots can learn efficiently from their mistakes.
With an effort the team accomplished that by designing a reward system that works for a robot in the way treats work for a dog. Where a dog might get a cookie for a job well done, the robot got an incentive too.
Andrew Hundt, lead author of the study, recalled how he once taught his puppy named Leah the command “Leave it”, so she could ignore squirrels on walks. He used two types of treats, ordinary trainer treats and something even better, like cheese. When Leah was excited and sniffing around the treats, she got nothing. But when she calmed down and looked away, she got the good stuff. “That’s when I gave her the cheese and said, ‘Leave it! Good Leah!’” He later decided to try that method on robots.
Similarly, to stack (码放整齐) blocks, Spot, the robot, needed to learn how to focus on constructive actions. As the robot explored the blocks, it quickly learned that correct behaviors for stacking earned high points, but incorrect ones earned nothing. Reach out but don’t grasp a block? No points. Knock over a stack? Definitely no points. Spot earned the most by placing the last block on top of a four-block stack.
“The robot wants the higher score,“ Hundt said. ”It quickly learns the right behavior to get the best reward. In fact, it used to take a month of practice for the robot to achieve 100% accuracy. We were able to do it in two days.”
8. What did the team want to achieve by creating the new approach?
A.To reduce the error rate of robots performing tasks. |
B.To develop a robot able to learn by itself. |
C.To figure out robots’ way of learning. |
D.To help improve robots’ learning efficiency. |
9. Which of the following can replace the underlined part in Paragraph 3?
A.Won an award. | B.Had a special talent. |
C.Raised an interest. | D.Experienced some mistakes. |
10. Why does the author talk about Andrew Hundt’s training his dog?
A.To help us better understand their findings. |
B.To make the text more attractive to animal lovers. |
C.To show dogs are actually very clever. |
D.To explain how he got inspiration for the study. |
11. What does the last but one paragraph mainly tell us?
A.The process of training Spot to stack blocks. |
B.The possibility of a robot’s earning high points. |
C.The superiority of Spot as a newly developed robot. |
D.The significance of having robots work with blocks. |