Incentive Centered Design

Making the Internet Safe, Fun, and Profitable

STIET News

 Press Release and  podcast -- Yahoo Answers users seek advice, opinion, as well as expertise in research by Mark Ackerman, Lada Adamic and STIET fellow Eytan Bakshy

 Press Release -- Bluffing in prediction markets research by Rahul Sami and STIET fellow, Stanko Dimitrov

 Podcast discussing the STIET research program with Jeff MacKie-Mason and Tom Finholt

  STIET video showing lifesize, uncompressed fiber-optic video conference from UM Atkins room to WSU via OptIPortal

Contact STIET

STIET Program
University of Michigan
2204 SI North 2112
1075 Beal Ave
Ann Arbor, MI 48109-2112
voice (734) 615-7210
fax (734) 764-2475

User login

Nov 19 Seminar: Satinder Singh

Date: 
Thu, 11/19/2009 - 4:10pm - 5:30pm
Seminar Information: 

Satinder Singh

Professor of Electrical Engineering and Computer Science, University of Michigan

"Rethinking Rewards in Reinforcement Learning"
Location: 

4-5:30 pm
UM: 411 West Hall
WSU: 313 State Hall (via videoconference)

Satinder Singh Baveja
Seminar Description: 

In the computational reinforcement learning (RL) framework, rewards—more specifically, reward functions—determine the problem the learning agent is trying to solve. Properties of the reward function influence how easy or hard the problem is, and how well an agent may do, but RL theory and algorithms are completely insensitive to the source of rewards. This is a strength of the framework because of the generality it confers, but it is also a weakness because it defers key questions about the nature of reward functions. In this talk, I address this weakness from two directions. First, I consider the role of evolution in determining where rewards come from in natural agents. Specifically, I present a computational framework in which evolved rewards capture regularities across environments leaving the agent to learn regularities within its environment during its lifetime. Second, I describe how in designing artificial agents the current use of rewards confounds their role in defining preferences over behaviors and their role as parameters of actual agent behavior (RL agents act so as to maximize reward). Disentangling this "preferences parameters confound" can be beneficial in designing artificial agents. I will present many empirical illustrations of both of these aspects of rethinking rewards in RL.

* This talk describe joint work with Richard Lewis, Andrew G. Barto, Jonathan Sorg and Akram Helou.

Satinder's website is http://www.eecs.umich.edu/~baveja/

Seminar Speaker Bio: 

Satinder Singh Baveja is a Professor of Electrical Engineering and Computer Science. His main research interest is in the old-fashioned goal of Artificial Intelligence (AI), that of building autonomous agents that can learn to be broadly competent in complex, dynamic, and uncertain environments. The field of reinforcement learning (RL) has focused on this goal and accordingly my deepest contributions are in RL. More recently, he has been taking seriously the challenge of building agents that can interact with other agents and even humans in both artificial and natural environments. This has led to research in: human-computer interaction, computational game theory, and mechanism design.