Technical Reports (CIS)

Document Type

Technical Report

Date of this Version

March 1992


University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-92-27.


We use findings in machine learning, developmental psychology, and neurophysiology to guide a robotic learning system's level of representation both for actions and for percepts. Visually-driven grasping is chosen as the experimental task since it has general applicability and it has been extensively researched from several perspectives. An implementation of a robotic system with a gripper, compliant instrumented wrist, arm and vision is used to test these ideas. Several sensorimotor primitives (vision segmentation and manipulatory reflexes) are implemented in this system and may be thought of as the "innate" perceptual and motor abilities of the system.

Applying empirical learning techniques to real situations brings up such important issues as observation sparsity in high-dimensional spaces, arbitrary underlying functional forms of the reinforcement distribution and robustness to noise in exemplars. The well-established technique of non-parametric projection pursuit regression (PPR) is used to accomplish reinforcement learning by searching for projections of high-dimensional data sets that capture task invariants.

We also pursue the following problem: how can we use human expertise and insight into grasping to train a system to select both appropriate hand preshapes and approaches for a wide variety of objects, and then have it verify and refine its skills through trial and error. To accomplish this learning we propose a new class of Density Adaptive reinforcement learning algorithms. These algorithms use statistical tests to identify possibly "interesting" regions of the attribute space in which the dynamics of the task change. They automatically concentrate the building of high resolution descriptions of the reinforcement in those areas, and build low resolution representations in regions that are either not populated in the given task or are highly uniform in outcome.

Additionally, the use of any learning process generally implies failures along the way. Therefore, the mechanics of the untrained robotic system must be able to tolerate mistakes during learning and not damage itself. We address this by the use of an instrumented, compliant robot wrist that controls impact forces.



Date Posted: 17 August 2007