Özgür Şimşek

Ph.D. Candidate
Department of Computer Science
University of Massachusetts Amherst
ozgur [at] cs [dot] umass [dot] edu


RESEARCH INTERESTS

Machine learning, artificial intelligence, reinforcement learning, knowledge discovery and data mining, statistical relational learning, decision making under uncertainty, developmental learning, skill acquisition, network science, information propagation in complex networks

RESEARCH OVERVIEW

Machine learning and artificial intelligence (AI) techniques have widespread use today, with considerable impact on other academic disciplines and on society, but current techniques are far from delivering on the field's promise. My research is directed at what may be considered the biggest limitation of existing methods: their need for substantial human design effort in addressing large, complex tasks. I aim to equip artificial agents with the capability of generating on their own the type of task-specific guidance that they currently receive from people.

My focus is on developing techniques for autonomously identifying and exploiting structural properties of the domain to restrict the alternatives that should be considered during the learning process—which is typically the type of guidance we provide artificial agents, relying on our domain knowledge to guess the likely form of the solution. An approach I have found useful in a range of problems is to view the problem in an abstract, graphical form and to examine the impact of its structural properties on solution characteristics and the learning process. Network analysis techniques play an important role in this approach. In the process, I also contribute to network science, the interdisciplinary study of the laws that govern complex physical, social, biological, and information networks that surround us.


CONTRIBUTIONS

Skill Acquisition in Autonomous Agents

A fundamental problem in AI is how to form useful skills. A skill is a behavior composed of primitive actions, the smallest behavioral units that are available to an agent. For example, using its primitive sensory and motor actions, a robot may form a skill for grasping objects or for navigating to a certain location. Skills may be treated as individual actions themselves—those that specify coarser behavioral units than primitives—and can be useful in efficiently searching the space of behaviors to achieve a desired outcome. Despite early recognition in the AI literature of their utility, there has been little progress in developing methods for forming skills autonomously. My contributions are centered around three questions:

What constitutes a useful skill? I introduced access skills, a class of skills defined using a graphical representation of an agent's interaction with its environment. These skills capture efficient ways of navigating the environment graph using the graph-theoretic measure betweenness centrality. In a wide variety of domains, access skills substantially improve learning performance and, surprisingly, are similar to skills people hand-craft for these domains.

How can an agent identify such skills? I developed fast and effective methods for identifying access skills without explicitly or completely representing the environment graph—for instance, by detecting particular patterns in short trajectories of the agent's interaction with its environment. These methods are useful when it is not possible to construct the environment graph to obtain access skills directly, for example, when action consequences are unknown.

How can an agent efficiently acquire such skills? I developed a class of algorithms for efficiently acquiring a desired skill—for example, for efficiently and fully learning how to grasp, once grasping is identified as a useful skill. These algorithms follow from a novel formulation of the problem that models the internal state of the agent in addition to the state of the environment. My approach applies more generally to reinforcement learning problems in which a training phase is present before the agent starts accumulating reward.

My contributions form a basis for a developmental process during which an agent learns to display behaviors of increasing complexity through continuously building on its existing skills to acquire new ones. For example, grasping may be followed by manipulating objects in different ways, which may be followed by using a key to unlock the door to an adjacent room, and so on, forming a continuously growing skill hierarchy. In this process, an agent develops the capability to perform tasks of increasing difficulty without guidance tailored for the specific domain in which it operates, but as the result of a generic developmental process, which, in a different domain, would result in learning to perform an entirely different set of tasks. This type of open-ended developmental process is fundamentally different from how artificial agents learn to perform complex tasks today—by considerable human design effort tailored for specific tasks—and has the potential to dramatically increase the capabilities of artificial agents. Exploring this potential is a major focus of my current and future work, with particular emphasis on testing these ideas on complex real-world problems.

Selected Publications

Özgür Şimşek and Andrew G. Barto, An intrinsic reward mechanism for efficient exploration, 23rd International Conference on Machine Learning (ICML), 2006 [pdf] [bibtex] [abstract]

Özgür Şimşek, Alicia P. Wolfe, Andrew G. Barto, Identifying useful subgoals in reinforcement learning by local graph partitioning, 22nd International Conference on Machine Learning (ICML), 2005 [pdf] [bibtex] [abstract]

Özgür Şimşek and Andrew G. Barto, Using relative novelty to identify useful temporal abstractions in reinforcement learning, 21st International Conference on Machine Learning (ICML), 2004 [ps] [pdf] [bibtex] [abstract]


Navigating networks using homophily and degree

In a classic study, social psychologists Travers and Milgram asked participants to route a letter to a target person through a chain of first-name acquaintances, giving them basic information about the target person, including name, address, and occupation. Remarkably, participants were able to route many letters successfully, despite lacking global knowledge of the social network. The task they performed—routing messages to specific nodes in a large network, knowing only their own neighborhood—is replicated in a variety of distributed systems in fields as diverse as sociology, neuroscience, biology, psychology, ecology, economics, and computer science. Some examples are request routing in social networks, message routing in dynamic communications networks, and query routing in true peer-to-peer networks.

Two structural properties of networks support this type of navigation: the presence of high-degree nodes and homophily, the tendency of neighboring nodes to have correlated attributes. The social network, like many other real-world networks, displays both properties: There are individuals with an unusually high number of acquaintances, acting as hubs that connect different social circles. And, acquaintances tend be similar in some dimension, for example, hobbies, occupation, or geographic location.

Starting from a novel probabilistic formulation of the navigation problem, we developed a method of navigation that is unique in successfully incorporating both homophily and degree information. The method is surprisingly simple, effective in a broad range of networks, and a candidate model of how people make decisions in similar settings. This work was featured in a number of media outlets, including New Scientist, a popular science magazine, and ScienceNow, Science magazine's online news site.

Relevant Publication
Özgür Şimşek and David Jensen, Decentralized search in networks using homophily and degree disparity, 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005 [pdf] [bibtex] [abstract]


Using Relational Knowledge Discovery to Prevent Securities Fraud

In a collaborative effort with the National Association of Securities Dealers (NASD), we applied statistical machine learning techniques to the problem of identifying and preventing securities fraud. NASD is the world's largest private-sector securities regulator, responsible for every firm in the United States that conducts securities business with the public. We developed statistical relational models of broker behavior that provide a ranking of active brokers with respect to their probability of committing a serious securities violation. Our models incorporate social, professional, and organizational relationships among brokers, which NASD experts believe are central to identifying misconduct, but which they were not able to easily use. NASD officials intend to use our models to improve the assignment of field examinations, which serve to identify fraud as well as to prevent it through increased scrutiny.

Relevant Publication
Jennifer Neville, Özgür Şimşek, David Jensen, John Komoroske, Kelly Palmer, Henry Goldberg, Using relational knowledge discovery to prevent securities fraud, 11th International Conference on Knowledge Discovery and Data Mining (KDD), 2005 [pdf] [bibtex] [abstract]