UMassCS Logo Donate to CS


Information Retrieval, Databases, and Data Mining

(James Allan, Bruce Croft, Yanlei Diao, David Jensen, Victor Lesser, R. Manmatha, Andrew McCallum, Alexandra Meliou, Gerome Miklau, Edwina Rissland, Hanna Wallach, Shlomo Zilberstein)

The information that interests us comes from a variety of sources, including text documents, photographic images, sensor data, Web pages, and biological sources. Accessing this data requires that information meaningful to humans be extracted from weakly structured or totally unstructured sources, in addition to conventional structured sources. The information must then be efficiently indexed and accurately retrieved. The most common approaches require formal statistical modeling and extensive empirical validation of the access techniques. We also explore methods can accommodate high-volume streams of data, and that adapt well to situations where resource availability is unpredictable. Our data mining and knowledge discovery work focuses on finding unexpected but interesting patterns within any of the varied types of information. Patterns might be found in relationships between individual pieces of information, in recurring sensor events over time, or in collections of strongly related text documents. Finally, to ensure information is valuable to users, we investigate techniques to assess the quality, reliability, and authenticity of information. To ensure information is handled safely, we investigate techniques for protecting against unexpected disclosures that can threaten privacy.

Center for Intelligent Information Retrieval
The National Center for Intelligent Information Retrieval (CIIR) is an NSF created S/IUCRC Center. The CIIR carries out basic research and technology transfer in the area of text-based and multimedia information systems. The research group investigates questions related to searching and browsing collections of documents.

Database and Information Management Laboratory
The Database and Information Management Laboratory (DBLab) focuses on the development of information infrastructures and data management systems for efficiently and securely managing large volumes of data. The research group is particularly interested in the challenges posed by emerging data types like XML and streaming data, and issues that arise in non-traditional architectures like embedded systems.

Information Extraction and Synthesis Laboratory
The Information Extraction and Synthesis Laboratory (IESL) specializes in the theoretical development and implementation of systems for extracting databases from unstructured text on the Web, and mining those databases to find patterns, predict the future, and provide decision support.

Knowledge Discovery Laboratory
investigates how to find useful patterns in large and complex databases. We study the underlying principles of data mining algorithms, develop innovative techniques for knowledge discovery, and apply those techniques to practical tasks in areas such as fraud detection, scientific data analysis, and web mining.

Machine Learning for Data Science
The Machine Learning for Data Science laboratory (MLDS) focuses on the development of machine learning models and algorithms for addressing a variety of challenging problems in the emerging areas of computational social science, computational ecology, computational behavioral science and computational health science.

Multi-Agent Systems Laboratory
The Multi-Agent Systems Laboratory is concerned with the development and analysis of sophisticated AI problem-solving and control architectures for both single-agent and multiple-agent systems. Current research projects include cooperative information gathering, distributed situation assessment, distributed scheduling, auditory scene analysis, multi-agent learning of coordination strategies, multi-agent coordination and negotiation protocols.

Resource-Bounded Reasoning Research Group
The Resource-Bounded Reasoning Research Group studies the construction of intelligent systems that can operate in real-time environments under uncertainty and limited computational resources. The group conducts research in decision theory, real-time planning, autonomous agent architectures and reasoning under uncertainty.