Rico Angell

I am a final-year PhD candidate in computer science at University of Massachusetts Amherst working with Andrew McCallum on machine learning and natural language processing. My research interests include machine learning, optimization, and human-centered artificial intelligence with an increasing focus on AI safety and alignment. I have been generously supported by the NSF Graduate Research Fellowship and the Spaulding-Smith Fellowship during my PhD. My CV can be found here.

I completed my undergraduate degree at University of Michigan in Ann Arbor in computer science and engineering with a minor in math. I worked with Andrew DeOrio applying machine learning to hardware verification and Grant Schoenebeck on a heuristic for the influence maximization problem.

I have completed interships at Google Research, the Chan-Zuckerberg Initiative, and MIT Lincoln Labratory.

I am actively seeking a postdoc with a focus on AI safety and alignment.

Research Interests

Scalable Machine Learning and Optimization: My thesis research has yielded a new algorithm for solving general semidefinite programs that scales practically and efficiently to massive problem sizes (e.g. 1013 decision variables)––provably finding optimal solutions, demonstrated on multiple benchmarks to run over 500x faster than previous state-of-the-art (arXiv ‘23). The approach combines a novel spectral bundle method with matrix sketching techniques implemented in standalone JAX . I have also worked on leveraging approximate matrix decomposition techniques to improve the scalability of nearest neighbor search in a non-metric similarity space parameterized by a cross-encoder LLM (EMNLP ‘22).

Clustering with Application to Entity Resolution: I have developed novel training and inference procedures for large-scale clustering, driven by applications to entity linking and entity resolution, central tasks in automated knowledge base construction. This research yielded two novel clustering methods operating jointly on both mention-mention and mention-entity affinities from a specially-trained LLM (NAACL ‘21, NAACL ‘22). I have also developed inference algorithms based on combinatorial optimization for incorporating novel human-in-the-loop feedback into entity resolution decisions (ICML ‘22).

Fairness Testing: I have developed multiple tools that support machine learning practitioners’ ability to test a model’s fairness towards a protected group or attribute. The first computes causal discrimination scores based on counterfactuals (ECSE/FSE ‘18). The second allows a user to view the Pareto frontier trading off performance and fairness metrics (EJDP ‘23).

AI Safety and Alignment: I am now transitioning my research squarely into AI safety and alignment. I am particularly interested in developing and scaling interpretability techniques with the end goal of evaluating and auditing AI systems. As part of my thesis work, I am leveraging my fast and scalable semidefinite programming algorithm to improve sparse dictionary learning used for decomposing neural representations in superposition.

Publications

Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching [pdf] [code]
Rico Angell, Andrew McCallum
arXiv preprint 2023

Fairkit, Fairkit, on the Wall, Who’s the Fairest of Them All? Supporting Data Scientists in Training Fair Models [pdf]
Brittany Johnson, Jesse Bartola, Rico Angell, Katherine Keith, Sam Witty, Stephen J Giguere, Yuriy Brun.
EURO Journal of Decision Processes, 2023

Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization [pdf] [code]
Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum
EMNLP 2022

Entity Linking via Explicit Mention-Mention Coreference Modeling [pdf] [code]
Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum
NAACL 2022

Interactive Correlation Clustering with Existential Cluster Constraints [pdf] [code]
Rico Angell, Nicholas Monath, Nishant Yadav, Andrew McCallum
ICML 2022

Event and Entity Coreference using Trees to Encode Uncertainty in Joint Decisions [pdf]
Nicholas Monath, Nishant Yadav, Rico Angell, Andrew McCallum
EMNLP/CRAC 2021

Clustering-based Inference for Biomedical Entity Linking [pdf] [code]
Rico Angell, Nicholas Monath, Sunil Mohan, Nishant Yadav, Andrew McCallum
NAACL 2021

Low Resource Recognition and Linking of Biomedical Concepts from a Large Ontology [pdf]
Sunil Mohan, Rico Angell, Nick Monath, Andrew McCallum
BCB 2021

Relation-Dependent Sampling for Multi-Relational Link Prediction [pdf]
Arthur Feeney*, Rishabh Gupta*, Veronika Thost, Rico Angell, Gayathri Chandu, Yash Adhikari and Tengfei Ma
ICML 2020 Workshop on Graph Representation Learning and Beyond (GRL+)

Inferring Latent Velocities from Weather Radar Data using Gaussian Processes [pdf]
Rico Angell and Daniel Sheldon
Conference on Neural Information Processing Systems (NeurIPS) 2018

Themis: Automatically Testing Software for Discrimination [pdf]
Rico Angell, Brittany Johnson, Yuriy Brun and Alexandra Meliou
Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2018

Don’t Be Greedy: Leveraging Community Structure to Find High Quality Seed Sets for Influence Maximization [pdf]
Rico Angell and Grant Schoenebeck
International Conference on Web and Internet Economics (WINE) 2017

A Topological Approach to Hardware Bug Triage [pdf]
Rico Angell, Ben Oztalay, Andrew DeOrio
Microprocessor and SOC Test and Verification (MTV) 2015

Mentoring

Independent Studies
  • Dhruv Agarwal (MS -> PhD, Spring 2021-Present) - Clustering for Entity Resolution
  • Sriharsha Hatwar (MS, Spring 2023) - Modeling Uncertainty of Human Feedback for Entity Resolution
  • Aneri Rana (Spring 2023) - Modeling Uncertainty of Human Feedback for Entity Resolution
  • Haritha Ananthakrishna (MS, Fall 2022) - Scalable Semidefinite Programming
  • Pragya Prakash (MS, Fall 2022) - End-to-end Supervised Correlation Clustering
  • Manay Patel (Undergrad, Fall 2021) - Entity Resolution for Patent Author Disambiguation
  • Shubham Shetty (MS, Fall 2021) - Lifelong Entity Linking and Discovery
  • Ronald Soeh (MS, Spring 2021) - Efficiently Updating Nearest Neighbor Indices During Model Training
  • Matt Pearce (Undergrad, Fall 2020) - Visualization of Discovered Biomedical Entities and Concepts

COMPSCI 696DS, Industry Mentorship Program Projects
  • Vijayalakshmi Vasudevan, Rishabh Garg, Sriharsha Hatwar (Spring 2024) - Investigations in Compressed Context Windows
  • Aditya Kuppa, Alexandra Burushkina, Yugantar Prakash (Spring 2023) - A Unified Natural Language Understanding Re-ranker with Deep Reinforcement Learning
  • Ruei-Yao Sun, Nilesh Khade (Spring 2021) - Non-Gradient Based Adversarial Attack and Defense for Sequence Labeling
  • Arthur Feeney, Yash Adhikari, Gayathri Chandu, Rishabh Gupta (Spring 2020) - Using Graph Neural Networks for Drug-Drug Interaction Detection

Teaching

  • COMPSCI 696DS (University of Massachusetts Amherst) - Industry Mentorship Program - Lead TA - Spring 2022
  • COMPSCI 696DS (University of Massachusetts Amherst) - Industry Mentorship Program - Spring 2021
  • EECS 280 (University of Michigan) - Programming and Introductory Data Structures - Winter 2015

Additional Interests

Outside of work, I enjoy skiing, hiking, and training Brazilian jiu-jitsu (I am a purple belt under John Clarke in the Carlson Gracie lineage).

Contact

rangell [at] cs [dot] umass [dot] edu
LinkedIn

College of Information and Computer Science
University of Massachusetts
140 Governors Dr
Amherst, MA 01002