Research Assistant Professor, Department of Computer Science, University of Massachusetts
AmherstFormerly: Natural Language Processing at Johns Hopkins University; and Head Programmer, Perseus Project, Tufts University
See also my curriculum vitae in PDF.
Fall 2009: Introduction to Natural Language Processing (CS 585).
Spring 2009: James Allan, R. Manmatha, and I are leading a seminar on Mining Text and Images in Digital Libraries Using Grid Computing.
August 2006: Charles Schafer and I presented a tutorial, Overview of Statistical Machine Translation [pdf], at the Association for Machine Translation in the Americas.
Fall 2005: Noah Smith and I designed and taught a course on Empirical Research Methods in Computer Science.
David Mimno, Hanna Wallach, Jason Naradowsky, David A. Smith, and Andrew McCallum. Polylingual topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 880-889, 2009. [ PDF ]
David A. Smith and Jason Eisner. Parser adaptation and projection with quasi-synchronous grammar features. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 822-831, 2009. [ PDF | PowerPoint slides ]
David A. Smith and Jason Eisner. Dependency parsing by belief propagation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 145-156, 2008. [ PDF | PowerPoint slides ]
David A. Smith and Jason Eisner. Bootstrapping feature-rich dependency parsers with entropic priors. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 667-677, 2007. [ PDF | PowerPoint slides ]
David A. Smith and Noah A. Smith. Probabilistic models of nonprojective dependency trees. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 132-140, 2007. [ PDF | PowerPoint slides ]
Keith Hall, Jiří Havelka, and David A. Smith. Log-linear models of non-projective trees, k-best MST parsing and tree-ranking. In Proceedings of the CoNLL Shared Task, pages 962-966, 2007.
David A. Smith and Jason Eisner. Quasi-synchronous grammars: Alignment by soft projection of syntactic dependencies. In Proceedings of the HLT-NAACL Workshop on Statistical Machine Translation, pages 23-30, 2006. [ PDF | PowerPoint slides ]
Markus Dreyer, David A. Smith, and Noah A. Smith. Vine parsing and minimum risk reranking for speed and precision. In Proceedings of the CoNLL Shared Task, pages 201-205, 2006. [ PDF ]
David A. Smith and Jason Eisner. Minimum risk annealing for training log-linear models. In Proceedings of the International Conference on Computational Linguistics and the Association for Computational Linguistics, pages 787-794, 2006. [ PDF ]
Noah A. Smith, David A. Smith, and Roy W. Tromble. Context-based morphological disambiguation with random fields. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 475-482, 2005. [ PDF ]
F.J. Och, D. Gildea, S. Khudanpur, A. Sarkar, K. Yamada, A. Fraser, S. Kumar, L. Shen, D. Smith, K. Eng, V. Jain, Z. Jin, and D. Radev. A smorgasbord of features for statistical machine translation. In Proceedings of the Conference on Human Language Technology and the North American Association for Computational Linguistics, pages 161-168, 2004. [ PDF ]
David A. Smith and Noah A. Smith. Bilingual parsing with factored estimation: Using English to parse Korean. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 49-56, 2004. [ PDF ]
Michael Bendersky, W. Bruce Croft, and David Smith. Two-stage query segmentation for information retrieval. In The 32nd International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR '09) Boston, MA, USA, 2009.
David A. Smith and Gideon S. Mann. Bootstrapping toponym classifiers. In Proceedings of the HLT-NAACL Workshop on Analysis of Geographic References, pages 45-49, 2003. [ PDF ]
David A. Smith. Detecting and browsing events in unstructured text. In Proceedings of the 25th Annual ACM SIGIR Conference, pages 73-80, Tampere, Finland, August 2002. [ PDF ]
David A. Smith. Detecting events with date and place information in unstructured text. In Proceedings of the 2nd ACM+IEEE Joint Conference on Digital Libraries, pages 191-196, Portland, OR, July 2002. [ PDF ]
David A. Smith and Gregory Crane. Disambiguating geographic names in a historical digital library. In Proceedings of the European Conference on Digital Libraries (ECDL), pages 127-136, Darmstadt, Germany, September 2001. [ PDF ]
Gregory Crane, Clifford E. Wulfman, Lisa M. Cerrato, Anne Mahoney, Thomas L. Milbank, David Mimno, Jeffrey A. Rydberg-Cox, David A. Smith, and Christopher York. Towards a cultural heritage digital library. In Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2003, pages 75-86, Houston, TX, June 2003. [ PDF ]
David A. Smith, Anne Mahoney, and Gregory Crane. Integrating harvesting into digital library content. In Proceedings of the 2nd ACM+IEEE Joint Conference on Digital Libraries, pages 183-184, Portland, OR, July 2002. [ PDF ]
Gregory Crane, David A. Smith, and Clifford E. Wulfman. Building a hypertextual digital library in the humanities: A case study on London. In Proceedings of the First ACM+IEEE Joint Conference on Digital Libraries, pages 426-434, Roanoke, VA, June 2001. Best paper award. [ PDF ]
David A. Smith, Anne Mahoney, and Jeffrey A. Rydberg-Cox. Management of XML documents in an integrated digital library. In Proceedings of Extreme Markup Languages 2000, pages 219-224, Montreal, August 2000.
Gregory R. Crane, Robert F. Chavez, Anne Mahoney, Thomas L. Milbank, Jeffrey A. Rydberg-Cox, David A. Smith, and Clifford E. Wulfman. Drudgery and deep thought: Designing a digital library for the humanities. Communications of the Association for Computing Machinery, 44(5):35-40, 2001. [ PDF ]
David A. Smith, Jeffrey A. Rydberg-Cox, and Gregory R. Crane. The Perseus Project: A digital library for the humanities. Literary and Linguistic Computing, 15(1):15-25, 2000.
David A. Smith, Anne Mahoney, and Jeffrey A. Rydberg-Cox. Management of XML documents in an integrated digital library. Markup Languages: Theory and Practice, 2(3):205-214, 2000. [ PDF ]
David A. Smith. Textual variation and version control in the TEI. Computers and the Humanities, 33(1-2):103-112, 1999.
David A. Smith. Debabelizing libraries: Machine translation by and for digital collections. D-Lib Magazine, 12(3), March 2006. [ HTML ]
Anne Mahoney, Jeffrey A. Rydberg-Cox, David A. Smith, and Clifford E. Wulfman. Generalizing the Perseus XML document manager. In Linguistic Exploration: Workshop on Web-based Language Documentation and Description, Philadelphia, December 2000. [ HTML ]