Computing That Serves

Error-Tolerant Record Matching


Thursday, March 10, 2011 - 10:00am


Surajit Chaudhuri
Research Area Manager
Data Management, Exploration and Mining Group
Microsoft Research


Christophe Giraud-Carrier

Record Matching is a key element of data cleaning technology.  Error-Tolerant Record Matching reconciles multiple representations of the same entity in the presence of errors such as spelling mistakes and abbreviations.  In this talk, we describe some of the key scenarios and the underlying technology for error-tolerant record matching that we have developed as part of our Data Cleaning project at Microsoft Research.


Surajit Chaudhuri is a Research Area Manager at Microsoft Research, Redmond.  He started the AutoAdmin project on self-tuning database systems. Surajit has also worked in the area of data cleaning. Their research on both physical database design and data cleaning has been incorporated in Microsoft products and services such as SQL Server and Bing.  Surajit did his Ph.D. from Stanford University and he is an ACM Fellow. He was awarded the ACM SIGMOD Contributions award in 2004 and a 10 year VLDB Best paper Award in 2007.