Recent Projects

    • iBench  The first metadata generator that can be used to evaluate a wide-range of integration tasks.
    • Data Quality
      Dirty data is a serious problem for businesses leading to incorrect decision making, inefficient daily operations, and ultimately wasting both time and money. Dirty data often arises when domain constraints and business rules, meant to preserve data consistency and accuracy, are enforced incompletely or not at all in application code. In this work, we are studying how to detect errors in data.  In a sister project, BART, we provide a scalable way to generate dirty data to systematically evaluate data cleaning systems.
    • LinkedCT
      A Linked Data Space for Clinical Drug Trials (a part of Linking Open Drug Data (LODD) project at W3C).
    • Data Exchange & Schema Mappings: the theory behind Clio
      Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this project, we address foundational and algorithmic issues related to the semantics of data exchange and to the query answering problem in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem, or none at all.

Past Projects

    • Clio  Creating and managing schema mappings
    • ConQuer  Efficient management and querying of  inconsistent data
    • Hyperion -Data management support for dynamic peer-to-peer data sharing applications
    • Iliads  Leveraging data and structure in ontology integration
    • Limbo  Structure discovery in large datasets
    • ToMAS  Managing and evolving schema mappings
