The Database Group @ University of Toronto The xCurator Project
     Project Description | Live Applications | People | Publications 
         
 
  Try a preview of xCurator's mapping interface here.
 
   Project Description
 
   Semistructured data is abundant on the Web. Many Web data sources and APIs make their data available in XML, JSON, or a domain-specific semistructured format, with the goal of making the data easily accessible and usable by Web application developers. Although such data formats are more machine-processable than pure text documents, managing and analyzing such data in large scale is often nontrivial. This is mainly due to the lack of a well-defined structure and clear semantics in such data formats, which could also result in degradation of the quality of the data over time.
 In xCurator project, our goal is to add structure and enhance the quality of such data, by extracting entities and their type and associations, identification and merging of duplicate entities, linking related entities, and publishing the results on the Web, all in a lightweight easy-to-use and scalable framework that effectively incorporates user feedback in all phases. We have designed our system based on our experience in managing large volumes of (user-generated) data on the Web in several real-world applications.
 
   Live Applications
 
  • LinkedCT
    The Linked Clinical Trials (LinkedCT) data set is a Linked Data source of clinical trials data.
  • BibBase3
    The BibBase data server aims at creating high-quality Linked Data out the BibTeX files of BibBase's users.
 
   People
 
 
   Publications
 
  • Linking Semistructured Data on the Web
    S. Hassas Yeganeh, O. Hassanzadeh, and R.J. Miller.
    Proceedings of the 14th International Workshop on the Web and Databases (WebDB 2011) at SIGMOD 2011
 
   Related Projects
 
  • LinQuer:  Linkage Query Writer
  • Stringer:  Duplicate Detection System for String Data 
 
   Datasets and Experimental Results
 
  • Will be available soon.
 
 
Copyright © 2011 The Database Group @ University of Toronto | Last Updated May 25th, 2011