Karma

A Data Integration Tool

WHAT IS KARMA

Karma is an information integration tool that enables users to quickly and easily integrate data from a variety of data sources including databases, spreadsheets, delimited text files, XML, JSON, KML and Web APIs. Users integrate information by modeling it according to an ontology of their choice using a graphical user interface that automates much of the process. Karma learns to recognize the mapping of data to ontology classes and then uses the ontology to propose a model that ties together these classes. Users then interact with the system to adjust the automatically generated model. During this process, users can transform the data as needed to normalize data expressed in different formats and to restructure it. Once the model is complete, users can published the integrated data as RDF or store it in a database.

All the project publications are here. The best paper on the technical aspects of Karma is our ESWC'2012 paper, and the best application paper is our ESWC'2013 paper, which received the best in-use paper award at the conference.

KARMA INNOVATIONS

Ease of Use

Karma uses programming-by-example, learning techniques and a Steiner tree optimization algorithm to automate as much of the process as possible to enable end-users to map their data to a chosen ontology. Users adjust the automatically generated model using a graphical user interface and never see the complex mapping rules used in other systems.

Hierarchical Sources

Many systems have been developed to map tabular sources to ontologies. Karma is unique in that it also supports hierarchcal data sources such as XML, JSON and KML.

Web APIs

In addition to static sources (databases and files), Karma supports data integration from Web APIs, enabling users to leverage the thousands of data sources that are available today via Web APIs.

Semantic Models

Karma uses ontologies as a basis for integrating infomation, leveraring the class and property hierarchies, domain and range information and other ontology constructs to help users integrate their data. Karma allows users to combine multiple ontologies to enable users to map their data to standard vocabularies.

Scalable Processing

Users work with a subset of their data to define the models that integrate their data sources. This enables Karma to offer a responsive user interface when users are defining the model that integrates their data. Karma can then use these models in batch mode to integrate large data sources.

Data Transformation

Karma offers a programming-by-example interface to enable users to define data transformation scripts that transform data expressed in multiple data formats into a common format.

CASE STUDY

PEOPLE

PUBLICATIONS

generated by bibbase.org
  2016 (4)
Semantic labeling: A domain-independent approach. Pham, M.; Alse, S.; Knoblock, C.; and Szekely, P. In ISWC 2016 - 15th International Semantic Web Conference, 2016.
Semantic labeling: A domain-independent approach [pdf]Paper   Semantic labeling: A domain-independent approach [link]Link   link   bibtex   48 downloads  
Leveraging Linked Data to Discover Semantic Relations within Data Sources. Taheriyan, M.; Knoblock, C.; Szekely, P.; and Ambite, J. L. In ISWC 2016 - 15th International Semantic Web Conference, 2016.
Leveraging Linked Data to Discover Semantic Relations within Data Sources [pdf]Paper   Leveraging Linked Data to Discover Semantic Relations within Data Sources [link]Link   link   bibtex   15 downloads  
Maximizing Correctness with Minimal User Effort to Learn Data Transformations. Wu, B.; and Knoblock, C. A. In Proceedings of the 21st International Conference on Intelligent User Interfaces, 2016.
Maximizing Correctness with Minimal User Effort to Learn Data Transformations [pdf]Paper   Maximizing Correctness with Minimal User Effort to Learn Data Transformations [link]Link   Maximizing Correctness with Minimal User Effort to Learn Data Transformations [link]Slides   Maximizing Correctness with Minimal User Effort to Learn Data Transformations [link]Demo   Maximizing Correctness with Minimal User Effort to Learn Data Transformations [link]Video   link   bibtex   12 downloads  
Learning the semantics of structured data sources. Taheriyan, M.; Knoblock, C. A.; Szekely, P.; and Ambite, J. L. Journal of Web Semantics. 2016.
Learning the semantics of structured data sources [pdf]Paper   Learning the semantics of structured data sources [link]Link   Learning the semantics of structured data sources [link]Slides   link   bibtex   50 downloads  
  2015 (4)
Leveraging Linked Data to Infer Semantic Relations within Structured Sources. Taheriyan, M.; Knoblock, C. A.; Szekely, P.; Ambite, J. L.; and Chen, Y. In Proceedings of the 6th International Workshop on Consuming Linked Data (COLD 2015), 2015.
Leveraging Linked Data to Infer Semantic Relations within Structured Sources [pdf]Paper   Leveraging Linked Data to Infer Semantic Relations within Structured Sources [link]Link   Leveraging Linked Data to Infer Semantic Relations within Structured Sources [link]Slides   link   bibtex   6 downloads  
An Iterative Approach to Synthesize Data Transformation Programs. Wu, B.; and Knoblock, C. A. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), 2015.
An Iterative Approach to Synthesize Data Transformation Programs [pdf]Paper   An Iterative Approach to Synthesize Data Transformation Programs [link]Link   An Iterative Approach to Synthesize Data Transformation Programs [pdf]Slides   link   bibtex   3 downloads  
Assigning Semantic Labels to Data Sources. Ramnandan, S.; Mittal, A.; Knoblock, C. A.; and Szekely, P. In Proceedings of the 12th ESWC 2015, 2015.
Assigning Semantic Labels to Data Sources [pdf]Paper   Assigning Semantic Labels to Data Sources [link]Link   Assigning Semantic Labels to Data Sources [pdf]Slides   Assigning Semantic Labels to Data Sources [link]Video   link   bibtex   16 downloads  
Exploiting Semantics for Big Data Integration. Knoblock, C. A.; and Szekely, P. AI Magazine. 2015.
Exploiting Semantics for Big Data Integration [pdf]Paper   Exploiting Semantics for Big Data Integration [link]Link   Exploiting Semantics for Big Data Integration [link]Slides   link   bibtex   10 downloads  
  2014 (3)
A Scalable Approach to Learn Semantic Models of Structured Sources. Taheriyan, M.; Knoblock, C. A.; Szekely, P.; and Ambite, J. L. In Proceedings of the 8th IEEE International Conference on Semantic Computing (ICSC 2014), 2014.
A Scalable Approach to Learn Semantic Models of Structured Sources [pdf]Paper   A Scalable Approach to Learn Semantic Models of Structured Sources [link]Slides   link   bibtex   358 downloads  
Publishing the Data of the Smithsonian American Art Museum to the Linked Data Cloud. Szekely, P.; Knoblock, C. A.; Yang, F.; Zhu, X.; Fink, E.; Allen, R.; and Goodlander, G. International Journal of Humanities and Art Computing (IJHAC), 8: 152-166. 2014.
Publishing the Data of the Smithsonian American Art Museum to the Linked Data Cloud [pdf]Paper   Publishing the Data of the Smithsonian American Art Museum to the Linked Data Cloud [link]Link   link   bibtex   153 downloads  
Minimizing User Effort in Transforming Data by Example. Wu, B.; Szekely, P.; and Knoblock, C. A. In Proceedings of the International Conference on Intelligent User Interface, 2014.
Minimizing User Effort in Transforming Data by Example [pdf]Paper   Minimizing User Effort in Transforming Data by Example [pdf]Poster   Minimizing User Effort in Transforming Data by Example [link]Video   link   bibtex   172 downloads  
  2013 (6)
Semantics for Big Data Integration and Analysis . Knoblock, C. A.; and Szekely, P. In Proceedings of the AAAI Fall Symposium on Semantics for Big Data, 2013.
Semantics for Big Data Integration and Analysis  [pdf]Paper   Semantics for Big Data Integration and Analysis  [link]Slides   link   bibtex   166 downloads  
Publishing Data from the Smithsonian American Art Museum as Linked Open Data . Knoblock, C. A.; Szekely, P.; Gupta, S.; Manglik, A.; Verborgh, R.; Yang, F.; and de Walle, R. V. In Proceedings of the ISWC 2013 Posters & Demonstrations Track, pages 129-132, 2013.
Publishing Data from the Smithsonian American Art Museum as Linked Open Data  [pdf]Paper   Publishing Data from the Smithsonian American Art Museum as Linked Open Data  [pdf]Poster   Publishing Data from the Smithsonian American Art Museum as Linked Open Data  [link]Video   Publishing Data from the Smithsonian American Art Museum as Linked Open Data  [link]Link   link   bibtex   5 downloads  
On-the-fly Integration of Static and Dynamic Sources. Harth, A.; Knoblock, C.; Stadtmüller, S.; Studer, R.; and Szekely, P. In Proceedings of the Fourth International Workshop on Consuming Linked Data (COLD2013), 2013.
On-the-fly Integration of Static and Dynamic Sources [pdf]Paper   On-the-fly Integration of Static and Dynamic Sources [link]Link   link   bibtex   60 downloads  
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Geospatial Data. Zhang, Y.; Chiang, Y.; Szekely, P.; and Knoblock, C. A. In Proceedings of the 2013 IJCAI Workshop on Semantic Cities, 2013.
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Geospatial Data [pdf]Paper   A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Geospatial Data [link]Slides   link   bibtex   93 downloads  
A Graph-based Approach to Learn Semantic Descriptions of Data Sources. Taheriyan, M.; Knoblock, C. A.; Szekely, P.; and Ambite, J. L. In Proceedings of the 12th International Semantic Web Conference (ISWC 2013), 2013.
A Graph-based Approach to Learn Semantic Descriptions of Data Sources [pdf]Paper   A Graph-based Approach to Learn Semantic Descriptions of Data Sources [link]Slides   A Graph-based Approach to Learn Semantic Descriptions of Data Sources [pdf]Poster   A Graph-based Approach to Learn Semantic Descriptions of Data Sources [link]Video   link   bibtex   278 downloads  
Connecting the Smithsonian American Art Museum to the Linked Data Cloud. Szekely, P.; Knoblock, C. A.; Yang, F.; Zhu, X.; Fink, E.; Allen, R.; and Goodlander, G. In Proceedings of the 10th Extended Semantic Web Conference, Montpellier, May 2013. Awarded Best In-Use Paper at ESWC 2013
Connecting the Smithsonian American Art Museum to the Linked Data Cloud [pdf]Paper   Connecting the Smithsonian American Art Museum to the Linked Data Cloud [link]Slideshare   Connecting the Smithsonian American Art Museum to the Linked Data Cloud [link]Youtube   link   bibtex   60 downloads  
  2012 (6)
Rapidly Integrating Services into the Linked Data Cloud. Taheriyan, M.; Knoblock, C. A.; Szekely, P.; and Ambite, J. L. In Proceedings of the 11th International Semantic Web Conference (ISWC 2012), 2012.
Rapidly Integrating Services into the Linked Data Cloud [pdf]Paper   Rapidly Integrating Services into the Linked Data Cloud [link]Slides   Rapidly Integrating Services into the Linked Data Cloud [pdf]Poster   Rapidly Integrating Services into the Linked Data Cloud [link]Video   Rapidly Integrating Services into the Linked Data Cloud [link]Demo   link   bibtex   123 downloads  
Semi-Automatically Mapping Structured Sources into the Semantic Web. Knoblock, C. A.; Szekely, P.; Ambite, J. L.; Gupta, S.; Goel, A.; Muslea, M.; Lerman, K.; Taheriyan, M.; and Mallick, P. In Proceedings of the Extended Semantic Web Conference, Crete, Greece, 2012.
Semi-Automatically Mapping Structured Sources into the Semantic Web [pdf]Presentation   Semi-Automatically Mapping Structured Sources into the Semantic Web [pdf]Paper   Semi-Automatically Mapping Structured Sources into the Semantic Web [link]Youtube   link   bibtex   36 downloads  
Semi-Automatically Modeling Web APIs to Create Linked APIs. Taheriyan, M.; Knoblock, C. A.; Szekely, P.; and Ambite, J. L. In Proceedings of the ESWC 2012 Workshop on Linked APIs, 2012.
Semi-Automatically Modeling Web APIs to Create Linked APIs [link]Presentation   Semi-Automatically Modeling Web APIs to Create Linked APIs [pdf]Paper   link   bibtex   82 downloads  
Learning Data Transformation Rules through Examples: Preliminary Results. Wu, B.; Szekely, P.; and Knoblock, C. A. In Ninth International Workshop on Information Integration on the Web (IIWeb 2012), 2012.
Learning Data Transformation Rules through Examples: Preliminary Results [pdf]Paper   Learning Data Transformation Rules through Examples: Preliminary Results [pdf]Slides   link   bibtex   31 downloads  
Mapping Existing Data Sources into VIVO. Knoblock, C. A; Szekely, P.; Muslea, M.; and Gupta, S. . August 2012.
Mapping Existing Data Sources into VIVO [pdf]Paper   link   bibtex   4 downloads  
Exploiting Structure within Data for Accurate Labeling Using Conditional Random Fields. Goel, A.; Knoblock, C. A; and Lerman, K. In Proceedings of the 14th International Conference on Artificial Intelligence (ICAI), 2012.
Exploiting Structure within Data for Accurate Labeling Using Conditional Random Fields [pdf]Paper   Exploiting Structure within Data for Accurate Labeling Using Conditional Random Fields [pdf]Slides   link   bibtex   30 downloads  
  2011 (5)
Building Mashups by Demonstration. Tuchinda, R.; Knoblock, C. A.; and Szekely, P. ACM Transactions on the Web (TWEB), 5(3). July 2011.
Building Mashups by Demonstration [link]Link   Building Mashups by Demonstration [pdf]Paper   link   bibtex   54 downloads  
Interactively Mapping Data Sources into the Semantic Web. Knoblock, C. A.; Szekely, P.; Ambite, J. L.; Gupta, S.; Goel, A.; Muslea, M.; Lerman, K.; and Mallick, P. In Proceedings of the First International Workshop on Linked Science 2011 in Conjunction with the 10th International Semantic Web Conference, Bonn, Germany, 2011.
Interactively Mapping Data Sources into the Semantic Web [pdf]Slides   Interactively Mapping Data Sources into the Semantic Web [link]Link   Interactively Mapping Data Sources into the Semantic Web [pdf]Paper   link   bibtex   71 downloads  
Mind Your Metadata: Exploiting Semantics for Configuration, Adaptation, and Provenance in Scientific Workflows. Gil, Y.; Szekely, P.; Villamizar, S.; Harmon, T. C.; Ratnakar, V.; Gupta, S.; Muslea, M.; Silva, F.; and Knoblock, C. A. In Proceedings of the 10th International Semantic Web Conference (ISWC 2011), 2011.
Mind Your Metadata: Exploiting Semantics for Configuration, Adaptation, and Provenance in Scientific Workflows [pdf]Slides   Mind Your Metadata: Exploiting Semantics for Configuration, Adaptation, and Provenance in Scientific Workflows [pdf]Paper   link   bibtex   2 downloads  
Using Conditional Random Fields to Exploit Token Structure and Labels for Accurate Semantic Annotation. Goel, A.; Knoblock, C. A.; and Lerman, K. In Proceedings of the 25th National Conference on Artificial Intelligence (AAAI-11), San Francisco, CA, 2011.
Using Conditional Random Fields to Exploit Token Structure and Labels for Accurate Semantic Annotation [pdf]Presentation   Using Conditional Random Fields to Exploit Token Structure and Labels for Accurate Semantic Annotation [pdf]Paper   link   bibtex   39 downloads  
Exploiting Semantics of Web Services for Geospatial Data Fusion. Szekely, P.; Knoblock, C. A.; Gupta, S.; Taheriyan, M.; and Wu, B. In Proceedings of the SIGSPATIAL International Workshop on Spatial Semantics and Ontologies (SSO 2011), Chicago, IL, 2011.
Exploiting Semantics of Web Services for Geospatial Data Fusion [pdf]Slides   Exploiting Semantics of Web Services for Geospatial Data Fusion [pdf]Paper   link   bibtex   31 downloads  
  2010 (1)
Building Geospatial Mashups to Visualize Information for Crisis Management. Gupta, S.; and Knoblock, C. A. In Proceedings of the 7th International Conference on Information Systems for Crisis Response and Management, 2010.
Building Geospatial Mashups to Visualize Information for Crisis Management [pdf]Presentation   Building Geospatial Mashups to Visualize Information for Crisis Management [pdf]Paper   link   bibtex   25 downloads  
  2008 (1)
Building Mashups by Example. Tuchinda, R.; Szekely, P.; and Knoblock, C. A. In Proceedings of the 2008 International Conference on Intelligent User Interface, January 2008.
Building Mashups by Example [pdf]Presentation   Building Mashups by Example [pdf]Paper   link   bibtex   42 downloads  
  2007 (1)
Building Data Integration Queries by Demonstration. Tuchinda, R.; Szekely, P.; and Knoblock, C. A. In Proceedings of the International Conference on Intelligent User Interface, January 2007.
Building Data Integration Queries by Demonstration [pdf]Presentation   Building Data Integration Queries by Demonstration [pdf]Paper   link   bibtex   95 downloads  

ACKNOWLEDGMENT

This research is based upon work supported in part by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL) under contract number FA8750-14-C-0240, the Smithsonian American Art Museum, the National Science Foundation under awards IIS-1117913 and CMMI-0753124, the NIH through the following NCRR grant: the Biomedical Informatics Research Network (1 U24 RR025736-01), the National Institutes of Health under grant number (1 UL1 RR031986-01) at the University of Southern California.