Detailed information about my work:
- All my publications are on this page, in Google Scholar and Research Gate
- Videos are in YouTube
- Recent presentations are in Slideshare
- News and useful tidbits for students are in my blog
- Detailed information: DIG page, Karma page
I like working with students. More info here.
Nowadays we have access to lots of data to make decisions, but it is difficult to combine these data to act on them. The problem is that these data are scattered in different sources, in different formats and schemas, and with no metadata to describe their meaning and provenance. Data can be in databases, Excel spreadsheets, CSV, XML or JSON files, or is accessible only via a Web service or REST API. My research objective is to help the consumers of these data to easily clean, transform and combine data to do analysis, and to help providers publish their data with the appropriate metadata so it is more useful to consumers.
Our approach is based on two ideas: semantics and examples. When tools understand the meaning of data, they can more effectively help users combine it in a meaningful way. To this end, we are developing techniques to semi-automatically infer the semantics of the data from examples. Users then show the system using the sample data how to they want the data combined and processed, and the system infers a workflow that can be used in batch on large datasets (big data).
I am interested in technology and applications. Our information integration toolkit Karma, is open source software that you can download to solve your information integration problems. I also collaborate with multiple organizations to apply Karma to build interesting applications in multiple domains such as intelligence analysis, bioinformatics, cultural heritage and business intelligence.
Here is a video that illustrates how we use Karma to publish the data from the Smithsonian American Art Museum as Linked Open Data:
At ISI I work in Craig Knoblock's Information Integration Group, and I collaborate very closely with him on most projects. I collaborate with Jose Luis Ambite on information integration topics, with Gully Burns on bioinformatics data integration, with Yolanda Gil on provenance and workflows, with Yao-Yi Chiang on data mining and geospatial data integration, and with Rajiv Maheswaran and Yu-Han Chang in analysis of spatio-temporal data.
I am working with Rudi Studer, Andreas Harth and Steffen Stadtmüller from KIT on combinging their Dat-Fu engine with Karma to support integration of dynamic data; with Freddy Priyatna in Oscar Corcho's group to use Karma in his work with Google Fusion tables; with Alex Viggio and other folks from the VIVO community to use Karma as a data ingestion tool for VIVO; with Rachel Allen from the Smithsonian American Art Museum and Eleanor Fink on our work to publish museum data to the Linked Open Data cloud; with Joan Cobb from the Getty on publication of the Getty vocabularies to the Linked Open Data cloud; with Miel Vander Sande and an enthusiastic group of USC undergraduates to adapt his wonderful everythingisconnected work to produce stories using the Smithsonian American Art Museum data.
I am always looking for new opportunities to collaborate, so please send me a note if you see any topics of mutual interest. Nowadays, I attend the Semantic Web conferences (ISWC and ESWC) and the Intelligent User Interfaces Conference (IUI), so look for me there.
In the past, I was conference chair for UIST and IUI, and I was IUI program co-chair in 2013. I regulary review for HCI, semantic web and AI conferences. I figure I should review at least as many papers as I send. I often have at least 2 coauthors, so things should balance out.
Lately, I became interested in promoting Semantic Web in Latinoamerica. In 2012 and 2013 I taught summer courses on Semantic Web in the Universidad de los Andes, my undergraduate college, and Pontificia Universidad Javeriana, both in Bogota, Colombia. Both times I had enthusiastic students and it was a pleasure to teach the course. I intend to go back every summer to teach this class (I would like to do it in Medellin in 2014, and I need an invitation, hint?). I am also working with a team from the Universidad de los Andes in Bogota and Universidad Simon Bolivar in Caracas on a bid to host the 2015 International Semantic Web Conference, yes ISWC, in Latinamerica.
There is a group of latinamerican Semantic Web researchers, scattered all around the world, but eager to work to promote Semantic Web technologies in latinamerica. Boris Villazon-Terrazas is doing the heavy lifting organizing the group, kudos to him, and if you can help, please email me or contact Boris.