Data Integration: Prometheus Framework  
  We can utilize various extraction techniques to extract data from a wide variety of sources. However, different sources often have different schemas, access methods, and coverage. To address this issue, we have developed a data integration framework called Prometheus that facilitates uniform access to the sources. Prometheus provides an infrastructure that can be used to (a) quickly build applications that integrate data from various data sources, and (b) can be used as a test bed for information integration researchers to build and test new information integration techniques. Our group has focused on the data integration systems for a long time. Before Prometheus, our group had designed SIMS and Ariadne mediators.  
       
  Figure 1 shows an example application that can be built using the Prometheus. As shown in the figure there are 4 data sources. In this example application all data sources are web sources, i.e. web sites. The data sources can also be web services or databases. Given the data sources, the user can now ask the integration system queries about hotels and restaurants.  
       
 
Figure 1. An Example Application
 
       
  Research Focus  
       
  Prometheus provides an infrastructure that can be used to (a) quickly build applications that integrate data from various data sources, and (b) provide a test bed for information integration researchers to build and test new information integration techniques. Our group has focused on the mediator systems for a long time. Before Prometheus, our group had designed SIMS and Ariadne mediators. In this project, we are addressing the following limitations of the data integration systems:  
       
 
  1. Efficient execution of recursive integration plans - Traditionally, data integration systems do not have capability to efficiently execute recursive integration plans. Prometheus addresses this issue by utilizing a query reformulation algorithm that supports generation of recursive integration plans and efficiently executing the generated integration plans (both recursive and non-recursive) using Theseus execution engine More details on techniques to translate recursive and non-recursive datalog programs into plans for streaming, dataflow-style execution engine are in the IJCAI 2003 workshop article or VLDB Journal article.

  2. Extend view integration techniques to support dynamic service composition - We have applied the Prometheus mediator to the problem of dynamical composition of web services. While traditional view integration techniques apply directly to the problem of dynamic service composition, new challenges arise. We developed a new optimization technique called tuple-level filtering that introducing sensing and filtering operations to optimize the template integration plans generated for web services.

  3. Incorporating operations to support geo-spatial datatypes - In the TerraWorld project, we are extending the mediator to support integration of a wide variety of geo-spatial data, such as, satellite imagery, maps, and vector data. The key new challenges in geospatial domain in accurately combine the retrieved information. For example, when integrating road vector data with a satellite imagery, we need to align the vector data with the satellite imagery, i.e. ensure that roads in the vector data line up with roads in the satellite imagery. In our ACMGIS 2004 paper, we described techniques to automatically align vector data with imagery. Currently, we are working on representing the alignment operations in the mediator, so that the mediator can dynamically generates plans to not just retrieve geospatial data, but also accurately integrate different types of geospatial data.