Geospatial Information Fusion  
  The ability to reason over geospatial entities using publicly available information is greatly enhanced by the abundance of geospatial data sources on the Internet. Traditional data sources such as satellite imagery, maps, gazetteers and vector data have long been used in geographic information systems (GIS). However, incorporating non-traditional sources such as phone books, etc. brings to light integration issues that have not previously been dealt with. The goal of this project is to develop robust techniques that integrate such open-source data in order to identify building structures in a spatial image. By building identification, we mean labeling the buildings on the satellite image with their correct postal addresses and then associating the information from the relevant phonebooks with those addresses. Figure 1 below illustrates the objective.  
       
 
Figure 1: Objective
 
       
  Approach  
       
  We model the Building IDentification problem (BID) as a constraint satisfaction problem (Figure 2) to capture its various aspects and express the addressing rules in terms of constraint relations [Michalowski and Knoblock, 2005]. We have also developed various advanced constraint-reformulation techniques and constraint-based inference techniques (known as constraint-propagation techniques) to solve the BID problem more efficiently [Bayer et al., 2007; Michalowski et al., 2007]. The approach (Figure 3) we present is a novel way to use both explicit and implicit information in publicly available data sources. The key challenge lies in combining this information and using it to label buildings in satellite imagery with a high degree of accuracy. Using a constraint satisfaction framework allows us to address the integration issue by generating a CSP model that allows all of the information to be plugged in easily. Finally, leveraging common properties of streets and addresses in the world allows us to provide solutions that could not be deduced from any individual source but require the combination of data from multiple sources.  
       
 
Figure 2: CSP Formulations
 
       
       
 
Figure 3: Modeling Building Identification Problem in a Constraint Satisfaction Framework
 
       
  Geospatial Reasoning Framework  
       
  We have developed a geospatial reasoning framework based on this approach that allows a user to interactively gather information from different data sources such as OpenStreetMap, Yellow Pages, etc., and execute constraint-reasoning process over the collected data for building identification. Geospatial reasoning process broadly involves the following steps:  
       
 
  • Identification of streets and buildings: The framework allows the user to identify the streets and buildings by interactively gathering its associated vector data in various ways:
    • Import OpenStreetMap data
    • Import existing vector data in formats such as KML, shapefiles, etc.
    • Creating the vector data manually through framework’s interface

  • Execution of constraint-reasoning process: Once the locations for the building and streets have been identified, the users can then execute the constraint-reasoning process over the collected information in order to generate mappings between building locations and addresses.
  • Exploit other public data sources: More information from other public data-sources can be integrated into the system to resolve the ambiguities (e.g. one building being mapped to multiple addresses) that may be present in the mappings produced, as explained in the demonstration below. One such example data source is the maps for that area that may be present in the business websites for businesses nearby (URLs are gathered from the Yellow Pages data). MapFinder service is used to classify images found on those websites as maps and on-maps.