Information Integration Research Group


	Map Extraction



	Raster maps are widely available for areas around the globe and are an important source of geospatial data. Comparing to other geospatial data, raster maps are easily accessible and provide geographic features that are difficult to find elsewhere, such as landmarks in historical maps. For example, the tourist map found using an image search engine on the Internet shown in Figure 1(a) contains location information such as the gas stations, hotels, and road names of Tehran, Iran, while the hybrid view from Google Map shown in Figure 1(b) only shows major roads and their labels.


	(a) A tourist map found on the Internet	(b) The hybrid view of Tehran from Google Maps

	Figure 1: The tourist map contains rich information that is difficult to find elsewhere for the city of Tehran, Iran

	We can exploit the geographic features in raster maps (e.g., roads, text labels, etc.) to provide additional knowledge for viewing and understanding other geospatial data. For example: Figure 2(a) shows that by aligning the tourist map to the imagery, we can create an integrated representation of the geospatial datasets Figure 2(b) shows that we can align the text layer from a raster map to imagery for annotating the imagery objects Figure 2(c) shows that we can extract and align road-intersection templates to imagery and use the templates as seed points to extract roads from the imagery for the areas where we have limited access to the vector data

	(a) Fusing a tourist map with imagery	(c) Extract roads with road-intersection templates

	(b) Labeling roads in imagery with the text layer from a map

	Figure 2: Exploiting information in raster maps for imagery understanding

	Harvesting the geographic features in raster maps is a challenging task because of: The varying image quality (e.g., scanned maps with poor image quality and digital generated map with good image quality) The complexity of maps (i.e., overlapping features in maps) The typical lack of metadata (e.g., map geocoordinates, map source, original vector data, etc.) To overcome these difficulties, this project investigates a general approach to extract feature layers from raster maps and recognize geographic features from the extracted layers. Our approach is able to process raster maps with varying map complexity and image quality (such as the examples shown in Figure 3) and does not rely on auxiliary information of the raster maps.

	Map Decomposition Techniques

	We developed two map decomposition techniques, each requiring a different amount of user input to first decompose raster maps with varying image quality into individual layers of geographic features. For raster maps with good image quality, we developed a fully automatic technique that exploit the distinctive geometry of the desired geographic features (e.g., road lines) to decompose raster maps into feature layers, namely the road layer and the text layer [Chiang et al., 2005, 2008] For raster maps with poor image quality (such as scanned and compressed maps that are otherwise difficult to process automatically and tedious to process manually), we developed a pixel-based supervised technique and a color-based supervised technique including user labeling to extract the road layers from raster maps [Chiang and Knoblock, 2006, 2009c] We are currently investigating user-labeling techniques to extract the text layers from raster maps (A preliminary work can be found here [Chiang and Knoblock, 2006]) In addition, to minimize repetitive manual work in the supervised technique, we plan to include a map classification technique that automatically selects a trained map profile (i.e., the training results of the supervised technique) to apply on previous unseen maps for extracting their feature layers (A preliminary work can be found here [Chiang and Knoblock, 2009a])

	Feature Recognition Techniques

	We also developed techniques to convert the feature layers into machine-editable map context, such as extracting the road-intersection templates and road vector data from a road layer [Chiang and Knoblock, 2008, 2009b; Chiang et al., 2005, 2008]. For the text layer, we plan to develop an automatic technique that identifies strings in the text layer and employs an optical character recognition (OCR) component to translate the text strings into machine-editable text.

	Map Imagery Alignment Techniques

	Since roads are a common geographic feature that exists across many geospatial data, the recognized road vectors can be used as a matching feature to align the raster map, the extracted feature layers, and the recognized features to other geospatial data that contain roads. We developed conflation techniques that use the set of road-intersection templates of a raster map as a reference feature to compute a transformation matrix for aligning the map with other geospatial data, such as imagery [Chen et al., 2006, 2008].

	(a) A TIGER/Line map	(b) A USGS topographic map

	(c) A scanned Thomas-Guide map

	Figure 3: Example Raster maps