The major goal of this project is to develop a framework to efficiently extract contents and build semantics for large volumes of maps as well as link the extracted contents through space and time for robust, meaningful change analysis. This framework is intended to exploit existing geographic data to build generic semantic models of geographic phenomena and use the models to extract geographic features from maps, evaluate (the semantic consistency), update the extracted data, and link the data across space and time.
Millions of historical maps are in digital archives today. For example, the U.S. Geological Survey has created and scanned over 200,000 topographic maps covering a 125-year period. Maps are a form of "evolutionary visual documents" because they display landscape changes over long periods of time and across large areas. Such documents are of tremendous value because they provide a high-resolution window into the past at a continental scale. Unfortunately, without time-intensive manual digitization scanned maps are unusable for research purposes. Map features, such as wetlands and roads, while readable by humans, are only available as images. This interdisciplinary collaborative project involving researchers and their students at University of Southern California and University of Colorado, Boulder will develop a set of open-source technologies and tools that allow users to extract map features from a large number of map sheets and track changes of features between map editions in a Geographical Information System. The resulting open-source tools will enable exciting new forms of research and learning in history, demography, economics, sociology, ecology, and other disciplines. The data produced by this project will be made publically available and through case studies integrated with other historical archives. Spatially and temporally linked knowledge covering man-made and natural features over more than 125 years holds enormous potential for the physical and social sciences. The wealth of information contained in these maps is unique, especially for the time before the widespread use of aerial photography. The ability to automatically transform the scanned paper maps stored in large archives into spatio-temporally linked knowledge will create an important resource for social and natural scientists studying global change and other socio-geographic processes that play out over large areas and long periods of time.
The research goal of this project is to develop a recognition and data integration framework that extracts, organizes, and links the knowledge found in visual documents that evolve over time, such as a map series. While past work has focused on feature extraction from single well-conditioned map images, this framework will handle large volume historical map archives for efficient, robust extraction of man-made and natural features and link the features across time (map editions), space (map sheets), and scale. The framework will perform recognition in maps with poor graphical quality by exploiting contextual information in the form of linked knowledge. This contextual information comes from existing spatial data sources or has been extracted from more recent high-quality map editions, which can be used to improve and refine the training steps for automatically processing maps in an archive. The framework also exploits knowledge of the semantic relationships between features to increase robustness, efficiency, and the degree of automation of the methods developed and characterize uncertainty in the extracted data as well as in linking between extracted data across space, time, and scale. This research project will validate the methods by using case studies that evaluate the extracted, fully linked data collections for major feature types (built-up area, infrastructure, hydrography and vegetation) from both the USGS and Ordnance Survey maps. The researchers will use multiple study regions that represent different histories in landscape evolution and transitions driven by processes such as urbanization and its effects on rural and wild landscapes (e.g., the I-95 megapolitan urban corridor). Publications, software, and datasets for this project will be made available on the project website.