Data Science Chair

    MapLUR in ACM TSAS - Special Issue on Deep Learning


    Our Paper "MapLUR: Exploring a new Paradigm for Estimating Air Pollution using Deep Learning on Map Images" was recently accepted for publication in the ACM journal Transactions on Spatial Algorithms and Systems (TSAS) in its special issue on Deep Learning.

    Figure 1: Land-use regression with the DOG paradigm.
    Figure 1: Land-use regression with the DOG paradigm.

    Our group works on multiple projects, like for example BigData@Geo or p2Map, where we aim to develop machine learning techniques to improve the analysis of environmental problems. One such problem that affects many cities is air pollution. Pollution concentrations can only be measured at select locations where expensive monitoring stations are deployed. In order to get estimates for other locations, land-use regression (LUR) models are used. They learn to estimate pollutant concentrations based on the land-use of an area (e.g. industrial areas, motorways, etc.). This is intuitive since, for example, air quality near roads is usually worse than in forests. Typically, LUR models use manually engineered features (e.g. distance to next traffic light) and rely on neither globally nor openly available data. In contrast, we propose the novel DOG (Data-driven, Open, Global) paradigm for LUR which encompasses using purely data-driven models on open and globally available data. Models adhering to DOG work directly on raw data, automatically extracting their features from the input.

    This has multiple advantages compared to previous approaches:

    1. Models can be fit more easily to different study areas.
    2. Models do not require manual feature engineering.
    3. Models can be reproduced by other researchers without requiring access to data sources that are not easily available.

    We also propose the MapLUR model based on the DOG paradigm, which uses map images as a globally, openly available source of information and deep learning for automatic feature extraction. We evaluate MapLUR with modeled NO2 measurements from London and map images from OpenStreetMap as well as Google Maps. We show that our model is able to outperform baseline models that employ manually engineered features, namely linear regression, Random Forests, and Multi-layer Perceptrons.

    Our Analysis of the MapLUR model shows that models based on DOG are not necessarily black-boxes. Using guided backpropagation and artificially created map images we can demonstrate that MapLUR automatically learns features that share strong similarities to hand-crafted features typically used for land-use regression models.

    Directions for future work encompass, for example, studies on real world data, the development of a comprehensive framework for extracting and interpreting features, and exploring techniques to reduce the data requirements of data-driven models.

    You will be able to find the final version of our paper on ACMs Website after its publication but you can already read the accepted version on arxiv.org

    Weitere Bilder