Evaluation of low-cost sensors for quantitative personal exposure monitoring

https://doi.org/10.1016/j.scs.2020.102076Get rights and content

Highlights

  • Challenges associated with the robustness of low-cost sensors are studied.

  • Field experiments are performed to analyse sensor performance in diverse conditions.

  • Pre/post deployment colocation experiments are performed for LCS and reference monitors.

  • We investigated four calibration methods for SC kits: LR, ANN, SVR and RF.

  • SVR model outperformed others models with an average RMSE of 3.39 for PM2.5 and 4.10 for PM10.

Abstract

Observation of air pollution at high spatio-temporal resolution has become easy with the emergence of low-cost sensors (LCS). LCS provide new opportunities to enhance existing air quality monitoring frameworks but there are always questions asked about the data accuracy and quality. In this study, we assess the performance of LCS against industry-grade instruments. We use linear regression (LR), artificial neural networks (ANN), support vector regression (SVR) and random forest (RF) regression for development of calibration models for LCS, which were Smart Citizen (SC) kits developed in iSCAPE project. Initially, outdoor colocation experiments are conducted where ten SC kits are collocated with GRIMM, which is an industry-grade instrument. Quality check on the LCS data is performed and the data is used to develop calibration models. Model evaluation is done by testing them on 9 SC kits. We observed that the SVR model outperformed other three models for PM2.5 with an average root mean square error of 3.39 and average R2 of 0.87. Model validation is performed by testing it for PM10 and SVR model shows similar results. The results indicate that SVR can be considered as a promising approach for LCS calibration.

Introduction

Deterioration of air quality is an important challenge in most urban parts of the world (Kumar, Khare et al., 2015; Kumar, Morawska et al., 2015; Kumar, Nagar et al., 2015; 2016). Pollutants such as particulate matter (PM), carbon monoxide (CO), nitrogen dioxide (NO2) can cause respiratory as well as cardiovascular diseases (Cascio & Long, 2018; Kumar, Khare et al., 2015; Kumar, Morawska et al., 2015; Kumar, Nagar et al., 2015). It is not just adversely affecting human well-being, health and productivity but also overall development and sustainability. Environmental damages are also enormous and cities require intensive monitoring (Morawska et al., 2019) to understand the trends and sources responsible for particulate matter concentration (Hama et al., 2020; Shukla, Kumar, Mann, & Khare, 2020). Raising awareness among the people is of utmost importance to efficiently monitor the changes in the air quality and assess the harmful impact of air pollution on human health and sustainability of cities (Ortolani & Vitale, 2016).

Traditionally, government agencies are considered as the primary participants involved in air quality monitoring. Their sole purpose is to perform a regular inspection for air quality compliance and informing policy-making. There are limited monitoring sites as the industry-grade instruments are expensive and require regular maintenance (Kumar, Khare et al., 2015; Kumar, Morawska et al., 2015; Kumar, Nagar et al., 2015; Rai et al., 2017). These limited monitoring sites also limit the spatial resolution of the data (Morawska et al., 2018). It has been observed that pollutant concentration can show complex spatial short-term variations (Monn, 2002). For example, depending on the meteorological conditions, different parts of a city are affected differently by emission sources (Mangia, Gianicolo, Bruni, Angela Vigotti, & Cervino, 2013). In the case of streets, the concentration can vary within a short distance within minutes (Goel & Kumar, 2015). This makes it necessary to have an efficient network of sensors that can generate large data-sets which can improve spatio-temporal resolution (Mahajan, Chen, & Tsai, 2018; Mahajan, Liu, Tsai, & Chen, 2018) of air quality data. This can lead to knowledge extraction from the data which can be further used by public to take precautionary measures.

One of the driving force behind efficient air quality monitoring with the finer spatio-temporal resolution is the availability of low-cost sensors (LCS) for large scale data sensing (Boulos et al., 2011; Chen et al. 2018; Commodore, Wilson, Muhammad, Svendsen, & Pearce, 2017). These include portable sensors that are cost-effective as well as reliable to capture pollution peaks and reproduce the data. The use of such sensors can help in increasing the spatial density of air pollution monitoring which can lead to more information and services (Castell et al., 2015) that can support citizens (Mahajan et al., 2020) as well as the policymakers (Mahajan et al., 2020). LCS can be seen as a cost-effective way to monitor environment in cities (Mahajan, Tang, Wu, Tsai, & Chen, 2019) as well as rural areas (Karagulian et al., 2019). They can not only increase the spatial coverage (Kumar, Khare et al., 2015; Kumar, Morawska et al., 2015; Kumar, Nagar et al., 2015; Rai et al., 2017) but also provides an alternative to design cost-effective air quality monitoring frameworks (Chen, Ho, Hsieh et al., 2017; Chen, Ho, Lee et al., 2017) that can be easily deployed in different parts of the world. Compared to the industry-grade instruments, these LCS are a convenient alternative for static as well as mobile sensing (Rai et al., 2017; Spinelle, Gerboles, Kok, Persijn, & Sauerwald, 2017). This makes LCS easy to use and deploy in regions with limited monitoring facilities.

The downsides of using LCS for large scale deployment is the less accurate data generated by them (Yi et al., 2015). Studies have noted significant differences between the measurements reported by the LCS and industry-grade instruments (Jiao et al., 2016). Properly calibrating these sensors is one of the ways to improve their data quality. The calibration of sensors is often needed before and after their deployment (Maag, Zhou, & Thiele, 2018). The challenge with low-cost calibration is that the LCS are affected by meteorological conditions (Masson, Piedrahita, & Hannigan, 2015; Williams et al., 2013) and anthropogenic factors (Liou, Luo, Mahajan, & Chen, 2020). Also, it has been found that LCS are affected by temperature and humidity (Wei et al., 2018). If the relative humidity exceeds 75 %, the error rate rises significantly (Masson et al., 2015). Although calibration related research has been going on for many years, it still attracts a lot of interest due to the following reasons: (i) availability of new LCS that are cost-effective; and (ii) air quality sensing frameworks and applications using crowdsourcing and crowd-sharing for personal exposure monitoring (Maag, Zhou, & Thiele, 2018).

Table 1 presents a summary of relevant literature on different calibration strategies for LCS. While the related works have demonstrated LCS calibration by using co-location techniques, followed by the implementation of statistical models, the approach has not been extensively explored for realistic deployment scenarios. Some of the important questions that need to be answered include efficient and accurate calibration algorithms, sensor correlation with reference instruments pre- and post-deployment and performance of calibration algorithms when environmental conditions are different from the training set. We attempt to answer these questions by performing extensive colocation and using the dataset to assess the performance of various calibration algorithms. The idea is to develop a model that is not just accurate and efficient but can potentially be used.

Section snippets

Modelling approach

As described in Table 1, most of the existing methods either use linear regression (LR) methods or advanced neural network methods. The drawback of using LR methods is that it can only capture the linear relationship. On the other hand, artificial neural network (ANN) models do capture the non-linear relationship but the overall computation load in terms of memory and time is higher. Random Forest (RF) models are found to be a good option but to have an accurate and efficient model, large

Variations in data and data correlation

Table 2 presents the summary statistics of the field data, including the median, mean and standard deviation of pollutant concentration for GRIMM and ten SC kits. It can be observed that statistical paramters are very similar for all SC kits which shows the consistent behavior of SC kits. Also, the mean, median and standard deviation (SD) values are very similar for SC kits and GRIMM for PM1 and PM2.5 whereas the difference is more in the case of PM10.

To get a better understanding of how the SC

Summary, conclusions and future work

Air pollution and its consequencies have affected majority of the countries in the world. Measurement of personal exposure to particulate matter can help in understanding and reducing the exposure to harmful air pollutants such as PM2.5. Technological advances have led to development of cost-effective LCS for exposure monitoring but there is always a doubt regarding the data accuracy of such LCS devices. In this work, we have addressed the issue of data reliability of LCSs as well as measures

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper

Acknowledgements

This work is carried out by the University of Surrey’s GCARE team under the framework of iSCAPE (Improving Smart Control of Air Pollution in Europe) project, which is funded from the European Union Horizon 2020 research and innovation programme under the Grant Agreement No. 689954. We thank Mr KV Abhijith of GCARE team for helping during the various stages of the work.We also thank Guillem Camprodon and Oscar González for providing the citizen science kits as a part of the iSCAPE project.

References (56)

  • L. Morawska et al.

    Applications of low-cost sensing technologies for air quality monitoring and exposure assessment: How far have they gone?

    Environment International

    (2018)
  • H. Omidvarborna et al.

    Envilution™, chamber for performance evaluation of low-cost sensors

    Atmospheric Environment

    (2020)
  • C. Ortolani et al.

    The importance of local scale for assessing, monitoring and predicting of air quality in urban areas

    Sustainable Cities and Society

    (2016)
  • A.C. Rai et al.

    End-user perspective of low-cost sensors for outdoor air pollution monitoring

    The Science of the Total Environment

    (2017)
  • K. Shukla et al.

    Mapping spatial distribution of particulate matter using Kriging and inverse distance weighting at supersites of megacity Delhi

    Sustainable Cities and Society

    (2020)
  • T.F. Yusaf et al.

    Crude palm oil fuel for diesel-engines: Experimental and ANN simulation approaches

    Energy (Oxford, England)

    (2011)
  • R.M. Balabin et al.

    Support vector machine regression (SVR/LS-SVM) - An alternative to neural networks (ANN) for analytical chemistry? comparison of nonlinear methods on near infrared (nir) spectroscopy data

    The Analyst

    (2011)
  • M. Boulos et al.

    Crowdsourcing, citizen sensing and sensor web technologies for public and environmental health surveillance and crisis management: trends, ogc standards and application examples

    International Journal of Health Geographics

    (2011)
  • W.E. Cascio et al.

    Ambient air quality and cardiovascular health

    North Carolina Medical Journal

    (2018)
  • J.-H. Chang et al.

    Analysis of correlation between secondary PM2.5 and factory pollution sources by using ANN and the correlation coefficient

    IEEE Access : Practical Innovations, Open Solutions

    (2017)
  • L.-J. Chen et al.

    ADF: An anomaly detection framework for large-scale PM2.5 sensing systems

    IEEE Internet of Things Journal

    (2017)
  • L.-J. Chen et al.

    An open framework for participatory PM2.5 monitoring in smart cities

    IEEE Access: Practical Innovations, Open Solutions

    (2017)
  • V. Cherkassky

    The nature of statistical learning theory

    IEEE Transactions on Neural Networks

    (1997)
  • A. Commodore et al.

    Community-Based Participatory Research for the Study of Air Pollution: A Review of Motivations, Approaches, and Outcomes

    Environmental Monitoring and Assessment

    (2017)
  • E.S. Cross et al.

    Use of electrochemical sensors for measurement of air pollution: Correcting interference response and validating measurements

    Atmospheric Measurement Techniques

    (2017)
  • U. Grömping

    Variable importance assessment in regression: Linear regression versus random forest

    The American Statistician

    (2009)
  • D.M. Holstius et al.

    Field calibrations of a low-cost aerosol sensor at a regulatory monitoring site in California

    Atmospheric Measurement Techniques

    (2014)
  • L.-P. Huang et al.

    A vector mosquitoes classification system based on Edge computing and deep learning

    2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI), IEEE

    (2018)
  • Cited by (52)

    View all citing articles on Scopus
    View full text