class: center, middle, title-slide # Shiny apps and aiR quality ## EARL 2021 ### Adithi R. Upadhya ### ILK Labs ### 10 September, 2021 --- # Hi! My name is Adithi, and I am a Geospatial Data Analyst at ILK Labs in Bengaluru, India. I will now be presenting about how I use shiny to build tools to provide a workflow to analyse air quality data. --- # Who all R there? ![Multidisciplinary team](img/team.png) --- # R, shiny everywhere! <img src="WWW/R.png" width="100px" /> - R and shiny are a powerful combination. - With the ever increasing global measurements of air pollutants (through stationary, mobile, low-cost, and satellite monitoring) it has become necessary to use management platforms. - So here we present two shiny applications. ??? R and shiny are a powerful couple which can be used to build interactive platforms to manage and work with the data collected. With the ever increasing global measurements of air pollutants (through stationary, mobile, low-cost, and satellite monitoring), the amount of data being collected is huge and it necessitates the use of management platforms. In an effort to address this issue, we developed two Shiny applications to analyse and visualise air quality data. --- # Why shiny apps? - Shiny apps can be deployed - The user needs no programming knowledge - Help available --- # 2 types of Air Quality Measurements .left-column[ ###Staionary monitoring ![Stationary reference grade monitor](WWW/SM.png) ] -- .right-column[ ###Mobile monitoring ![Sensors in a car](img/google-street-view.png) ] ??? Typically, we have two main types of air quality measurements. --- # Stationary Monitoring ![Network of air quality sensors in Begaluru, India](img/pa.png) ??? This is what stationary monitoring network looks like. Multiple sensors recording data at real time ~ 1 min frequency. Gives information about trends at the city scale --- # Mobile Monitoring ![Individual instruments for each parameter](img/setup.png) ??? For a neighborhood level measurement, we use mobile monitoring. This schematic shows the many instruments that go in the mobile platform. You can imagine the complexity, diff instruments, have different download methods, from the point of view of analysis - different data formats, different time stamps, etc. --- .left-column[ ###Our set-up in Bengaluru, India ] .right-column[ ![Internal set up in the mobile monitoring platform](img/cng.jpg) ] ??? We are looking at the inside of the car now. We have a laptop as a logger for some instruments. And all instruments are secured in the blue tray with inlets outside the vehicle. This is what looks like in practise. --- .left-column[ ###but sometimes our ride is unexpected ] .right-column[ ![Unplanned and unanticipated road blocks](img/challenges.jpg) ] ??? All that to say that many times, things are out of our control in the field. --- # Our next step - Build an app to make public use the open source air quality data available and think about science. - Build another app which helps team at ILK to perform quality checks to the high frequency data. ??? Once we saw that stationary monitors can sometimes need cleaning and also the freely available data was for public use. To facilitate public use why not try to make an app which navigates a user to use the open source air quality data easily and think about science. And we at ILK Labs we waiting to understand how to perform quality checks on the high dimensional data we were collecting everyday. --- .left-column[ #Data Cleaning - Important!!! ![](img/tidyverse_logo.png) ] .right-column[ ![](img/tidyverse.png) ] ??? We use all tidyverse packages for data cleaning. Ggplot for highly flexible plotting, purrr and map functions for more efficient and faster code instead of for loops, and forcats for functions. Readr, read_csv automatically parses date-time objects, that is is very helpful --- # Working with datetime using `lubridate` .left-column[ ![](img/lubridate.jpeg) ] .right-column[ Date for different instruments can have different ![](img/data_issues.png) - Parsing date-times with `ymd_hms()`, `dmy_hms()`... - Assigning time zone with `with_tz()` ] ??? Step 1 is consistent time stamps so that we can join all data sets together. Some instruments are either in UTC time zone, or depending on the country of origin in a diff time zone. --- class: center # Stationary Monitoring - pollucheck <img src="WWW/SM.png" width="200px" /> .center[CSTEP, Begaluru, India] ??? The second shiny application "pollucheck" helps processing open source air quality data usually called stationary monitoring. There are several platforms which provide open source air quality data. We built this application for users of these platforms who can have quick analysis and basic plots of air quality easily generated and think about science. --- class: center # [pollucheck](https://aruapps.shinyapps.io/OpenSourceAirQualityApp) <img src="WWW/pollucheck_app.png" width="800px" /> ??? This is how pollucheck looks like. --- # [pollucheck](https://github.com/adithirgis/pollucheck) <img src="WWW/PolluCheck.png" width="100px" /> - sources compatible - [CPCB (specific to India)](https://app.cpcbccr.com/ccr/#/caaqm-dashboard-all/caaqm-landing), [OpenAQ](https://openaq.org/#/countries/IN?_k=5ecycz), and [AirNow](https://www.airnow.gov/international/us-embassies-and-consulates/#India) - Data processing options available along with different summary statistics - Generates statistical and summary plots - Implements linear or multilinear regression - Allows users to compare from two sites ??? "pollucheck" can be used for data downloaded from CPCB (specific to India), OpenAQ and AirNow, it aims at generating a range of statistical plots and summary statistics with several data processing options. There are options for different averaging periods as well. It also checks for normality, generates density, Q-Q plots along with all these checks for trends in the time series for the selected parameter. It can implement linear or multilinear regression. pollucheck allows users to upload another set of data to compare selected parameters and generate plots. It also implements two plots from the openair package. This is made into a package. Don't forget to check out air quality of your city using our application! --- class: center # Mobile Monitoring - mmaqshiny <img src="img/google-street-view.png" width="400px" /> .center[Google Street View car] --- class: center # [mmaqshiny](https://aruapps.shinyapps.io/mmaqshiny/) <img src="WWW/mmaqshiny_app.png" width="800px" /> ??? This is how mmaqshiny looks like. The first shiny application is called "mmaqshiny" which helps in processing high resolution air quality data collected on a moving platform usually called mobile monitoring. So we take multiple sensors in a car, and take repeated measurements of each road in an area to generate stable high resolution air pollutant maps (usually daily maps are generated). There is superior performance in estimating long-term mean concentrations when multiple repeated drives are possible. --- # [mmaqshiny](https://github.com/meenakshi-kushwaha/mmaqshiny) <img src="WWW/mmaqshiny.png" width="100px" /> - Handles high frequency data (~ 1 Hz) - Only mandatory file is GPS and multiple inputs possible - Visualise air pollution hot spots - Pre-processes for various instrumental sensitivities - Provides unit of analysis for further study - Joins output files from different instruments - Reduces computational labour - Alarms user on instrumental errors - Near real time quality check of the data (usually after the ride) ??? mmaqshiny can handle data of the order 1000's every day from each of the instrument, we had nearly 5 instruments for this study. It reduces the time consumed for analysing each pollutant individually, helps in visualising the data collected on field each day, it can also be used to look for pollution hotspots, locations that are relatively more polluted than neighboring areas. Each pollutant or sensor data requires specific kind of pre-processing which depends on the principle on which it operates or its mechanical setup. Eventually, it joins different instrument data into one single corrected file. This really helps us in achieving the unit of analysis for the rest of the study. This application has reduced computational labour. Since mobile data contains a huge amount of spatial data, GPS data file is mandatory but other files from different instruments are not necessary. Since Alarm tab is also present in the application, it helps the user to give a near real-time check on the health of all the instruments used. This application is available as a package on CRAN. --- # Coming up next - Working with models for low cost sensor data - Working with satellite images --- # Resources used .pull-left[ - R packages - [xaringan](https://github.com/yihui/xaringan) - [xaringanextra](https://github.com/gadenbuie/xaringanExtra) - Images from [rawpixel](https://www.rawpixel.com/) - Logos from [The Noun Project](https://thenounproject.com/) ] --- ![](img/Team_ILK.PNG) --- .left-column[ ##Thank You everyone and EARL 2021! Website: [Adithi R. Upadhya](https://adithirugis.rbind.io/) Twitter: [AdithiUpadhya](https://twitter.com/AdithiUpadhya) Github: [@adithirgis](https://github.com/adithirgis) Email: [adithi@ilklabs.com]() ] .right-column[ ![](img/code_hero.jpg) ] ??? Please let me know if you all have any further questions. I am also available in the Lounge if you want to discuss. Have a great day ahead! Enjoy EARL!