The Nowcast for Ozone - 2019 Update (Partial Least Squares Method)

On August 1, 2019, the AirNow Data Management Center implemented a new EPA method for calculating the ozone Nowcast using a Partial Least Squares (PLS) method. All relevant R code and corresponding white paper cited in this post can be found on the US EPA’s github.

Ozone Nowcast Overview

From the EPA’s document The O3 Nowcast: U.S. EPA’s Method for Characterizing and Communicating Current Air Quality

To address the need for real-time reporting of air pollution data to the public and because the AQI is based on longer-term averages, EPA has developed a “Nowcast” to estimate the AQI in real time. For O3, the Nowcast is a prediction of what the AQI would be for an 8-hour average centered on the current hour

This O3 Nowcast works by building a statistical relationship between 8-hour rolling mean O3 concentrations and 1-hour concentration measurements. This relationship is then applied to the current 1-hour concentration measurements to predict the 8-hour mean. The AQI value that corresponds with this 8-hour mean can be computed to provide a message to the public in real time about air quality and exposure reduction measures


Ozone Nowcast Calculation

The Nowcast Function R code, located here, and decision tree code, located here, contain information on how the ozone Nowcast is calculated. In short, there are two different calculation methods that may be used to determine the ozone Nowcast.

  • If completion criterion are met for the PLS method, and there is at least 1 valid hour of ozone data in the past 3 hours, the PLS method is leveraged. Using this improved method, current ozone conditions will more closely represent the 8-hour average value for ozone air quality and should be more consistent with the daily AQI value than previously used methods. The previous 2 weeks (336 data points) of hourly ozone data are considered when calculating a PLS nowcast.

  • If completion criterion are not met, and there is at least 1 valid hour of ozone data in the past 3 hours, the Surrogate method will be leveraged. This is a simple slope/intercept equation where y=0.85x+4.5ppb where x is the hourly ozone value and y is the resulting Nowcast. This is meant as a fallback/secondary method to produce a NowCast, and was derived from a linear regression between concurrent 8-hour mean and 1-hour O3 concentrations using historical data from about 40 monitoring sites from major continental U.S. cities

  • If there has been no hourly ozone data in the past 3 hours, the Nowcast will be N/A.


Ozone Nowcast Completeness Criteria and Calculation Pathways

Below is an unofficial flowchart for the new Nowcast method (click the image for higher resolution).

Changes in ozone data completeness at a monitoring site will affect which path is taken. Using this chart, there are 15 different pathways that may be taken to determine the Nowcast value for a given hour.

Path Brief Path Description Resulting Nowcast method/value
1 All data are missing in the two week window N/A
2 <75% completeness of midpoint 8-hour averages; no current hour data; no previous hour data; no 2 hour old data N/A
3 <75% completeness of midpoint 8-hour averages; no current data; no previous hour data; valid 2 hour old data The Surrogate result from 2 hours ago is repeated
4 <75% completeness of midpoint 8-hour averages; no current data; valid previous hour data The Surrogate result from 1 hour ago is repeated
5 <75% completeness of midpoint 8-hour averages; valid current hour data The Surrogate method is used
6 >=75% completeness of midpoint 8-hour averages; >= 8 consecutive N/A hourly values in 2 week window; no current hour data; no previous hour data; no 2 hour old data N/A
7 >=75% completeness of midpoint 8-hour averages; >= 8 consecutive N/A hourly values in 2 week window; no current hour data; no previous hour data; valid 2 hour old data The Surrogate result from 2 hours ago is repeated
8 >=75% completeness of midpoint 8-hour averages; >= 8 consecutive N/A hourly values in 2 week window; no current hour data; valid previous hour data The Surrogate result from 1 hour ago is repeated
9 >=75% completeness of midpoint 8-hour averages; >= 8 consecutive N/A hourly values in 2 week window; valid current hour data The Surrogate method is used
10 >=75% completeness of midpoint 8-hour averages; <8 consecutive N/A hourly values in 2 week window; all values in the two week window are 0 ppb 0
11 >=75% completeness of midpoint 8-hour averages; <8 consecutive N/A hourly values in 2 week window; no current hour data; no previous hour data; no 2 hour old data N/A
12 >=75% completeness of midpoint 8-hour averages; <8 consecutive N/A hourly values in 2 week window; no current hour data; no previous hour data; valid 2 hour old data The PLS Nowcast from 2 hours ago is repeated
13 >=75% completeness of midpoint 8-hour averages; <8 consecutive N/A hourly values in 2 week window; no current hour data; valid previous hour data The PLS Nowcast from 1 hour ago is repeated
14 >=75% completeness of midpoint 8-hour averages; <8 consecutive N/A hourly values in 2 week window; valid current hour data; resulting PLS Nowcast is <=0 0
15 >=75% completeness of midpoint 8-hour averages; <8 consecutive N/A hourly values in 2 week window; valid current hour data; resulting PLS Nowcast is >0 The PLS NowCast method is used

AirNow Data Management Center policy on calculations and recalculations of historical NowCast values

  • Nowcast calculations for the current hour are pre-populated for all AirNow sites at the top of the hour (i.e., 08:00-09:00 Nowcast values are first calculated between ~09:00-09:03). Typically, data for the previous hour has not arrived yet when these calculations are made. Thus, all sites in the system will likely use the 1-Hour-Old Surrogate or 1-Hour-Old PLS method to start the hour (but depending on data availability may also be 2-Hour-Old or N/A). Then, when data arrives, typically between :05 and :25 past the hour, the current hours’ Nowcast values will be updated accordingly using the PLS or Surrogate method. If data does not arrive for the given site for the hour, the Nowcast is not recalculated unless that point is back-filled. This behavior is not typically noticed in any public products as they aren’t produced until :25-:30 past the hour. However, the pre-populated NowCast values may be noticed in some queries that directly hit the AirNow database, such as API web services. This is one possible reason why you may see Nowcast values change over the course of the hour.

  • Nowcast values stored in our system before the rollout of the PLS method on August 1, 2019 will not be updated, with two exceptions (mentioned below).

    • Nowcast values will be recalculated if a data point in our system is updated. That is, if the QC code of the data point changes, the value of the data point changes, or the point is backfilled. However, Nowcast values from hours that did not have direct QC code or value changes will not be updated.

      For example, if the ozone value or QC code was updated for 1pm yesterday at a given site, the 1pm Nowcast value from yesterday will be recalculated. Subsequent Nowcast values (2pm yesterday, 3pm yesterday, etc.), while dependent on the 1pm raw value in its PLS Nowcast calculation, will not be updated.

    • On rollout August 1, 2019, new ozone Nowcast values were backfilled to July 26, 2019.


When Nowcast calculations are made

  • Nowcast calculations for a given hour are made for all sites at the top of the next hour (i.e. 08:00-09:00 Nowcast values begin being calculated at 09:00). In almost all cases, data for the previous hour has not arrived yet when these initial calculations are made. These calculations are made as a fallback in case data do not arrive later in the hour.

  • Nowcast calculations are made on ingest of new ozone data. This happens when an agency delivers a file to the AirNow Data Management Center, typically between :05 and :25 past the hour.


Where is the Nowcast used?


1 Like