Air temperature model and areal interpolation
Air temperature predictions were drawn from a geostatistical model
detailed previously (9). Briefly, we trained a machine learning model
(XGBoost) on ground station data aggregated by the National Oceanic and
Atmospheric Administration (NOAA) Meteorological Assimilation Data
Ingest System (MADIS). The specific MADIS dataset we used was the
National Mesonet, with >4,000 weather stations across the
region. Predictors in the model included land surface temperature from
the MODIS instrument on the Aqua and Terra satellites, an
inverse-distance weighting interpolation of air temperature, enhanced
vegetation index, landcover and landform characteristics, elevation, a
topological position index, and temporal terms for seasonality. Careful
attention was paid to avoid overfitting with spatial cross-validation
methods, and the model was assessed and validated against an independent
external dataset. The resulting model was approximately 1-km grid cells
and an hourly timestep. For comparison, air temperature predictions from
this model have
a root
mean square error of 1.6 Kelvin when compared to ground observations,
whereas the North American
Land Data
Assimilation System-2 (NLDAS-2) model, used in the Centers for Disease
Control and Prevention Heat and Health Tracker, has a
root
mean square error of 2.5 Kelvin.
An areal interpolation procedure was designed to align temperature data
with census tracts. First, 1-km prediction cells were reprojected to
align with NASA’s Gridded Population of the World version 4.11 data
product for 2000 and 2010 (25). We then used the exactextractrpackage in R to calculate area-weighted mean population density values
in each cell (26). Finally, we used the population density and coverage
area of each prediction cell to weight temperature predictions in each
cell and compute weighted average by each tract. The result was a
population-weighted areal interpolation of temperature within each
census tract.
Air temperature data were transformed to create CDDs, restricted to
predictions in May–September of each year. We used Fahrenheit instead
of Celsius to align with energy policies in the U.S. CDD calculations
were adapted based on NOAA methods:
\(Cooling\ degree\ days=\ \sum_{i}^{n}\left\{\par
\begin{matrix}\left(\frac{\text{Tmax}_{i}-\ \text{Tmin}_{i}}{2}\right)-65,if>0,\ \ otherwise\\
0\\
\end{matrix}\right.\ \) (1)
where Tmax was the maximum hourly temperature and Tmin was
the minimum on day i, summed over n days in
May–September.