HNR2/HR20 Forecast Comparison Executive Summary

    • The DTC conducted its second extensive test of a HWRF configuration (HD33), demonstrating that a robust testing environment, functionally-similar to EMC’s, is available.

    • Over 600 HWRF runs for the Eastern North Pacific and North Atlantic basins for the 2010 season were conducted in order to establish a benchmark of the community code (HD33) and to compare the forecasts against a counterpart set produced at EMC (H21A).

    • Track errors for HD33 increase linearly with time from near zero at initialization time to 280 nm at the 5-day forecast in both basins.

    • Absolute intensity errors increase sharply in the first 6-h of forecast and then grow slowly out to 3-days, after which they remain virtually unchanged.

    • A negative intensity bias is noted for HD33 in the Pacific basin after the second day of forecasting, while in the Atlantic there is no statistically significant bias.

    • The HD33 forecast storm size is larger than the observed one, and continuously grows in size, for the 34-, 50-, and 64-kt wind radii in the Atlantic basin. In the Eastern North Pacific, the forecast size is over predicted at the onset, but decreases with forecast lead time.

    • The worst track and absolute intensity forecasts (outliers) were identified so that forecast improvements for these poorly performing cases can be addressed in the future.

    • While an exact match between the HD33 and H21A forecasts was not expected due to differences in computational platform and a few other minor setup differences noted in Section 3, a large number of statistically significant differences in track, intensity, and structure were found between the two sets.

    • Diagnostic investigations conducted after the test revealed that the differences were caused by a coding error in the convective parameterization. This bug behaved differently in different computational platforms. After correcting this bug, a small sample of forecasts was rerun and indicated that HD33 and H21A results were much closer.

    • Model output files have been archived and are available to the community for future studies. Forecast maps and verification graphics, along with this report and additional information are available in this website.

Codes Employed

The software packages used in the HWRF HNR2/HR20 Forecast Comparison Test included:

    •WRF - revision 4594

    •WPS - revision 573

    •WPP - official release v3.2

    •GSI - official release v2.5

    •Vortex relocation and initialization, prep_hybrid, miscellaneous       libraries and tools - hwrf-utilities revision 173

    •Princeton Ocean Model (POM) and POM initialization - revision 60

    •NCEP Coupler - revision 35

    •GFDL Vortex Tracker - revision 49

    •National Hurricane Center Verification System - revision 3

    •Statistical Programming Language R for aggregation of verification       results and computation of confidence intervals

HWRF Model: HNR2 Configuration

    2011 Operational HWRF Baseline configured from the community       code repositories and run by DTC.

HWRF Model: HR20 Configuration

    2011 Operational HWRF Baseline configured from the NCEP/EMC       code repositories and run by EMC.

Differences in Configuration:
HNR2 HR20
Institution DTC EMC
Platform Linux IBM
Source code Community EMC
Scripts DTC EMC
Automation NOAA GSD Workflow Manager EMC HWRF History Sequence Manager
I/O format NetCDF Binary
WPP WPP v3.2 NAM Post modified for HWRF
Tracker Community repository EMC operational
Sharpening in ocean init Used in spin up Phases 3 and 4 Used in Phase 3 only (known bug)
Snow Albedo Older dataset Newer dataset

Domain Configuration

    The HWRF domain was configured the same way as used in the NCEP/EMC operational system. The atmospheric model employed a parent and a movable nested grid. The parent grid covered a 75x75 deg area with approximately 27 km horizontal grid spacing. There were a total of 216 x 432 grid points in the parent grid. The nest covered a 5.4 x 5.4 deg area with approximately 9 km grid spacing. There were a total of 60 x 100 grid points in the nest. The location of the parent and nest, as well as the pole of the projection, varied from run to run and were dictated by the location of the storm at the time of initialization.

    HWRF was run coupled to the POM ocean model for Atlantic storms and in atmosphere-only mode for East Pacific storms. The POM domain for the Atlantic storms depended on the location of the storm at the initialization time and on the 72-h NHC forecast for the storm location. Those parameters defined whether the East Atlantic or United domain of the POM was used.

    The image shows the atmospheric parent and nest domains (yellow) and the United POM domain (blue).


Click for larger image.

Cases Run

Storms: 53 complete storms from 2008, 2009, and 2010.

    • 2008 Atlantic: Fay, Gustav, Hanna, Ike

    • 2008 Pacific: Elida, Fausto, Genevieve, Marie, Norbert

    • 2009 Atlantic: Bill, Claudette, Danny, Erika, Fred, Henri, Ida

    • 2009 Pacific: Felicia, Guillermo, Hilda, Ignacio, Jimena, Linda,
      Olaf, Rick

    • 2010 Atlantic: Alex, Bonnie, Collin, Danielle, Earl, Fiona,
      Gaston, Hermine, Igor, Julia, Karl, Lisa, Matthew, Nicole, Otto,
    • 2010 Pacific: Blas, Celia, Darby, Six, Estelle, Eight, Frank, Ten,
      Eleven, Georgette

Initializations: Every 6 h, in cycled mode.

Forecast Length: 126 hours; output files available every 6 hours

Verification

The characteristics of the forecast storm (location, intensity, structure) were compared against the Best Track using the National Hurricane Center (NHC) Verification System (NHCVx). The HNR2 ATCF files were produced by the DTC as part of this test, while the HR20 ATCF files were supplied by NOAA/NCEP/EMC. The NHCVx was run separately for each case, at 6-hourly forecast lead times, out to 120 h, in order to generate a distribution of errors. Verification was performed for any geographical location for which Best Track was available, including over land. No verification was performed when the observed storm was classified as a low or wave.

An R-statistical language script was run on an homogenous sample of the HNR2 and HR20 datasets to aggregate the errors and to create summary metrics including the mean and median of track error, intensity error, absolute intensity error, and radii of 34, 50, and 64 kt wind in all four quadrants. All metrics are accompanied of 95% confidence intervals to describe the uncertainty in the results due to sampling limitations.

For the purposes of comparing the HNR2 and HR20 forecasts, pairwise differences (HNR2-HR20) of track error, and absolute intensity error, and absolute wind radii error were computed and aggregated with a R-statistical language script. Ninety-five percent confidence intervals on the median were computed to determine if there is a statistically significant difference between the two configurations.