HNR2/HR20 Forecast Comparison Executive Summary

    •The DTC conducted its first extensive test of a HWRF configuration, demonstrating that the development of a testing infrastructure functionally equivalent to NOAA EMC's is complete.

    • Over 1000 HWRF runs for the Eastern North Pacific and Atlantic basins, for the 2008, 2009, and 2010 seasons were conducted in order to produce a robust Reference Configuration of the community code (HWRF PS:85.98.98.88.88.2.4, nicknamed HNR2).

    • Track errors for HNR2 increase linearly with time from near zero at initialization time to 280 nm at the 5-day forecast.

    • A negative intensity bias is noted for HNR2 at all lead times, with a marked increase in errors in the first 6-h of the forecast, suggesting a problem with the initialization. More recent results, not presented in this report, indicate that this problem has been mitigated in more recent HWRF configurations.

    • Absolute intensity errors increase sharply in the first 6-h of forecast and then grow slowly out to 3-days, after which they remain virtually unchanged.

    • The forecast storm size is larger than the observed one for the 34-, 50-, and 64-kt wind radii, with the worst errors occurring for the 34-kt radii.

    • A comparison between HNR2 and HR20 (a similar HWRF test conducted by EMC) was conducted for the purposes of assessing how similar the forecasts produced with the community code were to forecasts produced with a similar configuration from the EMC code repository. An exact match was not expected due to differences in computational platform, and a few other minor setup differences noted in Section 3.

    • The HNR2 forecast skill is shown to be similar to HR20, with very few statistically significant differences in track and intensity. Several statistically significant differences favoring HNR2 were noted in storm structure but their magnitude is small compared to the actual errors.

    • The worst track and absolute intensity forecasts (outliers) were identified so that forecast improvements for these poorly performing cases can be addressed in the future. These results are included in the Final Report.

    • Model output files have been archived and are available to the community for future studies.

Codes Employed

The software packages used in the HWRF HNR2/HR20 Forecast Comparison Test included:

    •WRF - revision 4594

    •WPS - revision 573

    •WPP - official release v3.2

    •GSI - official release v2.5

    •Vortex relocation and initialization, prep_hybrid, miscellaneous       libraries and tools - hwrf-utilities revision 173

    •Princeton Ocean Model (POM) and POM initialization - revision 60

    •NCEP Coupler - revision 35

    •GFDL Vortex Tracker - revision 49

    •National Hurricane Center Verification System - revision 3

    •Statistical Programming Language R for aggregation of verification       results and computation of confidence intervals

HWRF Model: HNR2 Configuration

    2011 Operational HWRF Baseline configured from the community       code repositories and run by DTC.

HWRF Model: HR20 Configuration

    2011 Operational HWRF Baseline configured from the NCEP/EMC       code repositories and run by EMC.

Differences in Configuration:
HNR2 HR20
Institution DTC EMC
Platform Linux IBM
Source code Community EMC
Scripts DTC EMC
Automation NOAA GSD Workflow Manager EMC HWRF History Sequence Manager
I/O format NetCDF Binary
WPP WPP v3.2 NAM Post modified for HWRF
Tracker Community repository EMC operational
Sharpening in ocean init Used in spin up Phases 3 and 4 Used in Phase 3 only (known bug)
Snow Albedo Older dataset Newer dataset

Domain Configuration

    The HWRF domain was configured the same way as used in the NCEP/EMC operational system. The atmospheric model employed a parent and a movable nested grid. The parent grid covered a 75x75 deg area with approximately 27 km horizontal grid spacing. There were a total of 216 x 432 grid points in the parent grid. The nest covered a 5.4 x 5.4 deg area with approximately 9 km grid spacing. There were a total of 60 x 100 grid points in the nest. The location of the parent and nest, as well as the pole of the projection, varied from run to run and were dictated by the location of the storm at the time of initialization.

    HWRF was run coupled to the POM ocean model for Atlantic storms and in atmosphere-only mode for East Pacific storms. The POM domain for the Atlantic storms depended on the location of the storm at the initialization time and on the 72-h NHC forecast for the storm location. Those parameters defined whether the East Atlantic or United domain of the POM was used.

    The image shows the atmospheric parent and nest domains (yellow) and the United POM domain (blue).


Click for larger image.

Cases Run

Storms: 53 complete storms from 2008, 2009, and 2010.

    • 2008 Atlantic: Fay, Gustav, Hanna, Ike

    • 2008 Pacific: Elida, Fausto, Genevieve, Marie, Norbert

    • 2009 Atlantic: Bill, Claudette, Danny, Erika, Fred, Henri, Ida

    • 2009 Pacific: Felicia, Guillermo, Hilda, Ignacio, Jimena, Linda,
      Olaf, Rick

    • 2010 Atlantic: Alex, Bonnie, Collin, Danielle, Earl, Fiona,
      Gaston, Hermine, Igor, Julia, Karl, Lisa, Matthew, Nicole, Otto,
    • 2010 Pacific: Blas, Celia, Darby, Six, Estelle, Eight, Frank, Ten,
      Eleven, Georgette

Initializations: Every 6 h, in cycled mode.

Forecast Length: 126 hours; output files available every 6 hours

Verification

The characteristics of the forecast storm (location, intensity, structure) were compared against the Best Track using the National Hurricane Center (NHC) Verification System (NHCVx). The HNR2 ATCF files were produced by the DTC as part of this test, while the HR20 ATCF files were supplied by NOAA/NCEP/EMC. The NHCVx was run separately for each case, at 6-hourly forecast lead times, out to 120 h, in order to generate a distribution of errors. Verification was performed for any geographical location for which Best Track was available, including over land. No verification was performed when the observed storm was classified as a low or wave.

An R-statistical language script was run on an homogenous sample of the HNR2 and HR20 datasets to aggregate the errors and to create summary metrics including the mean and median of track error, intensity error, absolute intensity error, and radii of 34, 50, and 64 kt wind in all four quadrants. All metrics are accompanied of 95% confidence intervals to describe the uncertainty in the results due to sampling limitations.

For the purposes of comparing the HNR2 and HR20 forecasts, pairwise differences (HNR2-HR20) of track error, and absolute intensity error, and absolute wind radii error were computed and aggregated with a R-statistical language script. Ninety-five percent confidence intervals on the median were computed to determine if there is a statistically significant difference between the two configurations.