Welcome to the Third Annual 2009 AMS AI Contest
Sponsored by the
American Meteorological Society
Committee on Artificial Intelligence Applications to
Environmental Science
Note: Due date for predictions changed to 23:59 MST Jan 7, 2010.
"Uncertainty is thus a fundamental characteristic of weather,
seasonal climate, and hydrological prediction, and no forecast
is complete without a description of its uncertainty."
... from Completing the Forecast:
Characterizing and Communicating Uncertainty for Better Decisions
Using Weather and Climate Forecasts, National Academies
Press.
A forecast that includes an expression of uncertainty or a probability
can be much more useful than the forecast of a single value.
For this reason, the 2009 AMS Artificial Intelligence Competition
will be based on forecasting a probability.
The challenge for this year's competition is to predict the
probability of turbulence exceeding a specific threshold.
The AMS Artificial Intelligence contest is open to all, and is intended
to promote the study of statistical artificial intelligence
techniques applied to meteorology.
Turbulence Prediction: Problem Description
The AI Contest this year is focused on predicting atmospheric turbulence that
affects aviation. It uses a dataset collected during summer months (June -
September) in which convectively-induced turbulence (CIT)--turbulence in and
around thunderstorms--is particularly prevalent, though mountain-wave
turbulence (MWT) and clear-air turbulence (CAT) are also present. Studies have
suggested that CIT is responsible for over 60% of turbulence-related aircraft
accidents; thus, accurate real-time turbulence diagnoses that include CIT could
improve airline safety and also help mitigate the significant delays that now
frequently afflict the national airspace system during periods of widespread
convection.
The mechanisms for the generation and propagation of atmospheric turbulence,
and CIT in particular, are a topic of current research and are still only
partially understood. However, the likelihood of CIT is thought to be related
to the proximity (vertical and horizontal), intensity, depth and extent of
convection as well as the state of the atmosphere around the storm. It seems
plausible that an empirical model that uses numerical weather prediction model
data to get an indication of larger-scale environmental conditions and
satellite, radar reflectivity, and lightning observations that indicate the
extent and severity of the storms and associated clouds could have good skill
in predicting turbulence. MWT could similarly be modeled based on location
(e.g., the presence of rough topography) as well as environmental conditions
reflected in numerical weather prediction model data, and CAT may be predicted
based on environmental conditions. Observations of turbulence generally entail
pilot reports or automated reports. Automated reports of eddy dissipation rate
(EDR, a measure of atmospheric turbulence) produced every minute by a
collection of commercial aircraft will provide the "truth" data for this
contest.
Entering the AMS AI Contest
Anybody may enter as follows.
- Download the training and test datasets.
They are packaged as a single zip file, at
http://verif.rap.ucar.edu/ams.ai/ams_ai_2009.zip .
- Calculate your predicted probabilities for the test dataset.
- Write an abstract discussing your forecasting techniques, tests,
and results.
Submit your abstract (not answers) to Gillian Peguero:
gpeguero@ametsoc.org
by Nov 16, 2009
- Pay the AMS abstract fee, $90. USD,
by calling Gillian Peguero at the AMS at 617-227-2426 x251.
Since this is an educational contest, submitting an abstract
and writing a paper are required.
You are encouraged to attend the annual AMS meeting in Atlanta,
and to present your results in person. Generally everyone who
enters presents a short talk on their techniques. However attendance
at the AMS meeting is not required to enter the contest. If you cannot
attend you may send us a poster to display, or we will display your paper.
- Send your predicted probabilities to us at:
ams-ai-2009@rap.ucar.edu
before 23:59 MST Jan 7, 2010 .
Make sure your entry has the format described below.
- Finally, submit your paper online at:
http://ams.confex.com/ams/
before 23:59 MST Jan 13, 2010 .
Gillian Peguero will send you the userid and password to access the site
when you submit your abstract.
The paper is sometimes called an extended abstract, and is typically
three or four pages long.
We will announce the winners after receiving the papers.
Judging Criteria
There are two important attributes for probabilistic forecasts - reliability
and resolution. For a probabilistic forecast to be reliable, the frequency of
an observed event, should agree with the forecasted probability value. For
example, when a forecast of 20% is made, one should observe this event 20% of
the time. When this is true, a forecast is considered reliable.
However, a reliable forecast is not necessarily a useful forecast. By only
forecasting the long-term chance of an event occurring, one would
have a reliable forecast. However, one can see this would be of limited
utility. For this reason, we also need to consider the resolution of a
forecast. A forecast with perfect resolution will always correctly forecast
either 0% or 100%. A completely random forecast or a completely consistent
forecast such as the climatological average probability has no resolution.
To reward both reliability and resolution, the forecasts in this competition
will be assessed using the Brier Skill Score (BSS). The Brier Skill Score
combines features of resolution, reliability and observational uncertainty. The
reliability component of the Brier Skill Score is the standard deviation of the
difference between the forecast probability and the average frequency of the
observed value corresponding to that forecast. This component should be
minimized. The resolution component is the variance of the difference between
the climatological frequency of an event occurring and the individual forcasts.
This value should be maximized. This is done when forecasts are either 0% or
100% in correct proportions to the climatological frequency.
It should be noted that the Brier Skill Score is not without its weaknesses.
The value of the BSS, like all skill scores, is dependent on the sample
climatology. Different climatologies will result in different scores. In this
competition, everyone will be using the same sample dataset, so comparing
scores is appropriate. Also, with these two components, a single Brier Skill
Score can be the result of different combinations of resolution and reliability
components. In the real world, different uses will have specific requirements
for resolution and reliability.
Further information about calculating the Brier Skill Score can be found in the
following references.
Training Dataset Format
The training dataset contains 103990 data rows,
and 136 columns.
The dataset is in ASCII format with comma separated values.
The columns are described below under "Variables".
The ismog column is the predictand variable.
The peak_edr is provided for
your information but is not to be used as a predictor variable.
The peak_edr is not found in the test dataset.
The remaining variables might, or might not,
be useful in predicting ismog.
All variables can contain missing values, encoded as "NA".
Test Dataset Format
The test dataset contains 50127 data rows in a format
similar to the training dataset,
but without the peak_edr and ismog columns.
The contest problem is to predict, for each data line (row)
in the test dataset,
the probability that the ismog is true.
This is the same as the probability that peak_edr >= 0.25.
Contest Entry Format
All contest entries must be in the following format.
The entry must be an uncompressed ASCII file having
50127 lines - the number of data rows as the test dataset.
Each line (row) should have two columns separated by a comma.
The columns are:
- Line (row) number
- Predicted probability p, where 0 <= p <= 1,
that ismog is true.
Note this is a probability (0 to 1), not a percentage (0 to 100).
Variables
In the AI Contest dataset, collocated observation and model-derived variables
have been extracted for each aircraft EDR measurement. The object of this
contest is to use these variables to predict the probability that the measured
turbulence is moderate-or-greater (MoG). The 'target' or 'truth' variable to
be predicted is ismog, which is 0 (false) if the EDR measurement
(peak_edr, also included) reflects null or light turbulence, and 1
(true) if it is above the threshold for MoG turbulence. The peak_edr
and ismog fields are provided in the training dataset, but not in the
testing dataset.
The NWP model, satellite and radar fields surrounding the plane's EDR
measurement location have been used to calculate potential predictor variables
that indicate a plane's distance from various intensity levels of storms and
clouds, as well as environmental characteristics at the measurement point.
These variables may have skill individually or in combination. However, there
are many times that the satellite or radar readings are missing or null; those
field values are labeled 'NA' in the data set. (Since MoG turbulence is quite
rare, the proportion of null to positive instances in both training and testing
datasets has been manipulated for the purposes of this contest by removing 2/3
of the null report instances.)
In the comma-separated value (CSV) training and testing data files, each line
represents an instance with predictor and predictand variables associated with
an aircraft turbulence measurement. Below are brief descriptions of the
variables. A line number is also provided for each instance.
Fields in the training set only: (last 2 fields)
Predictor fields in both training and testing sets: (listed in order)
Airplane information at time the EDR measurement was recorded:
- aircraft id: identifier of the reporting aircraft (1-98)
- utc: Coordinated Universal Time of the EDR measurement (formatted as hhmmss,
where hh = hour, mm = minute and ss = second)
- latitude: latitude of the EDR measurement location
- longitude: longitude of the EDR measurement location
- altitude: altitude of the EDR measurement location
Lightning information:
- ltg_strikes_1hr: the number of lightning strikes in the previous hour near the
EDR measurement location
Satellite radiance channels from the NOAA GOES imager:
- ch2_3_9_micron: 3.9 micron radiance at the EDR measurement location
- ch2_3_9_micron_DMean10: mean value in a 10km disc around the
EDR measurement location
- ch2_3_9_micron_DMean20: ... in a 20km disc
- ch2_3_9_micron_DMean40: ... in a 40km disc
- ch2_3_9_micron_DMean80: ... in an 80km disc
- ch2_3_9_micron_DMin10: minimum value in a 10km disc around the EDR measurement
- ch2_3_9_micron_DMin20: ... in a 20km disc
- ch2_3_9_micron_DMin40: ... in a 40km disc
- ch2_3_9_micron_DMin80: ... in a 80km disc
- ch3_6_7_micron: 6.7 micron radiance at the EDR measurement
- ch3_6_7_micron_DMean10: same meanings as above
- ch3_6_7_micron_DMean20
- ch3_6_7_micron_DMean40
- ch3_6_7_micron_DMean80
- ch3_6_7_micron_DMin10
- ch3_6_7_micron_DMin20
- ch3_6_7_micron_DMin40
- ch3_6_7_micron_DMin80
- ch4_11_micron: 11 (approx.) micron radiance at the EDR measurement
- ch4_11_micron_DMean10: same meanings as above
- ch4_11_micron_DMean20
- ch4_11_micron_DMean40
- ch4_11_micron_DMean80
- ch4_11_micron_DMin10
- ch4_11_micron_DMin20
- ch4_11_micron_DMin40
- ch4_11_micron_DMin80
- ch6_13_3_micron: 13.3 micron radiance at the EDR measurement
- ch6_13_3_micron_DMean10: same meanings as above
- ch6_13_3_micron_DMean20
- ch6_13_3_micron_DMean40
- ch6_13_3_micron_DMean80
- ch6_13_3_micron_DMin10
- ch6_13_3_micron_DMin20
- ch6_13_3_micron_DMin40
- ch6_13_3_micron_DMin80
NEXRAD radar-derived storm intensity and proximity information:
- dbz: composite radar reflectivity (dBZ) at the EDR measurement location
- dbz_DMax10: maximum dbz value in a 10km disc around the EDR measurement
- dbz_DMax20: ... in a 20km disc
- dbz_DMax40: ... in a 40km disc
- dbz_DMax80: ... in a 80km disc
- dbz_DMean10: mean dbz value in a 10km disc around the EDR measurement
- dbz_DMean20: ... in a 20km disc
- dbz_DMean40: ... in a 40km disc
- dbz_DMean80: ... in a 80km disc
- dbz_DNGood10: number of dbz pixels above detection threshold in a
10km disc around the EDR measurement
- dbz_DNGood20: ... in a 20km disc
- dbz_DNGood40: ... in a 40km disc
- dbz_DNGood80: ... in a 80km disc
- NSSL_DBZ10_Wedge0_5_mindistance: minimum horizontal distance to an
NSSL 3-D reflectivity mosaic pixel >= 10 dBZ
- NSSL_DBZ20_Wedge0_5_mindistance: ... >= 20 dBZ
- NSSL_DBZ30_Wedge0_5_mindistance: ... >= 30 dBZ
- NSSL_DBZ40_Wedge0_5_mindistance: ... >= 40 dBZ
- NSSL_VIL0_14_Wedge0_5_mindistance: minimum horizontal distance to a
vertically integrated liquid (VIL) pixel >= 0.14 kg m^-2
- NSSL_VIL0_76_Wedge0_5_mindistance: ... >= 0.76 kg m^-2
- NSSL_VIL3_50_Wedge0_5_mindistance: ... >= 3.5 kg m^-2
- NSSL_VIL6_90_Wedge0_5_mindistance: ... >= 6.90 kg m^-2
- UniEchoTops20_Wedge0_5_mindistance: minimum horizontal distance to
UniSys radar reflectivity echo top >= 20,000 ft.
- UniEchoTops30_Wedge0_5_mindistance: ... >= 30,000ft.
- nssl_10dbz_top_0_: radar reflectivity 10 dBZ "echo top" at EDR measurement location derived from NSSL 3-D reflectivity mosaic
- nssl_10dbz_top_DMax0_10: maximum 10 dBZ echo top value in a 10km disc around the EDR measurement
- nssl_10dbz_top_DMax0_20: ... in a 20km disc
- nssl_10dbz_top_DMax0_40: ... in a 40km disc
- nssl_10dbz_top_DMax0_80: ... in an 80km disc
- nssl_10dbz_top_DMean0_10: mean 10 dBZ echo top value in a 10km disc around the EDR measurement
- nssl_10dbz_top_DMean0_20: ... in a 20km disc
- nssl_10dbz_top_DMean0_40: ... in a 40km disc
- nssl_10dbz_top_DMean0_80: ... in an 80km disc
- nssl_18dbz_top_0_: radar reflectivity 18 dBZ "echo top" at EDR measurement location derived from NSSL 3-D reflectivity mosaic
- nssl_18dbz_top_DMax0_10: maximum 18 dBZ echo top value in a 10km disc around the EDR measurement
- nssl_18dbz_top_DMax0_20: ... in a 20km disc
- nssl_18dbz_top_DMax0_40: ... in a 40km disc
- nssl_18dbz_top_DMax0_80: ... in an 80km disc
- nssl_18dbz_top_DMean0_10: mean 18 dBZ echo top value in a 10km disc around the EDR measurement
- nssl_18dbz_top_DMean0_20: ... in a 20km disc
- nssl_18dbz_top_DMean0_40: ... in a 40km disc
- nssl_18dbz_top_DMean0_80: ... in an 80km disc
- nssl_18dbz_top_DNGood0_10: number of "good" 18 dBZ echo top pixels in a 10km disc around the EDR measurement
- nssl_18dbz_top_DNGood0_20: ... in a 20km disc
- nssl_18dbz_top_DNGood0_40: ... in a 40km disc
- nssl_18dbz_top_DNGood0_80: ... in an 80km disc
- nssl_dbz_0_: NSSL 3-D radar reflectivity measurement at the EDR measurement location
- nssl_dbz_DMax0_10: maximum 3-D radar reflectivity in a horizontal 10km disc around the EDR measurement
- nssl_dbz_DMax0_20: ... in a 20km disc
- nssl_dbz_DMax0_40: ... in a 40km disc
- nssl_dbz_DMax0_80: ... in a 80km disc
- nssl_dbz_DMean0_10: mean 3-D radar reflectivity in a horizontal 10km disc around the EDR measurement
- nssl_dbz_DMean0_20: ... in a 20km disc
- nssl_dbz_DMean0_40: ... in a 40km disc
- nssl_dbz_DMean0_80: ... in a 80km disc
NWP model-derived fields:
The following fields are provided by or calculated from the output of the Rapid Update Cycle (RUC) numerical weather prediction model analysis. The values are linearly interpolated from the model grid to the location of the EDR measurement. Most are 3-D fields, but some are 2-D. Individual descriptions are not provided.
- NVA_Linear
- PRES_SFC_Linear
- PWAT_ATMO_Linear
- RICH_Linear
- RITW_Linear
- ROL_Linear
- RPVORT_Linear
- RSTAB_Linear
- SATRI_Linear
- SCATR_Linear
- SHOWI_Linear
- SIGW10_Linear
- SMHGT_Linear
- SMTEMP_Linear
- SPDEDRL_Linear
- SPDSIGX_Linear
- STONE_Linear
- TOTALS_Linear
- UBFDT_Linear
- VPTMP_HYBL_Linear
- VWS_Linear
- DTF3_Linear
- DTF5_Linear
- EDRLL_Linear
- EDRS10_Linear
- ELLROD1_Linear
- F2D_Linear
- FRNTGTH_Linear
- IWMR_HYBL_Linear
- LAPSE_Linear
- MIXR_HYBL_Linear
- MSLMA_MSL_Linear
- NCSU1_Linear
- NCSU2_Linear
- ABSDIV_Linear
- CLIMO_Linear
- BROWN2_Linear
- DEFSQ_Linear
Questions
If you have questions please email them to:
ams-ai-2009@rap.ucar.edu
Questions and answers will be posted on this web site for all to read.
Please check this site for updates before sending your question.
Acknowledgements
We'd like to thank the
UCAR Research Applications Laboratory
for their support in running the forecasting contest.
The small print
Currently there is no prize offered other than recognition.
If your organization is interested in sponsoring, please contact
us at ams-ai-2009@rap.ucar.edu. Past sponsorships have been
$1000, which covers prizes for the first few places. Sponsors
get recognized in the AMS conference agenda, on the contest web site,
and at the paper presentations.
The decision of the judges is final.
The AMS, UCAR, and all persons and organizations associated
with the contest have no liability for any actions associated
with the contest.
Any communications with us regarding the contest may be published.
Version: 2009-08-24 A