On the Evaluation of Skill in Binary Forecast


  • Thitithep Sitthiyot Faculty of Commerce and Accountancy, Chulalongkorn University, Thailand
  • Kanyarat Holasut Faculty of Engineering, Khon Kaen University, Thailand


Directional Change, Forecast Skill Score, Forecast Verification, Random Event, Rare or Extreme Event


A good prediction is very important for scientific, economic, and administrative purposes. It is therefore necessary to know whether a predictor is skillful enough to predict the future. Given the increased reliance on predictions in various disciplines, a prediction skill index (PSI) is devised. Twenty-four numerical examples are used to demonstrate how the PSI method works. The results show that the PSI awards not only the same score for random prediction and always predicting the same value, but also nontrivial scores for correct prediction of rare or extreme events. Moreover, the PSI can distinguish the difference between the perfect forecast of rare or extreme events and that of random events by awarding different skill scores, while other conventional methods cannot and award the same score. The data on the growth of real gross domestic product forecast of the Bank of Thailand between 2000 and 2019 are also used to demonstrate how the PSI evaluates the skill of the forecaster in practice.


Download data is not yet available.


Bank of Thailand. (2020). Monetary Policy Report (2000Q3-2019Q4) [dataset]. Retrieved from https://www.bot.or.th/English/MonetaryPolicy/MonetPolicyComittee/MPR/Pages/


Briggs, W., & Ruppert, D. (2005). Assessing the skill of yes/no predictions. Biometrics, 61, 799-807.

Bloom, N. (2014). Fluctuations in uncertainty. Journal of Economic Perspectives, 28, 153–176.

Camporeale, E. (2019). The challenge of machine learning in space weather: Nowcasting and forecasting. Space Weather, 17, 1166-1207.

Clayton, H. H. (1934). Rating weather forecasts. Bulletin of the American Meteorological Society, 15, 279-283.

Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology, 26, 297-302.

Doolittle, M. H. (1888). Association ratios. Bulletin of the Philosophical Society of Washington, 10, 83-87, 94-96.

Faisal, Z. Md., Monira S. S., & Hirose, H. (2013). DF-ReaL2Boost: A hybrid decision forest with Real L2Boost decision stumps. In F. L. Gaol (Ed.), Recent progress in data engineering and internet technology, vol. 1 (pp. 47-53). New York: Springer.

Ferro, C. A. T., & Stephenson, D. B. (2011). Extremal dependence indices: Improved verification measures for deterministic forecasts of rare binary events. Weather and Forecasting, 26, 699-713.

Finley, J. P. (1884). Tornado predictions. American Meteorological Journal, 1, 85-88.

Foresti, L., Reyniers, M., Seed, A., & Delobbe, L. (2016). Development and verification of a real-time stochastic precipitation nowcasting system for urban hydrology in Belgium. Hydrology and Earth System Sciences, 20, 505-527.

Gandin, L. S., & Murphy, A. H. (1992). Equitable skill scores for categorical forecasts. Monthly Weather Review, 120, 361-370.

Gilbert, G. K. (1884). Finley’s tornado predictions. American Meteorological Journal, 1, 166-172.

Granger, C. W. J., & Pesaran, M. H. (2000). Economic and statistical measures of forecast accuracy. Journal of Forecasting, 19, 537-560.

Halide, H. (2009). Implementing predictive models for domestic decision-making against dengue haemorrhagic fever epidemics. Dengue Bulletin, 33, 1-10.

Heidke, P. (1926). Berechnung des erfolges und der güte der windstärkvorhersagen im sturmwarnungsdienst. Geografika Annaler, 8, 301-349.

Hogan, R. J., & Mason, I. B. (2012). Deterministic forecasts of binary events. In I. T. Jolliffe & D. B. Stephenson (Eds.), Forecast verification: A practitioner’s guide in atmospheric science (2nd ed.), (pp. 31-59). West Sussex: John Wiley & Sons.

Holliday, J. R., Rundle, J. B., & Turcotte, D. L. (2009). Earthquake forecasting and verification. In R. A. Meyers (Ed.), Encyclopedia of complexity and systems science (pp. 2438-2449). Berlin: Springer.

Jolliffe, I. T. (2016). The Dice co-efficient: A neglected verification performance measure for deterministic forecasts of binary events. Meteorological Applications, 23, 89-90.

Jolliffe, I. T., & Stephenson, D. B. (2012). Epilogue: New directions in forecast verification. In I. T. Jolliffe & D. B. Stephenson (Eds.), Forecast verification: A practitioner’s guide in atmospheric science (2nd ed.), (pp. 221-230). West Sussex: John Wiley & Sons.

Kubo, Y., Den, M., & Ishii, M. (2017). Verification of operational solar flare forecast: Case of regional warning center Japan. Journal of Space Weather and Space Climate, 7(A20), 1-16. https://doi.org/10.1051/swsc/2017018.

Lahiri, K., & Yang, L. (2013). Forecasting binary outcomes. In G. Elliott & A. Timmermann (Eds.), Handbook of economic forecasting, vol. 2, part B (pp. 1025-1106). New York: North Holland.

Manzato, A., & Jolliffe, I. T. (2017). Behaviour of verification measures for deterministic binary forecasts with respect to random changes and thresholding. Quarterly Journal of the Royal Meteorological Society, 143, 1903-1915.

McGovern, A., Gagne II, D. J., Williams, J. K., Brown, R. A., & Basara, J. B. (2014). Enhancing understanding and improving prediction of severe weather through spatiotemporal relational learning. Machine Learning, 95, 27-50.

Murphy, A. H. (1993). What is a good forecast?: An essay on the nature of goodness in weather forecasting. Weather and Forecasting, 8, 281-293.

Murphy, A. H. (1996). The Finley affair: A signal event in the history of forecast verification. Weather and Forecasting, 11, 3-20.

Peirce, C. S. (1884). The numerical measure of the success of predictions. Science, 4, 453-454.

So, R., Teakles, A., Baik, J., Vingarzan, R., & Jones, K. (2018). Development of visibility forecasting modeling framework for the Lower Fraser Valley of British Columbia using Canada’s Regional Air Quality Deterministic Prediction System. Journal of Air & Waste Management Association, 68, 446-462.

Wilks, D. S. (2011). Statistical methods in the atmospheric sciences (3rd ed., pp. 305-316). San Diego: Academic Press.

Woodcock, F. (1976). The evaluation of yes/no forecasts for scientific and administrative purposes. Monthly Weather Review, 104, 1209-1214.

Yule, G. U. (1900). On the association of attributes in statistics. Philosophical Transactions of the Royal Society, 194A, 257-319.




How to Cite

Sitthiyot, T., & Holasut, K. . (2022). On the Evaluation of Skill in Binary Forecast. Thailand and The World Economy, 40(3), 33–54. Retrieved from https://so05.tci-thaijo.org/index.php/TER/article/view/261138