Date Added: May 2011
This is a comment on Mitchell and Wallis (2011) which in turn is a critical reaction to Gneiting et al. (2007). The comment discusses the notion of forecast calibration, the advantage of using scoring rules, the "Sharpness" principle and a general approach to testing calibration. The aim is to show how a more general and explicitly stated framework for evaluation of probabilistic forecasts can provide further insights. Both Gneiting et al. (2007) (hereafter GBR) and Mitchell and Wallis (2011) (hereafter MW) examine various important aspects of calibration, but they do not propose a consistent framework and do not give sufficiently universal definitions of "Calibration" and "Ideal forecast".