Revista Cubana de Meteorología 27, Apr-Jun 2021, ISSN: 2664-0880
Artículo Original
Fog/haze events forecast validation using the mesoscale model WRF
Validación del pronóstico de eventos de niebla/neblina a través del modelo mesoescalar WRF

iDLic. Pedro Manuel González Jardines1Jose Martí ,International Airport, Boyeros Ave, Havana, Cuba. *✉

iDDra. Maibys Sierra Lorenzo2Center for Atmospheric Physics, Institute of Meteorology, Casablanca, Havana, Cuba.

iDMsc. Carlos Manuel González Ramírez3Provincial Meteorological Center La Habana-Artemisa-Mayabeque. Institute of Meteorology, Casablanca, Havana, Cuba.

iDLic. Israel Borrajero Montejo4Center for Atmospheric Physics, Institute of Meteorology, Casablanca, Havana, Cuba.


1Jose Martí ,International Airport, Boyeros Ave, Havana, Cuba.

2Center for Atmospheric Physics, Institute of Meteorology, Casablanca, Havana, Cuba.

3Provincial Meteorological Center La Habana-Artemisa-Mayabeque. Institute of Meteorology, Casablanca, Havana, Cuba.

4Center for Atmospheric Physics, Institute of Meteorology, Casablanca, Havana, Cuba.


*Autor para Correspondencia: Pedro Manuel González Jardines. E-mail:



The main objective of this research is the validation of numerical tools used for fog/haze events forecasting over the national territory. It is an extension of the SisPI project (Short range Forecasting System, with Spanish acronym) working operationally at the Institute of Meteorology (INSMET, with Spanish acronym). Version 3.8.1 of the mesoscale model WRF-ARW is used, initialized at 00:00 and 06:00 UTC to evaluate the impact of initialization on forecasts. As study area, it is chosen the region comprising the provinces of Havana, Artemisa and Mayabeque, which has ten conventional weather stations, divided into North coast, inner zone and South coast, for a more detailed assessment. Main absolute errors and linear correlations of the variables involved in the genesis and evolution of these phenomena were calculated allowing to determine a tendency to overestimate the values predicted on the study area. Contingency tables for binary events are also used for forecast evaluation, which show that the use of a cumulative distribution function allows a high degree of detection of these phenomena.

Palabras clave: 
nieblas; neblinas; visibilidad; probabilidad; sondeos numéricos; WRF.

El objetivo esencial de la presente investigación es la validación de herramientas numéricas orientadas al pronóstico de eventos de niebla/neblina sobre el territorio nacional. Es una extensión del proyecto SisPI (Sistema de Pronóstico Inmediato) que trabaja operacionalmente en el Instituto de Meteorología (INSMET). Se emplea el modelo mesoescalar WRF-ARW en su versión 3.8.1 inicializado en los horarios de las 00:00 y 06:00 UTC para evaluar el impacto del horario de inicialización sobre el pronóstico. Se escoge como área de estudio la región que comprende las provincias de La Habana, Artemisa y Mayabeque, que cuenta con 10 estaciones meteorológicas convencionales, divididas en costa norte, interior y costa sur para obtener una evaluación más detallada. Se calcularon errores medios y correlaciones lineales de las variables implicadas en la génesis y evolución de estos fenómenos, lo cual permitió determinar una tendencia a la sobreestimación de los valores pronosticados en el área de estudio. Se emplearon además tablas de contingencia para eventos binarios con el propósito de evaluar el pronóstico, lo cual arrojó que el empleo de una función de distribución acumulativa permite un alto grado de detección de estos fenómenos.

fog; haze; visibility; probability; numerical sounding; WRF.

The fog/haze events forecast is a constant concern of the national weather service. In regard to these phenomena it is essential to consider the dynamic and synoptic characteristics that determine the boundary layer, as well as its variation with respect to the time scale, to adequately predict their extent, intensity and duration.

Some statistical tools have been developed with the purpose of predicting fog/haze events and are based primarily on the climatology. Using a conditional climatology has greater value than simple climatology but its main limitation is that it does not adequately consider the dynamic processes and gives more weight to the available data.

In the decade of the nineties, some numerical forecasting models were develop for fog forecast. (Golding , 1993Golding B.W. 1993. ``A Study of the Influence of Terrain on Fog Development``. Monthly Weather Review, Vol.121, No. 9, 2529-2541.) used a mesoscale general purpose numerical weather forecast model to simulate the development of fog in Perth, Australia. This result shows that terrain local inequalities and the development of local nocturnal winds can often determine the location and the opportunity of fog/haze to appear.

(Bergot & Guedalia. 1994Bergot T.; & Guedalia D. 1994. Numerical Forecasting of Radiation Fog. Part I: Numerical Model and Sensitivity Tests. MWR, Vol. 122, No. 6, 12181230. DOI: 10.1175/1520-0493(1994) ) detailed an improved prognosis of radiation fog using a nocturnal one-dimensional boundary layer scheme, fed with an operational three-dimensional limited area mesoscale model. This paper shows the correlation between the observed data and predictions made by the proposed model. The influence of different physical processes including the dew deposition is also determined

Currently, the rapid advance of the numerical models, particularly the mesoscale model WRF (Weather Research and Forecasting) with a dynamical core ARW (Advanced Research WRF) allows introducing micro-physics, dynamic and planetary boundary layer characteristics that manage to reproduce, in real time and with sufficient accuracy, the environments in which processes of fog/haze develop. In this regard is included the doctoral thesis of (Ryerson, 2012Ryerson W. 2012. Toward improving short-range fog prediction in data-denied areas using the Air Force Agency Mesoscale Ensemble.) which proposes the use of an ensemble method of fog/haze forecast for ranges up to 20 hours.

Most national studies make reference to nocturnal irradiation processes in the formation of the aforementioned phenomena (Alfonso & Florido, 1980). However, it cannot be ruled out that due to horizontal transport of warm and humid air in the Southeast region synoptic flux that prevails when these phenomena occur, two kinds of fog can take place: advection and radiation. Other studies describe synoptic conditions associated with these events, highlighting those of days ahead of a cold front approximation and the presence of weak pressure gradients under a strong anti-cyclonic influence. (Guzmán, 2013Guzmán L. 2013. Condiciones favorables para la ocurrencia de nieblas en Cuba. Tesis en opción al título de Licenciado en Meteorología. Instituto Superior de Tecnologías y Ciencias Aplicadas (InsTEC), La Habana, Cuba. 85 p, Avaliable: Meteorologic Department, InsTEC, .).

This investigation evaluates a group of numerical tools oriented to fog/haze events forecasting using the mesoscale model WRF-ARW. To do this, forecasts obtained are evaluated for varying sea level pressure, wind force and direction, relative humidity, ambient temperature and dew point temperature, processes that are all involved in the genesis and development of fog/haze events. Subsequently, by using numerical soundings, visibility fields and cumulative distribution functions based on Weibull parameters, forecast maps are created by comparing days where the phenomenon occurred with those in which it is not present under similar synoptic conditions.

Fig. 1.  Study area

Study area

In the region including the provinces of Artemisa, Havana and Mayabeque, the occurrence of fog and haze has a low frequency; these are mostly of seasonal and local character, related largely to physical and geographic characteristics. They are associated with the second quadrant flux imposed by the periphery of the subtropical anticyclone, situation that precedes the arrival of a frontal system or the influence of weak pressure gradients (Guzmán, 2013Guzmán L. 2013. Condiciones favorables para la ocurrencia de nieblas en Cuba. Tesis en opción al título de Licenciado en Meteorología. Instituto Superior de Tecnologías y Ciencias Aplicadas (InsTEC), La Habana, Cuba. 85 p, Avaliable: Meteorologic Department, InsTEC, .).

Additionally it should be noted that this region is of extraordinary social and economic importance as it has a high agricultural and industrial development. Just to mention some examples, there is the Mariel Exclusive Economic Zone with a growing importance of port activities and transport of goods. In addition, livestock and agriculture, have an increasing activity, there are large areas dedicated to these purposes mainly in the provinces of Artemisa and Mayabeque, with high demand crops such as garlic, onions and potatoes.

Southeast of the capital, José Martí International Airport is one of the largest airport facilities in the Country that given its locations, is affected by these phenomena mainly during the dry season (that extends from November to April), sometimes causing delays in the normal development of their duties. In the Baracoa zone (Artemisa province) lies another airport where the fog/haze forecast is also necessary. It's also relevant to mention that it is a densely populated region as the capital only has more than two million inhabitants.

Study cases

Based on the information from present time data codes reported at stations and comprising hours between 00:00 and 12:00 UTC six study cases were selected, corresponding to the dry season of 2017. The six selected cases were divided into three couples of continuous days, considering that there were significant differences (in both spatial-temporal extension and intensity) in fog/haze outbreaks between one day and the next under similar synoptic conditions. This has the purpose of evaluating the sensitivity of the model to these changes.

The first pair of days comprises January 26th and 27th. The 26th was characterized by weak influence of the subtropical ridge, imposing a southeast wind regime over the study area, while a cold front, weakened at its southern portion, was passing over the center-eastern region of the Gulf of Mexico. The next day shows the above conditions with a slightly strengthened anticyclone center and the weakened cold front moving over the peninsula of Florida, which advanced toward the north of the study area during the remainder of the 27th.

The second pair of days includes February 8th and 9th, which were characterized by a marked influence of high pressures with an extended oceanic ridge over the eastern portion of the Gulf of Mexico. A greatly weakened cold front moved by the southeastern United States, which did not affect the western of the country. The pressure gradients weakened considerably towards the early 9th, resulting in a small secondary center northwest of the island during the morning that quickly disappeared with the advance of a migratory anticyclone accompanied by the new air mass.

Finally, April 5th and 6th were included. The 5th was characterized by the influence of an oceanic anticyclone that extended its dorsal into the Gulf of Mexico, imposing a second quadrant flux, at the same time a well-structured cold front advanced toward the western portion the Gulf of Mexico. On the early morning of day 6th, the front reached the East of the gulf and started to affect the study area during the afternoon.

WRF-ARW model experiments design

The experiments were executed using version 3.8.1 of WRF-ARW with the operational configuration used by SisPI (Sierra et al., 2014Sierra M. et al., 2014. Sistema de Predicción a muy corto plazo basado en el Acoplamiento de Modelos de Alta Resolución y Asimilación de Datos. Informe de resultado. Programa: “Meteorología y Desarrollo Sostenible del País”. Instituto de Meteorología. DOI: 10.13140/RG.2.1.2888.1127 ) which has two way nested outer domains, with horizontal resolutions of 27 and 9 kilometers (km) respectively and a single one way nested inner domain of 3 km enabled through the use of the ndown tool.

The temporal resolution of the first two domains is three hours while the 3 km domain provides forecasts every hour. For the initial and boundary conditions the GFS (Global Forecast System) data forecast was used with spatial resolution of 0.5 degrees and temporal resolution of three hours.

Fig. 2.  Operative SisPI domains. (a) 27 km, (b) 9 km, (c) 3 km.

An element to take into account, in addition to the purely meteorological considerations is the model spin-up. A model spin-up effect can matter on some phenomena forecasts and determinate its detection or not.

Following this reasoning, an anal

ysis was made about how close to the initialization the phenomenon should be. Given that the most frequent hours of fog/haze occurrence is 12:00 UTC (Álvarez et al., 2011Álvarez, E. L.; Borrajero, M. I.; Álvarez, M. R. & León, L. A. 2011ª. “Estudio de la marcha interanual de la frecuencia de ocurrencia de los fenómenos nieblas y neblinas a partir del código de estado de tiempo presente”. Revista Ciencias de la Tierra y el Espacio , (12): 31-46, ISSN: 1729-3790) it was decided to evaluate the predictions made by the WRF initialized at 00:00 and 06:00 UTC.

The parameterizations set used by SisPI for the 3 km domain includes the Morrison double moment scheme for microphysics, which is a second order representation of the processes of ice, snow, rain and graupel, it does not apply cumulus parameterization, as at high resolutions the convective precipitation can be solved with the microphysics scheme, it also uses the Mellor-Yamada Nakanishi and Niino (MYNN) 2.5 level TKE boundary layer scheme, which predicts the terms of kinetic energy at subgrid level.

The ARWpost package version 3.1 was used for post-processing, this post-processing package is avaliable at following direction (, along with GrADS (Grid Analysis Display System) scripts. A script made in shell language automated the entire process. In the simulations, a number of diagnostic variables were calculated, that the Air Force Weather Agency (AFWA) used in its operational model MEPS (Mesoscale Ensemble Prediction Suite) (Creighton et al., 2014Creighton G.; Kuchera E.; Adams-Selim R.; Mc Cormik J.; Rentschler S.; & Wickard B. 2014. AFWA Diagnostics in WRF. .).

Most of these diagnoses are only calculated for output time steps, as they are just snapshots of the modeling environment. However, one of the benefits of running diagnostics in-line is the ability to collect information on the rapidly evolving fields between output time’s steps, although it involves an additional computational cost.

Values related with a visibility reduction due to hydrometeors, dust and fog or haze are used as a Weibull β value and a prognostic Weibull alpha value is used if lowest visibility is associated to haze or fog. The alpha term is dimensionless and describes the shape of the Weibull curve, it behaves more like a Gaussian curve when absolute humidity is high and more like an exponential when the absolute humidity is low. The practical implication of this is to ensure the highest probability of reduced visibility in the mid-range of 4.83 to 8.05 km (3-5 miles) (Creighton et al., 2014Creighton G.; Kuchera E.; Adams-Selim R.; Mc Cormik J.; Rentschler S.; & Wickard B. 2014. AFWA Diagnostics in WRF. .).

The alpha term is calculated as follows:

afog=0.1+Pwat25+Wind3+rh10+1mix  (1)

where Pwat is the precipitable water, Wind is the 100 meters height wind, rh is the relative humidity at 2 meters and mix the mixing ratio at 2 meters. The value of this parameter is 3.6 and its decrease implies that the BIAS shifts toward a fog/haze event.

The empirical algorithm used by (Creighton et al., 2014Creighton G.; Kuchera E.; Adams-Selim R.; Mc Cormik J.; Rentschler S.; & Wickard B. 2014. AFWA Diagnostics in WRF. .), is based on relative humidity and the visibility values, in meters, is obtained by equation (2):

Vismeters=(VisHydro,VisDust,Visfog)  (2)

where the visibility due to dust obscuration is calculated only for WRF-CHEM simulations and the visibility due to hydrometeors and fog/haze areas obtained using:

Vishydro=3.9121.1(Rain+Graupel)0.75+10.36(Snow)0.78  (2.1)


rain, graupel and snow

are mass concentrations in g/m 3

Visfog=1500(105rh)(5mix)  (2.2)

where rh is 2 meter relative humidity and mix is 2 meters mixing ratio.

By combining the elements just described, it is possible to determinate, the probability of occurrence of a fog/haze event from the definition provided by the World Meteorological Organization (WMO), using a cumulative distribution function (CDF):

CDF(X)=(1e(XXo)βα)100  (3)

This function determines the probability of obtaining a value less than or equal to X. Beta in this case, determines the shape of the distribution curve, is assumed from the visibility value calculated in equation (2) and alpha is explicitly obtained for this phenomenon, as shown in equation (1). (Creighton et al., 2014Creighton G.; Kuchera E.; Adams-Selim R.; Mc Cormik J.; Rentschler S.; & Wickard B. 2014. AFWA Diagnostics in WRF. .).

In order to obtain a more detailed analysis of the results the study area was divided into three regions, meeting the criteria of subdivisions used at INSMET (Cuban Meteorological Institute, with acronym in Spanish) for the assessment of forecasts.

These sub-regions were designed taking into account the meteorological stations that had a similar statistical behavior. These sub-regions are divided into North coast, comprising the stations Bahía Honda (318), Bauta (376) and Casablanca (325), South coast, including Güira de Melena (320), Batabanó (322) and Melena del Sur ( 375) stations and the inner subregion which encompasses stations Santiago de Las Vegas (373), Tapaste (374), Bainoa (340) and Güines (323).

Fig. 3.  Sub-regions inside the study area

For a statistical evaluation, the mean absolute error considering the observation (O) minus the prediction (P), (O-P) for the all analysis cases and also the Pearson correlation coefficient described in equation (4) were calculated.

r=n(xiyi)xiyi((nxi22)(nyi22))  (4)

The calculations were made through a program developed in C / C ++ platform and some graphics support with code developed in Octave; in both cases it was all included in the automation post-processing program.

Finally, a verification of numerical forecasts considering fog/haze occurrences as binary events is included. Such verification is made by using a contingency table considering alpha, visibility and CDF variables, which can explicitly predict the genesis/evolution/ dissipation of the event.

Table I.  Contingency table for binary events.
Event forecastEvent observed
YesHitFalse alarm
NoMissCorrect rejection

Based on the results obtained for the two initialization times in all study cases, hit rate (H) was obtained which indicates the proportion of occurrences that were correctly identified by the model. It is also included the probability of false detection (F), which indicates the proportion of occurrences that were incorrectly predicted and critical detection index (CSI).

H=aa+c    F=bb+d     CSI=aa+b+c  (5)

Relationship between real data and simulations with WRF

(Hernández et al., 2017Hernández J. F.; González C. M.; González P. M. 2017. “Pronóstico de nieblas en las provincias de Artemisa, La Habana y Mayabeque”. IX Congreso Cubano de Meteorología , La Habana, Cuba. ISBN: 978-959-7167-60-0. Avaliable: Sociedad Meteorológica de Cuba. .) defined thresholds for a number of variables involved in the genesis and development fog/haze processes in the study area. These thresholds were obtained for each month, from a sample that included 6372 haze cases and 904 fog cases; they are shown in table II.

Table II.  Thresholds of variables involved in the occurrence of fog/haze events (Hernández et al., 2017Hernández J. F.; González C. M.; González P. M. 2017. “Pronóstico de nieblas en las provincias de Artemisa, La Habana y Mayabeque”. IX Congreso Cubano de Meteorología , La Habana, Cuba. ISBN: 978-959-7167-60-0. Avaliable: Sociedad Meteorológica de Cuba. .)
MonthWind speed (km/h)Temperature (ºC)Dew Point (ºC)Relative humidity (%)
January0.9 - 2.014.8 - 20.014.3 - 20.796.2 - 99.0
February0.2 - 8.615.6 - 21.015.1 - 20.096.2 - 98.0
April0.0 - 5.018.6 - 22.717.7 - 21.794.1 - 98.5
MonthWind speed (km/h)Temperature (ºC)Dew Point (ºC)Relative humidity (%)
January0.0 - 4.615.0 - 20.014.0 - 20.092.0 - 96.6
February1.0 - 4.715.0 - 20.414.0 - 19.091.0 - 95.8
April0.0 - 6.019.0 - 22.818.0 - 21.689.8 - 94.9

The analysis done by calculating the mean absolute errors shows a tendency by the model to overestimate the values of all variables. The biggest errors are in the wind direction.

Table III.  Main absolute error for all study cases, (a) initialized at 00:00 UTC, (b) initialized at 06:00 UTC.
Sub-regionsT (ºC)Td (ºC)dd (º)ff(km/h)P (hPa)RH (%)Vis (km)
North coast-0.82 0.18-46.49-3.91-0.51-2.07-16.03
Inner zone-2.10-1.03-70.03-6.10-0.19-4.38-16.62
South coast-1.09-0.44-30.90-4.42-0.05-3.70-12.09
Sub-regionsT (ºC)Td (ºC)dd (º)ff(km/h)P (hPa)RH (%)Vis (km)
North coast-1.76 -0.07-22.98-4.48-0.57 0.61-19.93
Inner zone-2.55-0.64-48.62-7.25-0.19 0.64-22.08
South coast-1.31-0.29-19.14-5.24-0.14-1.52-15.18

A determining factor in this regard is the fact that observers report zero for direction when there is calm, which significantly increase errors. Finally it should be mentioned that wind direction calculated by the model has a much greater variability than those measured by conventional station, although the estimation, from a qualitative point of view, is suitable.

An important issue related with fog/haze predictions is the fact that the model overestimates the wind force in the range between 1 and 10 km/h. This may result significant in some cases, because during the nocturnal irradiation processes, low wind speeds reduce turbulent mixing processes, which favors an increase in the relative humidity over surface allowing saturation process. The increased turbulence mixing due to stronger winds considerably limits the appearance of the phenomena, apparently because it dries the air as moisture loss increases due to dew formation or more intrusion of dry air at higher levels.

WRF forecasts obtained for January 26th show the above-mentioned situation. In this case, the model overestimates the wind speed in the 8 to 10 km/h range, which was consistent with the worst fog/haze forecast over the study area. This bias is attributed to the fact that the model also overestimated the subtropical ridge strength and therefore the pressure gradients over the study area, which is corroborated by reanalysis data (Figure 4).

Fig. 4.  Sea level pressure for January 26th, 12:00 UTC; (a) represent reanalysis fields; (b) forecast initialized at 00:00 UTC; (c) forecast initialized at 06:00 UTC.

Another variable, which causes errors to grow significantly, is visibility. There is a factor that adds to the subjectivity inherent to these observations and acts indirectly on the error value. This is that all conventional stations estimate visibility from reference points with known distances to the station. This implies that all stations have a maximum visibility in the study area that does not exceed 15 km. This limitation imposes an upper limit to the value of visibility which the model does not have, since it obviously does not depend on any reference point. This causes the absolute daytime errors to grow significantly.

As general considerations it can be said that temperature, dew point pressure and relative humidity showed little errors that can be considered insignificant for the forecast ability, although for initializations made at 06:00 UTC these errors were slightly higher compared to those made at 00:00 UTC. Another interesting aspect is that the biggest errors are limited to the inner zone of study area. This can be caused by increased variability in the behavior of meteorological variables, which is limited in coastal regions because of the modulation effect of the sea.

Table 4 shows the behavior of the mean absolute error for the variables involved in the analysis at the times that, according to the information of present time code reports, the station detected fog/haze, values reveal some interesting aspects.

Table IV.  Main absolute error at times of phenomena apparition for all study cases, (a) initialization at 00:00 UTC, (b) initialization at 06:00 UTC.
Sub-regionsT (ºC)Td (ºC)dd (º)ff (km/h)P (hPa)RH (%)Vis (km)
North coast-0.260.70-55.93-3.41-0.89-1.67-6.55
Inner zone-0.43-0.02-90.07-4.55-0.74-0.87-6.61
South coast-0.27 0.11-22.89-3.46-0.48-1.38-4.54
Sub-regionsT (ºC)Td (ºC)dd (º)ff(km/h)P (hPa)RH (%)Vis (km)
North coast-1.53 -0.61-69.53-4.73-0.913.05-8.44
Inner zone-1.86-1.07-131.52-5.04-0.692.61-9.84
South coast-0.89 -0.40-49.67-5.05-0.441.93-6.38

First it’s observed how it decreases, by more than 10 km, the value of visibility forecast error, which confirms what is explained above. For runs made with initialization at 00:00 UTC, the model predicts more trusty visibility values than when initialized at 06:00 UTC. In the example of Figure (5) the significant differences between both initializations for the forecast given at 12:00 UTC of February 9th can be appreciated, this pattern is reproduced in all cases studied.

The main cause of this difference is the fact that WRF was unable to detect favorable environment for fog formation during the first 6 hours of forecast, on the other hand, a 24 hours forecast shows no significant differences regardless of the initialization time.