The Weather Research and Forecasting mesoscale model (WRF) was used to simulate hourly 10 m wind speed and direction over the city of Taranto, Apulia region (south-eastern Italy). This area is characterized by a large industrial complex including the largest European steel plant and is subject to a Regional Air Quality Recovery Plan. This plan constrains industries in the area to reduce by 10 % the mean daily emissions by diffuse and point sources during specific meteorological conditions named wind days. According to the Recovery Plan, the Regional Environmental Agency ARPA-PUGLIA is responsible for forecasting these specific meteorological conditions with 72 h in advance and possibly issue the early warning.

In particular, an accurate wind simulation is required. Unfortunately, numerical weather prediction models suffer from errors, especially for what concerns near-surface fields. These errors depend primarily on uncertainties in the initial and boundary conditions provided by global models and secondly on the model formulation, in particular the physical parametrizations used to represent processes such as turbulence, radiation exchange, cumulus and microphysics. In our work, we tried to compensate for the latter limitation by using different Planetary Boundary Layer (PBL) parameterization schemes. Five combinations of PBL and Surface Layer (SL) schemes were considered.

Simulations are implemented in a real-time configuration since our intention is to analyze the same configuration implemented by ARPA-PUGLIA for operational runs; the validation is focused over a time range extending from 49 to 72 h with hourly time resolution. The assessment of the performance was computed by comparing the WRF model output with ground data measured at a weather monitoring station in Taranto, near the steel plant. After the analysis of the simulations performed with different PBL schemes, both simple (e.g. average) and more complex post-processing methods (e.g. weighted average, linear and nonlinear regression, and artificial neural network) are adopted to improve the performances with respect to the output of each single setup. The neural network approach comes out as the most promising method.

Due to the numerous and high-impact industrial activities, Taranto (south-eastern Italy) is included in the so-called areas at “high risk of environmental crisis”. In fact, it hosts a heavy industrial district, unfavourably positioned very close to residential areas. Among these industrial plants, particular attention is addressed to the steelworks: the ILVA plant is extended over a surface of 15 million square meters, including 200 km of railway tracks, 50 km of roads, 190 km of conveyor belts, 5 blast furnaces and 5 converters. Storage and handling of primary materials in the stockyards and transportation of materials by heavy duty trucks are the major sources of particulate. The most critical neighbourhood, called Tamburi, is located at less than 1 km from the stockyard, downwind of the plant with respect to the prevailing north-westerly wind.

In the last few years, particles under 10

For the above reasons, to predict well in advance the occurrence of “wind
days”, it is necessary to accurately simulate wind speed and direction. In
this effort, the Weather Research and Forecasting model (WRF) was
used to simulate hourly 10 m wind speed and direction over the city of
Taranto (Fedele et al., 2014, 2015). WRF is a consolidate mesoscale numerical weather
prediction system, which represents the state-of-the-art in meteorological
limited area modelling. This model has been successfully used in the past to
simulate the evolution of boundary layer features over Apulia region (e.g.,
De Tomasi et al., 2011; Comin et al., 2015). Unfortunately, meteorological
model outputs, particularly near-surface fields (e.g., precipitation, wind,

In the present work, we considered a total of five combinations of PBL and Surface Layer (SL) schemes, i.e. the Yonsei University with topographic correction, Mellor-Yamada-Janjic and Mellor-Yamada-Nakanishi-and-Niino level 2.5 schemes for PBL parametrization; for the latter, three compatible SL schemes are implemented to analyse the specific effect of SL schemes. Simulations are considered in real-time implementation: the validation is performed for a forecast range from 49 to 72 h with hourly time resolution. We compared the results from each single model setup with different linear and nonlinear post-processing approaches as described in Sect. 2. Section 3 deals with the results of wind speed and direction analysis. In Sect. 4, the discussion and interpretation of the results are shown.

Hourly 10 m wind speed and direction in Taranto are obtained by running the
WRF model using five different combinations of PBL (and associated SL)
schemes as reported in Table 1. In subsequent sections, we will refer to the
“

Combination of Boundary Layer and Surface Layer Schemas considered in the present work.

In each run, all other parameterization schemes were kept unchanged. We considered two nesting domains with 16 and 4 km grid spacing respectively and 40 eta levels. The model configuration can be summarized as follows: the “Thompson scheme” microphysics, for longwave and shortwave radiation the “rrtm scheme” and “Dudhia scheme” respectively, the effect of resolved and unresolved orography on surface wind as in Jiménez and Dudhia (2012), the Kain–Fritsch cumulus parameterization and the Noah land-surface model.

The analysis was performed over a period of eight months, from 1 August 2015 to 31 March 2016. Simulations are implemented in a real-time configuration since our intention is to analyze the results for the operational runs implemented at ARPA-PUGLIA; for validation purposes, the range from 49 to 72 h with hourly resolution is considered. The Global Forecast System (GFS) analyses/forecasts were used as initial/boundary conditions.

To compare the WRF outputs with the weather monitoring station measurements, we used the nearest neighbor approach to the model fields interpolated by Grid Analysis and Display System (GrADS).

The wind speed prediction errors ES (bias) were obtained as difference between
the WRF model outputs WS

The Kalman Filter (KF) method is a recursive algorithm commonly used to
estimate an unknown variable at time “

In our analysis, we used the Kalman Filter to evaluate, for each model and
for each daily hour, the error at time “

The best architecture in terms of minimum Root Mean Square Error (RMSE) is
found to be a feed-forward multilayer perceptron, one hidden layer with
three neurons, and seven input neurons, five for the corrected predictions
and two for the hourly cyclical component,

Comparison of hourly mean values of wind speed using the five selected model configurations and their post-processed value with respect to the observed data.

This procedure is similar to the procedure described in the Sect. 2.1.1. The KF algorithm is replaced with an additional ANN. In order to correct each model configuration, a dedicated ANN is necessary. The best architecture in terms of RMSE is found to be a feed-forward multilayer perceptron, one hidden layer with two neurons, and three input neurons, one for the predicted wind speed provided by each single model and two for the hourly cyclical component. To train this ANN, the best size period in terms of RMSE is found to be 10 days.

Also in this procedure, to combine the five corrected values, we used an
additional ANN in the Stacked Generalization method, similar to that
described in Sect. 2.1.1. For the additional ANN, the best size of the
training period in terms of RMSE is found to be 20 days. Summarizing, we
need a period of 30 days to estimate WS

The ANN is a nonlinear regression approach (Gardner and Dorling, 1998). To
evaluate the best architecture, i.e. to estimate the best size of the
training period and the best input features, various tests are implemented.
The best configuration in terms of RMSE is found to be a feed-forward
multilayer perceptron architecture, one hidden layer with three neurons, and
seven input neurons, five for the corrected predictions and two for the
hourly cyclical component,

Taylor Diagram to compare the five models and their combinations with respect to the observed data in terms of Centered Root Mean Square Error, Correlation Coefficient, and Standard Deviation.

Conditional Quantile Plot to compare the five model configurations and their post-processed combinations with respect to the observed data in terms of distribution.

Direction Accuracy plot versus Angle_ref (see text for definition) for each
model and correction. Only values smaller than 45

Throughout this study, the wind direction prediction error ED was
obtained by comparing the WRF outputs WD

First, a partial average

Bar Diagram of data subdivided by quadrant. For each quadrant, the
frequencies are reported below and above the threshold of 1 m s

The analysis shows that all considered model configurations present a positive mean bias. In fact, Fig. 1 shows that all model runs (colour continuous lines) simulate a wind speed that is on average larger than the observed values (yellow continuous lines). For this reason, the simple average (blue small circles) and the weighted average (green small circles) cannot improve the performances. Therefore, as described in Sect. 2, we implemented four post-processing methods based on the training with past data. The Artificial Neural Network (blue stars) is found to be the best procedure to combine the five wind speed predictions, reproducing the mean hourly wind speed very well. Similarly, the Taylor diagram in Fig. 2 shows that ANN is better also in terms of centered RMSE and correlation. Conditional Quintile Plots are reported in Fig. 3. The conditional quantile plot splits the predicted values into evenly spaced bins and calculates the median, the 25/75th percentiles, and the 10/90th percentiles; also, for each bin the corresponding values of the observations are identified (Wilks, 2005). Additionally, in each subplot, the observed WS distribution (histogram with blue-contoured columns), and the predicted WS distribution (grey histogram) are reported. A good performance is obtained when the median (red line) coincides with the bisector (blue straight line) and when the spread in the percentile is as small as possible.

Scatter Diagram between wind speeds and wind direction errors.
(WS values smaller than 1 m s

Some remarks can be deduced from Fig. 3:

On average, the five original WRF model configurations show values higher than observed, i.e. the medians are under the bisectors;

Although model 5 shows a WS distribution very similar to the observed distribution, it shows a great error in terms of median due to data mismatching;

The ANN shows a better mean WS correction: effectively, its median overlaps
the bisector well at least for WS smaller than 5 m s

The analysis shows that all models have wide WD error distributions. In
contrast with the WS analysis, the WD analysis does not include the
parameter “daily hour” because the WD error is found to be independent of
it. The application of the procedure described in Sect. 2.2 shows an
improvement in terms of DACC for reference values smaller than 45

The quadrant analysis shows that the improvement is greater for the
Nord-West quadrant (not shown) and we focus on this subset of directions in
the following. This improvement is due to the higher frequency of occurrence
of wind direction from this quadrant (better trained network), and for its
higher intensity (Fig. 5). In fact, higher wind speeds imply smaller WD
error: Fig. 6 shows that WS values smaller than 1 m s

With the purpose of studying and improving 10 m wind speed and direction simulations over Taranto, a city in south-eastern Italy including the largest European steel plant, the WRF model was employed with a total of five different combinations of PBL and SL schemes. Simulations are considered in real-time implementation, and are evaluated over a forecast range extending from 49 to 72 h with hourly time resolution over a period of eight months. The analysis shows that all considered model configurations present a positive mean bias for the wind speed. Instead, for the wind direction, the analysis shows that all considered model configurations present a wide error distribution centered at zero.

To improve the wind speed predictions, six different methods are considered. The analysis shows that the simple average and weighted average are not able to improve the performances. Among the other four methods based on training with past data, the ANN is found able to better eliminate the mean bias, to better reduce the centered RMSE, and shows a better overlap between the median values and the observations.

To improve the wind direction predictions, since neural networks were not
effective in improving the results, a dedicated procedure was implemented,
based on the partial average and on the linear regression. The proposed
method is found to improve the wind direction prediction error of each model
in terms of direction accuracy. A better improvement is found when the
analysis is carried out separately for each quadrant, and by excluding the
data with WS less than 1 m s

The authors declare that they have no conflict of interest. Edited by: D. Reinert Reviewed by: two anonymous referees