Impact of the assimilation of lightning data on the precipitation forecast at different forecast ranges

This study investigates the impact of the assimilation of total lightning data on the precipitation forecast of a numerical weather prediction (NWP) model. The impact of the lightning data assimilation, which uses water vapour substitution, is investigated at different forecast time ranges, namely 3, 6, 12, and 24 h, to determine how long and to what extent the assimilation affects the precipitation forecast of long lasting rainfall events (> 24 h). The methodology developed in a previous study is slightly modified here, and is applied to twenty case studies occurred over Italy by a mesoscale model run at convection-permitting horizontal resolution (4 km). The performance is quantified by dichotomous statistical scores computed using a dense raingauge network over Italy. Results show the important impact of the lightning assimilation on the precipitation forecast, especially for the 3 and 6 h forecast. The probability of detection (POD), for example, increases by 10 % for the 3 h forecast using the assimilation of lightning data compared to the simulation without lightning assimilation for all precipitation thresholds considered. The Equitable Threat Score (ETS) is also improved by the lightning assimilation, especially for thresholds below 40 mm day−1. Results show that the forecast time range is very important because the performance decreases steadily and substantially with the forecast time. The POD, for example, is improved by 1–2 % for the 24 h forecast using lightning data assimilation compared to 10 % of the 3 h forecast. The impact of the false alarms on the model performance is also evidenced by this study.


Introduction
Continuous advances in computing power have made the regional atmospheric forecast available at convectionpermitting scales ( x ≤ 4 km) in several countries worldwide.The adoption of these high horizontal resolutions opens the possibility to assimilate convective scale observations (Weisman et al., 1997;Weygandt et al., 2008).
Among the data at the convection-permitting scale, lightning offers several advantages as the ability to locate precisely the convection and heavy precipitation, availability with few temporal gaps, compactness, which requires a low bandwidth to transfer the data, long-range detection over the oceans and beyond the radar range (Mansell et al., 2007).
In the studies of Alexander et al. (1999) and Chang et al. (2001), the lightning data were first converted in precipitation rates, which were assimilated into the Numerical Weather Prediction Model (NWP).The study of Papadopulos et al. (2005) used lightning to locate convection and the model water vapour profile was nudged toward vertical profiles recorded during convective events.Mansell et al. (2007) and Giannaros et al. (2016) assimilated lightning by forcing the Cumulus Parameterization Schemes CPS in the positions where lightning were recorded.Interestingly, this methodology can also be used to suppress convection where the model simulates convection (as shown by the fact that the CPS is active in this position) while no flashes are recorded.The performance is improved when lightning controls both activation and suppression of the CPS.These studies at non convection-permitting scale demonstrated the positive impact of the lightning data assimilation not only on large-scale fields, as the sea-level pressure, but also on the precipitation.
The above schemes are limited to non convectionpermitting scales because: (a) the horizontal resolution is coarser than 10 km; (b) to force/suppress convection they use the CPS scheme is used, which is not suitable to simulate convection at convection-permitting scales.Of course, these methods can be used at higher spatial resolution through grid nesting, using the assimilation of lightning data for the coarser grid, and transferring the information at convectionpermitting scale through the nesting.
The first attempt to directly assimilate lightning at the convection-permitting scale was perfomed by Fierro et al. (2012).They nudged water vapour in the mixed-phase region (−20 • C ≤ T ≤ 0 • C), where the charge separation occurs, toward a profile that depends on the flash rate and on the graupel mixing ratio.This profile gets closer to the saturation as the flash rate increases, while the amount of nudging decreases as the graupel mixing ratio increases.Qie et al. (2014) extended the methodology of Fierro et al. (2012) by nudging the ice crystals, snow and graupel mixing ratios toward functions derived by fitting observations for three thunderstorms occurred in Northern China.For these thunderstorms, radar observations, lightning data and favourable numerical simulations were available.Federico et al. (2017) adapted the methodology of Fierro et al. (2012) to the Regional Atmospheric Modeling System (RAMS).In their study, however, the simulated water vapour profile is substituted by the function proposed by Fierro et al. (2012) in each grid point where flashes are observed within a square centred on the grid point and of equal side to the grid spacing.Dixon et al. (2016) nudged water vapour toward the saturation water vapour mixing ratio at all vertical levels in the troposphere within 10 km.
The above studies examined the positive impact of the lightning data assimilation at different forecast ranges.Giannaros et al. (2016) showed the positive impact of the assimilation of lightning data on the 24 h precipitation forecast for eight rainfall events over Greece.They also showed the dependence of the performance on the type of event, specifically widespread versus non-widespread events.Alexander et al. (1999) andPapadopulos et al. (2005) showed the positive impact of the assimilation of lightning data up to 12 h after the end of the assimilation period for the simulations of intense thunderstorms occurred over the United States (Alexander et al., 1999) and over Greece (Papadopulos et al., 2005).Fierro et al. (2014) showed the positive impact of the lightning data on the simulation of the 29-30 June 2012 over the United States.Comparison of simulated and observed radar reflectivity showed that the assimilation of lightning improved the simulation for the 6 h following the end of the assimilation period.Federico et al. (2017) considered the impact of the assimilation of lightning data for twenty-cases for the 3 h precipitation forecast, showing an important improvement of the rainfall forecast over Italy by the assimilation of lightning.Qie el al. (2014) considered the impact of the assimilation of lightning data on the simulation of a Mesoscale Convective System (MCS) over two megacities in China, showing that the representation of convection, as well as the quantitative precipitation forecast, were clearly improved during the assimilation period and after 1 h forecast.
In general, the above studies are focused on different forecast ranges, showing that the assimilation of lightning data has a positive impact on the NWP.However, a systematic assessment of the impact of the assimilation of lightning data on the precipitation forecast at different forecast time ranges is missing.This study is a first step in this direction: the goal is to assess for how long and to what extent the assimilation of lightning data positively affect the precipitation forecast.
To investigate in detail the impact of the assimilation of lightning data on the precipitation forecast at different time ranges, this study applies the assimilation methodology of Federico et al. (2017), which consists in the substitution of the water vapour when specific conditions are met, to the precipitation forecast at 3, 6, 12 and 24 h.The performance is evaluated considering twenty cases used in Federico et al. (2017) occurred over Italy in Fall 2012 because, for these cases, a dense raingauge dataset is available (see Data and Method section).As noted by Giannaros et al. (2016), the performance of the assimilation of lightning data depends on the event.In particular, they showed that the assimilation of lightning data gave better precipitation forecast for widespread long-lasting (> 24 h) convective events compared to non widespread (i.e.scattered over Greece or located in some specific areas of the country) short (< 24 h) events.This study considers only cases of widespread convection over Italy, lasting more than 24 h, and analyses the impact of the assimilation of lightning for different forecasting time ranges.

Data and methods
We use the Regional Atmospheric Modeling System (RAMS).RAMS is a general purpose limited area model designed to be used at the mesoscale or finer horizontal resolutions.It is based on a full set of non-hydrostatic, compressible equations of the atmospheric dynamics and thermodynamics, and on conservation equations for scalar quantities as water vapour and liquid and ice hydrometeor mixing ratios.The model is widely used for research as well as for weather forecast (Cotton et al., 2003).The model is configured with two domains, shown in Fig. 1: the first domain (R10) grid has 10 km horizontal resolution and 301 × 301 grid points in both WE and NS directions.The second domain (R4) grid has 4 km horizontal resolution and has 401 × 401 grid points.The second domain is used to evaluate the model performance, while the first domain is used to give the initial and boundary conditions to the second domain, avoiding the abrupt change of resolution from the global scale models (here we use the Integrated Forecasting System (IFS) analysis and forecast fields at 0.25 • horizontal resolution of the European Centre for Medium Weather range Forecast (ECMWF)) and the 4 km horizontal resolution of the inner RAMS domain.The interaction between R10 and R4 is one way, and the physical options of R10 and R4 are chosen as in Federico et al. (2017).The CPS is activated for R10 only.
Lightning data are provided by LINET (Betz et al., 2009).This network has more than 550 sensors worldwide and is expanding.LINET (http://www.nowcast.de)has a very good performance for both precision and efficiency over Europe (Lagouvardos et al., 2009).
For the assimilation, which is performed only for R4, the flashes are mapped into the RAMS inner grid, and the water vapour mixing ratio is computed using the following equation: Where A = 0.86, B = 0.15, C = 0.30 D = 0.25, α = 2.2, q s is the saturation mixing ratio at the model atmospheric temperature, and Q g is the simulated graupel mixing ratio (g kg −1 ).The X is the flash rate remapped onto the R4 grid (number of flashes in each grid point registered in 5 min).In other words, every 5 min and for each grid point of the R4 grid, we compute the number of LINET strokes [available as  12, 13, 14, 24, 26, 30 October 201212, 13, 15, 26, 27, 28, 29, 31 November 2012 4, 5, 11, 20, 21, 28 a triple (longitude, latitude, time)] registered in the previous five minutes in a square centred in the grid point of side equal to the grid mesh size.This number is X and is used in Eq. ( 1).
The water vapour mixing ratio of Eq. ( 1) is substituted to the simulated value at grid points where electric activity is observed if the value of Eq. ( 1) is larger than that simulated, while no change is made if the value of Eq. ( 1) is less than that simulated.The water vapour mixing ratio is changed in the charging zone, between 0 and −25 • C and it is redistributed by the model through adiabatic and diabatic processes.Twenty cases are considered in this study (Table 1).All events lasted more than 24 h over Italy and most of them belong to the HyMeX-SOP1 (Hydrological cycle in the Mediterranean Experiment -First Special Observing Period; Ferretti et al., 2014).These events were selected because the precipitation forecast can be accurately verified by a database of hourly precipitation of 2944 raingauges over Italy (http://mistrals.sedoo.fr/?editDatsId=1282&datsId= 1282&project_name=MISTR&q=DPC; Davolio et al., 2015).Starting from hourly data, 3, 6, 12 and 24 h precipitation can be easily derived.
For each of the twenty cases, the following configurations are considered (Fig. 2): (a) R10: A 36 h simulation of R10 without lightning data assimilation and using, as initial and boundary conditions, the ECMWF analysis/forecast cycle issued at 12:00 UTC of the day before the actual day; (b) CN-TRL: this simulation is performed by nesting R4 in R10 using a one-way nesting and without doing lightning data assimilation.Each CNTRL simulation starts at 18:00 UTC of the day before the actual day and the first six hours, accounting for the spin-up time, are discarded from the evaluation; (c) F3HA6: these simulations consist of eight runs of 9 h duration.During the first 6 h, lightning data are assimilated (assimilation stage), then a short term 3 h forecast is made (forecast stage).The first simulation starts at 18:00 UTC of the day before the actual day, using as initial and boundary conditions the R10 forecast, and gives the forecast for the hours 00:00-03:00 UTC of the actual day (Fig. 2).Simulations from two to eight are as the first one but shifted every time 3 h ahead.Therefore, eight F3HA6 simulations are needed to span the forecast of a whole day.The simulations from two to eight use the output of the previous F3HA6 forecast as initial conditions to maximize lightning data assimilation; (d) F6HA6: these simulations consist of four runs of then a short term 6 h forecast is made.Initial and boundary conditions follow the same strategy as F3HA6, i.e. the BC are given by R10 for all the runs, while initial conditions are given by R10 for the first simulation, starting at 18:00 UTC of the day before the actual day, then are given by the previous F6HA6 forecast.Four F6HA6 simulations are needed to span the forecast over a whole day; (e) F12HA6: these simulations consist of two runs of 18 h duration.During the first 6 h, lightning are assimilated, then a 12 h forecast is made.Initial and boundary conditions follow the same strategy as F3HA6.Two F12HA6 simulations are needed to span the forecast over a whole day; (f) F24HA6: it consists of one run of 30 h duration.During the first 6 h, lightning data are assimilated, then a 24 h forecast is made.One F24HA6 simulation is needed to span the forecast over a whole day.Initial and boundary conditions are given by R10; (g) ASSIM: this simulation is performed by nesting R4 in R10 using a one-way nesting and performing lightning data assimilation continuously for all the run.
In order to have a common verification range for all simulation configurations, we consider the daily precipitation.In the case of the FXHA6 simulations (where X can be 3, 6, 12 or 24), the daily precipitation is obtained by summing all the forecast stages of the FXHA6 run (eight 3 h forecasts for F3HA6, four 6 h forecasts for F6HA6, two 12 h forecasts for F12HA6 and one 24 h forecast for F24HA6).Considering this simulation and verification strategy, it follows that, for each simulated day, the constraint given by the assimilation of lightning data is stronger for shorter forecast ranges.For F3HA6, for example, the assimilation is performed eight times for 6 h, while for F24HA6 it is performed once for 6 h.
The simulations strategy outlined above and the assimilation technique are the same as in Federico et al. (2017); there are, however, important differences between the two studies that are discussed in the following.First, the aims of the two papers are different: Federico et al. (2017) adapted the methodology of Fierro et al. (2012) to the RAMS model and discussed the impact of the technique on the 3 h precipitation forecast.Therefore, the R10, CNTRL, F3HA6 and AS-SIM simulations (20 days for each configuration) are shared with this paper.The aim of this paper is to show how long lasts the impact of lightning assimilation on the precipitation forecast.To this purpose, the configurations F6HA6, F12HA6 and F24HA6 (each used for the simulations of the 20 cases) are presented here for the first time.Moreover, to be consistent with other studies (Fierro et al., 2012;Dixon et al., 2016), where the water vapour mixing ratio is modified within a distance of 10 km from the points where flashes are observed, the water vapour mixing ratio is changed also at the four grid points adjacent (two in the WE and two in the NS directions) the grid point where flashes are observed, even if no electric activity is observed at those adjacent grid points.In this way, we assimilate the water vapour in a circle of 8 km of diameter, in better agreement with the cited studies.As a consequence of this difference, the F3HA6 and ASSIM simulations are different in this paper compared to those presented in Federico et al. (2017).It is also noted that the CN-TRL simulations shown in this paper are also different from those in Federico et al. (2017) because the orography dataset was updated from GTOPO30 (Global 30 Arc-Second Elevation; https://lta.cr.usgs.gov/GTOPO30,used in Federico et al., 2017) to GMETD2010 (Global Multi-resolution Terrain Elevation Data 2010; https://lta.cr.usgs.gov/GMTED2010,used in this study).However, the results for CNTRL shown in this paper do not show significant change compared to Federico et al. (2017).
Statistical verification is performed by dichotomous contingency tables (2 × 2) for different precipitation thresholds and the Probability of Detection (POD), Bias, the False Alarm Ratio (FAR) and Equitable Threat Score (ETS) are considered.Indicating with a the number of hits, b the false alarms, c the misses and d the correct no forecast we have: where a r is the probability to have a correct forecast by chance (Wilks, 2006).
The POD gives the fraction of observed events that are correctly forecast (range [0,1] and 1 is the prefect score, i.e. when no misses or false alarms are forecast), the Bias (range [0, +∞), and 1 represents the perfect score) is the ratio of predicted events to observed events, the FAR (range [0, 1] and 0 is the perfect score) gives the fraction of forecast events that were not observed, ETS (range [−1/3, 1], 1 is the perfect score, while 0 is for a useless forecast) measures the fraction of observed events that were correctly predicted, adjusted for hits associated to a random forecast, where forecast occurrence/non-occurrence is independent of observation/non observation.
The elements of the contingency tables (a, b, c, d) are summed for all the twenty events before the computation of the scores.
Figure 3a shows the POD for all simulation types and thresholds considered in this paper.The POD decreases for increasing precipitation, showing the difficulty to correctly forecast the rainfall for increasing daily amounts.Without lightning assimilation (CNTRL), the POD ranges from 0.69 (1 mm day −1 ) to 0.42 (80 mm day −1 ).The POD for 60 mm day −1 is about 0.50, i.e. half of the potentially dangerous events are correctly predicted by RAMS without assimilation of lightning data.
The F3HA6 simulation has the best POD among the forecasts.For this setting, the POD decreases from 0.79 (1 mm day −1 ) to 0.52 (80 mm day −1 ).For 60 mm day −1 , the POD is 0.62, i.e. 62 % of the potentially dangerous events are correctly forecast by F3HA6.The lightning assimilation has an important impact of the rainfall prediction and the best result is for F3HA6.For this model configuration, the improvement of the POD compared to CNTRL is larger than 10 % for all precipitation thresholds.The impact of the assimilation of lightning data is apparent also for F6HA6 and F12HA6 because the POD improvement compared to CNTRL is larger than 5 % for all thresholds.
The forecast range has an important impact on the rainfall forecast because, from Fig. 3a, we note a steadily decrease of the POD performance as the forecasting time increases; the best performance is for F3HA6 followed by F6HA6, F12HA6, F24HA6, and finally CNTRL.This is caused by the stronger constraint given to the forecast by the assimilation of lightning data as the forecasting time decreases.For example, the F3HA6 assimilates the lightning for eight 6 h periods for each day of forecast, while F24HA6 assimilates the lightning for one 6 h period for the same forecast (Fig. 2).The stronger constraint imposed by the lightning assimilation to the forecast gives a better representation of the convection and of the precipitation fields for shorter range forecasts.Stated in other terms, the model errors become more important than the analysis errors for longer forecast time ranges and the performance worsens.
The POD of ASSIM is the largest among all simulations because the convection is forced by the lightning assimilation when and where it is observed for the whole simulation duration.
Figure 3b shows the results for the Bias.The RAMS overforecasts the precipitation events (larger precipitation areas and overestimation) as the rainfall threshold increases.For example, the Bias for CNTRL raises from 0.85 (1 mm day −1 ) to 3.3 (80 mm day −1 ).For each precipitation threshold, the Bias increases steadily for shorter range forecasts.Because the assimilation of lightning adds water vapour to the forecast, it follows that a larger amount of water vapour is added to the simulations for shorter forecast ranges, increasing the Bias.
Figure 3c shows the FAR for the different simulations considered in this paper.There is little variation of the score among the different RAMS configurations.This is mainly caused by the fact that the variation of the hits and false alarms among different configurations is much smaller that the values of the hits and false alarms.The FAR increases from 0.2 (1 mm day −1 threshold) to 0.85 (80 mm day −1 threshold), showing that the fraction of the predicted events that are false alarms increases from 20 % (1 mm day −1 ) to 85 % (80 mm day −1 ).
A drawback of the lightning data assimilation used in this paper is the increase, for a fixed threshold, of the false alarms for shorter forecast ranges, as revealed by the inspection of the contingency tables for the twenty cases.For example, for the 20 mm day −1 threshold, the hits (a) of F3HA6 are 3426 and the false alarms (b) are 2969.These values are, respectively, 2939 and 2686 for CNTRL.In general, the assimilation of lightning gives a larger number of hits (a) but also of the false alarms (b).
Unlike POD, ETS (Fig. 3d) is penalized by false alarms.Among the forecasts, the F3HA6 has the best ETS followed by F6HA6, F12HA6, F24HA6 and CNTRL, showing the improvement of the forecast for shorter forecast ranges.As the forecast range decreases, the stronger constraint given by the assimilation of lightning data to the forecast improves the rainfall forecast.This is confirmed by the ETS values of AS-SIM, which is the largest among all simulations.
For the larger precipitation thresholds, however, the spread of the ETS among the forecasts becomes smaller and, for the 80 mm day −1 threshold, CNTRL outperforms F24HA6 and F12HA6.The improvement of the the ETS by the assimilation of lightning data is reduced, and eventually canwww.adv-sci-res.net/14/187/2017/celled, by the increase of the false alarms for larger thresholds, with results depending on the forecast range and precipitation threshold.
Anyway, the lightning data assimilation is overall helpful because, with the two exceptions noted above, the ETS with the assimilation of lightning data outperforms that of CN-TRL (Fig. 3d).

Conclusions
This study shows the impact of the assimilation of lightning data on the precipitation forecast for different forecast time ranges, from 3 to 24 h.The performance is evaluated for twenty long-lasting (> 24 h) cases characterized by widespread convection and moderate to heavy precipitation over Italy, using a dense raingauge network to verify the forecast.The RAMS model is employed at convection-permitting horizontal scales (4 km).
Results show that the assimilation of lightning data has an important impact on the forecast performance, and the F24HA6 run, which assimilates lightning for 6 h and then performs a 24 h forecast, gives already an improvement compared to the CNTRL forecast, not using lightning data assim-ilation.However, the forecast range has an important impact on the quality of the simulations, because the performance is notably better for short forecast ranges, specifically 3 and 6 h forecasts, and the performance improves steadly as the forecast range becomes shorter.Model errors become more important than the analysis errors for longer forecast ranges worsening the performance.
The assimilation scheme used in this paper adds the water vapour to the forecast.This determines a general increase of the false alarms, limiting the usefulness of lightning data assimilation for the largest precipitation thresholds.To deal with this issue, future studies will consider the possibility of decreasing the water vapour when lightning are simulated but not observed.This option is already available in the assimilation schemes at non-convection permitting scales using the CPS to assimilate lightning (Mansell et al., 2007;Giannaros et al., 2016); in these schemes, the CPS can be partially or totally suppressed where lightning are not observed, while the model activates the CPS.
On the other hand, when nudging or substituition of water vapour is performed, the task of assimilating water vapour when lightning are not observed is not an easy task because additional observations and/or further assumptions are needed.Alexander et al. (1999) and Chang et al. (2001) showed how the assimilation of the Integrated Water Vapour (IWV) can be used to decrease the modelled water vapour when the model overestimates the convective activity.If additional observations are not available, however, the assimilation of water vapour is problematic because, while we can reasonably assume that the charging zone is saturated when flashes are observed, the estimation of the water vapour profile in cases when lightning are not observed while the model is simulating convection is difficult.
It is finally noted that the overestimation of the false alarms is not only a consequence of the assimilation scheme, as shown by the high values of the Bias of CNTRL, where there is no assimilation of lightning data.In particular, the horizontal resolution (4 km in this study) has an important role; RAMS simulations at 2.5 km horizontal resolution used for two case studies included also in this paper (the 15 and 27 October 2012) have shown a considerable reduction of the false alarms that will be further investigated in the future (see the discussion associated with the paper Federico et al., 2017).
Data availability.The precipitation forecast of the RAMS model can be requested to the corresponding author.LINET data were provided by Nowcast GmbH (https://www.nowcast.de/)within a scientic agreement between Hans Dieter Betz and the Satellite Meteorological Group of CNR-ISAC in Rome.The precipitation dataset, used to verify the RAMS forecast, is available through the HyMeX website: http://mistrals.sedoo.fr/?editDatsId= 1282&datsId=501282&project_name=MISTR&q=DPC.
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.The two domains (D1, D2).D1 has 301 grid points in both the WE and SN directions; D2 has 401 grid points in both WE and SN directions.

Figure 2 .
Figure 2. The simulations (see text for details); d and d − 1 are the actual day and the day before the actual day.

Figure 3 .
Figure 3. Scores for the daily precipitation computed by summing the contingency tables of the twenty case studies; (a) Probability Of Detection; (b) Bias; (c) False Alarm Rate; (d) Equitable Threat Score; F3HA6 is in red, F6HA6 is in blue, F12HA6 is in green, F24HA6 is in yellow, ASSIM is in violet, CNTRL is in cyan.

Table 1 .
The twenty case studies.
This article is part of the special issue "16th EMS Annual Meeting & 11th European Conference on Applied Climatology (ECAC)".It is a result of the 16th EMS Annual Meeting & 11th European Conference on Applied Climatology (ECAC), Trieste, Italy, 12-16 September 2016.