This study investigates the characteristic time-scales of variability found in long-term time-series of daily means of estimates of surface solar irradiance (SSI). The study is performed at various levels to better understand the causes of variability in the SSI. First, the variability of the solar irradiance at the top of the atmosphere is scrutinized. Then, estimates of the SSI in cloud-free conditions as provided by the McClear model are dealt with, in order to reveal the influence of the clear atmosphere (aerosols, water vapour, etc.). Lastly, the role of clouds on variability is inferred by the analysis of in-situ measurements. A description of how the atmosphere affects SSI variability is thus obtained on a time-scale basis. The analysis is also performed with estimates of the SSI provided by the satellite-derived HelioClim-3 database and by two numerical weather re-analyses: ERA-Interim and MERRA2. It is found that HelioClim-3 estimates render an accurate picture of the variability found in ground measurements, not only globally, but also with respect to individual characteristic time-scales. On the contrary, the variability found in re-analyses correlates poorly with all scales of ground measurements variability.

The Sun is of the utmost importance for the planet Earth. Not only does it
play a central role in our solar system, but it also represents the main
power source for the Earth, being the main driver behind weather and climate

The SSI is known to exhibit variations on a large dynamic range with respect
to both time and geographical position. This is due to a wide array of factors.
Some of these operate at long time-scales, from decades to millennia and
beyond, and are related to the stellar variability of the Sun

In order to analyse the temporal variability of the SSI, measurements of this
physical quantity are needed. A primary way of obtaining such data is
recording the values of the SSI at ground stations by using pyranometers or
pyrheliometers. Nevertheless, even sources of high quality solar radiation
measurements, such as the Baseline Surface Radiation Network (BSRN), a
worldwide radiometric network providing accurate readings of the SSI at high
temporal resolution

In practice, information about the SSI is often required at geographical
locations different from any measuring station. But extending the
representativity of ground station measurements to surrounding areas cannot
be applied to regions where the physical and/or climatological distance
between stations is large

Another option is to make use of satellite based methods, which are a good
supplement in long term solar resource assessment.

Yet another possibility for estimating the solar radiation at ground level is
provided by global atmospheric re-analyses from numerical weather models. The
main benefits are the wide, even global, coverage and the spanning of
multi-decennial time periods. However, some authors have found a large
uncertainty relative to satellite-based irradiance estimates and advise
against using data from re-analyses

In this context, we investigate and analyse here the temporal variability in time-series of daily means of SSI for two geographical locations, at different time-scales, as found in the outputs of different models, satellite estimates, re-analyses, and ground measurements. To gain better insight into the causes of variability of the SSI, we follow the downwelling solar shortwave irradiance along its path through the atmosphere towards the surface. The modelled top of the atmosphere (TOA) solar irradiance is first analysed as a clean input signal, devoid of any atmospheric perturbations, in order to reveal the natural variability of the exo-atmospheric solar input. To account for variability owing to atmospheric effects such as scattering or absorption due to water vapour or aerosols, but excluding any influence of clouds, the output of a clear-sky (i.e. cloud-free) model of the SSI is scrutinized. The role of clouds on variability is lastly inferred by analysing pyranometric ground measurements. The fitness for use of satellite estimates and re-analyses data is then assessed, by comparison with the measured data.

The novelty of our work stems from the fact that, unlike previous studies
where global statistical indicators are employed

The study develops as follows. Section

The data consists of multiple time-series of daily means of solar irradiance
corresponding to two geographical locations in Europe: Vienna, Austria
(48.25

The six solar irradiance time-series for VIE investigated in this study, spanning 1 February 2004 to 31 January 2013. From top to bottom: TOA, McClear, ERA, MERRA2, HC3v5, and WRDC. Each point corresponds to a daily mean of irradiance. Time markers on the abscissa indicate the start of the corresponding year.

Six datasets are used for each location:

modelled exo-atmospheric irradiance;

modelled clear-sky irradiance at ground level;

pyranometric measurements of the SSI;

Meteosat satellite-based SSI estimates;

radiation products from the ERA-Interim re-analysis;

radiation products from the MERRA2 re-analysis.

The top of the atmosphere (TOA) solar irradiance time-series has been
generated using the SG2 algorithm

The dataset of downwelling surface solar irradiance, under clear-sky
conditions (i.e. cloud-free), is generated using the McClear model

Estimates of the SSI derived from Meteosat satellite imagery by the
Heliosat-2 method, as described by

Pyranometric ground measurements of the daily SSI were obtained from the
World Radiation Data Centre (WRDC)

The ERA-Interim product “Surface Solar Radiation Downwards” (ECMWF2009),
from 2004 to 2014, was retrieved using the

The 1 h radiation diagnostics M2T1NXRAD from MERRA2 have been extracted for
the four nearest points and bi-linearly interpolated to generate the time
series at the exact location. The MERRA2 data are in W m

The goal of the study at hand is to first decompose the scrutinized
time-series into uncorrelated sub-constituents that have distinct
characteristic time-scales. Analysis then ensues at each distinct scale of
intrinsic variability. These time-scales, or characteristic periods, are
nothing more than the inverse of the frequency of the various processes from
which the data stems. As such, analysis techniques that depict the changes
with respect to time of the spectral content of a time-series are to be
favoured, since they enable both the identification of periodicities and the
following of the dynamic evolution of the processes generating the data. For a
review of such regularly employed methods in geophysical signal processing
see

The non-linear and non-stationary characteristics of the SSI

As such, this study employs the Hilbert–Huang Transform, an adaptive,
data-driven analysis technique. The HHT is ideally suited for non-linear and
non-stationary data and it adaptively decomposes any time-series into basis
functions derived from the local properties of the data

The HHT consists in two steps, the empirical mode decomposition (EMD), followed by Hilbert spectral analysis (HSA), both detailed hereafter.

The EMD is algorithmic in nature, and iteratively decomposes data into a
series of oscillations; within a series, oscillations have a common local
time-scale, called Intrinsic Mode Function (IMF). An IMF is a function that
satisfies two criteria: (1) its number of zero crossings and number of
extrema differ at most by one; (2) at any point, the mean value of its upper
and lower envelopes is zero. The theoretical signal model for IMFs is an
amplitude modulation–frequency modulation (AM–FM) one. Given the adaptive
nature of the EMD, the IMFs represent the basis functions onto which the data
is projected during decomposition. Any two IMFs are locally orthogonal for
all practical purposes, however, given the empirical nature of the method no
theoretical guarantee can be provided. In practice, it is found that the
relative difference between the variance of the input signal and the sum of
variances of the IMFs (i.e. the spectral leakage) is typically less than
1 %; only for extremely short data ranges does the leakage increase to 5 %,
comparable to that of a collection of pure trigonometric functions having the
same data length

let

let

find the minima and maxima of

interpolate minima to find lower envelope:

interpolate maxima to find upper envelope:

find mean of envelopes:

substract the mean:

if

store IMF:

update the residual:

if

return IMFs

Step 3 is called the sifting loop and it controls the filter character of the
EMD. An infinite number of sifting iterations would asymptotically approach
the result of the Fourier decomposition (i.e. constant amplitude envelopes)

After all the IMFs are extracted, what is left of the data is called a trend
or residue, which can no longer be considered as an oscillation at the span
of the data.

The IMFs obtained by decomposing the WRDC time-series for VIE,
plotted as

This study uses a modified version of the original EMD algorithm, the
Improved Complete Ensemble Empirical Mode Decomposition, introduced by

Once the EMD has decomposed the data into IMFs, the last step of the HHT
consists in the Hilbert spectral analysis. For each IMF

The 12 datasets (6 datasets per station) have been decomposed by the EMD into
10 IMFs and a residual, as shown in Fig.

Mean IMF time-scales in days for the VIE datasets.

Mean IMF amplitudes in W m

From Tables

For the McClear time-series, as well as for the rest of the VIE datasets,
IMF1…IMF5 display remarkably similar features, such as monotonically
decreasing amplitudes and time-scales that exhibit period doubling, roughly
following the dyadic scale: 3 days

The Hilbert marginal spectra for the VIE datasets: TOA, McClear,WRDC, HC3v5, ERA, MERRA2. The abscissa indicates the time-scale on a binary logarithm, and the ordinate denotes power in dB.

The WRDC dataset unsurprisingly shares many features with HC3v5, ERA and
MERRA2 datasets, since the latter three are intended to be accurate estimates
of the former. The rest of the results will be presented in a lumped form for
these four time-series. For IMF1…IMF5, HC3v5 agrees better with WRDC
than the re-analyses in terms of mean amplitudes and takes on only slightly
higher values (1 to 3 W m

Generally similar results are also obtained for KIV, as summarized in
Tables

The Hilbert marginal spectra for the KIV datasets: TOA, McClear,WRDC, HC3v5, ERA, MERRA2. The abscissa indicates the time-scale on a binary logarithm, and the ordinate denotes power in dB.

The previous summary of the results, although informative, is static in the
sense that only two features are used to characterize, in an approximative
manner, each time-evolving IMF: the long-term average amplitude and time
period. To make use of the full potential of the HHT, which can follow both
the temporal and the spectral evolution of the data, Hilbert spectra were
also computed for all the datasets (not shown) and are provided in the
Supplement. From these Hilbert spectra the marginal,
time-integrated, versions were computed and are presented in
Fig.

The TOA spectrum for VIE in Fig.

The McClear dataset is seen to introduce variability in the high-frequency
regime, whose power decreases almost monotonically from 30 dB at 2 days, to
about 2 dB at roughly 300 days. Most of this variability occurs during
summer, as observed in Fig.

As previously shown, the high frequency variability (IMF1…IMF5) of
HC3v5 matches more closely that of WRDC, while the re-analyses have slightly
(2–5 dB) less power. The power of these features is 15 dB greater in the
estimates and ground measurements of SSI than that found in the clear-sky
regime. From 170 days to 256 days, WRDC and HC3v5 overpower the re-analyses.
After 256 days, the power in the re-analyses overcomes that of WRDC and
HC3v5, until approximately one year. This can also be seen from
Tables

The spectra for the KIV datasets in Fig.

Still another possibility of investigating the data is to make use of the adaptive, data-driven, time-domain filter character of the EMD. Looking at pairs of IMFs in the time domain, it is possible to construct 2-D histograms of the satellite and re-analyses estimates of SSI compared to the concomitant ground measurements. This gives a good overview of the similarity, at each characteristic time-scale of variability, between satellite estimates of the SSI or re-analyses radiation products and the WRDC measurements, which serve as ground truth.

The 2-D histogram for IMF2 of HC3v5 and WRDC for VIE. Each pixel encodes relative frequency according to the colour-bar on the right. The solid black line denotes the identity line and the dash-dotted red line represents the best fit line. The linear regression equation is indicated in the legend. The time-scale, root-mean-square error and coefficient of determination are indicated in the panel above the legend.

Figure

Figure

Statistical indicators for correlations at different time-scales between SSI estimates and ground measurements for VIE.

Statistical indicators for correlations at different time-scales between SSI estimates and ground measurements for KIV.

The 2-D histogram for IMF2 of ERA and WRDC for VIE. Each pixel encodes relative frequency according to the colour-bar on the right. The solid black line denotes the identity line and the dash-dotted red line represents the best fit line. The linear regression equation is indicated in the legend. The time-scale, root-mean-square error and coefficient of determination are indicated in the panel above the legend.

Similar plots to those in Figs.

Table

Similar statements can be made about the results for KIV, presented in
Table

The results in the previous section have highlighted some features of the data that will be expanded upon here.

It has been inferred from the mean time-scales and mean amplitudes of the
decomposed data (Tables

Apart from this yearly component, the TOA exhibits no other form of
significant variability, also in good agreement with its trace from
Fig.

High-frequency variability, from 2 days up to 2–3 months, is manifest in
McClear through its first five IMFs. This feature is also present in the rest
of the datasets with greater power when compared with McClear. Hence, this
feature can be attributed to clear-sky (no cloud) atmospheric effects
(scattering and absorption by ozone, water vapour, aerosols, etc.). Looking
at the McClear graph in Fig.

Another significant result of this study is the fact that, for both VIE and
KIV, the McClear datasets do not have a variability component in between this
high-frequency feature and the yearly cycle. In other words, IMF6 for the
McClear data represents the yearly cycle, unlike the ground measurements or
the SSI estimates, where IMF6 is an intermediate component before the yearly
component represented by IMF7. This has first been discussed as a
“variability gap” by

Larger scales of variability (IMF8…IMF10) have been discarded from this analysis because of their failing to stand above the uncertainty threshold.

In this study we have investigated the characteristic time-scales of variability found in long-term time-series of daily means of SSI. We have also studied the fitness for use of satellite estimates of the SSI and radiation products re-analyses as alternatives to pyranometric ground measurements. The novelty of our work is the use of the adaptive, data-driven Hilbert–Huang Transform (HHT) to decompose the datasets into their distinct characteristic time-scales of variability before undergoing analysis.

We have shown that the TOA only presents variability at the one year time-scale. The clear-sky atmosphere introduces stochastic high frequency variability, from 2 days to 2–3 months, which exhibits non-linear cross-scale phase-amplitude coupling with the yearly cycle. This feature is also present, and amplified, in ground measurements, satellite estimates and re-analysis products. The fact that the cloud-free atmosphere does not introduce variability from 2–3 months to one year, i.e. the “variability gap” alluded to in previous studies, has been confirmed. It has also been shown that, HC3v5 outperforms ERA and MERRA2 by a large margin in terms of estimating the measured SSI, not only at a global, whole dataset level, but also on an per time-scale basis, and especially with respect to the stochastic variability component. This has implications on the forecast and modelling of the SSI, where satellite estimates should be preferred instead of re-analysis products. Our study, hence, refines the existing methodology to assess the fitness for use of surrogate SSI products, through an improved in-depth comparison of their local time-scales of variability.

A limitation of our study needs to be pointed out. Before carrying out the
analysis, we have used the EMD on each time-series and have only compared
modes with similar time-scales. That is, we have used the mono-variate
version of the EMD, where mode alignment (identical time-scales for the IMFs
across datasets) is not enforced. Nevertheless the non-alignment of modes is
not to be considered a weakness of our approach. Because identical
time-series will be decomposed into identical modes, by not enforcing similar
time-scales across the modes of different datasets, changes in the
time-scales of the modes (e.g. IMF6 of HC3v5 matches WRDC unlike ERA or
MERRA2) also provide supplementary clues as to the fitness for use of the
surrogate SSI datasets in lieu of ground measurements. Mode alignment can be
enforced by more advanced, multi-variate versions of the EMD. Two such
techniques are the noise-assisted multi-variate empirical mode decomposition (NA-MEMD)
introduced by

Lastly we recognize the restrained geographical character of the study and, as a future exercise, we propose its extension to many more geographical locations and possibly including several different satellite estimates and re-analyses radiation products, in order to determine whether the findings reported herein also hold for different regions and for different SSI surrogates.

The software used for this study, comprising general EMD and HSA routines is publicly available online, as follows.

The fast EMD routine used in this study is provided by

Methods pertaining to Hilbert spectral analysis are part of a general HHT
toolkit provided by

The code for the ICEEMD(AN) algorithm

The data can be accessed as follows:

The ERA-Interim data set (ECMWF2009) can be accessed at:

The MERRA2 radiation diagnostics M2T1NXRAD timeseries (GMAO2015)
is available at:

TOA and McClear data from Copernicus Atmosphere Monitoring Service
(Copernicus2015,

The WRDC global radiation daily sums for Europe (WRDC2014) can be
accessed at:

The HelioClim-3v5 dataset was downloaded from the SoDa Service web site
(

Tables

Mean IMF time-scales in days for the KIV datasets.

Mean IMF amplitudes in W m

All authors contributed equally to this work.

The authors declare no competing interests. HC3v5 solar
radiation products are commercialized by Transvalor SA through its online SoDa
Service at

The authors thank the World Radiation Data Centre for maintaining the radiation archives and hosting the website for downloading data. They thank the ground station operators of the WMO network for their valuable measurements. The authors are indebted to the company Transvalor SA which is taking care of the SoDa Service for the common good, therefore permitting an efficient access to the HelioClim databases. Edited by: S.-E. Gryning Reviewed by: two anonymous referees