Climate reference stations in Germany: Status, parallel measurements and homogeneity of temperature time series

Germany’s national meteorological service (Deutscher Wetterdienst, DWD) operates a network of so-called “climate reference stations”. These stations fulfill several tasks: At these locations observations have already been performed since several decades. Observations will continuously be performed at the traditional observing times, so that the existing time series are consistently prolonged. Currently, one specific task is the performance of parallel measurements in order to allow the comparison of manual and automatic observations. These parallel measurements will be continued at a subset of these stations until at least 2018. Later, all stations will be operated as automatic stations but will also be used for the comparison of subsequent sensor technologies. New instrumentation will be operated in parallel to the previously used sensor types over sufficiently long periods to allow an assessment of the effect of such changes. Here, we present the current status and an analysis of parallel measurements of temperature at 2 m height. The analysis shows that the automation of stations did not cause an artificial increase in the series of daily mean temperature. Depending on the screen type, a bias with a seasonal cycle occurs for maximum temperature, with larger differences in summer. The effect can be avoided by optimizing the position of the sensor within the screen.


Introduction
Climate at a specific location is influenced by large-scale as well as local factors.For a detailed understanding of the climate system atmospheric conditions have to be observed over sufficiently long time.The Global Climate Observing System (GCOS) was introduced to support and ensure systematic observation at a global scale (Karl et al., 2010).GCOS defined a list of variables to be observed with priority (so-called Essential Climate Variables, ECVs, see Bojinski et al., 2014) and defined so-called Climate Monitoring Principles.Atmospheric near-surface variables are typically observed by networks of surface stations operated by national weather services.Taken together, these form the GCOS surface network (GSN).For the reliable description of climate, i.e. the statistical features of various atmospheric variables and the assessment of their long-term variability and change, high-quality meteorological observations have to be performed over sufficiently long time and non-climatic influences on these time series have to be understood.It is well known that over such periods the observation networks and procedures are affected by various modifications.One important example is the transition from manual to automatic observation techniques.The GCOS Climate Monitoring Principles suggest that traditional and new sensors should be operated with a sufficiently long temporal overlap.
Such overlapping ("parallel") measurements have therefore been performed in several countries.Especially in case of temperature, interest in an analysis of these series is also motivated by the aim of better understanding the uncertainty of global temperature datasets and their trends (Jones, 2016).To allow studying systematic biases at a global scale, the Parallel Observations Science Team (POST) of the International Surface Temperature Initiative (ISTI, see Rennie et al., 2014) compiles a database with parallel measurements.
Different settings are used for such parallel measurements, e.g. with focus on the screens or the sensors.In the Netherlands the same type of sensor was used in a comparison of Table 1.Length of time series, time range in which station has parallel measurements (for conventional measurements: three times per day), location and height (in meters) of DWD's climate reference stations.The time range of parallel measurements refers to the interval when AMDA stations were in use (see text).The data from these intervals have been used for the analysis in this study, including data until end of June 2016.Aachen was relocated and continued as Aachen-Orsbach in 2011.nine thermometer screens over 6 years (Brandsma, 2004;Brandsma and Van der Meulen, 2008).With this dataset, transfer functions were derived to relate the measurements from different thermometer screens to each other.Böhm et al. (2010) compared temperature differences at screened and unscreened sites at Kremsmünster (Austria) in order to assess biases in earlier measurements in the Greater Alpine Region.Auchmann and Brönnimann (2012) use parallel measurements from Switzerland to propose a correction method for sub-daily temperature data.They also discuss several former studies that intercompare different types of ventilation or sensor-shielding.Doesken (2005) used a twenty year time series of parallel temperature measurements from one site to study the impact of the replacement of liquid-in-glass thermometers to electrical thermometers in the US.DWD operates a network of so-called "climate reference stations" (CRS).Currently manual and automatic observations are performed in parallel in order to allow understanding the impact of changes in the instrumentation in Germany.
Here we provide a summary of the status of this network and conclusions on the homogeneity of temperature series.For a general summary of DWD's contribution to climate observation see Deutscher Wetterdienst (2013).

History and status of DWD's climate reference stations
After Germany's reunification, the observation network was successively modernized and automatized.In a first step, seven "reference stations" were introduced to allow the comparison of conventional and automatic measurements.Since 2008, 12 stations are operated as climate reference stations (Table 1).Parallel measurements at these stations officially started 1 May 2008.At these stations observations are performed by observers and automatic instruments.The stations are located in different climatic regions of Germany (Fig. 1, Table 1).At these locations observations have already been performed since several decades, in most cases already since the end of the 19th century.
Currently 10 stations are operated as climate reference stations (Konstanz was transformed to a standard station in 2012 because of difficulties to ensure a representative surrounding.Fichtelberg was CRS until end of 2013, see Fig. 1).In the current configuration, the CRSs are manned with observers around the clock.The automatic measurements are performed equivalently to other stations in DWD's main observing network.In addition, manual observations are performed in parallel at the three traditional observing times (so called "Mannheimer Stunden"): 06:30, 13:30 and 20:30 UTC (in the following: observing times I, II, III).These include readings of instantaneous and extreme values (see Table 3) and the interpretation of recordings (see Table 4).Observed parameters are: air pressure, air temperature, humidity, precipitation, sunshine duration, snow height (Table 2) and soil temperatures at different depths (Table 3).The instruments used at these stations are listed in Table 2.The manual readings are transferred into the central database of DWD (Kaspar et al., 2013).The time periods in Table 2 refer to the interval when so-called AMDA stations (= Automatische Meteorologische Datenerfassungs-Anlage) were used1 (Klapheck and Wolff, 2005).These synoptic-climatological stations also perform the first step of the data quality control (QC) procedures.Af-    Five CRSs will be operated in the current mode until 2018 (Schleswig, Lindenberg, Brocken, Frankfurt, Hohenpeißenberg).Ten years of parallel measurements will then be available for these stations ("type I" stations).After that period, they will be converted to automatic CRSs ("type II").The five additional existing stations will be operated as automatic CRSs ("type II") from now onward.At these automatic stations no manual observations will be performed.When new automatic instruments are introduced into DWD's network, the previous and new instruments will be operated in parallel at the CRSs.The intended duration for such parallel measurements is 2 to 5 years.From 2019 onward, all CRSs will be of type II, i.e. automatic CRSs.
A first comparison of the manually measured data with the automatically recorded data was performed by Augter (2013).In that study the data of the climate reference stations available until end of 2010 were used together with data from five additional stations where parallel measurements were already performed in earlier years.This analysis led to the following conclusions: The change of the observing technique resulted in only small differences for air pressure and temperature, i.e. no inhomogeneities were caused.Precipitation is slightly higher for traditional measurements, but the mean differences are in the range of uncertainty of the manual readings.For humidity, values > 95 % were measured more often with the traditional technique (Aßmann-Psychrometer).For sunshine duration the traditional technique typically resulted in higher values.For some stations the difference in the annual sunshine duration was greater than 100 h for selected years.The traditional measurement technique ("Campbell-Stokes") is based on a burned trace in a paper card whereas the automatic observations are based on radiation measurements.
With the introduction of automatic stations, a new procedure for the calculation of the daily mean temperature has been introduced.Traditionally, the daily means have been calculated based on three daily state observations according to the following formula: where T I is the temperature observed at 06:30 UTC, T II at 13:30 UTC and T III at 20:30 UTC.This formula and giving double weight to the third date was suggested by Kämtz (1831, p. 102) and aims at providing the best estimate for the daily mean for this combination of observation times.Since April 2001 daily mean values are calculated based on the hourly observations.Augter (2013) also compared the differences between these approaches.In average, the results based on hourly values were 0.1 K lower.The spread of the differences between daily means from hourly and three times daily measurements was found to be wider than the one that arises from the comparison of traditional and automatic measurements.

Comparison of parallel temperature measurements
The temperature measurements from DWD's station network are regularly used to provide information on climate change in Germany (e.g.Kaspar and Mächel, 2017).It is therefore important to understand if there are any artificial breaks in these time series, e.g.caused by changes in the observing technique.Here we use the parallel measurements to analyse the impacts of changes of the sensors, screen types and data processing.

Temperature measurements
Traditionally, temperature was measured with a mercury thermometer.Minimum temperature was measured with alcohol thermometers.At DWD's automatic stations, a platinum resistance thermometer ("Pt 100") is used.The tolerance class of these resistance thermometers is 1 3 of Class B (i.e.±0.1 + 0.00167 • |T | with T in • C) according to the IEC 60751 standard 2 (IEC, 2008).Details of the measurement accuracy are described in the appendix of Deutscher Wetterdienst (2015).The electrical thermometers are calibrated every 60 months, the liquid thermometers every 120 months.
The traditional thermometers are operated in a Stevenson screen, except for the station Brocken, were a screen of type "Gießener Hütte" is used.At standard stations, the automatic thermometers are operated in lamellar shelters "LAM 630", a multi-plate screen with artificial ventilation.At the mountain stations Brocken and Fichtelberg the automatic thermometers are placed in a different screen type: a Stevenson screen is used at Fichtelberg, a "Gießener Hütte" at Brocken.A Stevenson screen was also used for the automatic measurements at airport station Frankfurt until 22 October 2014.From 22 October 2014 onwards, a LAM 630 was used at that station for the automatic measurements.Generally, the LAM 630 is equipped with two identical sensors for temperature and two for humidity.One of each is used operationally, the second one for quality assurance.
Manual state measurements are performed for the traditional observation times: 06:30, 13:30 and 20:30 UTC.The electrical thermometers measure continuously and data are stored in the data base every ten minutes.The value that is encoded and transferred for a specific observation time (e.g.20:30 UTC) is taken 10 min before that nominal time and is the average of an 1 min interval, i.e. in case of 20:30 it is the mean value of 20:19 to 20:20 UTC.In this analysis daily minima and maxima refer to the nominal interval from 20:30 UTC of the previous day to 20:30 UTC.
In the following, differences between automatic and manual measurements are analysed, i.e. positive differences indicate that the automatic measurements provided higher values.Differences higher than 2 K were excluded from the analyses, as these are obviously incorrect measurements.

Comparison of air temperature at 2 m height
The comparison is based on the three daily state observations at traditional observation times.Figure 2 (top) shows the results for Frankfurt: The mean of the differences is 0.03 K with a standard deviation of 0.2 K.For most dates, the differences are close to 0 K and only for a small number of cases (less than 3 %) the differences are larger than ±0.5 K (see histogram in Fig. 2, top right).Similar to Frankfurt, the mean of the differences is small for the majority of stations, as show in the second column of Table 5.With −0.16 K, 2 ( 1 3 Class B) is also called Class AA. station Schleswig shows the largest mean difference (Fig. 2, bottom).The standard deviation (0.28 K) is also larger than for Frankfurt.The average of the mean difference for all stations is −0.02K, i.e. the automatic measurements are on average slightly lower than the traditional measurements (Table 5, column 2).

Comparison of averaging procedures for daily temperature
Breaks in the time series can not only result from changes in the sensors itself, but also from changes in the data processing.Similar to the comparison in the previous section, column 4 in Table 5 shows the differences in the daily mean temperature when automatic observations are used instead of manual observations, but without changing the averaging procedure, i.e. for both cases, Eq. ( 1) has been applied.The analysis is therefore based on the same observations as the one in the previous section (column 2) but daily means are calculated before comparing the data (i.e. a different averaging procedure is applied to the same data as in the previous section: double weight is given to the third observation, see Eq. 1).This only leads to small differences in the results.Figure 3 (left) shows the histogram of the differences: The mean difference between the daily values for all days and all stations is −0.03K with a standard deviation of 0.16 K.
The new approach for calculating daily mean temperature is the arithmetic mean of all automatically taken hourly values.To illustrate the impact of changing this procedure, column 6 in Table 5 shows the mean difference between the daily means of the new and the traditional approach.For that comparison the arithmetic mean has been applied to the hourly automatic observations and Eq. ( 1) has been applied to the manually observed data.Figure 3 (right) shows the histogram of the difference between both approaches for all days and stations.
From the histogram and the standard deviation (0.52 K for all stations) it is obvious that the spread of differences is larger than the spread that is caused by the change of the sensors alone (standard deviation: 0.16 K).The mean of the differences for all stations is −0.08 K.The change of the formula leads to larger differences than changing the observing technique (−0.03 K).However, the mean differences are still rather small for all stations.Again, the largest bias occurs for station Schleswig (−0.16 K).These results also show that the formula of Kämtz (Eq. 1) is a good approach for estimating the daily mean temperature.
For both types of changes (sensors and averaging procedure) the overall mean bias is slightly negative, i.e. the values based on the new approach are slightly lower.This is Table 5. Summary of results for all climate reference stations: Second and third column: Mean difference (automatic minus manual) and standard deviation (SD) based on the three traditional observation times.Forth to seventh column: Comparison of different approaches for calculating the daily mean: (4, 5): based on the same formula for three daily automatic and manual observations, (6, 7): based on the traditional and new formula, i.e. based on 24 hourly automatic vs. three manual observations.Eighth to eleventh column: Mean differences for daily extremes (8, 9): maximum, (10, 11): minimum.( 12 important to note, as the question has been raised if the introduction of new methodologies could have contributed to the observed long-term increase in temperature.This analysis shows that there is no artificial increase in temperature for the CRSs of DWD and that the effects are rather small compared to the climate change signal.The increase of the average annual temperature for Germany from 1881 to 2015 is 1.4 K (linear trend, see Kaspar and Friedrich, 2016).

Comparison of daily extreme temperatures
Daily extremes of temperature (minimum and maximum) are also measured at the CRSs with traditional and automatic instruments.The liquid thermometers are read at 20:30 UTC.This interval is therefore also used to provide the daily extremes for the automatic thermometers in this analysis.Figure 4 shows the histograms for the differences in observed extremes (maximum and minimum) for all stations.On average, the automatic sensors measure slightly lower values for the maximum (−0.03 K) and the minimum (−0.08 K).The results for the individual stations are shown in Table 5 (column 8 to 11).
Figure 5 shows the time series of the differences in the observed maximum temperature for Potsdam.A seasonal cycle with positive differences in summer is visible, i.e. the automatic thermometer measures higher values.A similar seasonal cycle is visible for several other climate reference stations, specifically those where a LAM 630 shelter is used.It is not visible in the time series for Brocken, Fichtelberg and Frankfurt, where either Stevenson screens or a "Gießener Hütte" is used.Figures 7 and 8 show the average annual cycle of the differences based on monthly aggregated results.
In contrast to Potsdam (Fig. 7), no distinct cycle is visible for Fichtelberg (Fig. 8).It has already been noted in other studies that the LAM 630 shelter was warmer than some other screens in case of high solar radiation and low wind speed (e.g.Lacombe et al., 2011, for desert conditions in Algeria).
Further internal investigations at DWD have led to the conclusion that this effect is at least partly caused by radiation effects and is influenced by the positioning of the sensor within the screen.Additional internal guidelines for the placement of the sensor have been defined (Deutscher Wetterdienst, 2015).According to the guidelines, the operational sensor should be placed at the North-Eastern position within the LAM 630, to reduce the radiation effect close to sunset.For Potsdam the placement has been changed in March 2016.Figure 5 shows that the bias does not occur any more in summer 2016.q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 01 02 03 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 01 02 03  For the minimum temperature, a seasonal cycle is only visible for two stations (Schleswig, Potsdam).An explanation for this has not yet been found and further investigations are necessary.
In some countries the daily mean temperature is calculated based on observed maximum and minimum temperature: T mean = (T max + T min ) / 2 (see discussion in Weiss and Hays, 2005).It is therefore also of interest to see the impact of the changes in the sensors for this approach.Figure 6 and Table 5 (columns 12 and 13) show the differences when this approach is applied to the manually and automatically observed extremes.Consistent to the previous results, the results show that on average, the automatic system results in slightly lower values (−0.05 K).In this paper, we analysed the impact on temperature series.The change in the technology does not introduce an artificial increase in the mean temperature.The procedure for the calculation of daily means has slightly stronger impacts on the time series, but the mean bias due to that effect is also small (−0.08 K for all CRSs).This confirms earlier results of Augter (2013), where data from the CRS until 2010 were used together with additional measurements from an earlier type of stations.The effect on the daily extremes of temperature is also small on average, but a seasonal cycle for the daily maximum temperature was noted for stations where a LAM 630 shelter is used, with increased values for the electrical thermometer in summer.This effect can be avoided by optimizing the placement of the sensors within the screen.

Data availability
Observations from the station network of Deutscher Wetterdienst at hourly, sub-daily, daily and monthly resolution are available at ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/.
the data into the central database, further QC algorithms are applied.

Figure 1 .
Figure 1.Location of DWD's climate reference stations (CRS).•: manual observations will be performed until 2018, : manual observations have been performed until at least 2014 (Helgoland: 2013), ×: Konstanz was CRS until 2012, Fichtelberg until 2014.• + : stations will be operated as CRS with parallel observations of automatic sensors ("type II") after the end of the manual observations.

Figure 2 .
Figure 2. Comparison of automatic versus traditional temperature measurements: The black line (left) and the histogram (right) show the difference of the automatic minus traditional measurements in K based on all traditional observation times (I, II, III).Obvious outliers were removed from the analysis and are also not considered in the mean and standard deviation.The blue line is three times the standard deviation.The mean difference is shown in red and the moving average in green (based on 150 values).The top figure shows the results for station Frankfurt, the bottom for Schleswig.

Figure 3 .
Figure 3.Comparison of different methods for calculating the daily mean temperature.Left: Automatic and traditional observations are used to calculate the daily mean temperature based on the three traditional observation times.The histogram is based on the differences (in K; automatic minus traditional) of the daily mean values for all stations and all available days.Right: The histogram is based on the differences of the daily mean values of the new and the traditional procedure, i.e. the arithmetic average of all hourly values from the automatic measurements versus the daily mean values calculated with the traditional formula based on three manual observations (Eq.1).Outliers are removed.

Figure 4 .
Figure 4. Histograms of differences between automatic and manually observed daily extremes (in K, left: maximum, right: minimum).All dates and stations are included.The daily extremes refer to the 24 h interval from 20:30 to 20:30 UTC.Obvious outliers are removed.

Figure 5 .Figure 6 .
Figure 5. Differences of daily temperature maximum in K as time series (left) and histogram (right) for station Potsdam: automatic minus manual measurements (black line), triple standard deviation (blue line), mean (red line), moving average (green line; 50 days).Outliers are removed.

Figure 8 .
Figure8.Boxplots of the differences of temperature maxima (automatic minus conventional measurements) for Fichtelberg for each month (as Fig.7).At Fichtelberg, the automatic temperature sensors are operated in a Stevenson screen.
a network of climate reference stations.At these stations automatic and manual observations of several meteorologic parameters have been performed in parallel for several years.They allow to analyse the impact of changes of www.adv-sci-res.net/13/163/2016/Adv.Sci.Res., 13, 163-171, 2016 F. Kaspar et al.: Climate reference stations in Germany the sensor technology on the homogeneity of the time series.

Table 2 .
Conventional and automatic measurements taken at climate reference stations and instruments used for these measurements.Observation times for these parameters for traditional observations are: (I): 06:30 UTC, (II): 13:30 UTC and (III): 20:30 UTC.

Table 3 .
Observing time of temperature extremes, soil temperature, precipitation and snow height.
, 13): Mean difference of daily values based on daily minimum and maximum.
Boxplots of the differences of temperature maxima (automatic minus conventional measurements) for Potsdam for each month.At Potsdam, the automatic temperature sensors are operated in a LAM 630 shelter.Numbers above each boxplot are the p value of a t test which was performed to check, if the mean of the automatic time series and the mean of the conventional time series is equal.For this plot, only data until 14 March 2016 have been included.After that date, the positioning of the sensors was modified (see text).